A complete, neutral reference for building generative AI in a healthcare context on AWS without mishandling protected health information. What makes Amazon Bedrock HIPAA-eligible and what the AWS Business Associate Addendum (BAA) actually covers; how PHI must be handled (Guardrails PII/PHI redaction, the no-training guarantee, encryption with KMS, network isolation with PrivateLink and VPC endpoints); reference architectures for clinical and administrative use cases — documentation summarization, patient intake, and medical-coding assist; the audit, logging, and de-identification (Safe Harbor vs Expert Determination, Amazon Comprehend Medical) you need; and an explicit list of what NOT to do. The body is reference content; only the example and the offer tie back to CloudRoute.
Before any model, prompt, or architecture, healthcare GenAI on AWS rests on three facts: you need a signed BAA, you may only put protected health information through HIPAA-eligible services, and AWS securing the cloud does not mean your application is compliant. Get these three right and everything else is engineering.
HIPAA (the U.S. Health Insurance Portability and Accountability Act) governs how protected health information (PHI) — health information tied to an identifiable individual — is used, stored, transmitted, and disclosed. If your application touches PHI and you are a covered entity (a provider, health plan, or clearinghouse) or a business associate acting for one, HIPAA applies to that application, including the generative-AI parts of it.
AWS supports HIPAA workloads, but on specific terms. First, you must execute the AWS Business Associate Addendum (BAA) — a contract in which AWS agrees to act as your business associate and to safeguard PHI according to HIPAA. The BAA is self-service to accept through AWS Artifact (for most accounts) and applies across your AWS Organization. Without an executed BAA, you should not place PHI on AWS at all, regardless of how the architecture looks.
Second, the BAA only covers HIPAA-eligible services — a defined (and growing) list of AWS services AWS has brought into scope for PHI. Amazon Bedrock is on that list, as are the services you will build around it (Amazon S3, AWS KMS, Amazon ECS/EKS/Lambda, Amazon API Gateway, Amazon CloudWatch and CloudTrail, Amazon Comprehend Medical, and more). You must architect so that PHI only ever flows through eligible services — a single non-eligible service in the PHI path (an analytics tool, a logging sink, a third-party SaaS) breaks the boundary. The eligible-services list changes; confirm current scope on the AWS HIPAA-eligible-services page before you finalize a design.
Third, and most misunderstood: HIPAA on AWS is a shared responsibility. AWS is responsible for the security of the cloud (the physical infrastructure, the eligible services themselves). You are responsible for security in the cloud — how you configure encryption, IAM, networking, logging, retention, and the application logic, including what PHI you send to a model and what you do with the response. The phrase "Bedrock is HIPAA-eligible" describes AWS's half. Your half — the architecture in the rest of this guide — is what actually determines whether your application is compliant.
Before PHI touches a model on AWS: (1) an executed AWS BAA is in place, (2) every service in the PHI path is HIPAA-eligible (Bedrock is), and (3) you have designed your half of the shared-responsibility model — encryption, IAM, network isolation, logging, retention. Eligibility is necessary, not sufficient; compliance is your architecture.
The reason Bedrock is a credible place to build healthcare GenAI comes down to four properties working together: your data does not train the base models, it is encrypted everywhere, it can be fully network-isolated, and a guardrail can detect and redact PHI before it is ever exposed. Each closes a specific risk.
A foundation model is only safe for PHI if you can answer four questions: does my data leak into the model? is it encrypted at rest and in transit? can someone outside my account see it? and can PHI escape in a prompt or a response? Bedrock gives a concrete answer to each.
The single most important property for healthcare: Amazon Bedrock does not use your prompts, completions, or any data you submit to train the underlying foundation models, and your content is not shared with the model providers (Anthropic, Meta, Mistral, Amazon, Cohere, and the rest). Your data stays within your AWS account and the Region you call; it is used only to serve your request. This is what makes it defensible to send clinical text to Bedrock that you could never send to a consumer chatbot — the data is not absorbed into a model that could later surface it to someone else. When you fine-tune or customize a model on Bedrock, the resulting custom model is private to your account as well. This property should be confirmed against current AWS documentation and reflected in your own risk assessment, but it is the published behavior and the basis of most healthcare Bedrock designs.
PHI must be encrypted in transit and at rest, and on AWS that is table stakes. Traffic to Bedrock is over TLS. Data at rest — your S3 documents, your vector store, your logs, any fine-tuning data, and Bedrock-managed artifacts — is encrypted with AWS KMS. For a HIPAA workload, use customer-managed KMS keys (CMKs) rather than the default AWS-managed keys: a CMK gives you control over the key policy, key rotation, and — importantly for audit — a CloudTrail record of every decrypt, so you can prove who (or what service) accessed encrypted PHI and when. Scope key policies to least privilege so only the specific roles that must decrypt PHI can use the key.
By default, calls to the Bedrock API traverse the public AWS endpoint over TLS. For a PHI workload you almost always want to keep that traffic off the public internet entirely using AWS PrivateLink: create a VPC interface endpoint for Bedrock so your application calls the service over a private connection inside your VPC, never routing through an internet gateway. Combine this with security groups, private subnets, and VPC endpoint policies (and S3 gateway endpoints for your document store) so the entire PHI path — application to Bedrock, application to S3, application to the vector store — stays within your private network. This is one of the highest-value controls for satisfying a security reviewer: PHI never leaves AWS's private network on its way to the model.
Even with the model isolated, you still have to control what enters and leaves it. A Bedrock Guardrail is the policy layer that does this, and it is central to a healthcare design. The sensitive-information policy detects PII/PHI — names, addresses, phone numbers, emails, national-ID and medical-record numbers, and more — in both the prompt and the response, and for each type you choose to block the interaction or redact/mask the value (replacing it with a placeholder like {NAME} or {MRN}) so the conversation continues without exposing the data. You can add custom regex patterns for organization-specific identifiers (MRNs, encounter IDs, payer member numbers). Beyond PII, the guardrail enforces denied topics (e.g., refusing to give a diagnosis or treatment recommendation when the app is informational only), content filters, the prompt-attack filter, and contextual grounding + relevance so the model does not assert medical claims unsupported by your source content. One versioned guardrail becomes the single, auditable place your AI safety and PHI policy lives — see the amazon-bedrock-guardrails and amazon-bedrock-guardrails-pii-redaction siblings for the full mechanics.
| Control | Risk it closes | AWS mechanism | Healthcare-specific note |
|---|---|---|---|
| No-training guarantee | PHI absorbed into a shared model | Bedrock data-handling (account/Region-scoped) | Why clinical text can go to Bedrock but not a consumer chatbot |
| Encryption at rest | PHI readable on disk / in storage | AWS KMS — use customer-managed keys (CMKs) | CMK gives key control + a CloudTrail decrypt audit trail |
| Encryption in transit | PHI intercepted on the wire | TLS to all endpoints | Pair with PrivateLink to also avoid the public internet |
| Network isolation | PHI traversing the public internet | PrivateLink + VPC interface/gateway endpoints | Keeps the whole PHI path inside your VPC |
| Guardrails PII/PHI redaction | PHI leaking in a prompt or response | Sensitive-info policy (block/redact) + custom regex | Redact MRNs/encounter IDs both directions |
| Guardrails denied topics + grounding | Out-of-scope advice / hallucinated claims | Denied topics + contextual grounding thresholds | Keep informational apps from "diagnosing" |
| IAM least-privilege | Over-broad access to PHI + models | IAM roles/policies, scoped KMS key policy | Only the roles that must touch PHI can |
| Audit logging | No evidence of who accessed PHI | CloudTrail + CloudWatch + Bedrock invocation logging | The evidence a HIPAA audit expects |
PHI exposure is not uniform — it depends entirely on what the application does. The three most common healthcare GenAI use cases — documentation summarization, patient intake, and medical-coding assist — each have a different PHI profile and therefore a different reference architecture. The discipline is to match the controls to the actual exposure.
A useful framing: every healthcare GenAI architecture is some combination of "minimize the PHI that reaches the model," "isolate and encrypt the path it travels," "constrain what the model can say," and "log everything." What changes between use cases is the emphasis. The three patterns below show that emphasis shifting.
The use case: summarize clinical notes, visit transcripts, or a longitudinal record into a concise note for a clinician — an internal, clinician-facing assistant. PHI exposure is high (the input is clinical text full of PHI) but the audience is trusted (licensed staff), which shifts the design toward isolation and grounding rather than aggressive redaction. Reference shape: source documents in an encrypted (CMK) S3 bucket; the application runs in a private VPC and calls Bedrock over a PrivateLink endpoint; a generation model (e.g., Claude on Bedrock for strong long-document reasoning) produces the summary; a Guardrail enforces grounding (the summary must follow from the note, not invent findings) and the prompt-attack filter, with PII redaction tuned conservatively since clinicians need to see patient identifiers; every invocation is logged to CloudTrail/CloudWatch with the user identity. Because the output is clinical, treat it as decision-support, keep a clinician in the loop, and never let the summary auto-populate the record without review.
The use case: a patient-facing assistant that collects intake information, answers benefits or scheduling questions, or triages symptoms to the right resource. Exposure is two-directional and the audience is untrusted — patients will type PHI into prompts, and the assistant must never emit one patient's PHI to another or give clinical advice it is not authorized to give. This is the design where Guardrails work hardest: PII/PHI detection set to redact so collected identifiers are masked before storage/logging; denied topics for "diagnosis" and "treatment recommendation" if the app is informational; the prompt-attack filter on (public input is hostile by default); and, for any answers drawn from clinical or plan content, a Knowledge Base with contextual-grounding thresholds so answers are cited and supported. Strict per-user/session isolation in retrieval prevents cross-patient leakage. Authenticate before collecting PHI, and surface a human-handoff path for anything clinical.
The use case: suggest ICD-10 / CPT / HCPCS codes from clinical documentation to assist (not replace) a certified coder. Exposure is high and the bar is accuracy and auditability, because miscoding has financial and compliance consequences. Reference shape: a retrieval-augmented design (see the rag-on-aws sibling) where the model is grounded in the actual coding guidelines and the patient's documentation rather than relying on parametric memory; the model proposes codes with the supporting passage cited; a Guardrail enforces grounding and redacts PHI from any logs; and the proposed codes are always reviewed and confirmed by a human coder, with that decision logged. The point of the GenAI here is to accelerate and support a credentialed human, never to auto-submit codes — both for compliance and because grounded suggestions with citations are far more useful to a coder than an unexplained label.
| Use case | Audience | PHI exposure | Architecture emphasis | Key AWS pieces |
|---|---|---|---|---|
| Clinical documentation summarization | Clinicians (trusted) | High (input is clinical text) | Isolation + grounding; human review of output | Bedrock (Claude) + PrivateLink + KMS + Guardrail (grounding) |
| Patient intake / patient assistant | Patients (untrusted) | Two-directional | Redaction + denied topics + prompt-attack + per-session isolation | Bedrock + Guardrail (PII-redact) + Knowledge Base + auth |
| Medical-coding assist | Certified coders (trusted) | High + accuracy-critical | RAG grounding + citations + mandatory human confirm | Bedrock + Knowledge Base (coding guidelines) + Guardrail + audit |
| Admin / back-office (claims, prior-auth drafts) | Staff (trusted) | Variable | De-identify where possible + grounding + logging | Comprehend Medical (de-id) + Bedrock + Guardrail |
The most robust way to handle PHI in a generative-AI pipeline is, where the use case allows, to remove it before the model ever sees it. De-identified data is no longer PHI under HIPAA, which dramatically shrinks your risk surface. HIPAA defines two recognized methods, and AWS provides a service purpose-built to help.
De-identification is not always possible — a clinician summarizing a specific patient's record needs the identifiers — but for analytics, model evaluation, prompt-engineering datasets, training/fine-tuning corpora, and many administrative tasks, you can often strip PHI first. When you can, do: it is the single highest-leverage way to reduce compliance scope, because HIPAA simply stops applying to properly de-identified data.
Safe Harbor is the prescriptive method: remove the 18 specified categories of identifiers (names; geographic subdivisions smaller than a state; all date elements more specific than year for dates directly related to an individual; phone/fax numbers; email; SSNs; medical-record numbers; health-plan beneficiary numbers; account numbers; certificate/license numbers; vehicle and device identifiers; URLs and IP addresses; biometric identifiers; full-face photos; and any other unique identifying number, characteristic, or code) and have no actual knowledge that the remaining information could re-identify the individual. Expert Determination is the statistical method: a qualified expert applies accepted statistical or scientific principles and documents that the risk of re-identification is very small. Safe Harbor is simpler and more common for engineering pipelines; Expert Determination is used when you need to retain more granularity (e.g., specific dates) than Safe Harbor allows. Which method you use is a compliance decision to make with counsel, not an engineering default.
Amazon Comprehend Medical is a HIPAA-eligible NLP service that extracts information from unstructured clinical text — and, critically here, it has a PHI detection (DetectPHI) capability that identifies protected health information so you can redact or replace it programmatically at scale. The common pattern is a pre-processing step: run source documents through Comprehend Medical to detect PHI, redact or tokenize the detected entities, and only then pass the de-identified text into your Bedrock pipeline for summarization, classification, or RAG. This pairs naturally with Guardrails — Comprehend Medical de-identifies the corpus up front, and the Guardrail catches anything that slips through at inference time, giving you defense-in-depth rather than a single point of failure. As with any automated detector, validate recall on your own documents; de-identification you rely on for compliance should be tested, not assumed.
If the use case does not genuinely need identifiers, de-identify before the model sees the data — properly de-identified data is no longer PHI, so HIPAA stops applying and your risk surface shrinks. Use Safe Harbor (remove the 18 identifier categories) or Expert Determination (documented statistical method) per your compliance team, and Amazon Comprehend Medical to detect/redact PHI at scale. Keep Guardrails on anyway as a second layer.
HIPAA does not just require that PHI be protected — it requires that you can prove it. The audit and access-control layer is what turns a secure architecture into a defensible one, and it is exactly what a HIPAA assessment, an OCR inquiry, or a hospital security review will ask to see.
Three obligations drive this layer: access to PHI must be least-privilege and authenticated; every access and inference must be logged immutably; and PHI must be retained and disposed of according to policy. AWS provides a direct mechanism for each.
Most healthcare GenAI compliance failures are not exotic — they are a short list of avoidable mistakes. Knowing them is as valuable as knowing the architecture, because any one of them can turn a well-built system into a breach.
These are the patterns a HIPAA-experienced reviewer looks for first. Each has a simple fix already covered above; the failure is almost always omission, not impossibility.
Here is the order a HIPAA-experienced team actually builds in: contracts and scope first, then the secure foundation, then the model layer, then the controls, then the evidence. Skipping ahead to the model is the most common way teams end up reworking everything.
The gap between a defensible healthcare GenAI build and a compliance incident is a handful of specific decisions. This is the same application configured the safe way versus the way teams get it wrong — read the left column as the standard and the right column as the failure mode to avoid.
| Decision point | HIPAA-ready (do this) | Risky (avoid) | Why it matters |
|---|---|---|---|
| Contract | Executed AWS BAA before any PHI | PHI on AWS with no BAA | No BAA = not permitted to place PHI at all |
| Services in the PHI path | Only HIPAA-eligible services (Bedrock, S3, KMS, Comprehend Medical…) | A non-eligible analytics/SaaS/logging tool in the path | One non-eligible service breaks the boundary |
| Model data handling | Bedrock — no training on your data, account/Region-scoped | A consumer chatbot or unvetted API | Consumer tools may absorb/expose PHI |
| Encryption | Customer-managed KMS keys (CMK) + TLS | Default keys, or unencrypted stores/logs | CMK gives control + a decrypt audit trail |
| Network | PrivateLink / VPC endpoints — PHI stays private | Public Bedrock endpoint over the internet | Keeps PHI off the public internet |
| Prompt/response safety | Guardrail: PII/PHI redact + denied topics + grounding | Raw model calls, no redaction or scope limits | Stops PHI leaks + hallucinated/out-of-scope advice |
| Logging | CloudTrail + redacted, encrypted invocation logs | Verbose raw-PHI logs, under-protected | Logs are a classic silent PHI leak |
| Human oversight | Clinician/coder confirms clinical + coding output | Auto-apply model output to record/claims | GenAI assists credentialed humans, never replaces them |
Situation: The team wanted an assistant that summarized visit notes and prior records into a clean draft note for clinicians — but it was squarely a PHI workload, and they had no BAA, no PHI-safe architecture, and no compliance story to show the clinics evaluating them. They could not risk PHI leaking into logs or into a model that might surface it elsewhere, the assistant could not drift into making diagnoses, and the two engineers who could build it were fully committed to the core product. On top of that, the projected Bedrock and infrastructure bill made the founder hesitant to start while pre-revenue.
What CloudRoute did: CloudRoute matched them in under 24 hours to an AWS partner with HIPAA and Bedrock experience. The partner executed the AWS BAA, scoped the build to HIPAA-eligible services only, and stood up the secure foundation: a private VPC with PrivateLink to Bedrock and gateway endpoints to S3, customer-managed KMS keys on every PHI store, least-privilege IAM, and CloudTrail to an immutable log bucket. They built the summarizer on Claude on Bedrock with a single versioned Guardrail in front — PII/PHI detection (custom regex for MRNs and encounter IDs), denied topics covering "diagnosis" and "treatment recommendation," prompt-attack and content filters on, and contextual-grounding thresholds so the draft note could not assert findings not in the source. Invocation logging was encrypted and redacted, an adversarial test set drove threshold tuning, and a clinician-review step was mandatory before any draft reached the record. The partner also filed a Bedrock POC credit application plus an Activate application to fund the build.
Outcome: The assistant produced grounded draft notes, redacted PHI in logs, refused out-of-scope clinical advice, and gave the team a versioned policy artifact plus a logging/IAM/encryption story to put in front of the clinics' security reviewers. The build and the first months of inference ran entirely on approved AWS credits, so the team paid $0 out of pocket. CloudRoute's commission was paid by the partner from AWS engagement funding, not by the customer.
foundation: BAA + PrivateLink + CMK + least-privilege IAM · guardrail: PHI-redact + denied topics + grounding · human-in-the-loop · credits: POC + Activate · out-of-pocket: $0
Whatever your HIPAA-ready GenAI workload would cost on Bedrock, AWS credits can cover it. CloudRoute routes you to the right credit pool (Activate up to $100K, Bedrock POC $10K–$50K, GenAI Accelerator up to $1M) and a vetted, HIPAA-experienced AWS partner who builds it right — BAA, PrivateLink, KMS, PHI redaction, grounding, and the audit trail. Customer pays $0.