A focused, neutral reference for the one Amazon Bedrock Guardrails control that regulated teams care about most: the sensitive-information filter. How it detects PII in prompts and responses, the difference between blocking the request and masking/redacting the value, the built-in entity types, custom regex for your own identifiers, applying it on Converse/InvokeModel, to Agents and Knowledge Bases, and standalone over any model via the ApplyGuardrail API — plus how it maps to HIPAA, PCI DSS, and GDPR, how it combines with model-invocation logging and encryption, how to test it, the real limitations, and a reference architecture for regulated workloads.
A foundation model will happily echo a Social Security number a user pasted into a prompt, or surface a phone number it retrieved from a document. PII redaction is the Bedrock Guardrails control that stops sensitive values from flowing where they should not — in either direction — without you writing detection code.
Bedrock PII redaction is the behavior of the sensitive-information filter, one of the policy types inside an Amazon Bedrock Guardrail. A guardrail is a named, model-independent policy object that Bedrock evaluates against every interaction; the sensitive-information policy is the part of it that finds personally identifiable information and acts on it. Like every guardrail policy, it runs on both directions — the user input (someone pasting a customer's card number into a chat) and the model output (the model emitting an email address it read from a retrieved file).
The word "redaction" is doing specific work here. When the filter detects a sensitive value it can do one of two things: block the entire interaction (the request never reaches the model, or the response is withheld, and the user sees a message you configured), or anonymize it — what most people mean by "redaction" or "masking" — where the value is replaced in place with a placeholder token such as {EMAIL}, {PHONE}, or {SSN}, and the rest of the interaction proceeds. Anonymizing is what lets a support assistant keep helping a user who happened to include their phone number, while ensuring that number is never stored, logged, or returned verbatim.
Because the guardrail is a separate object from the model, the redaction policy is portable and auditable: it is defined in one place, it is the same no matter which foundation model the request hits, and it can be versioned and shown to a reviewer. This page is a deep dive on this single policy. For the full survey of every guardrail filter type — content filters, denied topics, word filters, contextual grounding — see the amazon-bedrock-guardrails sibling; this page assumes that context and goes deep on PII specifically.
Bedrock PII redaction is the sensitive-information policy in a Bedrock Guardrail: it detects PII (built-in entity types plus your own regex) in both the prompt and the response, and per type either blocks the interaction or masks the value in place with a placeholder — so sensitive data never reaches the model, the logs, or the user.
Detection is only as good as the things you tell it to look for. Bedrock ships a managed list of common PII entity types and lets you extend it with regular expressions for the identifiers that are specific to your business.
The sensitive-information filter recognizes two kinds of things: a managed set of built-in PII entity types, and any custom regular-expression patterns you define. You enable the entity types relevant to your workload and pick an action for each; you do not have to take the whole list.
Bedrock maintains detectors for the categories of PII that show up across most applications. These broadly fall into a few groups: general personal identifiers (name, email address, phone number, physical/mailing address, age); government and national IDs (US Social Security number, passport number, driver's-license number, and country-specific national identifiers); financial data (credit/debit-card number, card expiry and CVV, bank-account and routing numbers, SWIFT/IBAN); and technical identifiers (IP address, MAC address, URL, AWS access keys, and similar). The exact set evolves and varies by region, so treat any list as representative and confirm the currently supported entity types and regional coverage in the AWS Bedrock Guardrails documentation.
Each enabled entity type gets its own action — you can block on credit-card numbers (you never want one in the system at all) while choosing to mask names and emails (so conversations keep flowing). That per-type granularity is the point: the right policy is rarely "block everything" or "mask everything," it is a deliberate matrix of which identifiers are intolerable versus which are acceptable once anonymized.
The built-in list covers common PII, but every regulated business has identifiers that are private and specific to it: medical-record numbers (MRNs), insurance policy or claim IDs, internal customer/account numbers, case references, member IDs. For these you define a custom regular-expression pattern in the guardrail, give it a name and an action (block or mask, with your own placeholder), and the filter applies it exactly like a built-in type — on both input and output. This is how you close the gap between "PII the world recognizes" and "the sensitive identifiers that exist only in your domain."
A custom regex is deterministic literal-pattern matching, which makes it predictable and cheap but only as good as the pattern. An MRN format like an 8-digit number is easy; something with looser formatting needs care to avoid both misses (a real MRN that does not match) and false hits (an unrelated number that does). The disciplined approach is to write the pattern from the actual format spec, then validate it against a corpus of real and decoy values during testing (covered in section VI).
| Detector kind | Examples | How it decides | Actions available | Best for | Limitation |
|---|---|---|---|---|---|
| General identifiers | Name, email, phone, address, age | Managed detection | Block or mask | Everyday PII in chat/forms | Names are context-dependent |
| Government / national IDs | SSN, passport, driver's license | Managed detection | Block or mask | KYC, healthcare, gov workloads | Regional coverage varies |
| Financial data | Card number, CVV, IBAN, routing # | Managed detection | Block or mask | PCI-scoped flows | Block is usually safer than mask |
| Technical identifiers | IP, MAC, URL, access keys | Managed detection | Block or mask | Logs, secrets hygiene | IPs can be legitimately needed |
| Custom regex | MRN, policy ID, account #, claim ref | Your regular expression | Block or mask (your placeholder) | Org-specific identifiers | Only as good as the pattern |
The single most important configuration decision for this filter is, per entity type, whether to block the interaction or anonymize the value. They produce very different user experiences and very different risk profiles.
Both actions prevent the sensitive value from being exposed; they differ in what happens to the interaction around it.
With the block action, detecting the entity stops the interaction: an offending prompt never reaches the model, and an offending response is withheld; the caller receives the blocked-message text you configured (you can set different messages for the input and output directions). Block is the right choice for data you never want in the system at all — for many teams, full credit-card numbers (PCI), or in some designs any national-ID number. The trade-off is user experience: a single stray value rejects the whole turn, which can frustrate legitimate users who included something innocuous-looking.
With the anonymize action (mask/redact), the interaction proceeds but the detected value is replaced in place with a placeholder token — for example {EMAIL}, {PHONE}, {NAME}, or a custom tag for a custom regex. On the input side this means the model receives the prompt with PII already masked, so the raw value never reaches the model context at all; on the output side it means any PII the model produced (e.g., from retrieved content) is masked before it is returned. The conversation keeps flowing, which is why masking is the usual default for common identifiers like names, emails, and phone numbers in customer-facing assistants. The trade-off is that a masked value is, by design, gone — if downstream logic genuinely needed the real phone number, masking will have removed it, so masking belongs on the values you want to protect, not the ones the workflow depends on.
| Action | What happens to the turn | Does the model see the value? | User experience | Choose it for |
|---|---|---|---|---|
| Block | Interaction refused; configured message returned | No — request never proceeds | Hard stop on the whole turn | Data that must never be present (e.g. card numbers / PCI) |
| Redact / mask | Interaction continues; value replaced with placeholder | No — sees the placeholder, not the raw value | Seamless; conversation flows | Common PII you want protected, not blocked (names, emails, phones) |
Block the values that must never exist in the system (often full card numbers / PCI data); mask the everyday identifiers you want to protect but that should not kill a conversation (names, emails, phones, addresses). Set it per entity type — the right policy is a deliberate matrix, not one global switch. Remember masking on the input side means the model never even sees the raw value.
A redaction policy is only useful if it covers every path data can take. Because the sensitive-information filter lives in a model-independent guardrail, the same policy reaches direct model calls, autonomous agents, retrieval pipelines, and even models outside Bedrock.
There are three ways the same PII policy gets enforced, and for regulated workloads you typically use more than one.
The standard pattern: pass the guardrail ID and version on the Converse or InvokeModel call. Bedrock runs the sensitive-information policy on the input before the model sees it (masking or blocking PII in the prompt) and on the output before it returns to you (masking or blocking PII in the response). Because the guardrail is model-independent, the identical redaction policy applies whether the request goes to Claude, Amazon Nova, Llama, or Mistral — you can change models without re-implementing PII handling.
Attach the guardrail to a Bedrock Agent and the redaction policy screens every step of the agent's loop — user input, and the model output at each turn — which matters because agents read tool outputs and retrieved documents that can contain PII the user never typed. The same logic applies to Knowledge Bases / RAG: retrieved passages may carry PII from your source corpus, so screening the output catches a model that would otherwise surface, say, a customer's address pulled verbatim from an indexed document. For PII specifically, RAG is a common leak vector precisely because the sensitive data enters via retrieval rather than via the user — see the rag-on-aws and amazon-bedrock-knowledge-bases siblings.
The ApplyGuardrail API evaluates a piece of text against the guardrail and returns the assessment — what was detected, what was masked, whether it was blocked — without invoking a model. For PII redaction this is powerful in two ways. First, you can sanitize text at any point in your own pipeline: scrub a document before it is indexed into a Knowledge Base, redact a record before it is written to a log or a ticket, validate user input at the edge. Second, it lets you put Bedrock's PII redaction in front of models that are not on Bedrock — a SageMaker endpoint, a self-hosted open-weight model, or a third-party API — by calling ApplyGuardrail on the text before and after the external call. The redaction policy becomes a portable privacy layer over your entire AI stack, not just the Bedrock-hosted parts.
PII redaction is frequently the control that makes a generative-AI feature shippable in a regulated context. The mapping to the major regimes is direct, but the boundary of what the control does and does not cover has to be stated honestly.
The sensitive-information filter speaks the language regulators care about: detect the protected data, and prevent it from being stored, logged, or returned. How that maps:
The honest caveat, repeated because it matters: a guardrail is a technical control, not a certification. HIPAA, PCI DSS, and GDPR are organizational programs — BAAs and DPAs, scoping, access controls, encryption, logging, breach processes, audits — and PII redaction is one part. Treat it as defense-in-depth alongside the controls in the next section, not as something you can fully outsource the regulatory risk to.
Three things the redaction policy provides: a single defined place the PII-handling rules live, evidence the rules ran on every input and output, and a version history proving which policy was in force when. PII redaction maps to HIPAA/PCI/GDPR data-minimization; versioning maps to SOC 2 change-control. It is a control, not a certificate.
Redaction does not exist in isolation. The two places teams most often undo their own redaction are logging and storage — and the fix is to think of redaction, logging, and encryption as one design.
The most common self-inflicted leak in a regulated GenAI build is logging the raw data before it was redacted. Bedrock can capture model-invocation logging (the inputs and outputs of model calls, sent to Amazon CloudWatch Logs and/or Amazon S3), which is invaluable for debugging and audit — but if you log the raw request and then redact, you have a copy of the unredacted PII sitting in your logs. Order and ownership matter.
Design so that what gets persisted is the redacted form. With input-side masking, the value is already a placeholder by the time it enters the model context, which helps — but you still control your own application logs, and those are where raw user input most often lands. Treat any log that could contain pre-redaction text as in-scope sensitive data: restrict it with IAM least-privilege, encrypt it, set tight retention, and prefer logging the guardrail-processed (masked) text over the raw text. If you enable Bedrock model-invocation logging, lock down and encrypt the destination bucket/log group and scope retention deliberately — confirm exactly what the logs capture relative to your guardrail in the AWS docs.
Redaction reduces what sensitive data exists, but the data that does exist still needs the standard controls: encryption at rest and in transit (including KMS-managed keys for the buckets/log groups that touch model I/O), IAM least-privilege so only the services and roles that must invoke the model and read the logs can, and VPC/network controls as appropriate. And because the guardrail is the place your PII policy lives, define it in infrastructure-as-code, publish numbered versions, and reference a specific version at inference time — so the redaction policy is reviewable, roll-back-able, and auditable rather than buried in a console.
The classic mistake: redact for the user, but log the raw request for "debugging." Now the unredacted PII lives in CloudWatch/S3 and is in audit scope anyway. Persist the masked form, encrypt and lock down anything that could hold raw input, and confirm what Bedrock model-invocation logging captures relative to your guardrail.
A PII filter that is never tested is a liability dressed as a control. Both failure modes — letting real PII through, and over-redacting legitimate text — are real, and only testing tells you where you sit.
Before relying on the redaction policy, exercise it. The Bedrock console provides a test window where you submit sample inputs and responses against a draft guardrail and see exactly what is detected, what is masked, and what is blocked — without wiring it into your application. The discipline is to go beyond ad-hoc clicks and build a labelled test set you can run repeatedly (scriptable through the API, including ApplyGuardrail):
The goal is the right balance: high enough recall that real PII does not slip through, high enough precision that you are not masking order confirmations into uselessness. You will not hit perfect on either — so for high-stakes flows keep a human escalation path and layer redaction with the storage/logging controls above rather than treating it as the only line of defense.
The sensitive-information filter is strong and worth defaulting to for anything that touches personal data, but it is bounded. Deploying it well means knowing the edges.
PII redaction is the right default for any GenAI workload that touches personal or regulated data — but it is one technical control. Pair it with encryption, IAM least-privilege, redaction-aware logging, testing, and the rest of your compliance program. It makes regulated GenAI feasible; it does not make it automatically compliant.
A scannable summary of how the sensitive-information policy behaves under each action and detector kind: what happens to the turn, whether the model ever sees the raw value, and what to choose it for. Use it to set a per-entity-type policy rather than one global switch.
| Configuration | What it does | Model sees raw value? | Conversation continues? | Tunable? | Best for |
|---|---|---|---|---|---|
| Built-in type → block | Refuses the interaction; returns your message | No | No — turn stops | Per entity type | Data that must never exist (cards / PCI) |
| Built-in type → mask | Replaces value with a placeholder in place | No — sees the placeholder | Yes | Per entity type | Common PII to protect (name, email, phone) |
| Custom regex → block | Refuses on your pattern match | No | No — turn stops | Your pattern + action | Intolerable org IDs |
| Custom regex → mask | Masks your pattern with your placeholder | No — sees the placeholder | Yes | Your pattern + action | Org IDs to protect (MRN, policy ID) |
| Input-direction policy | Screens the prompt before the model | No (if masked/blocked) | Depends on action | Yes | Stopping users leaking PII in |
| Output-direction policy | Screens the response (incl. RAG content) | n/a | Depends on action | Yes | Stopping the model emitting PII out |
Situation: The team wanted a member-facing assistant that could answer questions about claims and benefits from their own documents, but members routinely paste sensitive data into chat — card numbers when asking about a payment, Social Security numbers, and the company's own member IDs and medical-record numbers — and the retrieved policy/claims documents themselves contained PII. They were operating in a HIPAA- and PCI-conscious environment, could not let card numbers or PHI land in their logs or be echoed back, needed something they could show a security reviewer, and did not want to burn seed runway on inference while still pre-revenue.
What CloudRoute did: CloudRoute matched them in under 24 hours to an AWS partner with healthcare/fintech and Bedrock experience. The partner built the assistant on a Knowledge Base over the documents and put a single Bedrock Guardrail in front of it, configured around the sensitive-information policy: credit-card numbers, CVV, and SSN set to block (never tolerated); names, emails, phone numbers, and addresses set to mask on both input and output; and custom regex patterns for member IDs and medical-record numbers set to mask with dedicated placeholders. They ran ApplyGuardrail at ingestion to scrub the documents before indexing, made application logs persist only the masked form and encrypted the model-invocation-logging destination with a tight retention window, scoped IAM to least-privilege, built a labelled positive/negative test set (real values plus near-miss decoys) and tuned against it, then versioned the whole guardrail in infrastructure-as-code and referenced the version explicitly. The partner also filed a Bedrock POC credit application plus an Activate application to fund the build.
Outcome: Card numbers and SSNs were blocked outright, names/emails/phones/addresses and the custom member-ID and MRN patterns were masked in both directions, retrieved documents could not surface PII, and raw sensitive data never reached the logs — giving the team a single versioned redaction policy to put in front of their reviewer. Inference, the knowledge base, and Guardrails were fully covered by the approved AWS credits, so the build ran at $0 out of pocket. CloudRoute's commission was paid by the partner from AWS engagement funding, not by the customer.
PII policy: block cards/SSN · mask names/emails/phones + custom MRN & member-ID regex · scrubbed at ingestion · redaction-aware logging · versioned in IaC · credits: POC + Activate · out-of-pocket: $0
Whatever your PII-redacting, HIPAA/PCI/GDPR-conscious GenAI workload would cost on Bedrock, AWS credits can cover it. CloudRoute routes you to the right credit pool (Activate up to $100K, Bedrock POC $10K–$50K, GenAI Accelerator up to $1M) and a vetted AWS partner who builds it right — sensitive-information filters, custom regex, redaction-aware logging, encryption, the audit trail. Customer pays $0.