A neutral, reference-grade guide to building generative AI on AWS inside a fintech compliance perimeter: how the SOC 2 and PCI-DSS obligations actually map to the AWS GenAI stack, the data-handling controls that move an audit (Bedrock's no-training default, Guardrails PII/PAN redaction, KMS encryption, PrivateLink, region-pinned residency), which use cases fintechs ship versus the autonomous credit and fraud decisions they should not, how to make the system auditable, the model-risk and governance frame, what to avoid, and a reference architecture — plus how AWS credits and a fintech-experienced partner make the build cost $0.
Before any architecture, get the frame right. "Compliant" is not a property of a model — it is a property of the whole system and the program around it. For a fintech, three things define the perimeter the GenAI workload has to live inside, and they pull in different directions.
SOC 2 is a controls attestation (Trust Services Criteria: security, availability, processing integrity, confidentiality, privacy) produced by an auditor against controls you define and operate. It is not a checklist AWS hands you — it is evidence that the controls you claim are real and were enforced over a period. For a GenAI feature this means: who can call the model, how data flowing through it is protected, how the system is monitored, and how changes (including to prompts, models, and guardrails) are governed. SOC 2 cares less about which model you use and more about whether you can show the control was in place and working.
PCI-DSS is prescriptive and unforgiving in a different way: it governs cardholder data (the primary account number — PAN — and related sensitive authentication data). The single most important PCI fact for a GenAI build is about scope: any system component that stores, processes, or transmits cardholder data is in scope and inherits the full weight of the standard. So the architectural goal is almost always to keep the model out of cardholder-data scope — tokenize, truncate, or redact the PAN before it can reach a prompt, a completion, or a log. A model that never sees a PAN is not a PCI system component for the cardholder-data flow.
There is a third layer fintechs forget at their peril: the sector-specific obligations beyond SOC 2 and PCI. Depending on what you do, that can include GLBA (financial privacy and the Safeguards Rule), state money-transmitter requirements, fair-lending law (ECOA / Reg B) if anything touches credit decisions, AML/KYC obligations, and — if you operate in the EU/UK — GDPR. The model-governance frame examiners increasingly apply to AI is descended from SR 11-7-style model-risk management. None of this is AWS-specific, but all of it shapes what your GenAI feature is allowed to do, which is covered in the use-case and governance sections below.
The practical synthesis: compliant GenAI on AWS = a model running on infrastructure that is in-scope for the attestations you need (Bedrock and the surrounding AWS services), wrapped in controls you architect and evidence (encryption, private networking, IAM, redaction, logging, governance), constrained to use cases that do not put the model in a position it cannot legally or operationally occupy. Get those three right and the audit is a paperwork exercise, not a re-architecture.
Keep cardholder data out of the GenAI workload. Tokenize or redact the PAN before the prompt is built, and ensure it can never land in a completion or a log. A model that never sees a PAN is not a PCI system component for that flow — which collapses your audit scope and removes the most dangerous failure mode at the same time.
A fintech does not pick a GenAI platform on capability alone — it picks the one that does not fight its compliance program. Two default properties of Amazon Bedrock are what make it the common fintech choice, and they map directly onto the questions an auditor and a CISO ask first.
Your data is not used to train the base models. On Amazon Bedrock, the prompts you send and the completions you receive are not used to train or improve the underlying foundation models, and they are not shared with the model providers. Your content stays in your AWS account. This is the single property that most often unblocks a fintech GenAI project, because the alternative — "our customers' financial data might end up in someone else's training set" — is a non-starter for a privacy or security reviewer. (Confirm current data-handling terms on the AWS Bedrock data-protection documentation; this is the default behavior, not an add-on.)
Your data stays in your account and region. Bedrock processes requests within the AWS Region you call, and the data does not leave your account boundary. That is the foundation of a data-residency story (covered in detail below) and of the "no third party sees our data" assurance. Combined with the no-training default, these two properties are why Bedrock, rather than a public consumer LLM API, is the default substrate for regulated workloads.
The services are in-scope for the attestations you need. AWS publishes its own SOC 1/2/3 reports and maintains PCI-DSS Level 1 service-provider status, with a documented list of services in scope (and HIPAA-eligible services for health-adjacent fintech). Amazon Bedrock and the surrounding building blocks (KMS, VPC, CloudTrail, S3, etc.) are designed to operate inside that posture. Crucially, this is inherited control, not delivered compliance: AWS attests to the security of the cloud; you are responsible for security in the cloud. Always confirm the current in-scope service list on the AWS compliance pages — scope changes over time.
The control primitives are first-class. The things a fintech audit demands — customer-managed encryption keys (KMS), private network paths (PrivateLink/VPC endpoints), granular access control (IAM), immutable audit logs (CloudTrail), and a policy/redaction layer (Bedrock Guardrails) — are native AWS services, not bolt-ons. That means the GenAI workload uses the same control plane as the rest of your AWS estate, so the evidence you already produce for SOC 2 extends to cover it rather than requiring a parallel program.
| What the reviewer asks | AWS answer | Default or you-configure? | Maps to |
|---|---|---|---|
| Is our data used to train the model? | No — prompts/outputs not used to train base models, not shared with providers | Default | SOC 2 confidentiality · GLBA |
| Where does the data physically go? | Stays in your account, processed in the Region you call | Default (you pick the Region) | Residency · GDPR · SOC 2 |
| Is the infrastructure attested? | AWS SOC 1/2/3 + PCI-DSS L1 service provider; Bedrock + core services in scope | Inherited (confirm scope) | SOC 2 · PCI-DSS |
| How is data encrypted? | KMS — customer-managed keys, encryption at rest + in transit | You configure (CMK) | PCI-DSS Req 3/4 · SOC 2 |
| Can it stay off the public internet? | PrivateLink / VPC interface endpoints to Bedrock | You configure | PCI-DSS Req 1 · SOC 2 |
| Who can call the model, and is it logged? | IAM least-privilege + CloudTrail audit trail | You configure | PCI-DSS Req 7/10 · SOC 2 |
| Can it leak PII / a PAN? | Bedrock Guardrails — PII detection + redaction, custom regex | You configure | PCI-DSS · GLBA · SOC 2 privacy |
This is the heart of a compliant GenAI build. Five controls do most of the work of turning "a model that calls Bedrock" into "a model an auditor signs off on." None of them are exotic; the discipline is applying all five, consistently, and being able to prove it.
The no-training default is a control even though you did not build it — and SOC 2 expects you to document the controls you rely on, including inherited ones. The action item is small but real: record, in your control narrative, that the workload runs on Amazon Bedrock specifically because prompts and completions are not used to train base models and are not shared with model providers, cite the AWS data-protection terms, and note the date you confirmed them. If you ever fine-tune or create a custom model, document that the fine-tuning data and the resulting model artifacts remain private to your account as well. The reviewer is not checking that you invented the control — they are checking that you know it exists and chose the platform deliberately.
PCI-DSS (Requirements 3 and 4) and SOC 2 both expect strong encryption of data at rest and in transit, and PCI in particular expects you to control the keys. Use AWS KMS customer-managed keys (CMKs) to encrypt everything in the GenAI data path: the S3 buckets holding source documents for a knowledge base, the vector store, any conversation transcripts or logs you retain, and any fine-tuning datasets and custom-model artifacts. Transit is TLS end to end. The CMK gives you the audit-friendly properties PCI wants — key rotation, access policies, and a CloudTrail record of every key use — and lets you cryptographically revoke access by disabling the key. Encrypting with a default AWS-managed key is better than nothing, but a fintech auditor will specifically look for customer-managed keys on anything touching sensitive data.
A Bedrock Guardrail is the control that most directly addresses the privacy and cardholder-data risk, and it works on both the input (the prompt) and the output (the completion). Configure sensitive-information detection to redact common PII (names, emails, phone numbers, addresses, national-ID numbers) so a customer pasting personal data into a support chat does not persist it, and add custom regex patterns for the identifiers your domain actually uses — primary account numbers, routing/account numbers, internal customer IDs. For cardholder data the safest posture is belt-and-suspenders: redact or tokenize the PAN before the prompt is constructed in your application code, and also run a Guardrail that blocks or masks anything that looks like a PAN, so a defense-in-depth layer catches what application logic misses. The Guardrail is also where you set denied topics (keep a support bot from giving regulated financial advice) and the prompt-attack filter (essential for anything customer-facing). See the amazon-bedrock-guardrails sibling for the full filter taxonomy.
A fintech generally does not want sensitive traffic traversing the public internet, and PCI Requirement 1 plus SOC 2 network-security criteria both reward keeping it private. VPC interface endpoints (powered by AWS PrivateLink) let your application reach Amazon Bedrock over the AWS private network without an internet gateway, NAT, or public IP — the request never leaves AWS's backbone. This both satisfies the "no public exposure" control and tightens your network diagram for the auditor. Pair it with security groups and endpoint policies that restrict which principals and which Bedrock actions are allowed, so the private path is also a least-privilege path.
Because Bedrock processes requests in the Region you call, residency is a deployment decision: pin the workload to the Region(s) your obligations require (e.g., an EU Region for GDPR data, a specific in-country Region where local rules demand it) and confirm the models you need are available there before committing. The nuance to manage is cross-region inference: Bedrock can route requests across Regions within a geography for capacity and resilience, which is excellent for availability but means you must understand the geographic boundary of that routing and confirm it stays within your allowed residency zone. If strict single-Region residency is required, configure accordingly and document it. See the amazon-bedrock-cross-region-inference sibling for the mechanics, and treat residency as something you assert, configure, and evidence — not assume.
Redact or tokenize the PAN in two places: in application code before the prompt is built, and in a Bedrock Guardrail that blocks/masks PAN-shaped strings on input and output. Then make sure your logging redacts too. Three independent layers means a single bug never turns into a cardholder-data exposure — and that is exactly the story a PCI assessor wants to hear.
The fintechs that ship GenAI successfully share a pattern: they apply it where it augments a human or automates a low-stakes task, and they keep it away from decisions that carry regulatory weight or that it cannot explain. Here are the four use cases that consistently clear compliance, and exactly where each one's boundary sits.
The highest-value, lowest-risk starting point: a support assistant or internal copilot that answers from your knowledge base — help-center articles, product docs, policies, internal runbooks. Built as retrieval-augmented generation (a Bedrock Knowledge Base, or Amazon Q Business for an internal employee assistant), it grounds answers in approved content and, with a Guardrail enforcing contextual grounding, refuses to make things up. The compliance boundary: keep account-specific actions (move money, change a limit) behind authenticated, audited APIs and human confirmation — the assistant explains and drafts, it does not silently execute privileged transactions. See the rag-on-aws and amazon-q-business siblings.
GenAI is genuinely useful around fraud and risk operations: summarizing a case file for an analyst, drafting a suspicious-activity narrative for human review, clustering and explaining alerts, turning a natural-language question into a query over transaction data. The hard line — and it is a line, not a preference — is that the generative model must not be the autonomous decision-maker that approves or declines a transaction, flags an account, or files a regulatory report without a human in the loop. Those decisions implicate fair-lending and model-risk obligations and demand explainability that a black-box LLM does not provide. Keep the deterministic fraud/risk scoring models as the system of record for decisions; use GenAI to make the humans operating them faster. The word "adjacent" is doing load-bearing work.
Fintech runs on documents: onboarding paperwork, KYC evidence, statements, contracts, dispute correspondence. GenAI (often paired with Amazon Textract for OCR, then Bedrock for understanding) extracts structured fields, classifies documents, summarizes long agreements, and flags missing items. This is high ROI and largely low risk — with two controls: run the extracted/intermediate data through PII redaction so sensitive fields are handled deliberately, and keep a human review step for anything that feeds a regulated decision (an extracted income figure that informs underwriting is not a place for unreviewed model output). Treat the model as a very fast first-pass analyst, not the final authority.
The lowest-regulatory-risk use case is internal: an AI coding assistant (Amazon Q Developer, or Claude on Bedrock wired into developer tooling) that helps your engineers ship faster. The fintech-specific care here is about your source code and secrets, not customer financial data — choose the enterprise tier with the data-handling terms your security team requires, ensure code and prompts are not used to train base models, and keep secrets out of prompts. It does not touch cardholder data or customer PII in production, so it sits well outside PCI scope while still delivering real velocity. See the amazon-q-developer and amazon-q-vs-github-copilot siblings.
| Use case | Typical AWS building blocks | Compliance risk | The hard boundary |
|---|---|---|---|
| Customer / internal support | Bedrock Knowledge Bases · Amazon Q Business · Guardrails | Low–medium | Explains + drafts; privileged actions stay behind authed, audited APIs + human confirm |
| Fraud / risk-adjacent | Bedrock (summarize/triage) over case + txn data | High if it decides | Never the autonomous approve/decline/file; human-in-the-loop, deterministic models decide |
| Document processing | Amazon Textract → Bedrock · Guardrails (PII) | Low–medium | Human review before any extracted value feeds a regulated decision |
| Code assistance | Amazon Q Developer · Claude on Bedrock | Low | Enterprise tier + no-training terms; no secrets in prompts; outside PCI scope |
SOC 2 and PCI-DSS are, in practice, evidence exercises. A control that worked but cannot be demonstrated is, to an auditor, a control that did not exist. The good news for a Bedrock workload is that the evidence is produced by the same AWS services you already use — you just have to turn it on, retain it, and protect it.
Start with CloudTrail: it records the management and (where enabled) data-plane API calls in your account, so you have an immutable record of who invoked which Bedrock action, when, and from where. That is the backbone of PCI Requirement 10 (track and monitor access) and the SOC 2 monitoring criteria. Enable Bedrock model-invocation logging to capture request/response metadata (and, if you choose, content) to S3 or CloudWatch — but here a fintech must be deliberate: if you log prompt/response content, that log is now a data store that can contain PII or, if you are not careful, cardholder data, so it must be encrypted with a CMK, access-restricted, retention-bounded, and run through the same redaction logic as everything else. Many fintechs deliberately log metadata-plus-redacted-content rather than raw content for exactly this reason.
Layer on the standard evidence sources: CloudWatch for metrics, alarms, and operational monitoring; AWS Config to record resource configuration state and detect drift (e.g., a Bedrock endpoint that lost its CMK or a bucket that became public); and GuardDuty / Security Hub for threat detection and a consolidated control posture. Together these answer the auditor's three recurring questions — was the control configured, did it stay configured, and were exceptions detected and handled — with timestamped, tamper-evident records rather than screenshots.
Finally, version and log the AI-specific artifacts. Your Guardrail should be versioned (publish numbered, immutable versions and reference a specific one at inference time) so you can show exactly which policy was in force on any date. Your prompts and prompt templates belong in source control with change review, and any model or model-version change should go through the same change-management process as a code deploy. This is what turns "we have a guardrail" into "here is the version history proving which safety policy governed every interaction in the audit period" — the difference between a finding and a clean control.
Turning on full prompt/response content logging without redaction quietly creates a new sensitive-data store — sometimes one that pulls the logging system into PCI scope. Log metadata plus redacted content, encrypt logs with a CMK, bound retention, restrict access, and run logs through the same redaction as the live path. Capture enough to evidence the control, not enough to create a new liability.
Financial regulators have decades of expectation around model risk (the SR 11-7 lineage), and examiners increasingly apply that lens to AI/ML, including generative systems. You do not need a quant-team apparatus to satisfy it, but you do need to treat the GenAI feature as a governed model, not a magic box someone wired in.
The core of model-risk management is simple to state: know your models, validate them, monitor them, and assign ownership. For a GenAI feature that translates to a short, real set of artifacts. Maintain a model inventory entry: which foundation model and version, where it runs (Bedrock, which Region), what it is used for, who owns it, and what data it touches. Write down the intended use and the limits — what the system is allowed to do and, explicitly, what it must not (the boundaries from the use-case section). Define evaluation: how you tested quality and safety before launch (your adversarial Guardrail test set, accuracy on a representative task set, grounding/hallucination checks) and how you re-test when the model or prompt changes.
Then monitor in production: track quality and safety signals, watch for drift (a model-version update can change behavior), keep a human escalation path for low-confidence or high-impact cases, and log enough to investigate an incident. Assign a named owner accountable for the model's behavior and a change-management process for model/prompt/guardrail updates. None of this is heavyweight for a single support assistant — it is a one-page model card, a test set, a dashboard, and an owner — but it is exactly what an examiner or a SOC 2 auditor looking at AI governance expects to find, and it is far cheaper to set up at the start than to reconstruct under audit.
The explainability point deserves emphasis because it determines what GenAI is allowed to do. Generative models are not inherently explainable in the way a regulated credit model must be (you cannot always produce the precise reason for a given output). That is fine for assistive use cases and disqualifying for autonomous regulated decisions — which is the governance-level reason the fraud/credit boundary in the use-case section is a hard line, not a stylistic choice. Govern the model honestly about what it can and cannot justify, and your use cases will naturally land on the right side of fair-lending and model-risk expectations.
Most failed or stalled fintech GenAI projects fail for a small set of repeated reasons. Knowing the anti-patterns up front is cheaper than discovering them in a pre-audit. Here are the ones that reliably cause trouble.
Pulling it together: here is a concrete, defensible reference architecture for a customer-facing or internal fintech GenAI feature (a support/RAG assistant is the canonical example) that satisfies SOC 2 and keeps the model out of PCI cardholder-data scope. Adapt the specifics, but the control shape generalizes.
The request path: a user interacts with your application; the application authenticates and authorizes the request (your existing identity layer), then — before constructing any prompt — runs the input through a tokenization/redaction step that strips PANs and other regulated identifiers. The application reaches Amazon Bedrock over a VPC interface endpoint (PrivateLink), never the public internet, scoped by IAM and an endpoint policy to exactly the Bedrock actions and models it needs. The call passes a versioned Bedrock Guardrail (PII redaction + custom regex for PANs/account numbers, denied topics for regulated advice, prompt-attack filter, and contextual grounding for RAG answers) that screens both input and output. For a knowledge assistant, retrieval comes from a Bedrock Knowledge Base over source content in an S3 bucket and a vector store, all encrypted with KMS customer-managed keys.
The control plane around it: KMS CMKs encrypt every data store in the path (source bucket, vector store, any retained transcripts); CloudTrail records every API call; Bedrock model-invocation logging captures metadata-plus-redacted-content to a CMK-encrypted, access-restricted, retention-bounded S3 location; CloudWatch alarms on anomalies; AWS Config detects drift (a store that lost encryption, an endpoint that went public); and GuardDuty/Security Hub provide threat detection and a consolidated posture. The whole stack is pinned to a deliberately chosen Region for residency, with cross-region inference either disabled or constrained to an approved geography. Privileged actions (move money, change limits) are not performed by the model — they remain behind your authenticated, audited transaction APIs with human confirmation.
The governance wrap: a model-inventory entry and one-page model card document the feature; the Guardrail and prompts are versioned in source control with change review; an adversarial test set gates launch and re-runs on every model/prompt change; a named owner is accountable; and the redaction, encryption, logging, and access controls are mapped to the specific SOC 2 criteria and PCI requirements they satisfy, so the evidence is pre-assembled for the auditor. This architecture is deliberately boring — that is the point. It reuses the controls a fintech already operates, adds the two GenAI-specific ones (Guardrail + model governance), and keeps regulated data and regulated decisions away from the model.
Redact before the prompt → reach Bedrock over PrivateLink under least-privilege IAM → screen with a versioned Guardrail (PII/PAN redaction + grounding) → retrieve from a KMS-encrypted Knowledge Base → log metadata-plus-redacted-content to a CMK-encrypted, retention-bounded store → monitor with CloudTrail/Config/CloudWatch/GuardDuty → pin the Region → keep privileged actions and regulated decisions off the model → govern it as an inventoried, owned, tested model. Boring, defensible, $0 to build with credits.
A scannable map from each AWS control in the reference architecture to the obligations it satisfies and what it actually does for a fintech GenAI workload. This is the spine of the evidence package an auditor will ask for — assemble it as you build, not after.
| Control | AWS service | SOC 2 (TSC) | PCI-DSS | What it does for the GenAI workload |
|---|---|---|---|---|
| No base-model training | Amazon Bedrock (default) | Confidentiality, Privacy | Supports Req 3 intent | Customer data never enters a model provider's training set |
| Encryption + key control | AWS KMS (CMK) | Security, Confidentiality | Req 3 (at rest), Req 4 (in transit) | All GenAI data stores + transit encrypted with keys you control + rotate |
| PII / PAN redaction | Bedrock Guardrails | Privacy, Processing integrity | Req 3 (keeps PAN out of scope) | Strips sensitive data from prompts, completions; blocks PAN-shaped strings |
| Private networking | PrivateLink / VPC endpoints | Security (network) | Req 1 (no public exposure) | Bedrock traffic stays on AWS backbone, never the public internet |
| Least-privilege access | IAM + endpoint policies | Security (logical access) | Req 7 (need-to-know access) | Only authorized principals call only the Bedrock actions/models needed |
| Immutable audit trail | CloudTrail + invocation logging | Security (monitoring) | Req 10 (track + monitor) | Who invoked what, when — tamper-evident, retention-bounded, CMK-encrypted |
| Config + drift + threat | AWS Config · GuardDuty · Security Hub | Security, Availability | Req 10/11 (monitor + test) | Proves controls stayed configured; detects exceptions + threats |
| Model governance | Process (inventory, versioned Guardrail/prompts) | Processing integrity, Security | Supports AI/model-risk expectations | Inventory, evaluation, owner, change control — the model-risk story |
Situation: The team wanted a customer-facing support assistant that answered from their help center and account-policy content, but they were already PCI-scoped (they handle card payments) and mid-way through a SOC 2 Type II window, so anything new had to clear both bars. Their specific fears: a customer pasting a card number or personal data into the chat and it landing in a prompt or a log; the assistant drifting into giving regulated financial advice; and an auditor asking "where does this data go and who governed this model?" with no good answer. They also did not want to spend Series-A runway on inference while still proving the unit economics.
What CloudRoute did: CloudRoute matched them in under 24 hours to an AWS partner with payments-fintech and Bedrock experience. The partner built the assistant on a Bedrock Knowledge Base over the approved content, reached Bedrock over a PrivateLink VPC endpoint under least-privilege IAM, and put a single versioned Guardrail in front of it: PII detection set to redact plus custom regex that blocks PAN- and account-number-shaped strings on both input and output, denied topics covering "investment advice" and "regulated financial advice," the prompt-attack filter on, and contextual grounding so unsupported answers were blocked. Application code tokenized any card-shaped input before prompt construction (defense in depth with the Guardrail), all stores used KMS customer-managed keys, model-invocation logging captured metadata-plus-redacted-content to a CMK-encrypted retention-bounded bucket, and CloudTrail/Config/GuardDuty covered monitoring. The partner pinned the workload to an EU Region for residency, wrote the one-page model card and inventory entry, mapped every control to the relevant SOC 2 criteria and PCI requirements, and filed a Bedrock POC credit application plus an Activate application to fund the build.
Outcome: Cardholder data never reached a prompt, a completion, or a log; the assistant answered only from grounded content and refused regulated-advice questions; and the team walked into their SOC 2 evidence review with a versioned policy artifact, a model-governance package, and a control-to-criteria map already assembled. Because the model never saw a PAN, it stayed out of cardholder-data scope, which kept the PCI conversation contained. Inference, the knowledge base, and Guardrails were fully covered by the approved AWS credits, so the build ran at $0 out of pocket. CloudRoute's commission was paid by the partner from AWS engagement funding, not by the customer.
guardrail: PII + PAN-block + denied-topics + grounding · PrivateLink + CMK + EU Region · model card + control map · credits: POC + Activate · out-of-pocket: $0
Whatever your SOC 2 / PCI-conscious GenAI workload would cost on Bedrock, AWS credits can cover it. CloudRoute routes you to the right credit pool (Activate up to $100K, Bedrock POC $10K–$50K, GenAI Accelerator up to $1M) and a vetted AWS partner who has built generative AI inside a fintech audit before — Guardrails with PII/PAN redaction, KMS encryption, PrivateLink, region-pinned residency, the logging and model-governance package. Customer pays $0.