ai translation on aws · the 2026 how-to

How to do AI translation on AWS (2026).

AWS gives you two translation engines, and most of the cost and quality outcome rides on picking the right one for each job. This is the full how-to: Amazon Translate (purpose-built neural machine translation — fast, cheap, 75+ languages) versus an LLM on Amazon Bedrock (Claude, Nova, Llama — context-aware, tone-controlled, document-aware), exactly when each one wins, the hybrid pattern that uses both, how to lock quality with custom terminology and glossaries and human review, how to translate millions of strings in batch for a fraction of the cost, the reference architecture end to end, and how to localize a real multilingual app.

translation engines
2
Amazon Translate languages
75+
batch discount
~50%
credits to fund it
up to $100K
TL;DR
  • AWS has two ways to translate. Amazon Translate is purpose-built neural machine translation: fast, cheap (~$15 per million characters), 75+ languages, real-time or async batch, with custom terminology and parallel-data customization. An LLM on Amazon Bedrock (Claude, Amazon Nova, Llama, Mistral) translates with context, tone, formality and glossary instructions in the prompt, can reason over a whole document, and handle markup — but costs more and is slower per unit.
  • The decision is per job, not per company. Use Amazon Translate for high-volume, latency-sensitive, cost-sensitive, plain-text translation (chat, tickets, UGC, bulk content, real-time). Use a Bedrock LLM when context, brand tone, formality, idiom, domain glossary adherence, or document-aware formatting decide the outcome (marketing copy, legal/medical nuance, structured documents, low-resource style control). Many production systems are a hybrid: Translate for the bulk, an LLM to refine or to handle the hard subset.
  • Quality is engineered, not assumed: enforce a glossary (Amazon Translate custom terminology, or a term list injected into the LLM prompt), customize on your own parallel data where you have it, batch the bulk for roughly half the cost, and put a human-in-the-loop review on the high-stakes slice. Translation and inference bills scale with volume fast; CloudRoute routes you to AWS credits (Activate Portfolio up to $100K, Bedrock/GenAI POC $10K–$50K, GenAI Accelerator up to $1M) and vetted ML partners who build the pipeline — you pay $0.
the core choice

ITwo engines for translation on AWS — and why the choice matters

Translation on AWS is not one service. There are two fundamentally different engines, and almost every cost overrun or quality complaint traces back to using the wrong one for the job at hand.

Amazon Translate is a purpose-built neural machine translation (NMT) service. It does exactly one thing — convert text from a source language to a target language — and it does it fast, cheaply, and at scale across 75+ languages and thousands of language pairs. You call a TranslateText API for real-time work, or submit an asynchronous batch job over a folder of documents in Amazon S3. It supports custom terminology (force specific terms to translate a fixed way), parallel-data customization (Active Custom Translation, which adapts output to your own example translations), automatic source-language detection, and formality and profanity-masking controls on supported pairs.

An LLM on Amazon Bedrock — Claude, Amazon Nova, Llama, Mistral and others — is a general foundation model that translates as one of many things it can do. You do not call a translate endpoint; you write a prompt: "Translate the following from English to German. Maintain a formal tone, keep the brand name unchanged, use this glossary, and preserve the Markdown formatting." Because the model reasons over the whole input, it can handle context across sentences, choose register and tone, follow a domain glossary expressed in natural language, resolve ambiguity from surrounding text, translate inside structured formats (HTML, Markdown, JSON, XLIFF) without breaking the markup, and explain or flag uncertain choices.

The two engines sit at different points on the same trade-off curve. Amazon Translate optimizes for throughput, latency and cost — it is the right tool when you have a lot of text, you need it now or cheaply, and the content is relatively plain. A Bedrock LLM optimizes for nuance and control — it is the right tool when the quality of a smaller, higher-stakes set of translations depends on context, tone, terminology adherence, or document structure that a sentence-level NMT engine does not see.

This is not Translate-versus-Bedrock as a company-wide religious choice. The mature pattern is to route each job to the engine that fits it, and often to combine them. The next section is the decision rule; the section after that is the hybrid pattern that uses both.

the one-sentence version

Use Amazon Translate when volume, speed and cost dominate and the text is plain; use a Bedrock LLM when context, tone, formality, glossary nuance or document structure decide the outcome; use a hybrid when you have both at once — Translate for the bulk, an LLM for the hard slice.

the decision rule

IIWhen to use Amazon Translate vs a Bedrock LLM

The choice is per job, driven by the content and the stakes — not by the company. Five questions settle almost every case: how much text, how fast, how cheap, how nuanced, and how structured.

Amazon Translate wins on the first three axes (volume, latency, cost); a Bedrock LLM wins on the last two (nuance, structure). Map your job against the axes and the engine usually picks itself.

Choose Amazon Translate when…

Volume is high and cost matters. Translating millions of support tickets, product reviews, user-generated content, or a large content catalogue is where Translate's per-character pricing and batch mode crush LLM token costs — often by an order of magnitude or more.

Latency matters. Real-time use cases — live chat translation, in-app translate buttons, instant ticket routing — need a fast, low-latency call. Amazon Translate returns in milliseconds; a frontier LLM is slower per request.

The text is relatively plain and self-contained. Sentence-level content where each segment can be translated without deep cross-document context (chat messages, short reviews, UI strings with adequate context, notifications) is squarely Translate's sweet spot — especially with a custom-terminology glossary applied.

You want a managed, deterministic service. Translate is a single-purpose API with predictable behaviour, simple scaling, and customization via parallel data — no prompt engineering, no model selection.

Choose a Bedrock LLM when…

Context across the document changes the right answer. Pronoun resolution, ambiguous terms, callbacks to earlier sentences, and consistency of a chosen translation across a long document are things an LLM sees and a sentence-by-sentence NMT engine can miss.

Tone, formality and brand voice matter. Marketing copy, brand messaging, and customer-facing content where register (formal vs informal "you", e.g. Sie/du, vous/tu), idiom, and house style are part of the deliverable — the LLM takes a natural-language style brief and a glossary in the same prompt.

The content is structured or mixed. Translating inside Markdown, HTML, JSON, XLIFF or code-adjacent text while leaving tags, placeholders ({{name}}, %s), URLs and code untouched is something an instructed LLM does well and a naive pipeline mangles.

Domain nuance and glossary fidelity are critical. Legal, medical, and financial text where a mistranslated term has real consequences benefits from an LLM that can be told the domain, handed a glossary, and asked to flag low-confidence passages for human review.

The language is low-resource or the task is more than translation. When you also need transcreation (adapt, don't literally translate), summarise-then-translate, or translate-plus-explain, the LLM does it in one pass.

the practical heuristic

Per job, ask: would a fluent bilingual clerk handle this fine (high-volume, plain, fast) → Amazon Translate; or does it need a bilingual copywriter / domain expert who reads the whole thing and cares about tone and terms (nuanced, structured, high-stakes) → Bedrock LLM. When a corpus contains both, split it and route each part.

using both

IIIThe hybrid pattern — Amazon Translate plus a Bedrock LLM

The highest-value production systems rarely pick one engine for everything. They use Amazon Translate for the economical bulk and a Bedrock LLM for the slice where nuance pays — getting most of the cost advantage of NMT and most of the quality advantage of an LLM.

There are three established ways to combine the engines. They are not mutually exclusive; a large localization system often uses all three on different content types.

Pattern 1 — Translate first, LLM to refine

Run the whole corpus through Amazon Translate for a fast, cheap first pass, then send the output (plus the source) to a Bedrock LLM with an instruction to polish: fix tone, enforce the glossary, smooth idiom, and adapt register. This is sometimes called MT post-editing done by a model instead of a human. You pay full Translate cost on everything but only LLM cost on a refinement pass — and because the LLM is editing rather than translating from scratch, you can often use a smaller, cheaper model and shorter outputs.

A variant routes only the segments most likely to need help to the LLM: marketing strings, long passages, or anything where a quality classifier or simple heuristic (length, presence of idiom, low Translate confidence) flags risk. The plain majority ships straight from Translate untouched.

Pattern 2 — Route by content type up front

Classify each piece of content before translating and send it to the engine that fits: support tickets, UGC, logs and chat → Amazon Translate; marketing pages, legal clauses, product names and brand copy → Bedrock LLM. The routing key can be as simple as the content's source system or a metadata tag, or as rich as an LLM/classifier deciding per item. This keeps the expensive engine off the 90% of volume that does not need it.

Pattern 3 — Translate for draft, LLM as quality judge

Use Amazon Translate to produce the translation and a Bedrock LLM as an automated reviewer: ask the model to score the translation for accuracy, fluency, terminology adherence and tone, and to flag or correct only where it falls short. The model becomes an LLM-as-a-judge over MT output, cheaply triaging which translations are good enough to ship and which need correction or human review — far cheaper than human-reviewing everything.

why hybrid usually wins

A pure-LLM pipeline over millions of strings is expensive and slow; a pure-Translate pipeline misses tone and nuance on the content where it matters most. The hybrid — Translate for the bulk, an LLM for the hard slice (refine, route, or judge) — captures ~90% of the cost advantage and ~90% of the quality advantage at once. Most serious localization on AWS lands here.

engineering quality

IVControlling quality — glossaries, custom terminology, and human review

Translation quality is not something you hope for; it is something you engineer. Four levers do most of the work: a glossary the engine must obey, customization on your own data, a confidence-aware human-in-the-loop, and an evaluation set you score on every change.

Glossaries and custom terminology

Every brand has terms that must translate a fixed way (or not at all): product names, feature names, legal terms, units, and house-style choices. On Amazon Translate this is custom terminology — you upload a CSV/TMX term list and Translate forces those mappings on every job, in real-time and batch. On a Bedrock LLM the same glossary goes into the prompt ("Use exactly these translations for these terms; never translate the brand name"), and because the model reads context it can apply terms more flexibly (right inflection, right case) than a literal find-replace. A shared, version-controlled glossary feeding both engines is what keeps a hybrid pipeline consistent.

Customizing on your own data

If you have past human translations, use them. Amazon Translate's Active Custom Translation (parallel-data customization) adapts output toward your example translations at request time without training a bespoke model — point it at parallel data in S3 and translations shift toward your style and terminology. With a Bedrock LLM, the analogue is few-shot prompting (include several of your best source→target examples in the prompt) or light fine-tuning for a consistent house style at scale. Either way, your existing translation memory is an asset — feed it in rather than starting cold.

Human-in-the-loop where it counts

No automated translation is perfect, and the right move is not to human-review everything — it is to review the slice where errors are expensive. Tag content by risk (a legal disclaimer is high-stakes; a forum post is not), auto-ship the low-risk majority, and route the high-risk minority to human reviewers. Amazon Augmented AI (A2I) provides a managed human-review workflow you can wire into the pipeline; a confidence signal (an LLM judge's score, Translate output heuristics, or back-translation agreement) decides what gets escalated. This is how you get near-human quality on the content that matters without paying for human review on everything.

Evaluation — measure, don't vibe

Build a fixed evaluation set: a few hundred representative source segments with reference translations (and ideally human ratings). Score every pipeline change so you know whether a new glossary, a model swap, or a chunking tweak actually helped. Use both automatic metrics (BLEU, chrF, COMET for reference-based scoring) and an LLM-as-a-judge on Bedrock for accuracy/fluency/terminology, and keep a small human spot-check because automatic metrics miss domain-specific errors. The discipline mirrors any ML system: a golden set, automated scoring, and a number that moves when you turn a knob.

the quality stack

In order of leverage: (1) a shared glossary both engines obey · (2) customization on your own parallel data / few-shot examples · (3) a confidence signal that routes the risky slice to human review (A2I) · (4) a fixed evaluation set scored on every change. Most "the translations are wrong" complaints are a missing glossary or no review routing — not the engine.

volume + the bill

VBulk translation, batch mode, and what it costs

Translation bills scale directly with volume, so two things decide the number: which engine you run and whether you run the bulk in batch. Get both right and large-scale localization is surprisingly affordable; get them wrong and an LLM pass over millions of strings is eye-watering.

For bulk work, asynchronous batch is the default, not an optimization. Amazon Translate offers an async batch API that translates a whole folder of documents in S3 in one job — ideal for back-catalogues, document sets, and nightly content syncs. Amazon Bedrock offers batch inference (submit a large job, get results back asynchronously) at roughly 50% of on-demand token price — the right way to run any LLM translation that does not need a real-time answer. Reserve real-time/on-demand calls for genuinely interactive translation.

The figures below are representative as of 2026 to show the shape of the bill, not a quote — always check the AWS pricing page (and per-model Bedrock pricing) for current rates. The headline: Amazon Translate is priced per character and is dramatically cheaper per unit of text; a Bedrock LLM is priced per token and buys you nuance at a higher unit cost. Batch and prompt caching are the main levers on the LLM side.

Levers that cut the bill

Route by need: send the plain majority to Amazon Translate and reserve the LLM for the slice that needs it — the single biggest saving. Batch everything that can wait: Translate async batch and Bedrock batch inference (~50% off). Cache: Bedrock prompt caching means the static system prompt and glossary are not re-billed on every call. Right-size the model: a smaller Nova/Claude tier handles refinement and judging cheaply; reserve a frontier model for genuinely hard passages. Don't re-translate: cache results and maintain a translation memory so unchanged strings are never paid for twice.

translation cost shape on aws · representative as of 2026 — check the AWS pricing page for current rates
Engine / modePriced byRelative unit costLatencyBest for
Amazon Translate — real-timePer character (~$15 / million chars)LowestMillisecondsLive chat, in-app translate, ticket routing
Amazon Translate — async batchPer character (same rate, bulk job)LowestMinutes–hours (job)Back-catalogues, document sets, bulk content
Amazon Translate — custom (ACT)Per character (higher tier) + parallel dataLow–moderateMillisecondsOn-brand bulk translation with your own examples
Bedrock LLM — on-demandPer input + output token, per modelHigherSecondsInteractive, nuanced, structured, high-stakes
Bedrock LLM — batch inferencePer token, ~50% of on-demandModerateAsync (job)Bulk nuanced translation that can wait
Rough rule of thumb: Amazon Translate is the order-of-magnitude-cheaper engine for plain bulk text; a Bedrock LLM costs more but buys context, tone and structure. For LLM translation, batch inference (~50% off) and prompt caching (stop re-paying for a static system prompt/glossary) are the biggest cost levers. The hybrid keeps the LLM off the bulk, which is where most of the savings come from.
end to end

VIA reference translation architecture on AWS

A production translation pipeline is a small set of managed services wired together. The same skeleton serves both real-time and bulk paths; you choose the engine per route and add the quality controls from section IV.

It helps to separate the real-time path (a user clicks "translate", a chat message arrives) from the bulk path (a folder of documents, a nightly content sync). They share glossary, evaluation, and storage; they differ in how work is triggered and which engine mode they call.

Real-time path

1. Request in. A client calls an Amazon API Gateway endpoint backed by AWS Lambda (or your existing service). The request carries the text, source/target languages, and content type.

2. Route. Lambda decides the engine from the content type and stakes: plain/short → Amazon Translate TranslateText (with custom terminology applied); nuanced/structured/high-stakes → a Bedrock LLM with the glossary and style brief in the prompt.

3. Glossary + post-process. Apply the shared glossary (custom terminology on Translate, or in-prompt on Bedrock), restore any placeholders/markup, and optionally run an LLM judge for a confidence score.

4. Escalate or return. Low-confidence high-stakes items go to an Amazon A2I human-review queue; everything else returns to the caller. Cache the result (e.g. DynamoDB / ElastiCache keyed on source+languages) so repeats are free.

Bulk path

1. Land in S3. Source documents/strings arrive in an S3 bucket (a content export, a CMS sync, an upload). Parse non-text formats (PDF/Word/HTML) to clean text first — Amazon Textract for scanned PDFs and tables.

2. Trigger a batch job. An S3 event (or a schedule via EventBridge) kicks off an Amazon Translate async batch job over the folder, or a Bedrock batch inference job for the nuanced subset, orchestrated with AWS Step Functions.

3. Glossary, review, store. Apply the glossary, route flagged segments to A2I, write translated output back to S3 (and into your translation memory / CMS). Score a sample against the golden set.

4. Publish. Push approved translations to the destination — a localized CMS, an app's string catalogue, or a data store — with each translation versioned and traceable to its source.

translation pipeline components on aws · representative as of 2026
ConcernReal-time pathBulk path
EntryAPI Gateway + LambdaS3 drop + EventBridge / Step Functions
Plain textAmazon Translate (TranslateText)Amazon Translate (async batch)
Nuanced / structuredBedrock LLM (on-demand)Bedrock LLM (batch inference, ~50% off)
GlossaryCustom terminology / in-promptCustom terminology / in-prompt
Document parsingAmazon Textract (scanned/structured)
Human reviewA2I on flagged itemsA2I on flagged segments
Storage / cacheDynamoDB / ElastiCache + S3S3 + translation memory / CMS
Both paths share one glossary, one evaluation set, and one translation memory. The only real branch is the trigger (synchronous request vs S3/batch) and the engine mode (real-time vs async). Start with one path, add the other when you need it.
the common use case

VIILocalizing a multilingual app — the end-to-end flow

The most common reason teams build translation on AWS is to localize a product into many languages without a manual translation agency for every release. Here is the practical flow, and where each engine fits.

App localization is mostly about strings with structure — UI labels, marketing pages, emails, and docs — full of placeholders, plurals, and formatting that must survive translation. That is why localization is a textbook hybrid: Amazon Translate for the high-volume, low-risk strings, and a Bedrock LLM for the brand-facing, structured, or nuance-heavy ones.

  • 1 — Externalize strings — Move all user-facing text out of code into resource files (JSON, XLIFF, .po, or a string catalogue) keyed by ID, with placeholders (`{{name}}`, `%s`) and plural rules intact. This is standard i18n hygiene and the precondition for automating anything.
  • 2 — Classify by risk and structure — Tag each string: plain UI labels and notifications → Amazon Translate; marketing headlines, legal text, onboarding copy, and anything with markup or strong brand voice → Bedrock LLM. The tag drives routing in the pipeline.
  • 3 — Translate with placeholders protected — Send strings through the routed engine with the glossary applied and an explicit instruction (LLM) or pre/post-processing (Translate) to leave placeholders, tags, and URLs untouched. Mangled `{{variables}}` are the classic localization bug — guard against them.
  • 4 — Handle plurals, length, and layout — Languages differ in plural forms and text length (German runs long, CJK runs short). Generate all plural variants, and let an LLM flag strings that will likely overflow a button or break a layout so design can adjust. This is where an LLM's awareness beats raw NMT.
  • 5 — Review the brand-facing slice — Auto-ship the low-risk majority; route marketing and legal strings to human reviewers via A2I or your localization team. Feed approved translations back into translation memory so the next release reuses them for free.
  • 6 — Continuous localization — Wire the pipeline into CI: when a new or changed string lands, it is auto-translated, flagged if risky, and merged — so every release ships fully localized without a manual round-trip. Only changed strings are translated; everything else comes from memory.
the localization shortcut

Don't translate your whole app with a frontier LLM — it is slow and expensive and most strings don't need it. Externalize strings, route the bulk through Amazon Translate with a glossary, send only the brand-facing and structured strings to a Bedrock LLM, protect placeholders, and keep a translation memory so each release only pays for what changed.

what goes wrong

VIIICommon pitfalls (and the fix for each)

Most translation projects on AWS fail in the same handful of ways. None is exotic; each has a concrete fix that the reference architecture above already accounts for.

  • Using an LLM for everything — Routing millions of plain strings through a frontier model is slow and expensive. Fix: route the bulk to Amazon Translate; reserve the LLM for the slice that needs context, tone, or structure.
  • Using only Amazon Translate for brand and legal copy — Sentence-level NMT misses tone, register, and cross-document consistency. Fix: send brand-facing, structured, and high-stakes content to a Bedrock LLM with a glossary and style brief.
  • No glossary — Product names get translated, terms drift, and output is inconsistent across runs. Fix: one shared, version-controlled glossary — custom terminology on Translate, in-prompt on Bedrock.
  • Broken placeholders and markup — Naive translation mangles `{{name}}`, `%s`, HTML tags, and URLs. Fix: protect placeholders (pre/post-process for Translate; explicit instruction for the LLM) and test on real strings.
  • Real-time calls for bulk work — Translating a back-catalogue with synchronous calls is slow and forfeits the batch discount. Fix: Amazon Translate async batch and Bedrock batch inference (~50% off) for anything that can wait.
  • Reviewing everything — or nothing — Human-reviewing all output is unaffordable; reviewing none ships errors in high-stakes content. Fix: a confidence signal routes only the risky slice to A2I/human review.
  • No evaluation set — "It looks fine" is not a quality bar and regressions go unnoticed. Fix: a fixed golden set scored (BLEU/chrF/COMET + an LLM judge) on every change.
  • Re-translating unchanged content — Paying to translate the same strings every release. Fix: a translation memory / cache so only new or changed text is translated.
the central decision, side by side

Amazon Translate vs a Bedrock LLM — which to use when

This is the comparison that decides each translation job. Read it as "default to Amazon Translate for plain bulk; reach for a Bedrock LLM when a row in the right column is what the job hinges on; combine them when both are true."

DimensionAmazon Translate (NMT)Bedrock LLM (Claude / Nova / Llama)
What it isPurpose-built neural machine translation APIGeneral foundation model, prompted to translate
Languages75+ languages, thousands of pairsBroad; varies by model (often strongest on high-resource)
CostPer character (~$15 / M chars) — lowestPer token, per model — higher (batch ~50% off)
LatencyMilliseconds — real-time friendlySeconds — slower per request
Context awarenessLargely sentence/segment levelWhole-document context, ambiguity resolution
Tone / formality / brand voiceLimited (some formality controls)Strong — natural-language style brief in the prompt
Glossary / terminologyCustom terminology (CSV/TMX)Glossary in-prompt, applied with context
Customization on your dataActive Custom Translation (parallel data)Few-shot examples or fine-tuning
Structured content (HTML/MD/JSON)Needs pre/post-processingHandles markup + placeholders when instructed
Beyond translation (transcreate, summarize+translate)NoYes, in one pass
Best forHigh-volume, fast, cheap, plain textNuanced, structured, high-stakes, brand-facing
Both run on AWS and can share one glossary and translation memory. The mature pattern is a hybrid: Amazon Translate for the bulk, a Bedrock LLM for the slice where context, tone, structure or stakes decide the outcome — often with the LLM refining or judging Translate's output.
building this for real?
Have a vetted AWS partner build your translation pipeline — and let AWS credits pay for it
Start in 3 minutes →
a recent match

A localized product + support stack — anonymized

inquiry · seed-stage b2c marketplace, localizing into 12 languages, EU
Seed-stage B2C marketplace, 16 people, ~40k UI strings + a marketing site + a high-volume multilingual support inbox, expanding across the EU with a data-residency requirement

Situation: Needed to launch the product, marketing site, and support flow in 12 languages on a tight timeline. A manual translation agency quote was far over budget and too slow for continuous releases, while a first in-house attempt that pushed everything through a single LLM was expensive, slow on the support volume, and kept mangling placeholders in the UI strings. Brand and legal copy also read flat and off-tone. The one engineer who could build a real pipeline was committed to the core product, and the projected Bedrock + Translate bill made the founder hesitate to start.

What CloudRoute did: Routed within 24 hours to an EU-region AWS partner with a GenAI/ML and localization track record. The partner built a hybrid pipeline in eu-central-1: Amazon Translate (async batch, with a shared custom-terminology glossary) for the 40k UI strings and the high-volume support inbox; a Bedrock LLM (Claude, batch inference, placeholder-protected, with a style brief and glossary in-prompt) for the marketing site and legal copy; Amazon A2I human review on the brand and legal slice; an LLM-as-a-judge confidence score routing what got escalated; a translation memory so each release only paid for changed strings; and a 300-segment golden set scored with COMET plus an LLM judge. The whole engagement was funded by AWS credits the partner filed for — Activate Portfolio plus a Bedrock POC allocation.

Outcome: All 12 languages live in under 6 weeks, with continuous localization wired into CI so new strings auto-translate on every release. Support-inbox translation ran in real time at a fraction of the all-LLM cost; brand and legal copy cleared the team's tone bar after human review; placeholder breakage went to zero. The build and the first months of translation/inference ran on AWS credits — the customer paid $0. CloudRoute's commission was paid by the partner from AWS engagement funding.

engagement window: ~6 weeks · founder time: ~7 hours · stack: Amazon Translate (batch + custom terminology) + Bedrock LLM (Claude, batch) + A2I + translation memory · cost to customer: $0

faq

Common questions

Should I use Amazon Translate or a Bedrock LLM for translation on AWS?
It depends on the job, not your company. Use Amazon Translate when volume, latency and cost dominate and the text is relatively plain — support tickets, user-generated content, chat, bulk catalogues, real-time translate buttons. It is purpose-built neural machine translation across 75+ languages, priced per character (~$15 per million), and far cheaper per unit than an LLM. Use a Bedrock LLM (Claude, Amazon Nova, Llama, Mistral) when context, tone, formality, brand voice, domain glossary nuance, or document structure (HTML/Markdown/JSON with placeholders) decide the outcome — marketing copy, legal/medical text, structured documents. Many production systems are a hybrid: Translate for the bulk, an LLM for the hard slice.
How much does AI translation on AWS cost?
Amazon Translate is priced per character — roughly $15 per million characters as a representative 2026 figure (custom/Active Custom Translation is a higher tier), the same rate for real-time or async batch. A Bedrock LLM is priced per input + output token, per model, which is materially higher per unit of text but buys context, tone and structure; batch inference is roughly 50% of on-demand price. The biggest cost levers are routing the plain bulk to Translate, batching anything that can wait, using Bedrock prompt caching for a static system prompt/glossary, right-sizing the model, and caching results in a translation memory so unchanged text is never paid for twice. Figures are representative as of 2026 — check the AWS pricing page (and per-model Bedrock pricing) for current rates.
Can I translate millions of documents or strings in bulk on AWS?
Yes — use asynchronous batch. Amazon Translate has an async batch API that translates a whole folder of documents in Amazon S3 in one job, ideal for back-catalogues, document sets, and nightly content syncs at the lowest per-character rate. For LLM translation that does not need a real-time answer, Amazon Bedrock batch inference runs large jobs asynchronously at roughly half the on-demand token price. Orchestrate with AWS Step Functions, trigger from S3 events or EventBridge schedules, and keep a translation memory so each run only translates new or changed content.
How do I make sure brand names and specific terms always translate correctly?
Use a glossary, and share it across both engines. On Amazon Translate, upload a custom terminology file (CSV or TMX) and it forces those mappings on every real-time and batch job. On a Bedrock LLM, put the glossary in the prompt ("use exactly these translations for these terms; never translate the brand name") — and because the model reads context, it applies the right inflection and case rather than a blind find-replace. Keep one version-controlled glossary feeding both so a hybrid pipeline stays consistent, and add the terms to your evaluation checks so regressions are caught.
How do I keep translation quality high without human-reviewing everything?
Review the slice that matters, not all of it. Tag content by risk (legal disclaimers high, forum posts low), auto-ship the low-risk majority, and route the high-risk minority to human reviewers — Amazon Augmented AI (A2I) gives you a managed human-review workflow. Decide what to escalate with a confidence signal: an LLM-as-a-judge score on Bedrock, Translate output heuristics, or back-translation agreement. Combine that with a shared glossary, customization on your own past translations, and a fixed evaluation set scored (BLEU/chrF/COMET plus an LLM judge) on every change. That gets near-human quality on the content that matters without paying to review everything.
Does a Bedrock LLM handle formatting and placeholders better than Amazon Translate?
For structured or mixed content, yes — when instructed. A Bedrock LLM can translate inside Markdown, HTML, JSON, or XLIFF and be told to leave tags, placeholders ({{name}}, %s), URLs, and code untouched, because it reasons over the whole input. Amazon Translate works at the segment level and needs pre/post-processing to protect placeholders and markup (it does offer an HTML/document mode, but complex structured strings are where an LLM shines). For app localization specifically — strings full of variables and plurals — routing the structured, brand-facing strings to an LLM and the plain ones to Translate is the reliable pattern. Broken {{placeholders}} are the classic localization bug, so test on real strings either way.
What is the hybrid translation pattern, and why use it?
A hybrid uses Amazon Translate for the economical bulk and a Bedrock LLM for the slice where nuance pays — capturing most of the cost advantage of NMT and most of the quality advantage of an LLM. Three common shapes: (1) Translate first, LLM to refine/post-edit the output; (2) route by content type up front (tickets/UGC → Translate, marketing/legal → LLM); (3) Translate for the draft, LLM as an automated quality judge that flags or corrects only weak translations. A pure-LLM pipeline over millions of strings is too slow and costly; a pure-Translate pipeline misses tone on the content that matters most. Most serious localization on AWS lands on a hybrid.
How long does it take to build a translation pipeline on AWS?
A basic real-time path — API Gateway + Lambda calling Amazon Translate with a glossary — can be standing in a day or two. A bulk batch path (S3 + Step Functions + Translate/Bedrock batch) is a few days more. Getting to genuinely production-ready — engine routing, placeholder protection, custom terminology, human-review escalation via A2I, a translation memory, continuous localization in CI, and an evaluation set — is typically 2–6 weeks depending on content complexity and how many languages and content types you support. The slowest part is usually preparing content and glossaries, not the AWS wiring. A specialist ML/localization partner compresses this materially, which is the engagement CloudRoute routes — funded by AWS credits, so the customer pays $0.

Build your AI translation pipeline on AWS — funded by AWS credits

CloudRoute routes you to a vetted AWS GenAI/ML partner who designs and ships the pipeline — Amazon Translate, a Bedrock LLM, or the hybrid; glossaries and custom terminology; batch for the bulk; human-in-the-loop where it counts; and continuous localization. AWS credits fund the build and the translation/inference. You pay $0.

matched within< 24h
credits to fund itup to $100K
cost to you$0
How to do AI translation on AWS (2026) — Translate vs Bedrock · CloudRoute