Amazon Nova is AWS's own family of foundation models on Amazon Bedrock, built to be the price-performance and low-latency choice rather than the absolute frontier. This is a complete, neutral reference: what Nova is, the full tier ladder (Micro, Lite, Pro, Premier for text and multimodal; Canvas for images; Reel for video; Act for agentic browser tasks), a capability and context-window table, how to call each model on Bedrock, an honest read on quality versus Claude and GPT, where Nova's cost advantage is real, the right tier for each use case — and how AWS credits make all of it $0 to build.
Amazon Nova is Amazon's own family of foundation models, delivered as managed models inside Amazon Bedrock. Where most of the headline models on Bedrock come from partner labs — Anthropic's Claude, Meta's Llama, Mistral, Cohere — Nova is the model family AWS itself builds and trains, and it is the centre of gravity for the cost-conscious end of the Bedrock catalogue.
The simplest way to place Nova: it is to Bedrock what a house brand is to a well-stocked store. The store still sells every premium label (you can call Claude or Llama through the same API), but the house brand is engineered to deliver most of the value at a meaningfully lower price — and, importantly, with the tight integration and predictability that come from AWS owning the whole stack. Nova models run on AWS's own silicon and infrastructure, which is part of how they hit their price and latency targets.
AWS's stated positioning for Nova is explicit and worth taking at face value: price-performance and low latency, not "we beat the frontier on every benchmark." Amazon's claim is that for a large share of real production tasks, a Nova tier delivers the quality you actually need at a fraction of the cost and latency of a top frontier model — and that when you genuinely need the frontier, you call Claude (or Nova Premier) through the very same Bedrock API. This is a deliberately different pitch from "the smartest model in the world," and it shapes where Nova fits.
Nova spans multiple modalities. The core text/multimodal tiers (Micro, Lite, Pro, Premier) handle understanding tasks — reading text, images, documents and video and producing text. Separate models cover generation: Nova Canvas creates images from text, Nova Reel creates short video, and Nova Act is an agentic model designed to take actions in a web browser. So "Amazon Nova" is not one model but a small family, each member tuned for a different job.
Crucially, Nova inherits the Bedrock guarantees that matter to companies: your prompts and data are not used to train the base models, data stays in your AWS account and region, and the usual enterprise controls (IAM, PrivateLink, KMS, CloudWatch, Guardrails) apply unchanged. Choosing Nova does not mean leaving the security and governance model of Bedrock — it is the same service, the same API, a different model ID.
One caveat, stated once and meant throughout: the model names, capabilities and especially the prices on this page are representative as of 2026. AWS iterates the Nova family quickly and foundation-model prices change as the market moves. Treat the figures here as a guide to the shape of the family and its relative cost — and confirm current model availability and pricing on the official AWS Bedrock and Nova pages (and see the amazon-nova-pricing sibling for the cost detail).
Amazon Nova is AWS's own foundation-model family on Bedrock — Micro/Lite/Pro/Premier for text and multimodal understanding, plus Canvas (image), Reel (video) and Act (agentic browser) — positioned as the price-performance and low-latency choice you reach for first, with frontier models like Claude one model-ID away when you need them.
The understanding tiers form a clean ladder from cheapest-and-fastest to most-capable. The trick to using Nova well is choosing the lowest tier that clears the quality bar for each specific task — and knowing that the creative and agentic models are separate, purpose-built members of the same family.
Read the text tiers as a price/capability ladder. As you move up — Micro → Lite → Pro → Premier — the model gets more capable, handles more modalities, and costs more per token. Move down and it gets cheaper and faster. Most production systems end up using two or three tiers at once: a small tier for the high-volume easy work and a larger one for the hard cases.
Modality: text in, text out (no image/video input). Personality: very low latency, very low cost, the highest-throughput option in the family. Best for: the high-volume, well-defined text work that does not need a big brain — classification, routing, intent detection, extraction, simple summarization, function/tool calling, and the cheap "first pass" in a tiered router. When a task is simple and you are doing it millions of times, Micro is usually the right answer. Model ID pattern: amazon.nova-micro-v1:0.
Modality: multimodal — accepts text, images, documents and video as input; produces text. Personality: still very fast and cheap, but able to see. Best for: high-volume multimodal understanding where cost matters — describing or classifying images, reading documents and screenshots, light video understanding, multimodal RAG, and any workload where you want vision without frontier-model prices. Lite is the workhorse for "cheap but multimodal." Model ID pattern: amazon.nova-lite-v1:0.
Modality: multimodal (text, image, document, video in; text out). Personality: the balanced middle — noticeably more capable than Lite on reasoning, instruction-following and complex multimodal tasks, while staying well below frontier-model cost. Best for: the "main" model for many production apps — agent orchestration, richer document and video analysis, complex structured output, and tasks where Lite is not quite reliable enough but a frontier model would be overkill. Pro is the default starting point if you are not sure which tier to pick. Model ID pattern: amazon.nova-pro-v1:0.
Modality: multimodal, with the strongest reasoning of the family. Personality: Nova's top tier — built for the hardest tasks and for complex multi-step workflows, and notably positioned as the teacher model for distillation (you can use Premier to create a smaller, cheaper custom model that mimics it for a narrow task). Best for: demanding reasoning, complex agentic planning, and as the distillation source when you want frontier-ish quality at a small-model price for a specific workload. Even so, on the very hardest reasoning and code, dedicated frontier models such as Claude often still lead — see §IV. Model ID pattern: amazon.nova-premier-v1:0.
Modality: text-to-image (and image editing). What it does: generates and edits images from prompts, with controls for things like style, layout and inpainting/outpainting, plus built-in safeguards (e.g. invisible watermarking for provenance). Best for: product imagery, marketing creative, design exploration and any app that needs on-demand image generation inside AWS. Billed per image, not per token (see the pricing sibling).
Modality: text-to-video (and image-to-video). What it does: generates short video clips from a text prompt or a starting image, with camera-motion controls. Best for: short marketing and social video, b-roll, and prototyping motion concepts without a production shoot. Billed per second of generated video.
Modality: agentic — a model (with an associated SDK) designed to take actions in a web browser reliably: navigating pages, clicking, typing, filling forms and completing multi-step tasks. What it is for: building agents that operate web UIs — research, form completion, workflow automation across sites that lack good APIs. It targets the long-standing weak spot of browser agents (reliability on real, multi-step tasks) and is the building block for "do this on the web for me" automation. Best treated as an emerging capability you prototype against rather than a finished, drop-in product.
A single scannable view of the family: what each model takes as input, the rough size of its context window, where it sits on cost, and the job it is built for. Use it to pick a starting tier; the per-token numbers live on the pricing page.
Context windows (how much text/content the model can consider at once) are large across the text tiers — comfortably into the hundreds of thousands of tokens — which makes Nova practical for long documents and big RAG contexts. The exact maximum and the supported input types evolve; the figures below are representative as of 2026.
| Model | Inputs → output | Context window (approx.) | Relative cost | Latency | Built for |
|---|---|---|---|---|---|
| Nova Micro | Text → text | ~128K tokens | Lowest | Fastest | High-volume simple text: classify, route, extract |
| Nova Lite | Text + image + doc + video → text | ~300K tokens | Very low | Very fast | Cheap multimodal understanding at scale |
| Nova Pro | Text + image + doc + video → text | ~300K tokens | Low–mid | Fast | Balanced main model: agents, rich analysis |
| Nova Premier | Text + image + doc + video → text | ~1M tokens (large) | Mid (cheap vs frontier) | Moderate | Hardest reasoning; distillation teacher |
| Nova Canvas | Text (+image) → image | n/a (image model) | Per image | Seconds | Image generation + editing |
| Nova Reel | Text (+image) → video | n/a (video model) | Per second | Async-ish | Short video generation |
| Nova Act | Goal + web page → browser actions | n/a (agentic) | Varies | Interactive | Reliable browser/agent automation |
The fair, hedged answer: Nova is genuinely strong for its price class and excellent on the broad middle of real tasks, but it is positioned as price-performance rather than frontier — and on the hardest reasoning, nuanced writing and complex code, dedicated frontier models such as Claude and GPT still tend to lead. Benchmarks are a rough guide, not gospel.
Start with the caveat that matters most: public benchmark numbers move constantly and rarely match your workload. Leaderboards (MMLU-style knowledge, math, coding suites, multimodal tests) are useful for a coarse ranking but routinely mislead on the specific task you care about. The only benchmark that truly counts is your own evaluation set on your own prompts. Treat everything below as direction, not a verdict, and run a quick eval (Bedrock has a model-evaluation feature) before committing.
With that said, here is the honest shape as of 2026. On the broad middle of production work — classification, extraction, retrieval-augmented answers, summarization, structured/JSON output, routine tool calling, straightforward multimodal understanding — the right Nova tier is typically good enough to ship, and it gets there far cheaper and faster than a frontier model. For a great many real features, you would struggle to tell a well-prompted Nova Pro answer from a frontier-model answer, and you would pay a fraction of the cost.
On the hard end — multi-step reasoning, subtle instruction-following, long-horizon agentic planning, nuanced or stylistically demanding writing, and especially difficult code generation and debugging — dedicated frontier models still tend to have an edge. Anthropic's Claude in particular is widely regarded as a leader for complex reasoning and coding, and is available on the very same Bedrock API; OpenAI's GPT models (accessed outside Bedrock) are the other common frontier reference point. Nova Premier narrows this gap and is the family's answer for hard tasks, but "narrows" is the honest word, not "erases."
The productive way to hold this is not "Nova vs Claude" as a one-time pick but as a portfolio. A tiered router sends the easy 70–90% of requests to a cheap Nova tier and escalates only the genuinely hard minority to Nova Premier or Claude. Because both live behind one Bedrock API, this routing is a software decision, not an integration project — and it tends to cut cost dramatically while keeping quality high where it matters. That is exactly the architecture Nova is designed to enable.
Nova is the value leader, not (by design) the frontier. Reach for a Nova tier first for cost-sensitive, high-volume and latency-sensitive work; reserve a frontier model (Claude, or Nova Premier) for the hard cases. Always validate on your eval set — benchmarks are a coarse guide, not a decision.
Every Nova model is reached through Amazon Bedrock, so if you can call any model on Bedrock you can call Nova by changing one identifier. There is no separate Nova endpoint, SDK or account to set up — it is the same managed service with a different model ID.
The workflow is the standard Bedrock workflow. First, in the Bedrock console under Model access, enable the Nova models you want in your region (a one-time toggle). Then call them with the AWS SDK (boto3, the AWS SDK for JavaScript, etc.) using either the Converse API — the unified, recommended interface for chat-style and multimodal requests, with a consistent message format across all models — or the lower-level InvokeModel call. Switching from, say, Claude to Nova Pro is changing the modelId string; the surrounding code stays the same.
The model IDs follow the pattern amazon.nova-<tier>-v1:0: amazon.nova-micro-v1:0, amazon.nova-lite-v1:0, amazon.nova-pro-v1:0, amazon.nova-premier-v1:0, with corresponding IDs for nova-canvas and nova-reel. For higher availability and to avoid single-region throttling, many teams call Nova through a cross-region inference profile (an "inference profile ID" that lets Bedrock serve the request from a pool of regions) — see the amazon-bedrock-cross-region-inference sibling.
Because Nova lives inside Bedrock, every surrounding Bedrock capability works with it unchanged: Knowledge Bases for managed RAG, Agents for tool-using workflows, Guardrails for safety/PII filtering, Prompt Management and Flows for orchestration, and fine-tuning / distillation to customise a model. You can also pick a pricing mode per workload — on-demand, Batch (~50% cheaper for async jobs), Provisioned Throughput, and prompt caching — exactly as you would for any Bedrock model (the amazon-bedrock-pricing sibling covers the modes).
Nova rewards the discipline of matching the tier to the task. Spend a frontier-class budget only where the task is genuinely hard; push everything else down the ladder. Here is a practical mapping of common workloads to the tier that usually fits.
These are starting points, not laws — always confirm with a quick eval. The recurring theme is that a great deal of production GenAI work is "easy" in model terms (well-defined, repetitive, schema-bound), and that work belongs on Micro or Lite, with Pro as the balanced default and Premier (or Claude) reserved for the hard minority.
A tiered router — Micro/Lite for the easy 70–90%, Pro for the middle, Premier or Claude for the hard minority — is the single highest-leverage pattern with Nova. Because every model sits behind one Bedrock API, routing is a config decision and routinely cuts cost many-fold with little quality loss.
Three "Amazon-adjacent" choices confuse people: Amazon Nova (the current price-performance family), Anthropic's Claude (the frontier reasoning/coding leader, also on Bedrock), and Amazon Titan (the earlier Amazon model family). Here is how to choose between them in 2026.
Nova vs Claude. This is the live, everyday decision, and the honest framing is "value vs frontier." Reach for Nova when cost and latency matter and the task sits in the broad middle — classification, extraction, RAG answers, summarization, structured output, high-volume agents, multimodal understanding. Reach for Claude when the task is genuinely hard — complex reasoning, nuanced or long-form writing, difficult code, subtle instruction-following — where its quality edge justifies the higher price. Because both are on Bedrock behind one API, the best answer is usually both via a router: Nova for the easy majority, Claude for the hard minority. See the claude-on-amazon-bedrock sibling for the Claude side.
Nova vs Titan. Amazon Titan was Amazon's first-generation foundation-model family on Bedrock. For new generative text and multimodal work in 2026, Nova is the successor and the better choice on quality, capability and price-performance — if you are starting fresh, default to Nova, not Titan, for generation. Titan's most enduring relevance is its embeddings models: Amazon Titan Text Embeddings remains a common, cheap, solid choice for the retrieval/vector half of a RAG or search system. So a very typical modern stack is "Nova for generation, Titan Text Embeddings for retrieval." See the amazon-titan sibling for detail.
A simple rule of thumb. Default to Nova for new generation work; escalate to Claude for the hard cases (ideally via a router on the same API); use Titan Text Embeddings for embeddings/RAG retrieval. You are not locked in to any of these — switching among them is a model-ID change on Bedrock, so the right move is to start with Nova, measure on your own eval set, and escalate only where the data says you must.
Getting from "interested in Nova" to a running, cost-tuned workload is short, because Nova is just Bedrock with a different model ID. The only meaningful decision beyond which tier is whether to pay for it yourself or have AWS credits cover the build — which, for most startups and many companies, they will.
The mechanical first steps: (1) enable the Nova models you want under Bedrock Model access; (2) prototype against Nova Pro as a sensible default with the Converse API; (3) build a small evaluation set from your real prompts and compare Pro against Lite/Micro (to push cost down) and against Premier or Claude (where you need more); (4) introduce a tiered router once you know which requests are easy vs hard; and (5) turn on the cost levers that fit — Batch for bulk jobs, prompt caching for repeated context, Provisioned Throughput only for steady high volume. That sequence gets most teams to a cheap, reliable production setup quickly.
The cost story is where CloudRoute comes in. Nova is the value choice on Bedrock, but at real scale GenAI still costs money — and AWS will frequently fund the build with credits. Nova inference, fine-tuning, distillation and the supporting services are all credit-eligible. The relevant pools: AWS Activate (general startup credits, commonly up to $100K for institutionally-funded startups), a dedicated Bedrock / Generative-AI POC pool ($10K–$50K) for proving out a use case, and the competitive Generative AI Accelerator (awards up to $1M for a small cohort of AI-first startups). Credits apply automatically against your AWS bill — including Nova usage — until exhausted.
Most of those pools are partner-filed through the AWS Partner Network (the ACE program), not a public self-serve form, which is why teams route through an AWS partner rather than applying alone. That is the gap CloudRoute fills: it matches you to the right credit pool for your stage and to a vetted AWS DevOps/ML partner who both files the credit application and helps build the Nova workload (the RAG pipeline, the agent, the tiered router, the distillation). The customer pays $0 — AWS funds the credit pool, AWS pays the partner through engagement-funding programs, and the partner pays CloudRoute a routing commission. You never see an invoice. For the credit mechanics specifically, see the cross-cluster pages on AWS credits for generative-AI startups and Bedrock POC funding.
A scannable view of the four model families a team on AWS actually weighs in 2026, on the dimensions that drive the choice. It is qualitative and directional (frontier capability and pricing both move quickly) — validate on your own eval set, and remember that Nova and Claude live behind the same Bedrock API, so "both via a router" is usually the real answer.
| Family | Maker / access | Sweet spot | Cost | Frontier reasoning | On Bedrock? |
|---|---|---|---|---|---|
| Amazon Nova | Amazon · Bedrock | Price-performance, low latency, high-volume & multimodal | Lowest of the four | Good→strong (Premier highest) | Yes (native) |
| Anthropic Claude | Anthropic · Bedrock | Hardest reasoning, nuanced writing, complex code | Higher (frontier) | Class-leading | Yes |
| OpenAI GPT | OpenAI · OpenAI/Azure (not Bedrock) | Frontier reasoning + broad ecosystem | Higher (frontier) | Class-leading | No |
| Amazon Titan | Amazon · Bedrock | Embeddings/RAG retrieval (gen superseded by Nova) | Low | Modest (older gen) | Yes |
Situation: The product ran every page — classification, field extraction, and a written summary — through a single frontier model on-demand. It worked, but the modeled inference bill was heading toward ~$9K/month at current volume and rising fast with growth, and almost all of that spend was on tasks (classify this page, pull these fields) that did not need a frontier model. They wanted the cost down without a visible drop in quality, and they did not want to spend runway on it.
What CloudRoute did: CloudRoute matched them in under 24 hours to an EU AWS partner with GenAI cost-engineering experience. The partner (1) moved the high-volume classification and extraction steps to <strong>Nova Lite</strong> (multimodal, reads the page images) and routing logic to <strong>Nova Micro</strong>; (2) kept the harder, customer-facing summary step on <strong>Nova Pro</strong>, escalating only ambiguous documents to a frontier model; (3) ran the nightly bulk re-processing via <strong>Batch</strong> and turned on <strong>prompt caching</strong> for the shared extraction instructions; and (4) filed a Bedrock POC credit application plus an Activate Portfolio application to fund the whole build and launch.
Outcome: Measured on the team's own eval set, quality held within tolerance while the modeled inference bill fell from ~$9K to ~$2.7K/month — roughly a 70% cut — driven almost entirely by tier-matching to Nova. Even that reduced spend was fully covered by the approved credits, so the team paid $0 during the build and early scale. CloudRoute's commission was paid by the partner from AWS engagement funding, not by the customer.
cost cut: ~$9K → ~$2.7K/mo modeled (~70%) · quality: held on eval set · credits: POC + Activate · out-of-pocket: $0
Nova is the cheapest way to run real GenAI on AWS — and AWS credits can make it cost nothing to build. CloudRoute routes you to the right credit pool (Activate up to $100K, Bedrock POC $10K–$50K, GenAI Accelerator up to $1M) and a vetted AWS partner who files the application and builds the workload — the tiered router, the RAG pipeline, the distillation. Customer pays $0.