Amazon Nova is not one model but a family, and choosing the right member is the difference between a cheap, fast, reliable feature and an over-paid one. This is a deep, neutral reference on the whole lineup: Nova Micro, Lite, Pro and Premier for text and multimodal understanding, plus Canvas for images, Reel for video and Act for agentic browser control — each broken down on capability, modality, context window, latency, price band and best-fit use case, with a full side-by-side comparison table, a concrete framework for picking a tier, how each Nova tier stacks up against Claude and Llama on Bedrock, and how AWS credits make the whole build $0.
Before comparing tiers it helps to see the shape of the family. "Amazon Nova models" splits cleanly into two groups: a ladder of understanding models that read content and produce text, and a set of purpose-built models for creating images, video and browser actions. They all live inside Amazon Bedrock and share its security model, but each is tuned for a different job.
The first group is the understanding ladder: Nova Micro, Nova Lite, Nova Pro and Nova Premier. These take text (and, from Lite up, images, documents and video) as input and produce text as output. They are ordered from cheapest-and-fastest to most-capable, and the whole point of having four rungs is that you can match the rung to the difficulty of the task instead of paying frontier prices for simple work.
The second group is purpose-built generation and action models: Nova Canvas generates and edits images, Nova Reel generates short video, and Nova Act is an agentic model (with an associated SDK) built to take reliable actions in a web browser. These are not "tiers" of one another — each does a distinct thing and is billed in its own unit (per image, per second of video, per agentic run rather than per text token).
A useful way to hold the whole family: the understanding tiers are the brains you reach for first and most often; Canvas and Reel are the creative tools you add when an app needs to make media rather than read it; and Act is the emerging building block for "do this on the web for me" automation. This page treats each member on its own terms, then pulls them together in one comparison table (§III) and a decision framework (§IV).
The standing caveat, stated once: the capabilities, context windows and especially the price bands on this page are representative as of 2026. AWS iterates the Nova family quickly and foundation-model pricing moves with the market. Use the figures here to understand the shape of the lineup and the relative position of each model; confirm current availability and exact numbers on the AWS Bedrock and Nova pages, and see the amazon-nova-pricing sibling for the per-tier rate tables.
Understanding ladder — Micro (text, cheapest) → Lite (cheap multimodal) → Pro (balanced multimodal) → Premier (most capable + distillation teacher); plus Canvas (image), Reel (video) and Act (agentic browser). All on one Bedrock API.
This is the core of the page: a deep read on every member of the family along the six dimensions that actually drive a model choice — capability level, what modalities it accepts, how big its context window is, how fast it responds, where it sits on price, and the use case it is built for. The understanding tiers come first as a ladder, then the three purpose-built models.
A note on how to read these. The price bands (lowest → highest within Nova) are relative to each other and remain far below frontier-model pricing even at the top of the family — Nova Premier is the priciest Nova but is still positioned well under a frontier model like Claude. Latency labels are directional: smaller models answer faster, which matters enormously for interactive and high-throughput workloads. Context windows are large across the text tiers, which is what makes Nova practical for long documents and big retrieval contexts.
Capability: solid for well-defined text tasks; not built for hard multi-step reasoning. Modality: text in, text out only (no image/video input). Context window: large for its class — roughly 128K tokens (representative 2026), enough for long inputs and sizeable RAG contexts. Latency: the fastest in the family, the highest-throughput option. Price band: the lowest in Nova — often a few cents per million tokens.
Best fit: high-volume, well-scoped text work that does not need a big brain — intent classification and routing, sentiment/topic tagging, entity and field extraction, short summaries, content-moderation pre-filtering, function/tool calling, and the cheap "first pass" in a tiered router. When a task is simple and you run it millions of times, Micro is usually the right answer because cost and latency dominate. Model ID: amazon.nova-micro-v1:0.
Capability: comparable reasoning to Micro on text, plus the ability to see. Modality: multimodal — accepts text, images, documents and video as input; produces text. Context window: large, comfortably into the hundreds of thousands of tokens (representative ~300K, 2026). Latency: still very fast. Price band: very low — slightly above Micro, far below Pro.
Best fit: high-volume multimodal understanding where cost matters — describing or classifying images, reading documents/screenshots/receipts/forms, multimodal RAG, light video understanding, product-catalog enrichment, and any "understand this image or document" feature that needs vision without frontier prices. Lite is the workhorse for "cheap but multimodal." Model ID: amazon.nova-lite-v1:0.
Capability: noticeably stronger than Lite on reasoning, instruction-following and complex multimodal tasks, while staying well below frontier cost. Modality: multimodal (text, image, document, video in; text out). Context window: large (representative ~300K, 2026). Latency: fast for its capability class. Price band: low-to-mid within Nova — the middle of the family.
Best fit: the "main" model for many production apps — agent orchestration and tool use, richer document and video analysis, complex structured/JSON output, customer-facing assistants where quality matters but cost still counts, and anything where Lite is not quite reliable enough but a frontier model would be overkill. Pro is the sensible place to start prototyping if you are unsure which tier to pick, then move work down to Lite/Micro where quality allows. Model ID: amazon.nova-pro-v1:0.
Capability: the strongest reasoning in Nova — built for the hardest tasks and complex multi-step workflows. Modality: multimodal, with the family's best handling of difficult, long or nuanced inputs. Context window: the largest of the family — representative ~1M tokens (2026), useful for very long documents and large multi-document contexts. Latency: moderate — the trade you accept for the capability. Price band: the highest within Nova, yet still positioned as cheaper than a frontier model.
Best fit: demanding multi-step reasoning, complex agentic planning, high-stakes analysis — and, distinctively, as the teacher model for distillation: you can use Premier to create a smaller, cheaper custom model that mimics it on a narrow, high-volume task, getting close-to-Premier quality at close-to-Micro cost for that task. Even so, on the very hardest reasoning and code, dedicated frontier models such as Claude often still lead (see §V). Model ID: amazon.nova-premier-v1:0.
Capability: text-to-image generation plus image editing. Modality: text (and optionally a reference image) in, image out. What it does: generates and edits images from prompts, with controls for style and layout and editing operations like inpainting and outpainting, plus built-in safeguards (for example invisible watermarking for provenance). Latency / billing unit: a few seconds per image; billed per image, not per token (see the pricing sibling). Best fit: product imagery, marketing creative, design exploration and any app that needs on-demand image generation running inside AWS with provenance safeguards.
Capability: text-to-video and image-to-video generation of short clips. Modality: text (and optionally a starting image) in, video out. What it does: generates short video clips from a prompt or a starting frame, with camera-motion controls. Latency / billing unit: generation is asynchronous-ish (a clip takes time to render); billed per second of generated video. Best fit: short marketing and social video, b-roll, and prototyping motion concepts without a production shoot.
Capability: an agentic model (with an associated SDK) designed to take reliable actions in a web browser — navigating pages, clicking, typing, filling forms and completing multi-step tasks. Modality: a goal plus the live web page in, browser actions out. What it is for: building agents that operate real web UIs — research, form completion and workflow automation across sites that lack good APIs. It targets the long-standing weak spot of browser agents (reliability on real, multi-step tasks). Best fit: "do this on the web for me" automation — best treated as an emerging capability you prototype against rather than a finished, drop-in product.
For understanding work, pick the lowest tier that clears your quality bar for each task, not the most capable model overall. Start a prototype on Nova Pro, push high-volume steps down to Lite/Micro where quality holds, and reserve Premier (or Claude) for the genuinely hard minority.
One scannable view of the lineup across the dimensions that drive the choice: what each model takes in and produces, its rough context window, where it sits on cost and latency, and the job it is built for. Use it to shortlist a starting model; the per-token / per-image / per-second numbers live on the pricing page.
Read it top to bottom as the price/capability ladder (Micro → Premier) for understanding work, then the three purpose-built models below. The cost column is qualitative and relative within Nova — even Premier, the priciest member, is positioned below frontier-model pricing. Figures are representative as of 2026.
| Model | Inputs → output | Context (approx.) | Relative cost | Latency | Best-fit use case | Model ID |
|---|---|---|---|---|---|---|
| Nova Micro | Text → text | ~128K tokens | Lowest | Fastest | High-volume simple text: classify, route, extract, tool-call | amazon.nova-micro-v1:0 |
| Nova Lite | Text + image + doc + video → text | ~300K tokens | Very low | Very fast | Cheap multimodal understanding at scale | amazon.nova-lite-v1:0 |
| Nova Pro | Text + image + doc + video → text | ~300K tokens | Low–mid | Fast | Balanced default: agents, rich analysis, structured output | amazon.nova-pro-v1:0 |
| Nova Premier | Text + image + doc + video → text | ~1M tokens (largest) | Mid (cheap vs frontier) | Moderate | Hardest reasoning, planning; distillation teacher | amazon.nova-premier-v1:0 |
| Nova Canvas | Text (+image) → image | n/a (image model) | Per image | Seconds | Image generation + editing (inpaint/outpaint) | amazon.nova-canvas-v1:0 |
| Nova Reel | Text (+image) → video | n/a (video model) | Per second | Async-ish | Short video generation (text/image-to-video) | amazon.nova-reel-v1:0 |
| Nova Act | Goal + web page → browser actions | n/a (agentic) | Varies | Interactive | Reliable browser/agent automation | Act SDK (agentic) |
The recurring mistake is picking one model for the whole app. The discipline that pays off is matching each task to the lowest tier that clears its quality bar, measured on your own data. Here is a concrete, repeatable way to choose — and to keep the choice honest as the app grows.
Work through these questions per task (not per app), because a single product usually contains tasks of very different difficulty — a "classify this page" step and a "write the customer-facing summary" step do not belong on the same model.
Build a tiered router: send the easy 70–90% of requests to Micro/Lite, the middle to Pro, and escalate only the genuinely hard minority to Premier or Claude. Because every model sits behind one Bedrock API, routing is a config decision — and it routinely cuts cost many-fold with little quality loss.
Nova does not exist in isolation — it sits in the Bedrock catalogue next to Anthropic's Claude (the frontier reasoning/coding leader) and Meta's Llama (the leading open-weight family). The useful question is not "which is best" but "which Nova tier maps to which alternative, for which task." Here is that mapping, tier by tier.
Against Claude. This is the everyday "value vs frontier" decision, and the cleanest way to see it is per Nova tier. Nova Micro/Lite have no real Claude equivalent at their price point — for high-volume classification, extraction and cheap multimodal understanding, Nova is simply the cost-and-latency winner and Claude would be overkill. Nova Pro overlaps with the smaller/faster Claude models (the Haiku-class tier) on balanced production work; Pro usually wins on cost, the Claude tier can win on nuance — measure on your eval set. Nova Premier is Nova's answer to the hard end, but on the very hardest reasoning, subtle writing and difficult code, the top Claude models (Sonnet/Opus-class) still tend to lead. Because both families are on one Bedrock API, the production answer is usually both via a router: Nova for the easy majority, Claude for the hard minority. See the claude-on-amazon-bedrock sibling for the Claude side.
Against Llama. Meta's Llama is the leading open-weight family on Bedrock, and the trade-off is different from Claude's. Llama's appeal is openness and portability — open weights, the option to self-host or fine-tune freely, and avoiding lock-in to a single provider. Nova's appeal against Llama is that, as AWS's own family running on AWS silicon, it is tuned for price-performance and latency on Bedrock and comes with first-party multimodal tiers and the distillation/Premier story. For many teams the choice comes down to philosophy: pick Llama if open weights and portability matter to you; pick Nova if you want the cheapest, lowest-latency managed option that is native to AWS. Both are a model-ID swap away, so you can benchmark them head-to-head on your own data with almost no integration cost.
The honest meta-point. Public benchmarks (knowledge, math, coding and multimodal suites) are a coarse ranking at best and routinely mislead on the specific task you care about — the only benchmark that counts is your own evaluation set on your own prompts. Treat the tier mappings above as direction, then validate. For a fuller catalogue-wide comparison across providers, see the amazon-bedrock-models sibling; for the cost detail behind these choices, see amazon-nova-pricing and amazon-bedrock-pricing.
Default to a Nova tier for new generation work (cheapest, lowest-latency, AWS-native); escalate to Claude for the genuinely hard reasoning/coding cases via a router on the same API; choose Llama when open weights and portability are the priority. All three are a model-ID change on Bedrock — start with Nova, measure, escalate only where the data says you must.
Picking a stock tier is the start; many teams then customize a Nova model to a narrow task to push quality up or cost down. Two mechanisms matter — fine-tuning and distillation — and both work through Bedrock's standard customization features, so they fit the same one-API workflow as everything else.
Fine-tuning adapts a Nova model to your domain by training it further on your labeled examples — useful when stock prompting cannot reliably hit your format, tone or domain accuracy. You fine-tune through Bedrock's custom-model workflow and then call the resulting custom model like any other (typically via Provisioned Throughput for serving). The amazon-bedrock-fine-tuning sibling covers the mechanics; the cost adds a one-off training charge plus custom-model hosting on top of inference.
Distillation is the higher-leverage move for high-volume tasks and is where Nova Premier earns its keep beyond raw capability. You use a strong teacher model (Premier, or another capable model) to generate high-quality outputs for your task, then train a smaller, cheaper student model (a smaller Nova tier) to mimic them. The result is close-to-teacher quality at close-to-small-model cost — for that specific task. This is exactly how teams get "Premier-ish answers at Micro-ish prices" on a narrow, repetitive workload, and it is a deliberate part of how the tier ladder is meant to be used.
The practical sequencing: start with stock tiers and a tiered router; reach for fine-tuning only when prompting cannot get a stock tier to your bar; reach for distillation when a high-volume task needs more quality than a cheap stock tier gives but you cannot afford to run the expensive tier on every call. Both customizations are AWS-credit-eligible alongside inference — which is the bridge to the next section.
Going from "which Nova model?" to a running, cost-tuned system is short, because every member is just Bedrock with a different model ID. The only meaningful decision beyond tier selection is whether to pay for it yourself or have AWS credits cover the build — which, for most startups and many companies, they will.
The mechanical first steps: (1) in the Bedrock console under Model access, enable the Nova models you want in your region (a one-time toggle); (2) prototype against Nova Pro with the unified Converse API as a sensible default; (3) build a small evaluation set from your real prompts and compare Pro against Lite/Micro (to push cost down) and against Premier or Claude (where you need more); (4) introduce a tiered router once you know which requests are easy versus hard; and (5) turn on the cost levers that fit — Batch (~50% off) for bulk jobs, prompt caching for repeated context, Provisioned Throughput only for steady high volume, and a cross-region inference profile for production resilience. That sequence gets most teams to a cheap, reliable production setup quickly.
The cost story is where CloudRoute comes in. Nova is the value family on Bedrock, but at real scale GenAI still costs money — and AWS will frequently fund the build with credits. Nova inference across every tier, plus fine-tuning and distillation and the supporting services, are all credit-eligible. The relevant pools: AWS Activate (general startup credits, commonly up to $100K for institutionally-funded startups), a dedicated Bedrock / Generative-AI POC pool ($10K–$50K) for proving out a use case, and the competitive Generative AI Accelerator (awards up to $1M for a small cohort of AI-first startups). Credits apply automatically against your AWS bill — including all Nova usage — until exhausted.
Most of those pools are partner-filed through the AWS Partner Network (the ACE program), not a public self-serve form, which is why teams route through an AWS partner rather than applying alone. That is the gap CloudRoute fills: it matches you to the right credit pool for your stage and to a vetted AWS DevOps/ML partner who both files the credit application and helps build the Nova workload — the tier selection, the router, the RAG pipeline, the agent, the fine-tune or distillation. The customer pays $0 — AWS funds the credit pool, AWS pays the partner through engagement-funding programs, and the partner pays CloudRoute a routing commission. You never see an invoice. For the credit mechanics specifically, see the cross-cluster pages on AWS credits for generative-AI startups and Bedrock POC funding.
A concrete, scannable way to see how tier selection plays out: take one realistic multimodal document-processing job and see which Nova model fits each step, why, and what you would escalate to. It is directional (capability and pricing both move quickly) — validate on your own eval set, and remember Nova and Claude share the one Bedrock API, so "both via a router" is usually the real answer.
| Step in the workload | Fits this Nova model | Why this tier | Escalate to | When to escalate |
|---|---|---|---|---|
| Route/triage the incoming request | Nova Micro | Text-only, trivial, runs on every request — cost & latency dominate | Nova Lite | If routing needs to read the document image |
| Classify the document type from its image | Nova Lite | Cheap multimodal; high volume; well-defined labels | Nova Pro | If categories are subtle/ambiguous |
| Extract structured fields (JSON) | Nova Lite → Pro | Lite if clean; Pro when layout is messy or schema is complex | Nova Premier | If extraction accuracy is still short on hard docs |
| Write the customer-facing summary | Nova Pro | Quality matters but cost still counts; balanced tier | Nova Premier / Claude | For nuanced tone or high-stakes accounts |
| Handle the genuinely hard, rare cases | Nova Premier | Strongest Nova reasoning + largest context | Claude (Sonnet/Opus-class) | For the very hardest reasoning/writing/code |
| Generate a product image for the report | Nova Canvas | Purpose-built image model, billed per image | — | — |
Situation: Every incoming message ran end-to-end through one mid-size model: intent detection, knowledge-base retrieval answer, and a drafted reply, all on a single tier. It worked, but the modeled inference bill was approaching ~$8K/month and climbing with volume, and most of that spend was on trivial steps — detecting intent, routing, tagging — that plainly did not need the same model as the customer-facing reply. The team wanted to match each step to the right Nova tier without a visible quality drop, and without burning runway on the rework.
What CloudRoute did: CloudRoute matched them in under 24 hours to an APAC AWS partner with GenAI cost-engineering experience. The partner (1) moved intent detection, routing and tagging to <strong>Nova Micro</strong> (text-only, cheapest, fastest); (2) moved screenshot/attachment understanding and document reads to <strong>Nova Lite</strong> (cheap multimodal); (3) kept the customer-facing reply drafting on <strong>Nova Pro</strong>, escalating only flagged high-value or ambiguous tickets to <strong>Nova Premier</strong> and, for the hardest few, to <strong>Claude</strong> on the same Bedrock API; (4) added prompt caching for the shared system/RAG context and ran nightly transcript analytics via Batch; and (5) filed a Bedrock POC credit application plus an Activate Portfolio application to fund the whole build and launch.
Outcome: Measured on the team's own eval set, answer quality held within tolerance while the modeled inference bill fell from ~$8K to ~$2.4K/month — roughly a 70% cut — driven almost entirely by tier-matching across the Nova lineup. Even that reduced spend was fully covered by the approved credits, so the team paid $0 through the build and early scale. CloudRoute's commission was paid by the partner from AWS engagement funding, not by the customer.
cost cut: ~$8K → ~$2.4K/mo modeled (~70%) · quality: held on eval set · credits: POC + Activate · out-of-pocket: $0
Matching each task to the right Nova tier is the cheapest way to run real GenAI on AWS — and AWS credits can make the whole build cost nothing. CloudRoute routes you to the right credit pool (Activate up to $100K, Bedrock POC $10K–$50K, GenAI Accelerator up to $1M) and a vetted AWS partner who files the application and builds the workload — the tier selection, the router, the RAG pipeline, the distillation. Customer pays $0.