A complete, neutral reference for what Amazon Nova actually costs on Amazon Bedrock in 2026: per-tier input/output token prices for Micro, Lite, Pro and Premier (shown per 1,000 and per 1,000,000 tokens), Nova Canvas per-image and Nova Reel per-second rates, how Nova compares on cost to Claude, Llama and Titan, where Batch (~50% off) and prompt caching apply, a worked monthly example, why Nova is the cost-optimization pick on Bedrock — and how AWS credits make all of it $0 to build.
Nova pricing is the standard Bedrock pricing model applied to Amazon's own family — so if you understand Bedrock billing, you already understand Nova billing. The only thing that makes Nova distinctive on cost is that its rates sit at the low end of the catalogue by design.
For the text tiers (Micro, Lite, Pro, Premier) the billing unit is the token — roughly ¾ of an English word, so 1,000 tokens ≈ 750 words. Every request is metered in two directions: input tokens (your prompt, the system instruction, conversation history, any retrieved/RAG context, and — for the multimodal tiers — the images, documents or video you send) and output tokens (everything the model writes back). You pay separately for each, at a published rate per 1,000 tokens. Many AWS and provider pages quote the same number per 1,000,000 tokens; it is simply the per-1K figure × 1,000, and this page shows both so the tables are easy to compare.
As with every model, output tokens cost more than input tokens — typically 3–5× — because generation is the expensive part. That shapes cost design: a Nova workload that reads a lot and writes a little (classification, extraction, routing) is extremely cheap, while one that writes long outputs from a short prompt is dominated by the output rate. Trimming what you send and capping what the model writes are the two simplest cost levers on any tier.
The non-text Nova models price on a different unit. Nova Canvas (image generation) is billed per generated image, with the price varying by output resolution and quality setting. Nova Reel (video generation) is billed per second of generated video. These are not token-priced, so you budget them by volume of images/seconds, not by prompt length. (Nova Act, the agentic browser model, is an emerging capability — treat its cost as evolving and check current terms.)
Two more things change the effective rate, both shared with the rest of Bedrock. First, the pricing mode: on-demand (the rates in the tables), Batch (~50% cheaper for asynchronous bulk jobs), Provisioned Throughput (a flat hourly charge for reserved capacity, for steady high volume and for serving custom/fine-tuned models), and prompt caching (a steep discount on repeated input context). Second, fine-tuning / distillation: a one-time training charge plus, if you host a custom model, an ongoing capacity cost. The amazon-bedrock-pricing and amazon-bedrock-prompt-caching siblings cover those mechanics in depth; this page focuses on the Nova numbers.
Caveat, stated once and meant throughout: every dollar figure on this page is representative as of 2026, included to show relative cost and the shape of a Nova bill. Foundation-model prices change frequently as providers compete, and they vary by region. Always confirm current rates on the official AWS Bedrock / Nova pricing page before budgeting, and use the amazon-bedrock-pricing-calculator sibling to model your own numbers.
Nova text tiers are billed per input and output token (per 1K or per 1M), output costing more than input; Canvas is per image and Reel is per second. Batch (~50% off), prompt caching and Provisioned Throughput all apply on top — and Nova's rates are among the lowest on Bedrock by design.
The table most people come for: representative 2026 on-demand prices for the four Nova text/multimodal tiers, shown per 1,000 and per 1,000,000 tokens for both input and output. Use it to rank the tiers by cost and to sanity-check a budget — not as an audited price sheet.
Read it as a ladder: Micro is the floor (cheapest, text-only), Lite adds multimodal at a still-tiny price, Pro is the balanced middle, and Premier is the most capable — yet even Premier, the top of the family, is priced well under frontier models like Claude Sonnet/Opus-class. The "per 1M" columns are the same rates as "per 1K" multiplied by 1,000, included because providers increasingly quote prices per million.
| Nova tier | Modality | Input / 1K | Output / 1K | Input / 1M | Output / 1M |
|---|---|---|---|---|---|
| Nova Micro | Text only | $0.000035 | $0.00014 | $0.035 | $0.14 |
| Nova Lite | Multimodal | $0.00006 | $0.00024 | $0.06 | $0.24 |
| Nova Pro | Multimodal | $0.0008 | $0.0032 | $0.80 | $3.20 |
| Nova Premier | Multimodal | $0.0025 | $0.0125 | $2.50 | $12.50 |
The creative Nova models are not token-priced, so they need their own mental model. Canvas is billed by the image and Reel by the second of video — both cheap per unit, but easy to scale into real money at volume, which makes Batch and sensible defaults worth setting up early.
Canvas is billed per image you generate, with the price depending on the output resolution and the quality setting (a standard image costs less than a high-resolution or "premium" one). Representative 2026 pricing is on the order of $0.04 per standard image and ~$0.06–$0.08 for larger/high-quality images — confirm current tiers on the AWS pricing page. Image editing operations (inpainting, outpainting, variations) are billed per output image as well. The cost lesson: at a few cents each, casual use is trivial, but an app generating tens of thousands of images a month should pick the lowest resolution that meets the need and avoid regenerating unnecessarily.
Reel is billed per second of generated video. Representative 2026 pricing is on the order of $0.05–$0.10 per second — so a six-second clip is on the order of a few tens of cents. Confirm the current rate on the AWS pricing page. Because cost scales linearly with duration, the obvious lever is clip length: generate the shortest clip that does the job, and prototype at short durations before committing to longer renders. Video generation is also a natural fit for asynchronous/batch-style processing rather than blocking a user request.
Budget Canvas by images × per-image rate (resolution/quality drives the rate) and Reel by seconds × per-second rate. Both are cheap per unit but scale with volume/duration — choose the lowest resolution and shortest clip that meet the requirement, and avoid needless regeneration.
Nova's whole pitch is price-performance, and the comparison table makes the "price" half concrete: against the frontier (Claude), the open-weight options (Llama) and Amazon's earlier family (Titan), the Nova tiers consistently sit at or near the bottom of the cost range on Bedrock.
The point of this table is not "cheaper is better" — model choice is about getting the quality you need at the lowest cost, and on the hardest tasks a frontier model can be worth its higher rate (see amazon-nova for the honest quality read). The point is to show the size of the cost gap, because that gap is exactly what a tiered router monetizes: send the easy majority of requests to a Nova tier and you pay Nova rates for most of your traffic.
| Model | Provider | Input / 1M | Output / 1M | Class | Notes |
|---|---|---|---|---|---|
| Nova Micro | Amazon | $0.035 | $0.14 | Value (text) | Cheapest text tier; high-volume simple tasks |
| Nova Lite | Amazon | $0.06 | $0.24 | Value (multimodal) | Cheap multimodal at scale |
| Claude Haiku | Anthropic | $0.25 | $1.25 | Fast frontier-family | Fast/cheap Claude tier |
| Llama (small ~8B) | Meta | $0.22 | $0.72 | Open-weight | Low-cost open model |
| Nova Pro | Amazon | $0.80 | $3.20 | Value (balanced) | Balanced multimodal default |
| Llama (large ~70B+) | Meta | $2.65 | $3.50 | Open-weight | Capable open model |
| Nova Premier | Amazon | $2.50 | $12.50 | Value (top) | Amazon's most capable; distillation teacher |
| Claude Sonnet | Anthropic | $3.00 | $15.00 | Frontier workhorse | Strong reasoning/coding |
| Claude Opus-class | Anthropic | $15.00 | $75.00 | Top frontier | Hardest tasks |
Nova's on-demand rates are already low, but the standard Bedrock cost levers apply on top — and on the right workload they compound. The two biggest for Nova are Batch (for anything not interactive) and prompt caching (for anything with repeated context).
Submit a large set of Nova requests as a single asynchronous job (typically a file in S3) and Bedrock processes them in the background, returning results when done. In exchange for giving up real-time responses you pay roughly half the on-demand rate. On Nova — already the cheap family — this makes high-volume work astonishingly inexpensive: bulk classification, extraction, summarization, corpus enrichment, and offline evaluation are all natural Batch candidates. Pairing a cheap tier (Micro/Lite) with Batch is the lowest-cost way to run large text workloads on AWS.
When many Nova requests share a large common prefix — a long system prompt, a fixed instruction set, a reference document, large tool definitions, or few-shot examples — prompt caching lets Bedrock cache that prefix so subsequent requests are not billed full input price for it again. Cached input tokens are billed at a steep discount versus normal input tokens (with a small charge to write the cache). On chatbots with a long fixed system prompt or RAG that reuses the same context, this can cut the input portion of a Nova bill substantially. It only helps where context actually repeats — see amazon-bedrock-prompt-caching for the mechanics.
For steady, high, predictable Nova volume — or to serve a fine-tuned / distilled custom Nova model — you can reserve dedicated capacity via Provisioned Throughput, a flat hourly charge independent of token count. Fine-tuning itself is a one-time training charge; the recurring cost is hosting the custom model on reserved capacity. Because Nova Premier is positioned as a distillation teacher, a common advanced pattern is: distill Premier into a small custom model for one narrow high-volume task, then serve that cheaply. Only do this when the volume justifies a standing hosting cost.
On Nova the cheapest setup is often: a small tier (Micro/Lite) + Batch for bulk async work + prompt caching for repeated context, with on-demand reserved for interactive traffic and Provisioned Throughput only where volume is steady. Each lever multiplies the already-low base rate.
Per-token rates are hard to feel until you put a workload through them. Here is a concrete, representative monthly estimate for a common Nova workload, plus the same workload priced on a frontier model so the cost gap is visible. Figures are illustrative — your mileage varies with prompt length and mode.
The workload — a multimodal support + extraction assistant. Say 200,000 requests/month. Each request reads a customer message plus a screenshot/document (≈ 2,000 input tokens once the image is tokenized) and writes a ≈ 400-token answer. That is 400M input tokens and 80M output tokens per month.
On Nova Lite (on-demand). At the representative rates ($0.06 / $0.24 per 1M), input ≈ 400 × $0.06 = $24 and output ≈ 80 × $0.24 = $19.20 → ≈ $43/month for 200,000 multimodal requests. Turn on prompt caching for the fixed system prompt and the input portion drops further; run any non-interactive portion via Batch and it roughly halves again.
On Nova Pro (on-demand), if you needed the extra capability: input ≈ 400 × $0.80 = $320 and output ≈ 80 × $3.20 = $256 → ≈ $576/month. Still modest for the volume — and the point of a router is that you only pay Pro rates for the requests that actually need Pro.
The same workload on a frontier model (Claude Sonnet, on-demand) would be input ≈ 400 × $3.00 = $1,200 and output ≈ 80 × $15.00 = $1,200 → ≈ $2,400/month. So the identical 200,000-request workload is roughly $43 on Nova Lite, ~$576 on Nova Pro, and ~$2,400 on Claude Sonnet — a ~55× spread from the cheapest Nova tier to the frontier. That spread is the entire argument for tier-matching: run the easy majority on Nova, escalate only the hard minority, and the blended bill lands far closer to the Nova number than the frontier one. And at these magnitudes, the whole thing fits comfortably inside an AWS credit pool — which is why so many teams pay $0 while they scale.
If the question is "how do I run GenAI on AWS for the least money without wrecking quality," the answer almost always involves Nova. Not because it wins every benchmark, but because of where it sits in the cost/capability landscape and how cleanly it slots into a routing strategy.
Three things make Nova the default cost-optimization lever. First, the rates are simply the lowest tier-for-tier on Bedrock — for the broad middle of production tasks (classification, extraction, RAG answers, summarization, structured output, high-volume agents, multimodal understanding) a Nova tier is good enough and costs a fraction of a frontier model. Second, it is multimodal cheaply — Lite and Pro read images, documents and video at prices that previously only bought text, which collapses the cost of "understand this screenshot/PDF" features. Third, it routes trivially — because Nova and every other model live behind one Bedrock API, sending the easy 70–90% of traffic to a Nova tier and escalating the rest is a config decision, not an integration project.
There is a fourth, more advanced lever unique to the top of the family: distillation from Nova Premier. For a narrow, high-volume task you can use Premier as a teacher to train a small custom model that approaches its quality at a small-model price, then serve that — turning a frontier-ish capability into a value-tier ongoing cost. Combined with Batch and prompt caching, this is how teams get high-volume GenAI workloads down to genuinely small monthly numbers.
The honest counterweight: Nova is the value pick, not the frontier pick. Where a task is genuinely hard — complex reasoning, nuanced writing, difficult code — paying for Claude (or Nova Premier) is the right call, and a good cost strategy budgets for that minority rather than pretending it away. Cost optimization with Nova is not "use the cheapest model for everything"; it is "use the cheapest model that clears the bar for each task, and reserve the expensive model for where it earns its rate."
Everything above prices what Nova costs if you pay AWS directly. For most startups and many companies the relevant number is different — because AWS will frequently fund the build with credits, and Nova spend draws those credits down before it ever touches your card. Nova being the cheap family just means the credits last even longer.
AWS runs several credit programs specifically to put generative-AI workloads on AWS, and Nova usage is fully credit-eligible — inference across every tier, Canvas and Reel generation, fine-tuning, distillation, and the supporting services. The relevant pools: AWS Activate (general startup credits, commonly up to $100K for institutionally-funded startups); a dedicated Bedrock / Generative-AI POC pool ($10K–$50K) aimed at proving out a GenAI use case; and the competitive Generative AI Accelerator (credit awards up to $1M for a small cohort of AI-first startups). Credits apply automatically against your AWS bill — including all Nova usage — until exhausted.
The practical wrinkle is that most of these pools are partner-filed: they are requested through the AWS Partner Network (the ACE program), not a public self-serve form. That is why teams typically route through an AWS partner rather than applying alone — and it is the gap CloudRoute fills. CloudRoute matches you to the right credit pool for your stage and to a vetted AWS DevOps/ML partner who both files the credit application and helps build the Nova workload (the tiered router, the RAG pipeline, the distillation, the Canvas/Reel pipeline). The customer pays $0 — AWS funds the credit pool, AWS pays the partner through engagement-funding programs, and the partner pays CloudRoute a routing commission. You never see an invoice.
Put together with the cost levers above, the math for a startup is compelling: Nova is already the cheapest way to run real GenAI on AWS, and a $25K–$100K credit pool stretches enormously far against Nova rates — often covering the entire build and early scale, so you pay real money only once usage (and ideally revenue) has grown past the credits. Related: see amazon-nova for the full model overview, amazon-bedrock-pricing for the cross-model cost picture, and the cross-cluster pages on AWS credits for generative-AI startups and Bedrock POC funding for the credit mechanics.
To make the cost gap unmissable, here is the §VI workload — 200,000 multimodal requests/month, 400M input + 80M output tokens — priced on each Nova tier and on a frontier model, on-demand. It shows why tier-matching, not model loyalty, is the cost strategy. Figures are representative 2026 illustrations, not quotes.
| Model | Input / 1M | Output / 1M | Input cost | Output cost | Est. monthly |
|---|---|---|---|---|---|
| Nova Micro* | $0.035 | $0.14 | $14.00 | $11.20 | ≈ $25 |
| Nova Lite | $0.06 | $0.24 | $24.00 | $19.20 | ≈ $43 |
| Nova Pro | $0.80 | $3.20 | $320.00 | $256.00 | ≈ $576 |
| Nova Premier | $2.50 | $12.50 | $1,000.00 | $1,000.00 | ≈ $2,000 |
| Claude Sonnet (frontier) | $3.00 | $15.00 | $1,200.00 | $1,200.00 | ≈ $2,400 |
Situation: The team had prototyped their feature on a frontier model and modeled the bill at roughly $6K/month at their growth target — most of it spent on routine "read this document/screenshot and extract or answer" requests that did not need a frontier model. On a seed budget they could not absorb that, and they wanted to know whether Nova could carry it without a visible quality drop, and whether AWS would fund the build.
What CloudRoute did: CloudRoute matched them in under 24 hours to a MENA-region AWS partner with GenAI cost-engineering experience. The partner (1) moved the high-volume read/extract path to <strong>Nova Lite</strong> and the routing/classification step to <strong>Nova Micro</strong>; (2) kept the small share of genuinely hard requests on <strong>Nova Pro</strong>, with a fallback to a frontier model for edge cases; (3) turned on <strong>prompt caching</strong> for the fixed extraction instructions and ran the nightly bulk re-processing via <strong>Batch</strong>; and (4) filed a Bedrock POC credit application plus an Activate application to fund the build and early scale.
Outcome: On the team's own eval set, quality held while the modeled monthly inference cost fell from ~$6K (frontier) to roughly $700 on the Nova-based router — and that ~$700 was fully covered by the approved credits, so the team paid $0 out of pocket. CloudRoute's commission was paid by the partner from AWS engagement funding, not by the customer.
cost: ~$6K (frontier) → ~$700/mo (Nova router), modeled · quality: held on eval set · credits: POC + Activate · out-of-pocket: $0
Nova is already the cheapest way to run real GenAI on AWS — and AWS credits can make it cost nothing to build. CloudRoute routes you to the right credit pool (Activate up to $100K, Bedrock POC $10K–$50K, GenAI Accelerator up to $1M) and a vetted AWS partner who files the application and builds the cost-tuned workload — the tiered router, the caching, the distillation. Customer pays $0.