Amazon Nova Act is Amazon's agentic model and SDK for building agents that reliably take actions in a web browser — navigating, clicking, typing, filling forms and completing multi-step tasks on real sites. It is a deliberate answer to the weakest spot of browser agents: reliability. Instead of asking one model to "do everything," Nova Act has you decompose a goal into small, atomic, testable steps and lets you mix natural-language actions with ordinary code. This is a complete, neutral reference: what Nova Act is, how the SDK's act() model works, the reliability approach, the real use cases (web automation, workflows, form-filling, QA), how it relates to Bedrock Agents, its current preview status and limits — and how AWS credits make building on it $0.
Amazon Nova Act is an agent — a model paired with an SDK — built to take reliable actions inside a web browser. Where most of the Nova family reads inputs and produces text, Nova Act produces actions: it looks at a rendered web page and decides where to click, what to type, which link to follow, and how to move through a multi-step task on a real website.
Put plainly, Nova Act is "hands for a model on the web." A normal language model can tell you, in prose, how to fill out a form; Nova Act actually fills it out — opening the page, locating the right fields, entering values, clicking submit, and checking the result. It operates a real browser the way a person would, which means it can work against the large share of the internet that exposes a UI but no usable API.
It sits in the Amazon Nova family alongside the understanding tiers (Micro, Lite, Pro, Premier) and the creative models (Canvas for images, Reel for video). Within that family Nova Act is the agentic member: its output is browser actions, not a paragraph or an image. It was introduced through Amazon's frontier/AGI research effort and shipped first as a research preview with a developer SDK — a way to start building and giving feedback, not a finished enterprise product. That framing matters and recurs throughout this page.
The reason a model like this exists is that browser agents have historically been unreliable. Demos that breeze through a happy path tend to fall apart on real, multi-step tasks: the model mis-clicks, loses track of state, hallucinates a button that is not there, or quietly does the wrong thing. Reliability — not raw intelligence — is the binding constraint on whether you can let an agent run unattended. Nova Act's explicit goal is to attack that constraint directly, which shapes its whole programming model (see §II–§III).
A useful mental model: Nova Act is to browsers what a test-automation framework is to UIs, but driven by natural language and a model instead of brittle, hand-written selectors. You describe what to do in small steps; the model figures out how to do it against the live page; and you wrap that in ordinary code so you can branch, loop, assert and recover. The result is meant to be scriptable and repeatable rather than a one-off chat.
One caveat, stated once and meant throughout: Nova Act is newer and moving fast, and its exact status, availability, supported regions and capabilities as of 2026 are evolving. Treat the specifics here as a guide to the shape of the product and how to think about it — and confirm current availability, limits and pricing on the official Amazon Nova / Nova Act pages before you commit to it in production.
Amazon Nova Act is an agentic model + SDK for building agents that reliably take actions in a web browser — and its core idea is to win reliability by decomposing a task into small, atomic, testable steps you can interleave with ordinary code, rather than asking one model to "do everything" at once.
You do not "chat" with Nova Act so much as program with it. The SDK gives you a browser session and a single workhorse primitive — act() — that takes a short natural-language instruction and executes it against the current page. You compose many small act() calls with normal Python to build a reliable workflow.
The shape of a Nova Act program is: start a browser session pointed at a starting URL, then issue a sequence of act() calls, each describing one tightly-scoped action in plain English — act("search for a wireless keyboard"), act("add the first result to the cart"), act("go to checkout"). Each call hands control to the model, which inspects the rendered page and performs the action; control then returns to your code, so you decide what happens next. The agent is driven by your script, not left to free-wheel through the whole task.
The point of keeping each act() small is twofold. First, small actions are more reliable: a model asked to "click the blue Add-to-cart button for the first item" succeeds far more often than one asked to "buy me a keyboard, ship it, and email me the receipt" in a single shot. Second, small actions are testable: because each step is discrete, you can assert on its outcome, retry it, log it, and debug exactly where a flow broke — the difference between an agent you can trust and one you merely hope works.
Crucially, Nova Act lets you interleave natural-language steps with ordinary code. Between act() calls you can run normal Python: loop over a list of rows, branch on a condition, call your own REST API, read a value off the page and store it, do arithmetic, or set browser cookies to skip a login. This hybrid — model for the fuzzy "understand and operate the UI" parts, deterministic code for the structured "control flow and data" parts — is the heart of the design. You are not choosing between "all agent" and "all script"; you are weaving them.
The SDK also supports the practical machinery of real automation: extracting structured data from a page (so a step can return a typed value your code uses), waiting for elements/state, running multiple browser sessions in parallel for throughput, and capturing what happened for inspection. The recommended discipline is to give each act() one clear, bounded job, prefer assertions over assumptions, and treat a long flow as a tested pipeline of small steps rather than one monolithic prompt.
A typical script reads almost like a test: start session → act("dismiss the cookie banner") → act("search for 'standing desk'") → for each of the first 5 results: act("open result N"), extract the price, act("go back") → assert the cheapest is under the budget → act("add the cheapest to cart"). Notice the pattern: the model handles the messy UI interactions, while the loop, the extraction and the assertion are plain code. That separation is what makes the whole thing repeatable and debuggable.
The failure mode of naive browser agents is compounding error: a long autonomous task is a chain of dozens of decisions, and a small per-step error rate multiplies into a high chance of overall failure. Decomposing the task and pinning down the control flow in code shrinks each step's uncertainty and removes whole categories of error from the model's plate (you no longer rely on it to "remember" to loop, or to do the math). You trade a little authoring effort for a large gain in reliability — exactly the trade unattended automation requires.
Think of act() as a single reliable UI verb and your Python as the program around it. Model for the fuzzy parts (operate the page), code for the structured parts (loop, branch, assert, fetch). Long flows are pipelines of small, tested steps — not one heroic prompt.
Reliability is the entire pitch, so it is worth being precise about how Nova Act tries to earn it. The approach is less "a smarter model" and more "an engineering discipline the SDK encourages": make every step small, make outcomes checkable, and keep the deterministic parts deterministic.
Atomic, bounded steps. The first lever is scope. A bounded instruction — one verb, one target — gives the model far less room to go wrong than an open-ended goal. Nova Act's design nudges you toward many small act() calls precisely because per-step reliability is high when the step is small, and a flow built from reliable steps is itself reliable. This is the opposite of the "agent, here is your objective, off you go" pattern that demos so well and breaks so often.
Checkable outcomes. The second lever is verification. Because steps are discrete and the SDK can extract structured values from the page, you can assert that a step did what it should before moving on — confirm the item is in the cart, confirm the form shows a success message, confirm the extracted total is a number. Assertions turn silent failures (the agent's most dangerous behaviour) into loud, catchable ones, and give you a natural place to retry or bail out.
Deterministic control flow. The third lever is to stop asking the model to do things code does better. Looping, counting, comparing, calling an API, handling a login via stored cookies — these are deterministic, so you write them as code, not as prose for the model to interpret. Every responsibility you move from the model to code is one fewer source of nondeterministic error. The model is reserved for what only it can do: understand and operate an arbitrary, unfamiliar UI.
Recovery and observability. Real sites are flaky — pop-ups, slow loads, A/B-tested layouts, intermittent errors. A robust Nova Act flow plans for that: wait for state rather than assuming it, wrap fragile steps with retries, and capture enough of what happened (logs, intermediate values, traces of each step) to diagnose failures after the fact. Treating the agent like production software — tested, observable, defensively written — is the difference between something you can schedule unattended and something you have to babysit.
None of this makes Nova Act infallible. Browser agents in 2026 are markedly better than a year or two earlier but still err on genuinely hard, long-horizon or visually-ambiguous tasks. The honest claim is narrower and more useful: by decomposing tasks and leaning on code for structure, Nova Act makes the reliable envelope much larger — enough that well-scoped, well-tested flows can run on their own.
Small steps (low per-step error) + assertions (catch failures loudly) + deterministic code for control flow (remove whole error classes) + retries/observability (survive flaky sites). Reliability here is an engineering pattern the SDK encourages, not a promise that the model never errs.
Nova Act earns its keep wherever a useful task lives behind a web UI that has no good API, or where an internal process means "a human clicking through several websites." Anywhere a person would otherwise do repetitive browser work, a well-scoped Nova Act flow is a candidate.
The unifying theme across these use cases is the same: a multi-step task, expressed on a graphical web interface, that you want to run repeatably and ideally unattended. The more the task can be decomposed into bounded steps with checkable outcomes, the better a fit it is.
Nova Act fits a task when: (1) the task lives on a web UI, (2) there is no clean API to use instead, (3) it is repetitive or multi-step, and (4) it can be decomposed into bounded, checkable steps. The more boxes a task ticks, the stronger the case.
People conflate "Nova Act" and "Bedrock Agents" because both have agentic in the name, but they operate at different layers and solve different problems. The clean way to hold it: Bedrock Agents orchestrate tools and APIs; Nova Act drives a browser. They compose rather than compete.
Amazon Bedrock Agents is a managed capability inside Amazon Bedrock for building agents that reason over a request and call tools to fulfil it. Those tools are typically APIs and functions — you define "action groups" (often backed by Lambda or an OpenAPI schema) and attach Knowledge Bases for retrieval, and the agent plans which tools to call and in what order. It is the orchestration brain for structured integrations: when the thing you need is reachable as an API, a database, or a documented function, Bedrock Agents is the natural fit. See the amazon-bedrock-agents sibling for the full picture.
Nova Act operates one layer down and to the side: it is the mechanism for the case where there is no API at all, only a graphical web interface meant for humans. It does not orchestrate your microservices; it clicks buttons and fills fields on a real page. Its "intelligence" is concentrated on the hard problem of reliably operating an arbitrary UI, and its interface is an SDK you script, not a managed console agent.
Because they live at different layers, the powerful pattern is composition. A Bedrock Agent can treat a Nova Act flow as just another tool — a "go do this on the website" action group. The agent handles high-level planning and the API-shaped steps; when a sub-task only exists as a website, it hands off to Nova Act, which performs the browser work and returns a result the agent continues from. You get API-level orchestration and UI-level execution in one system.
A simple decision rule: if the capability you need has a good API, reach for Bedrock Agents (or just call the API directly from your code/agent); if it only exists as a website, reach for Nova Act; and if your task needs both, use both — Bedrock Agents to orchestrate, Nova Act as the browser-using tool. They are different tools for different shapes of integration, not substitutes. (For agent foundations generally, see the build-an-ai-agent-on-aws sibling.)
| Dimension | Amazon Nova Act | Amazon Bedrock Agents |
|---|---|---|
| Core job | Take actions in a graphical web browser/UI | Orchestrate tools/APIs to fulfil a request |
| Works against | Websites/web apps with no good API | APIs, functions, databases (action groups) |
| Interface | Python SDK · act() steps you script | Managed Bedrock capability (console/API/IaC) |
| Retrieval / knowledge | Reads the live page (extract from UI) | Knowledge Bases (managed RAG) built in |
| Reliability lever | Decompose into small, asserted steps | Tool schemas + model planning + guardrails |
| Maturity (2026) | Newer · preview-grade | Generally available, productionized |
| Together | Can be a "use the website" tool… | …called by a Bedrock Agent as one of its tools |
Being straight about maturity is part of using Nova Act well. As of 2026 it is a newer, fast-moving capability that began life as a research preview with an SDK — genuinely useful and worth prototyping against, but not a long-stable, drop-in enterprise product. Plan accordingly.
Maturity and stability. Nova Act shipped first as a research preview aimed at developers, with an SDK to start building and to gather feedback. That implies the usual preview realities: the API surface, behaviour, supported regions and limits can change between releases; expect to track updates and to re-test flows after upgrades. It is a strong base to build and learn on, but treat anything mission-critical as something to validate carefully and revisit as the product matures.
Reliability is much improved, not perfect. The decomposition-and-assert approach raises the reliable envelope substantially, but no browser agent in 2026 is flawless. Genuinely hard, long-horizon tasks, visually ambiguous pages, aggressive bot-detection, frequently-changing layouts and unusual widgets can still trip it up. The mitigation is engineering discipline (small steps, assertions, retries, human-in-the-loop for high-stakes actions), not an assumption that it will always get it right.
Scope and surface. As a web/browser agent it is, by definition, oriented to web UIs — it is not a general desktop-application automator, and its strongest support is web-first and English-first. Authentication, sites that forbid automation in their terms, CAPTCHAs and rate limits are practical constraints you have to design around. And because it operates real systems, you should scope its permissions tightly and gate any irreversible action (purchases, submissions, deletions) behind a confirmation or a human check.
Operational and governance considerations. An agent that can act on the web is powerful and therefore needs guardrails: least-privilege credentials, isolation (run sessions in a controlled/sandboxed environment), logging of every action for audit, and clear boundaries on what it may touch. Respect target sites' terms of service and robots/automation policies. These are not reasons to avoid Nova Act — they are the standard responsibilities of putting any capable automation into production.
Net: Nova Act is one of the more credible attempts yet at reliable browser automation, and the right posture in 2026 is to prototype real flows against it now, learn where it is solid, keep humans in the loop for risky steps, and harden as it matures — while confirming current status and limits on the official pages, since this is exactly the kind of fast-moving product where details shift.
Nova Act is newer and preview-grade in 2026 — a credible, genuinely useful agent for browser tasks, but not a long-stable enterprise product. Prototype against it, keep humans in the loop for irreversible actions, scope permissions tightly, and confirm current status/limits on the official Nova Act pages.
Getting from "interested in Nova Act" to a working agent is a developer task: get access, install the SDK, write a handful of small act() steps for a real flow, then harden it. The only decision beyond what to automate is whether you pay for it yourself or have AWS credits cover the build — which, for most startups and many companies, they will.
The mechanical first steps: (1) get access to Nova Act and obtain the credentials/API key the SDK needs, per the current official instructions; (2) install the Python SDK and run the minimal sample to confirm a browser session launches and a first act() executes; (3) pick one real, bounded workflow — a form you fill repeatedly, a report you pull, a UI flow you want to test — and write it as a sequence of small act() steps interleaved with the loops/assertions it needs; (4) add verification (assert each step's outcome, capture logs/traces) and retries so flaky pages do not break the run; and (5) only then scale up — parallel sessions for throughput, scheduling for unattended runs, and a human-in-the-loop gate on any irreversible action.
Treat the first build as a way to learn the envelope. Start with a task that is valuable but low-risk, measure how reliably it runs, and expand from there. Because each step is discrete and assertable, you will quickly see which parts of your flow are solid and which need tighter scoping or a code-side guardrail — that feedback is the fastest path to a flow you can trust unattended.
The cost story is where CloudRoute comes in. Building an agent — Nova Act flows, any Bedrock model calls that wrap them, the Bedrock Agent orchestration around them, the compute to run sessions — costs money at scale, and AWS will frequently fund the build with credits. The relevant pools: AWS Activate (general startup credits, commonly up to $100K for institutionally-funded startups), a dedicated Bedrock / Generative-AI POC pool ($10K–$50K) for proving out a use case, and the competitive Generative AI Accelerator (awards up to $1M for a small cohort of AI-first startups). Credits apply automatically against your AWS bill until exhausted.
Most of those pools are partner-filed through the AWS Partner Network (the ACE program), not a public self-serve form, which is why teams route through an AWS partner rather than applying alone. That is the gap CloudRoute fills: it matches you to the right credit pool for your stage and to a vetted AWS DevOps/ML partner who both files the credit application and helps build the agent — the Nova Act flows, the assertions and recovery logic, the Bedrock Agent orchestration, the deployment. The customer pays $0 — AWS funds the credit pool, AWS pays the partner through engagement-funding programs, and the partner pays CloudRoute a routing commission. You never see an invoice. For the credit mechanics specifically, see the cross-cluster pages on AWS credits for generative-AI startups and Bedrock POC funding.
When you need to automate something, the real choice is among a few approaches. A scannable view of how Amazon Nova Act compares to Bedrock Agents (API orchestration) and to traditional selector-based browser automation (RPA / Selenium / Playwright) on the dimensions that drive the pick. Directional as of 2026 — validate against your own task.
| Approach | What it drives | How you build | Resilience to UI change | Reliability lever | Best when |
|---|---|---|---|---|---|
| Amazon Nova Act | A real web browser/UI | Python SDK · natural-language act() steps + code | Higher (describes intent, not selectors) | Decompose into small asserted steps | No API exists; multi-step web tasks; flaky UIs |
| Amazon Bedrock Agents | APIs / functions / data | Managed agent · action groups + Knowledge Bases | n/a (calls APIs, not UI) | Tool schemas + planning + guardrails | A clean API/function exists to call |
| Classic RPA / Selenium / Playwright | A real web browser/UI | Hand-written CSS/XPath selectors + scripts | Lower (selectors break on UI churn) | Deterministic but brittle scripts | Stable UI, fixed flow, no model needed |
| Direct API integration (no agent) | APIs directly | Plain code against the API | n/a | Fully deterministic | You control or have a documented API |
Situation: A core part of the product meant entering and reconciling shipment data across several carrier and customs <strong>web portals that exposed no API</strong> — work an ops person did by hand, several hundred times a day, clicking through the same screens. It did not scale, it was error-prone, and every new carrier was another portal to learn. The team had tried brittle selector-based scripts that broke whenever a portal tweaked its layout. They wanted reliable, unattended automation but did not want to burn seed runway building and operating it.
What CloudRoute did: CloudRoute matched them in under 24 hours to an APAC AWS partner with agent and automation experience. The partner (1) built the worst portal flows as <strong>Amazon Nova Act</strong> agents — decomposing each into small <code>act()</code> steps (log in via stored cookies, search the shipment, enter the fields, submit) interleaved with Python that looped over the day's rows and <strong>asserted the success state per row</strong>; (2) added retries, waits and full action logging so portal flakiness did not break runs, with a <strong>human-in-the-loop gate</strong> on any irreversible submission; (3) wrapped the whole thing behind a <strong>Bedrock Agent</strong> so the rest of the system could trigger "process these shipments" as one tool call; and (4) filed a Bedrock/GenAI POC credit application plus an Activate application to fund the build and early runtime.
Outcome: The bounded, asserted flows ran reliably enough to move the bulk of portal data-entry off humans, with exceptions surfaced for review rather than silently mis-entered — and because steps describe intent rather than selectors, a portal layout change no longer broke the run outright. The build and early scale were fully covered by the approved credits, so the team paid $0. CloudRoute's commission was paid by the partner from AWS engagement funding, not by the customer.
portal data-entry: mostly automated · resilience: intent-based steps survive layout churn · credits: POC + Activate · out-of-pocket: $0
Nova Act is one of the most credible ways yet to put reliable browser automation into production — and AWS credits can make it cost nothing to build. CloudRoute routes you to the right credit pool (Activate up to $100K, Bedrock POC $10K–$50K, GenAI Accelerator up to $1M) and a vetted AWS partner who files the application and builds the workload — the Nova Act flows, the assertions and recovery, the Bedrock Agent orchestration. Customer pays $0.