for AWS partners →Fund your first Bedrock workload with AWS credits →

aws bedrock setup · the hands-on 2026 quickstart

Amazon Bedrock setup — from zero to your first Converse call.

Q: How do I set up Amazon Bedrock from scratch?

Four steps before code, then one call. (1) Have an AWS account and pick a Region that offers the model you want. (2) Install the AWS CLI v2 and configure credentials with aws configure (or SSO), attaching a least-privilege IAM policy with bedrock:Converse and bedrock:InvokeModel. (3) In the Bedrock console, go to Model access and request the specific model in that Region — it is off by default. (4) Install boto3 and call converse() on the bedrock-runtime client with your modelId. From an empty account this is roughly 15 minutes.

Q: What IAM permissions do I need to call Bedrock?

At minimum the runtime actions you use: bedrock:Converse and bedrock:ConverseStream for the Converse API, and bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream for the lower-level API and its streaming variant. Add the read-only bedrock:ListFoundationModels if you want to enumerate models from code. Scope the policy's Resource down to specific model ARNs in production so a principal can call only the models it needs. Note that IAM permission is separate from model access — you need both.

Q: Why do I get AccessDeniedException on my first Bedrock call?

Almost always one of two causes. Either your IAM principal is missing the Bedrock action (e.g. bedrock:Converse), or — more commonly — you have not requested model access for that model in that Region. Bedrock ships with every model disabled; enable it under Model access in the console for the exact Region you are calling. Both fixes take a couple of minutes. If instead you see ValidationException about the model, the modelId is wrong or stale — copy the exact current ID from the console.

Q: Do I need to install anything besides boto3?

Strictly, no — boto3 (which bundles botocore with the Bedrock service models) is enough to call the API from Python, and credentials can come from environment variables. In practice you also want the AWS CLI v2, because it is the easiest way to store credentials with aws configure and to verify them with aws sts get-caller-identity before involving Bedrock at all. For other languages, use the corresponding AWS SDK (JavaScript, Java, Go, .NET) — the Converse flow is identical.

Q: Should I use the Converse API or InvokeModel for my first call?

Use Converse. It gives one consistent request and response schema across every chat model, with built-in multi-turn conversation, system prompts, tool use, and (via converse_stream) streaming — so switching models is just changing modelId. InvokeModel is the older low-level API with provider-specific request bodies; reach for it only for non-conversational modalities like image or embedding models, or for a provider-specific parameter Converse does not expose.

Q: Which Region should I set up Bedrock in?

Pick a Region on three criteria: it offers the model you want, it meets any data-residency requirement (your prompts and completions are processed in the Region you call), and it is close to your users for latency. Frontier models often appear in US Regions (us-east-1, us-west-2) before eu-central-1 or ap-southeast-1. Remember model access is per-Region — enabling a model in one Region does not enable it elsewhere. If a model is missing in your Region, use a cross-region inference profile or choose a Region that has it.

Q: How do I add streaming to a Bedrock call?

Use converse_stream instead of converse — the request body is identical. The response carries a stream of events; iterate it and pull text from each contentBlockDelta event as it arrives (event["contentBlockDelta"]["delta"]["text"]). This renders tokens live, which is what you want for any user-facing UI. The matching IAM action is bedrock:ConverseStream (the low-level equivalent is InvokeModelWithResponseStream).

Q: How much does it cost to set up and test Bedrock?

Setup itself is free — there is no platform fee and nothing runs until you call a model. You then pay per token: small models cost a fraction of a cent per 1,000 tokens, so a quickstart of a few test calls costs pennies. Real workloads scale fast, which is why routing to small models, batch (~50% off), and prompt caching matter. AWS credits can fund the bill entirely — Bedrock/GenAI POC ($10K–$50K), Activate Portfolio (up to $100K), and the GenAI Accelerator (up to $1M); CloudRoute routes you to a partner who files them, and you pay $0.

A practical, copy-paste quickstart for developers: the prerequisites, installing the AWS CLI and boto3, wiring up credentials and a least-privilege IAM policy, enabling model access in the console, then making your first Bedrock Converse API call — with full code for handling the response and switching to streaming. No GPUs to provision, no model-serving stack to run. Roughly fifteen minutes from an empty AWS account to a working completion.

Fund your first Bedrock workload with AWS credits →→ jump to the first Converse call

time to first call

~15 min

servers to provision

SDK to install

boto3

cost to start

per-token

TL;DR

Bedrock setup is four steps before you write code: have an AWS account, install and configure the AWS CLI, create least-privilege IAM credentials with bedrock:Converse / bedrock:InvokeModel permissions, and request model access per-model, per-Region in the Bedrock console (access is off by default). Then you install boto3 and call the bedrock-runtime client.
Your first call should use the Converse API, not the older InvokeModel — Converse gives one request/response schema across every chat model, so switching from Claude to Nova to Llama is a one-line modelId change. Streaming is the same call with converse_stream and a loop over the contentBlockDelta events.
Setup itself is free; you pay per token once you start calling models. Small models cost a fraction of a cent per 1K tokens, so a quickstart costs pennies — but real GenAI workloads scale fast. CloudRoute routes you to AWS credits (Bedrock/GenAI POC $10K–$50K, Activate Portfolio up to $100K, GenAI Accelerator up to $1M) and vetted partners to build it; you pay $0.

before you start

IPrerequisites — what you need before the first call

Bedrock setup has a short list of prerequisites, and getting them straight up front saves the two most common stumbles: calling a Region where the model is not enabled, and using credentials that lack Bedrock permissions. None of this requires infrastructure — there are no GPUs, instances, or clusters to stand up.

At a high level you need five things, and the rest of this page walks through each in order. First, an AWS account with permission to manage IAM and Bedrock (your own account, or an admin who can grant you access). Second, the AWS CLI v2 installed locally, used to store credentials and sanity-check that they work. Third, a set of programmatic credentials — ideally a short-lived role via IAM Identity Center / SSO, or an IAM user with an access key for a pure quickstart — carrying a least-privilege Bedrock policy. Fourth, model access requested in the Bedrock console for the specific models you intend to call, in the Region you intend to call them; Bedrock ships with every model off by default. Fifth, a runtime — this guide uses Python with the AWS SDK, boto3, but the same flow maps directly to the JavaScript, Java, Go, and .NET SDKs.

A few decisions are worth making deliberately before you touch a keyboard. Choose a Region first. Frontier models frequently land in US Regions (us-east-1, us-west-2) before they appear in eu-central-1 or ap-southeast-1, and your prompts and completions are processed in the Region you call — so pick a Region that has the model you want and satisfies any data-residency requirement you have. Decide which model you are starting with. A sensible default for a quickstart is one cheap, fast chat model (for example a small Claude or Amazon Nova model) so your first iterations cost pennies; you can swap the model ID later without rewriting anything. Avoid root credentials. Never use your AWS account root user for day-to-day API calls — create an IAM identity scoped to exactly what you need, which the next sections cover.

An AWS account (with IAM + Bedrock admin reach) — Your own account or one where an admin can grant you IAM and Bedrock permissions. A brand-new account works fine.
AWS CLI v2 installed locally — Used to configure and verify credentials. Not strictly required to call Bedrock from code, but the fastest way to confirm your credentials resolve.
Programmatic credentials with a Bedrock policy — Short-lived SSO role (preferred) or an IAM user access key (simplest for a quickstart), carrying the least-privilege policy in section III.
Model access enabled in the Bedrock console — Requested per-model, per-Region. Off by default. Most models are granted in seconds; some require accepting the provider EULA.
Python 3.9+ and boto3 (or another AWS SDK) — This guide uses Python + boto3. The Converse flow is identical in the JS/Java/Go/.NET SDKs.

the most common first-call failure

If your very first call throws AccessDeniedException, it is almost always one of two things: (1) the IAM principal lacks bedrock:Converse / bedrock:InvokeModel, or (2) you have not requested model access for that model in that Region. Both are fixed in minutes — sections III and IV. A ValidationException about the model usually means a wrong or stale modelId; copy the exact current ID from the console.

step 1

IIInstall the AWS CLI and boto3

Two installs get you ready: the AWS CLI v2 (to hold and verify credentials) and the boto3 SDK (to make the calls from Python). Both are a single command on every major platform.

Install the AWS CLI v2 using the method for your OS, then confirm the version. On macOS the simplest route is Homebrew; on Linux the bundled installer; on Windows the MSI from the AWS docs. The CLI is what reads and writes the shared credentials file that boto3 (and every other AWS SDK) picks up automatically, so installing it now means you do not hard-code keys in your code later.

Then create an isolated Python environment and install boto3, the AWS SDK for Python. Bedrock is exposed through two clients in boto3: bedrock (the control plane — listing models, managing custom models, configuring features) and bedrock-runtime (the data plane — actually invoking models with Converse / InvokeModel). For making completions you want bedrock-runtime; this is the single most common point of confusion in setup, so internalize it now: you call models on the bedrock-runtime client, not the bedrock client.

Install and verify the AWS CLI

Pick the install command for your platform, then verify it resolves. A successful aws --version printing aws-cli/2.x confirms the CLI is on your PATH and ready to configure.

Install boto3 in a virtual environment

Keeping boto3 in a per-project virtual environment avoids version clashes and makes the project reproducible. boto3 pulls in botocore, which carries the Bedrock service models, so a recent boto3 is all you need for the Converse API.

install commands (copy-paste)

# macOS (Homebrew)
brew install awscli

# Linux (x86_64)
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o awscliv2.zip
unzip awscliv2.zip && sudo ./aws/install

# verify the CLI
aws --version # expect: aws-cli/2.x ...

# Python env + boto3
python3 -m venv .venv && source .venv/bin/activate
pip install --upgrade boto3
python -c "import boto3, botocore; print(boto3.__version__)"

step 2

IIIConfigure credentials and a least-privilege IAM policy

Bedrock is a standard AWS service governed by IAM. Setup here is two parts: give boto3 a way to find credentials, and make sure those credentials carry exactly the Bedrock permissions you need and nothing more. Do this once and every SDK on the machine inherits it.

There are two good ways to provide credentials, and one you should avoid. The avoid-it option is pasting an access key into your source code — it leaks. The simplest quickstart option is an IAM user with an access key, stored via aws configure, which writes a profile into the shared credentials file at ~/.aws/credentials; boto3 reads it automatically with no code. The production-grade option is short-lived credentials from IAM Identity Center (SSO) via aws configure sso or an assumed role, so nothing long-lived sits on disk. For a first call either works; for anything beyond a sandbox, prefer SSO/roles.

However you authenticate, the principal needs a Bedrock policy. The least-privilege starting point grants only the runtime actions you will use — bedrock:InvokeModel, bedrock:InvokeModelWithResponseStream (for streaming), and bedrock:Converse / bedrock:ConverseStream — plus, optionally, the read-only control-plane action bedrock:ListFoundationModels so you can enumerate available models. You can scope Resource down to specific model ARNs to restrict exactly which models a service may call; the example below starts permissive on resource for a quickstart and notes where to tighten it.

Every Bedrock call is recorded in CloudTrail, and you can additionally enable Bedrock model-invocation logging to capture full request and response payloads to S3 or CloudWatch — useful for debugging and audit, and worth turning on early. After you set credentials, verify they resolve before involving Bedrock at all: aws sts get-caller-identity should print your account and principal ARN. If that fails, fix credentials first; it has nothing to do with Bedrock yet.

Option A — IAM user access key (simplest quickstart)

Create an IAM user, attach the least-privilege Bedrock policy below, generate an access key, then run aws configure and paste the key, secret, default Region, and output format. boto3 will pick up the default profile automatically.

Option B — IAM Identity Center / SSO (preferred)

Run aws configure sso, authenticate in the browser, and select the account and permission set that carries the Bedrock policy. This issues short-lived credentials that refresh automatically — no long-lived secret on disk. Pass the profile to boto3 with a named session or the AWS_PROFILE environment variable.

least-privilege Bedrock IAM policy + credential check

# Least-privilege Bedrock policy (JSON)
{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": [
      "bedrock:Converse",
      "bedrock:ConverseStream",
      "bedrock:InvokeModel",
      "bedrock:InvokeModelWithResponseStream"
    ],
    "Resource": "*"  # tighten to specific model ARNs in production
  }]
}

# Configure + verify (Option A)
aws configure  # paste key, secret, region (e.g. us-east-1), json
aws sts get-caller-identity  # prints your account + principal ARN

step 3

IVEnable model access in the Bedrock console

This is the step most setup guides skip and most first calls trip over. Bedrock does not give you any model the moment you open the console — access is opt-in, requested per model and per Region. Until you enable a model, every call to it returns AccessDeniedException, regardless of your IAM policy.

The reason access is gated is governance: AWS wants every model your organization can call to be a deliberate, auditable choice, and several models carry provider end-user license terms you must accept first. To enable a model, open the Bedrock console, switch to the Region you plan to use (top-right Region selector — this matters, access is per-Region), and go to Model access in the left navigation. Select the specific models you intend to call — for a quickstart, one small chat model is enough — and submit the request. For most models access is granted within seconds to a couple of minutes; models that require accepting a EULA prompt you to do so before granting.

Two facts are worth burning in. First, access is per-Region: enabling Claude in us-east-1 does not enable it in eu-west-1: if you switch Regions later, request access again there. Second, availability varies by Region: a model you see in us-east-1 may not yet be offered in eu-central-1 or ap-southeast-1 at all. If a model you want is not listed in your chosen Region, either pick a Region that has it or use a cross-region inference profile (a related capability that lets one request be served from one of several Regions within a geography). You can confirm what is actually enabled for your account and Region from code with ListFoundationModels on the bedrock (control-plane) client.

Once the status for your chosen model reads Access granted, copy its exact model ID from the console — model IDs are specific strings (and some calls expect an inference-profile ID rather than a bare model ID). Using the precise current ID is what prevents the ValidationException that comes from a guessed or outdated identifier. With access granted and the ID in hand, you are ready to make a call.

Open the Bedrock console and select your target Region (top-right) — access is granted per-Region.
Go to Model access in the left navigation.
Select the model(s) you want (start with one small, cheap chat model) and submit the request.
Accept the provider EULA if prompted; wait for status to read Access granted (usually seconds).
Copy the exact model ID (or inference-profile ID) shown in the console — you will paste it into modelId.
Optional: from code, call ListFoundationModels on the bedrock client to confirm what is available in that Region.

step 4

VYour first Converse API call

With credentials resolving and model access granted, a working completion is about ten lines of Python. Use the Converse API, not the older InvokeModel — Converse gives you one consistent request and response schema across every chat model, so the same code works for Claude, Nova, Llama, Mistral, and Cohere with only the modelId changed.

You make completions on the bedrock-runtime client. The converse call takes three things you care about at the start: a modelId (the exact string you copied from the console), a messages list (each message has a role of user or assistant and a content list of blocks — a text block is {"text": "..."}), and an inferenceConfig (generation settings such as maxTokens and temperature). You can also pass a system prompt as a top-level argument to steer behavior. That uniform shape is the whole point of Converse: there is no provider-specific request body to assemble.

The example below is a complete, runnable first call. It creates the runtime client bound to your Region, sends one user message, and prints the model's text. The only line you ever change to switch models is modelId — everything else stays identical, which is what makes model evaluation and cost-driven model routing trivial later. Region and credentials are picked up from the environment / shared config you set in section III, so there are no secrets in the code.

Multi-turn conversations

To continue a conversation, append the model's reply (the output.message object) to your messages list and add the next user message, then call converse again. Bedrock is stateless between calls — you resend the running message history each turn, which is exactly the pattern that makes prompt caching worthwhile once histories grow long.

first Converse call — complete, runnable (python / boto3)

import boto3

brt = boto3.client("bedrock-runtime", region_name="us-east-1")

resp = brt.converse(
  modelId="anthropic.claude-haiku",  # paste the EXACT id from the console; swap to switch models
  system=[{"text": "You are a concise assistant."}],
  messages=[
    {"role": "user", "content": [{"text": "Give me three uses for Amazon Bedrock."}]}
  ],
  inferenceConfig={"maxTokens": 512, "temperature": 0.2},
)

print(resp["output"]["message"]["content"][0]["text"])

# The same call works for Nova, Llama, Mistral, Cohere — only modelId changes.

step 5

VIHandling responses, streaming, and errors

A robust setup reads more than just the text: it checks why the model stopped, tracks token usage for cost, switches to streaming for responsive UIs, and handles the handful of errors Bedrock actually throws. All of this is in the same Converse response shape.

The converse response is a structured object. The text lives at output.message.content[0].text. The stopReason tells you why generation ended — end_turn (the model finished), max_tokens (you hit your limit, the answer may be truncated, so raise maxTokens), or tool_use (the model wants to call a tool you defined). Crucially for cost, usage reports inputTokens, outputTokens, and totalTokens for the call — log these from day one, because they are how you reason about and forecast spend. A response that ends with max_tokens is the most common "why is my output cut off?" surprise in early setup.

For anything user-facing, switch to streaming so tokens render as they are generated instead of after a multi-second wait. Streaming is the same request via converse_stream; the response carries a stream of events, and you iterate it, pulling text from the contentBlockDelta events as they arrive. The request body is byte-for-byte the same as the non-streaming call — only the method name and the way you read the response change — so adding streaming to a working first call is a tiny diff. (The matching IAM action is bedrock:ConverseStream, already in the section III policy.)

Finally, handle the small set of errors Bedrock surfaces so failures are legible rather than cryptic. The four you will meet most are AccessDeniedException (missing IAM permission or model access not enabled — see sections III/IV), ValidationException (a bad request, most often a wrong or stale modelId), ThrottlingException (you exceeded the model's request/token rate — back off and retry, or move steady high volume to Provisioned Throughput), and ModelTimeoutException / ModelErrorException (a transient model-side issue — retry with backoff). Wrapping calls in a try/except and retrying throttles and timeouts with exponential backoff is the difference between a demo and something that survives real traffic.

streaming + reading the response (python / boto3)

# Streaming: same request, iterate the event stream
stream = brt.converse_stream(
  modelId="anthropic.claude-haiku",
  messages=[{"role": "user", "content": [{"text": "Explain Bedrock in 5 lines."}]}],
  inferenceConfig={"maxTokens": 512},
)["stream"]

for event in stream:
  if "contentBlockDelta" in event:
    print(event["contentBlockDelta"]["delta"]["text"], end="", flush=True)

# Non-streaming: inspect stopReason + token usage for cost
print(resp["stopReason"])  # end_turn | max_tokens | tool_use
print(resp["usage"])    # {inputTokens, outputTokens, totalTokens}

where to go next

VIINext steps — from a first call to a real workload (and who pays for it)

A working Converse call is the foundation, not the finish line. The path from here splits into capability (what you build on top) and economics (what it costs and how to fund it). Both matter from the first week.

On capability, Bedrock turns raw inference into applications through managed features you configure rather than build. The most common next moves: add tool use (function calling) so the model can call your code, then graduate to Agents for multi-step task orchestration; point a Knowledge Base at documents in S3 for managed retrieval-augmented generation with citations; wrap the model in Guardrails to filter harmful content and redact PII; and use model evaluation to choose the right model on your own data instead of vendor benchmarks. Deep dives live at Bedrock Agents, Knowledge Bases, RAG on AWS, and Guardrails. For the full platform map, the flagship reference is Amazon Bedrock — the complete guide, with model-specific setup at Claude on Bedrock and access mechanics at how to access Amazon Bedrock.

On economics, Bedrock is cheap per call and expensive in aggregate. A quickstart costs pennies; a retrieval-augmented assistant that resends a large system prompt and retrieved context on every turn, serving thousands of users, can move into five or six figures a month faster than teams expect — especially if every call hits a frontier model. The levers are concrete: route cheap, high-volume calls to small models and escalate only hard steps to frontier models; run latency-tolerant work as batch (~50% cheaper); enable prompt caching so repeated context is not re-billed at full price every turn; and reserve Provisioned Throughput only once volume is high and steady. The API mechanics live at the Bedrock API reference and the numbers at the pricing breakdown in the main guide.

The other lever is funding the bill with AWS's money rather than your own. AWS runs credit programs built precisely for teams standing up generative AI on Bedrock — dedicated Bedrock / GenAI proof-of-concept funding ($10K–$50K) for a defined build, Activate Portfolio (up to $100K) for institutionally-funded startups, and the competitive Generative AI Accelerator (up to $1M) for AI-first companies. These pools are largely partner-filed and invisible on the public Activate page. This is exactly what CloudRoute does: we route you to a vetted AWS partner who files the credit application and, if you want hands, builds the Bedrock workload with you — and because AWS funds both the credits and the partner engagement, you pay $0. See AWS credits for generative-AI startups, AWS PoC / Bedrock POC funding, and $100K AWS credits.

bedrock setup checklist · zero → first call → production

Phase	What you do	Tooling	Done when
Prereqs	Account, pick Region + starter model	AWS console	Region chosen, model decided
Install	AWS CLI v2 + boto3	CLI, pip, venv	aws --version + import boto3 work
Credentials	IAM creds + least-privilege policy	aws configure / SSO, IAM	aws sts get-caller-identity prints your ARN
Model access	Request access per-model, per-Region	Bedrock console	Status reads "Access granted"
First call	converse() on bedrock-runtime	boto3	You print a completion
Harden	Streaming, usage logging, error handling	boto3, CloudTrail	Streams, retries throttles, logs tokens
Scale	Routing, batch, caching, credits	Bedrock features + AWS credits	Cost controlled and/or credit-funded

Setup (prereqs → first call) is typically ~15 minutes. Hardening and scaling are ongoing. The single biggest early cost win is routing cheap calls to small models and turning on prompt caching before traffic grows.

which API for the first call

Converse vs InvokeModel — what to use when setting up

During setup the first real choice is which API to call. For a chat-style first call the answer is almost always Converse — it removes provider-specific request bodies and makes switching models a one-line change. This is the scannable rule.

Dimension	Converse API (recommended)	ConverseStream	InvokeModel (low-level)
Request/response shape	One schema across all chat models	One schema, streamed	Provider-specific JSON per model
Switching models	Change modelId only	Change modelId only	Rewrite the body per provider
Multi-turn + system prompt	First-class	First-class	You assemble it manually
Tool use (function calling)	Built-in	Built-in	Provider-specific
Streaming output	No (use ConverseStream)	Yes	Via InvokeModelWithResponseStream
Best for at setup	Your first chat call + most apps	Responsive / user-facing UIs	Image & embedding models, edge params
IAM action	bedrock:Converse	bedrock:ConverseStream	bedrock:InvokeModel

Default to Converse for setup and chat workloads. Use ConverseStream for anything a user watches render. Reach for InvokeModel only for non-conversational modalities (image, embeddings) or a provider-specific parameter Converse does not expose — its IAM action is bedrock:InvokeModel (and bedrock:InvokeModelWithResponseStream for its streaming variant).

setting up Bedrock for a real build?

Get AWS credits to fund your first Bedrock workload — and a vetted partner to build it. You pay $0.

Get matched in 24h →

a recent match

A Bedrock setup that became a funded build — anonymized

inquiry · seed-stage devtools startup, US

Seed-stage developer-tools startup, 7 people, two backend engineers, net-new to AWS, prototyping an AI code-review assistant

Situation: The team had a working Bedrock quickstart running on one engineer's laptop — a single Converse call against a small model — but no path from there to something they could ship. They needed proper IAM (not a personal access key), a Region decision, streaming for the IDE plugin, retrieval over a customer's repo, and a handle on cost before they pointed real traffic at it. With two backend engineers and no ML or AWS specialist, the gap between "first call works" and "production-ready" was the blocker, and they had no budget to burn on inference while they figured it out.

What CloudRoute did: Routed within 19 hours to a US-East AWS partner with a GenAI + developer-tools track record. The partner hardened the setup on Amazon Bedrock: a least-privilege IAM role scoped to specific model ARNs (replacing the laptop access key), ConverseStream wired into the IDE plugin, a Knowledge Base over the repo in S3 for grounded code review, Guardrails for secret/PII redaction, and model routing (a small model for triage, a frontier model only for the hard review steps) with prompt caching on the shared system prompt. In parallel the partner filed a Bedrock/GenAI proof-of-concept credit application and an Activate Portfolio application.

Outcome: GenAI POC credits ($25K) approved in under two weeks and Portfolio ($100K) shortly after — the first several months of inference were fully credit-funded while the product found traction. Streaming code-review assistant in private beta in 4 weeks, cost per review down roughly 60% from routing plus caching versus the all-frontier-model prototype. CloudRoute's commission was paid by the partner from AWS engagement funding; the customer paid $0.

time-to-match: < 24h · credits secured: $125K · cost/review: ~60% lower · cost to customer: $0

faq

Common questions

How do I set up Amazon Bedrock from scratch?

Four steps before code, then one call. (1) Have an AWS account and pick a Region that offers the model you want. (2) Install the AWS CLI v2 and configure credentials with aws configure (or SSO), attaching a least-privilege IAM policy with bedrock:Converse and bedrock:InvokeModel. (3) In the Bedrock console, go to Model access and request the specific model in that Region — it is off by default. (4) Install boto3 and call converse() on the bedrock-runtime client with your modelId. From an empty account this is roughly 15 minutes.

What IAM permissions do I need to call Bedrock?

At minimum the runtime actions you use: bedrock:Converse and bedrock:ConverseStream for the Converse API, and bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream for the lower-level API and its streaming variant. Add the read-only bedrock:ListFoundationModels if you want to enumerate models from code. Scope the policy's Resource down to specific model ARNs in production so a principal can call only the models it needs. Note that IAM permission is separate from model access — you need both.

Why do I get AccessDeniedException on my first Bedrock call?

Almost always one of two causes. Either your IAM principal is missing the Bedrock action (e.g. bedrock:Converse), or — more commonly — you have not requested model access for that model in that Region. Bedrock ships with every model disabled; enable it under Model access in the console for the exact Region you are calling. Both fixes take a couple of minutes. If instead you see ValidationException about the model, the modelId is wrong or stale — copy the exact current ID from the console.

Do I need to install anything besides boto3?

Strictly, no — boto3 (which bundles botocore with the Bedrock service models) is enough to call the API from Python, and credentials can come from environment variables. In practice you also want the AWS CLI v2, because it is the easiest way to store credentials with aws configure and to verify them with aws sts get-caller-identity before involving Bedrock at all. For other languages, use the corresponding AWS SDK (JavaScript, Java, Go, .NET) — the Converse flow is identical.

Should I use the Converse API or InvokeModel for my first call?

Use Converse. It gives one consistent request and response schema across every chat model, with built-in multi-turn conversation, system prompts, tool use, and (via converse_stream) streaming — so switching models is just changing modelId. InvokeModel is the older low-level API with provider-specific request bodies; reach for it only for non-conversational modalities like image or embedding models, or for a provider-specific parameter Converse does not expose.

Which Region should I set up Bedrock in?

Pick a Region on three criteria: it offers the model you want, it meets any data-residency requirement (your prompts and completions are processed in the Region you call), and it is close to your users for latency. Frontier models often appear in US Regions (us-east-1, us-west-2) before eu-central-1 or ap-southeast-1. Remember model access is per-Region — enabling a model in one Region does not enable it elsewhere. If a model is missing in your Region, use a cross-region inference profile or choose a Region that has it.

How do I add streaming to a Bedrock call?

Use converse_stream instead of converse — the request body is identical. The response carries a stream of events; iterate it and pull text from each contentBlockDelta event as it arrives (event["contentBlockDelta"]["delta"]["text"]). This renders tokens live, which is what you want for any user-facing UI. The matching IAM action is bedrock:ConverseStream (the low-level equivalent is InvokeModelWithResponseStream).

How much does it cost to set up and test Bedrock?

Setup itself is free — there is no platform fee and nothing runs until you call a model. You then pay per token: small models cost a fraction of a cent per 1,000 tokens, so a quickstart of a few test calls costs pennies. Real workloads scale fast, which is why routing to small models, batch (~50% off), and prompt caching matter. AWS credits can fund the bill entirely — Bedrock/GenAI POC ($10K–$50K), Activate Portfolio (up to $100K), and the GenAI Accelerator (up to $1M); CloudRoute routes you to a partner who files them, and you pay $0.

Past the first Converse call? Let AWS credits fund the rest.

CloudRoute routes you to a vetted AWS partner who files your Bedrock/GenAI credit application (Bedrock/GenAI POC $10K–$50K, Activate Portfolio up to $100K, GenAI Accelerator up to $1M) and, if you need hands, builds the workload with you. AWS funds the credits and the engagement. You pay $0.

Get matched in 24h →→ see the data & AI persona detail

matched within< 24h

GenAI credit ceilingup to $1M

cost to you$0