A complete, neutral build guide for generating images with AI on AWS in 2026, end to end through Amazon Bedrock: which model to pick (Stability AI Stable Image, Amazon Nova Canvas, Amazon Titan Image Generator), the InvokeModel image API and request body, prompting that actually works, image-to-image / variations / inpainting / outpainting, batch generation for volume, storing outputs in Amazon S3 and serving them over CloudFront, what each image really costs, content safety and invisible watermarking, and the marketing and e-commerce use cases that justify the build — plus how AWS credits make the whole thing $0.
There are two ways to generate images with AI on AWS: rent GPU instances and run an open image model yourself, or call a managed image model through Amazon Bedrock. For almost every product team the managed path wins, and understanding why frames the whole rest of this build.
The self-hosted path means provisioning GPU instances (the scarce, expensive accelerator families), installing and serving an open image model, building autoscaling so you are not paying for idle GPUs between bursts, and owning the operational tail — driver versions, model weights, cold starts, queueing. It buys you maximum control and can be cheaper at very high, very steady volume, but it is a real infrastructure project before you generate a single useful image.
The Amazon Bedrock path collapses all of that. Bedrock is AWS's fully-managed service for calling foundation models through one API, and image models are first-class citizens on it. There are no GPUs, servers, or clusters for you to run — AWS operates the inference fleet behind the API. You request access to an image model, call it with a prompt, and get an image back, billed per generated image. The same IAM, VPC (PrivateLink), KMS, and CloudTrail controls that govern the rest of your AWS account apply, and your prompts and generated images stay in your account and region and are not used to train the base models.
The second advantage is model choice without re-integration. Bedrock hosts three image-model families side by side — Stability AI (the Stable Diffusion / Stable Image line), Amazon Nova Canvas (Amazon's current image model in the Nova family), and the older Amazon Titan Image Generator. Because they all sit behind the same Bedrock invoke path, moving a request from one model to another is largely a change of model ID. You can prototype on the cheapest model, escalate hero images to the highest-quality one, and switch providers entirely if a new model ships — without rebuilding your pipeline.
The honest framing: choose self-hosted GPUs only if you have a specific open model, a niche fine-tune Bedrock does not offer, or sustained volume high enough that reserved GPU capacity clearly beats per-image pricing — and even then, weigh it against Bedrock Provisioned Throughput, which reserves dedicated image-model capacity without you operating GPUs. For everyone else, Bedrock is the default, and the rest of this page is the build on top of it.
An AI image generator on AWS = Amazon Bedrock (managed image models — Stability AI, Nova Canvas, Titan Image Generator) for generation, Amazon S3 for storage, Amazon CloudFront for delivery, optional AWS Lambda / batch for orchestration, and Bedrock Guardrails + invisible watermarking for safety — no GPUs to manage, billed per image, your data stays in your account.
The first and most consequential decision is which image model generates your pictures. Bedrock gives you three families, and the right answer is workload-specific — the practical discipline is to generate the same handful of representative prompts on each and compare quality, editing support, and per-image cost before committing.
A short orientation to the three families. Stability AI supplies the Stable Diffusion / Stable Image line: the SDXL-era Stable Diffusion model plus the newer Stable Image generation tiers — Stable Image Core (fast, low-cost), Stable Diffusion 3.x (strong general default with good prompt adherence and typography), and Stable Image Ultra (highest quality, for photorealism and hero assets). It is the deepest third-party image lineup on Bedrock and a strong choice when you want its particular aesthetic or its draft-to-final tiering. Amazon Nova Canvas is Amazon's current first-party image model, part of the Nova family, with built-in editing (including inpainting and outpainting) and built-in invisible watermarking for provenance. Amazon Titan Image Generator is Amazon's older first-party image model — still capable, with mature editing (inpaint, outpaint, background replacement, variation) and invisible watermarking — but for most new builds Nova Canvas supersedes it as the Amazon option.
The decision rarely comes down to one model for the whole product. The dominant cost-and-quality pattern mirrors text-model routing: use a cheap, fast tier for the high-volume majority — drafts, thumbnails, contact sheets, iteration — and reserve a high-quality tier for the small minority of hero and final assets where quality dominates and the higher per-image price is justified. A generate-many-cheap-drafts, finish-the-chosen-one-on-a-premium-model workflow is easy to build precisely because switching is a model-ID change.
For the deep single-model references, see the sibling pages: Stable Diffusion on Amazon Bedrock for the Stability line, Amazon Nova (which includes Nova Canvas) for Amazon's current image model, and Amazon Titan for the older Titan Image Generator. The table below is the build-time summary.
| Model family | Provider | Editing (inpaint/outpaint) | Watermark | Relative cost | Reach for it when |
|---|---|---|---|---|---|
| Stable Image Core | Stability AI | Limited (gen-focused) | Provider-dependent | Lowest | High-volume drafts, thumbnails, fast iteration |
| Stable Diffusion 3.x | Stability AI | Via edit endpoints | Provider-dependent | Mid | Strong general default — prompt adherence, composition, text |
| Stable Image Ultra | Stability AI | Via edit endpoints | Provider-dependent | Highest (Stability) | Hero images, photorealism, final marketing assets |
| Amazon Nova Canvas | Amazon | Yes (built-in) | Built-in (invisible) | Low–mid | Amazon-native image gen + editing with provenance |
| Amazon Titan Image Generator | Amazon | Yes (mature: + background) | Built-in (invisible) | Low–mid | Create-then-edit pipelines; legacy Titan workloads |
Image generation on Bedrock uses the per-model InvokeModel action, not the conversational Converse API. The path from a fresh account to a first generated image is short: enable model access, attach a scoped IAM policy, get the current model ID, and POST a model-specific JSON body. Here is the mechanic.
The reason image generation uses InvokeModel rather than Converse is that Converse standardizes a chat schema across text/multimodal models, while image models take a model-specific request body (prompt plus generation parameters) and return an image payload. So image generation is the per-model invoke path: you choose the model with its model ID and send the body that model expects.
In the Bedrock console, open Model access and request the image models you intend to use (the Stability AI entries, Amazon Nova Canvas, and/or Titan Image Generator). For first-party Amazon models access is typically immediate; Stability models are usually granted quickly too. There is no charge for enabling access — you only pay when you generate an image. Access is per-region, so enable each model in every region you will call from, and confirm the model is actually offered in that region (image-model availability is narrower than the big text models).
The principal making the call (a Lambda function, ECS task, or backend role) needs permission for bedrock:InvokeModel on the specific image-model ARNs you use. Scope the policy to those model IDs rather than granting blanket Bedrock access. Every call is logged to CloudTrail, traffic can be kept off the public internet via VPC endpoints (PrivateLink), and you can encrypt with your own KMS keys — the same governance that covers the rest of your Bedrock usage applies to image generation, so prompts and outputs stay inside your account and region.
Every model is invoked by a model-ID string identifying provider, model, and version — namespaced under the provider (for example stability.…, amazon.nova-canvas-…, or amazon.titan-image-generator-… with a version suffix). IDs advance as new model versions ship, so read the current value from the Bedrock model catalog in the console or list it via the API/CLI, and treat it as configuration rather than a literal in code. Switching models later is then a config change.
Call InvokeModel with a JSON body containing the prompt and the model's generation parameters; you receive the generated image, typically as a base64-encoded payload, in the response, which you then decode and store. The body schema differs by model: the SDXL-era Stability model exposes parameters like cfg_scale, steps, seed, style_preset, and dimensions; the newer Stable Image and the Amazon models take a simpler body centered on the prompt, an aspect ratio or dimensions, an optional negative prompt, a seed, the number of images, and an output format. Read the chosen model's parameter reference for the exact fields. The shape of a call (in Python with boto3) is below.
import boto3, json, base64
brt = boto3.client("bedrock-runtime", region_name="us-east-1")
body = json.dumps({
"prompt": "studio product photo of a ceramic coffee mug, soft daylight, white seamless background",
"negative_prompt": "text, watermark, hands",
"aspect_ratio": "1:1", "seed": 42, "output_format": "png",
})
resp = brt.invoke_model(modelId="<current-image-model-id>", body=body) # read ID from the model catalog
payload = json.loads(resp["body"].read())
img_b64 = payload["images"][0] # field name varies by model
open("out.png", "wb").write(base64.b64decode(img_b64))
The exact request fields and the response field holding the image differ by model — copy the current model ID and parameter names from the Bedrock model catalog / parameter reference. The pattern (InvokeModel → JSON body → base64 image out) is the same across image models.
Model choice sets the ceiling; prompting determines how close you get to it. The newer image models reward clear, structured natural-language prompts more than the keyword-soup style older Stable Diffusion communities relied on. A few durable practices get you most of the way, and they are the same regardless of which Bedrock model you call.
Write the prompt as a description of the finished image, in plain language, ordered from the most important element to the least: subject first, then the action or pose, then setting, then lighting, then style and medium, then camera or composition details. "A ceramic coffee mug on a marble counter, morning light from the left, shallow depth of field, product photography, 50mm" gives the model far more to work with than a pile of disconnected tags. The newer Stable Image and Nova/Titan models in particular are notably better at following multi-part prompts and rendering legible typography than older models, so spelling out the structure pays off.
Lean on the controls the models expose. A negative prompt tells the model what to exclude (extra fingers, text artifacts, a cluttered background) and is one of the highest-leverage knobs. A seed makes generation reproducible — fix the seed to iterate on a prompt while holding the composition steady, or vary it to explore alternatives. Aspect ratio / dimensions should match the asset's end use (square for catalog tiles, wide for banners). On the SDXL-era Stability model you also get cfg_scale (how strictly to follow the prompt — higher is more literal, lower is more creative), a step count (more steps = more refinement and more cost), and named style presets.
Treat prompting as an iteration loop, not a one-shot. Generate a small contact sheet of candidates per prompt (varying the seed), pick the best composition, then refine — tighten the prompt, adjust the negative prompt, or move to image-to-image to evolve the chosen image. For production, store your best prompts as reusable templates with slots (subject, setting, style) so non-experts on the team can generate on-brand imagery without re-discovering what works. A practical, model-agnostic prompting walkthrough lives in the cornerstone guide linked at the foot of this page.
A real image generator is rarely text-in, one-image-out. The capabilities that make it useful for production are the editing and transformation operations — taking an existing image and changing, extending, or varying it. Exactly which operations a model exposes varies, so confirm support for the model you pick.
These operations compose into the real production loop: generate a base image, produce variations to choose from, inpaint to fix or swap details, outpaint to fit new formats, and use image-to-image to keep a set on-brand. Designing your pipeline around "generate then edit" rather than "generate only" is what turns a demo into something a marketing or catalog team will actually use day to day. Availability differs by model — some edits are separate endpoints or require a dedicated edit model — so verify the operation set for your chosen model on Bedrock before building UI around it.
Generating one image is an API call; running an image generator as a product means generating many, putting them somewhere durable, and delivering them fast. This is the AWS-native plumbing around the model — and it is where the build becomes a real pipeline.
For high-volume, latency-tolerant work — generating a catalog of product variants overnight, pre-rendering a library of marketing assets, enriching a dataset — do not fan out thousands of synchronous calls from a web request. Drive generation from a queue: enqueue prompts (for example to Amazon SQS), have AWS Lambda or AWS Batch workers pull jobs and call InvokeModel with controlled concurrency, and write results to S3. This decouples spikes from your app, lets you respect Bedrock throughput limits, and makes retries clean. (Note that Bedrock's named "Batch inference" feature is primarily a text/token concept; for images, "batch" means your own asynchronous, queue-driven generation pattern.) If sustained volume is high and steady, Provisioned Throughput can reserve dedicated image-model capacity for predictable cost and latency.
The natural home for generated images is Amazon S3. Decode the base64 payload from InvokeModel and write the object to a bucket, with a key scheme that encodes what you need to find it later (for example generated/{tenant}/{prompt-hash}/{seed}.png). Store the generation metadata alongside it — prompt, negative prompt, model ID, seed, parameters — either as S3 object metadata/tags or in a small DynamoDB table, so any image is reproducible and auditable. Use S3 lifecycle rules to expire throwaway drafts and keep finals, enable default encryption (SSE-S3 or SSE-KMS), and keep the bucket private — generated images should not be world-readable by default.
Serve from S3 through Amazon CloudFront (the CDN) for low-latency global delivery and caching, with the S3 bucket kept private behind an origin access control so objects are only reachable via the distribution. For private or per-user assets, generate S3 presigned URLs (or signed CloudFront URLs) that grant time-limited access to a specific object without making the bucket public. A common shape: backend writes the image to S3, records metadata, and returns a presigned or CloudFront URL to the client. If you need on-the-fly resizing or format conversion (WebP/AVIF for the web), add a transformation step — Lambda@Edge, CloudFront Functions, or an image-optimization layer — between S3 and the viewer.
Request → queue (SQS) → worker (Lambda/Batch) → Bedrock InvokeModel → decode base64 → write to S3 (+ metadata in DynamoDB) → serve via CloudFront / presigned URL. Generation is per-image and serverless; storage is S3; delivery is CloudFront. This same skeleton scales from a prototype to a catalog-scale generator — you change concurrency and add Provisioned Throughput, not architecture.
Image generation on Bedrock is billed per generated image, not per token — a structural difference from the text models elsewhere in this cluster. There is no input/output token meter and no prompt caching; the unit of cost is one finished image at a given model and quality tier. Knowing the levers keeps the bill sane.
The cost of running the generator has two parts: generation (the per-image Bedrock charge) and the supporting services (S3 storage, CloudFront delivery, Lambda/queue compute, any DynamoDB metadata). Generation dominates at scale. For the SDXL-era Stability model, the per-image price scales with step count and resolution (more steps and pixels = more compute = higher price); the newer Stable Image tiers and the Amazon models use a flatter per-image price by model. The representative table below ranks the tiers so you can sanity-check a budget — it is for relative comparison, not an audited price sheet.
Three levers move the generation bill, in order of impact. First, model/tier choice — using a cheap tier for the high-volume majority and a premium tier only for finals can change the total severalfold. Second, resolution and step count — generate drafts small and cheap, render finals large. Third, and most often overlooked, how many candidate images you generate per prompt — asking for four variations costs four images, so the contact-sheet habit is a direct multiplier. The image analogue of text-model routing is "draft many cheap, finish few premium": generate a contact sheet on a low-cost model, then re-render only the chosen composition on a high-quality one.
| Model / tier | Billing basis | Representative price / image | Relative cost | Use in the pipeline |
|---|---|---|---|---|
| Stable Image Core | Flat per image | ~$0.04 | Lowest | Drafts, thumbnails, contact sheets |
| SDXL-era Stable Diffusion | Steps × resolution | ~$0.04 → ~$0.08 | Low–mid (tunable) | Controllable, cost-sensitive generation |
| Amazon Nova Canvas | Per image (by size/quality) | ~$0.04–$0.08 | Low–mid | Amazon-native gen + editing default |
| Amazon Titan Image Generator | Per image (by size/quality) | ~$0.01–$0.08 | Low–mid | Create-then-edit + background ops |
| Stable Diffusion 3.x | Flat per image | ~$0.06–$0.08 | Mid | Strong general default for finals |
| Stable Image Ultra | Flat per image | ~$0.12–$0.14 | Highest | Hero / photorealistic final assets |
Shipping generated imagery in a product means thinking about what the model can be prompted to produce, how you mark AI-generated output, and the licensing of what you generate. On Bedrock these are addressable with built-in features rather than bolted-on services.
Content safety. A user-facing generator needs guardrails on both the prompt and the output. Apply input filtering to block disallowed prompts, and consider an output-moderation pass — Amazon Bedrock Guardrails provides a configurable content-safety layer (and can be combined with image moderation via Amazon Rekognition for detecting unsafe imagery) so you are not relying solely on the model's own refusals. Define your policy explicitly (what categories are blocked), log decisions to CloudTrail, and keep a human-review path for edge cases. See Amazon Bedrock Guardrails for the safety layer in depth.
Watermarking and provenance. As AI-generated media becomes regulated, marking output matters. Amazon Nova Canvas and Amazon Titan Image Generator add a built-in invisible watermark to every image they generate, and Bedrock provides a way to detect it — useful for content authenticity, internal provenance, and compliance. (Industry provenance standards such as C2PA-style content credentials are increasingly relevant too.) If your use case has any regulatory or authenticity requirement, the built-in watermarking on the Amazon image models is a concrete reason to prefer them, or to add a watermarking step regardless of model.
Licensing and commercial use. Confirm the commercial-use terms for the specific model you generate with — the model providers set licensing terms for generated images, and they differ by provider and model. Because Bedrock runs the models inside AWS, your prompts and reference images are not used to train the base models and stay in your account and region, which resolves the data-governance side; the commercial-use side is a per-model license question to verify before you ship generated imagery in a paid product. When in doubt, check the model card and the provider's terms in the Bedrock console.
(1) Filter prompts and moderate outputs (Bedrock Guardrails + optionally Rekognition). (2) Watermark AI-generated images — built-in (invisible) on Nova Canvas and Titan Image Generator, with Bedrock detection. (3) Verify commercial-use licensing for the specific model. (4) Keep prompts and outputs in-region, encrypted, and logged (IAM + KMS + CloudTrail). Your data is never used to train the base models.
The build is justified by what it produces. Two use cases dominate the ROI for an AI image generator on AWS — marketing content production and e-commerce imagery — and both map cleanly onto the generate-then-edit pipeline above.
Both use cases share the same shape — templated generation, cheap drafts, premium finals, editing for fit, S3 + CloudFront for storage and delivery, and watermarking/safety for compliance. That is the point of building on Bedrock: one pipeline, swappable models, serving multiple use cases, with no GPU fleet to operate. The remaining question is usually not feasibility but funding the generation bill at volume — which is where AWS credits come in.
Marketing teams burn time and budget producing visual variations: ad creative in a dozen formats, social posts, blog headers, landing-page hero images, localized variants. An image generator turns that into a templated workflow — store on-brand prompt templates, generate a contact sheet of options on a cheap tier, pick and refine, then render finals on a high-quality tier, and use outpainting to adapt one approved composition into every ad format. The payoff is speed (hours not days), volume (test many creatives cheaply), and consistency (brand templates rather than ad-hoc prompts). The cost discipline from section VII keeps a high-volume creative pipeline economical.
E-commerce is the other heavy hitter. Background replacement puts a product shot on a clean white catalog background or into a lifestyle scene; inpainting swaps a product color or removes a blemish; image-to-image generates on-brand variants of a hero shot across a catalog; outpainting adapts one shot into thumbnail, gallery, and banner formats. For large catalogs this is exactly the batch-generation pattern — drive thousands of product variants through a queue overnight, store each in S3 keyed by SKU, and serve over CloudFront. It replaces a slow, expensive photo-and-retouch pipeline for a large fraction of catalog imagery, with the human team focused on the hero shots that still warrant a real photographer.
Per-image generation is cheap individually and meaningful in aggregate: a catalog-scale or high-volume marketing generator can run into four or five figures a month once it is real. The good news is that image generation on Bedrock is fully credit-eligible, and AWS runs credit programs designed for exactly this kind of GenAI build.
AWS funds generative-AI builds through several pools, most of them partner-filed and invisible on the public Activate page: Activate Portfolio (up to $100K) for institutionally-funded startups, dedicated Bedrock / GenAI proof-of-concept funding ($10K–$50K) for a defined GenAI build, and the competitive Generative AI Accelerator (up to $1M) for AI-first companies. Credits apply against your AWS bill — including per-image Bedrock generation, S3, CloudFront, and the rest of the pipeline — so a credit pool can cover the first several months of an image generator outright.
This is exactly what CloudRoute does. We route you to a vetted AWS partner who files the credit application via the AWS ACE program and, if you want hands, builds the image pipeline with you — model selection and routing, the InvokeModel integration, S3 storage and CloudFront delivery, batch generation, and content-safety/watermarking. Because AWS funds both the credits and the partner engagement, the customer pays $0 — there is no invoice from CloudRoute or the partner in your path. See AWS credits for generative-AI startups, AWS PoC / Bedrock POC funding, and $100K AWS credits for the mechanics.
The central build decision is which image model generates your pictures. This is the scannable map of the three families by relative cost, editing support, watermarking, and what each is for. Cost is relative ($ cheapest → $$$$ premium); exact per-image rates live on the AWS pricing page, and a head-to-head eval on your own prompts beats any table.
| Model | Provider | Relative cost | Editing strength | Built-in watermark | Reach for it when |
|---|---|---|---|---|---|
| Stable Image Core | Stability AI | $ | Generation-focused | Provider-dependent | High-volume drafts, thumbnails, iteration |
| Stable Diffusion 3.x | Stability AI | $$ | Via edit endpoints | Provider-dependent | Strong general default; prompt adherence + text |
| Stable Image Ultra | Stability AI | $$$$ | Via edit endpoints | Provider-dependent | Hero, photorealistic, final assets |
| Amazon Nova Canvas | Amazon | $$ | Built-in (inpaint/outpaint) | Yes (invisible) | Amazon-native gen + editing with provenance |
| Amazon Titan Image Generator | Amazon | $$ | Mature (+ background) | Yes (invisible) | Create-then-edit pipelines; background ops |
Situation: The team wanted to offer merchants on-demand product imagery — background replacement onto clean catalog and lifestyle scenes, color/variant swaps, and multi-format adaptation — generated at catalog scale. They had no ML infrastructure and no GPU budget, were worried about the per-image bill once thousands of merchants ran jobs, and needed AI-generated images watermarked for provenance and moderated so nothing unsafe shipped to a storefront. Self-hosting image GPUs was out of scope for a 13-person team.
What CloudRoute did: Routed within 20 hours to a US AWS partner with a GenAI + commerce track record. The partner built the pipeline entirely on Bedrock: a tiered model setup (a low-cost model for drafts and variants, a premium tier for hero shots), background replacement and inpainting for variant generation, a queue-driven batch path (SQS + Lambda workers) for catalog-scale jobs, outputs written to S3 keyed by SKU with metadata in DynamoDB, delivery over CloudFront with presigned URLs, Bedrock Guardrails plus Rekognition for moderation, and the Amazon image model's built-in invisible watermarking for provenance. In parallel the partner filed a Bedrock/GenAI proof-of-concept credit application and an Activate Portfolio application.
Outcome: GenAI POC credits ($25K) approved in under 2 weeks and Portfolio ($100K) shortly after — the first ~6 months of per-image generation, S3, and CloudFront were fully credit-funded. The merchant-facing generator was in production in 6 weeks, processing catalog jobs as overnight batch, all data resident in-region, with every generated image watermarked and moderation-gated. CloudRoute's commission was paid by the partner from AWS engagement funding; the customer paid $0.
time-to-match: < 24h · credits secured: $125K · generation cost: credit-funded · cost to customer: $0
CloudRoute routes you to a vetted AWS partner who files your Bedrock/GenAI credit application (Activate Portfolio up to $100K, GenAI POC $10K–$50K, GenAI Accelerator up to $1M) and, if you need hands, builds the image pipeline — model selection, S3 + CloudFront, batch, and watermarking/safety. AWS funds the credits and the engagement. You pay $0.