finops cornerstone · 2026 playbook

How to reduce your AWS bill — every lever, ranked by impact and effort (2026).

There are roughly nine real levers that move an AWS bill, and they are not equal. Some cut 20–40% with a weekend of work; others cut 5% and take a quarter of engineering time. This is the definitive, neutral, priority-ordered playbook: the mechanism behind each lever, its typical savings range, the order to pull them in, and the point at which a partner-led — often AWS-funded — audit pays for itself.

levers that matter
9
typical first-pass cut
25–45%
fastest win
1 weekend
overspend in the avg account
~32%
TL;DR
  • Pull the levers in impact-÷-effort order, not in the order vendors pitch them. The fastest large win is almost always commitment coverage (Savings Plans / Reserved Instances) on the steady-state baseline you already run — 20–40% off that compute with zero architecture change. Right-sizing and idle/zombie cleanup are close behind and require no commitment at all.
  • The two silent killers are data transfer and NAT Gateway processing charges. They rarely show up in a top-line glance because they’re spread across hundreds of line items, yet on a chatty microservice or cross-AZ architecture they can be 10–25% of the bill. They’re also the levers most teams have never deliberately touched.
  • A typical untouched account carries ~30% waste. A disciplined first pass — commitments + right-sizing + cleanup + the transfer/NAT traps — recovers most of it in weeks. Beyond that, a partner-led audit pays for itself when spend clears roughly $15K–$25K/month, especially because AWS funds much of that audit work through Well-Architected and optimization programs, so the customer often pays $0.
the framing

IBefore any tool: how to think about an AWS bill

The single most expensive mistake in cost work is pulling levers in the wrong order — spending a sprint shaving 4% off Lambda while a 30% commitment win sits untouched. Rank by impact divided by effort, and the order almost writes itself.

An AWS bill is not one number; it is a stack of largely independent cost drivers. Compute (EC2, ECS/Fargate, Lambda) is usually the biggest. Storage (EBS, S3, snapshots) is the slow accumulator. Databases (RDS, Aurora, DynamoDB) are often the most over-provisioned. And then there is the category most teams never look at directly: data movement — inter-AZ traffic, NAT Gateway processing, and egress to the internet. Each driver has its own lever, and the levers do not interfere with one another, which means you can sequence them.

The right mental model is impact ÷ effort. Impact is the dollar reduction; effort is the engineering and risk cost to capture it. A Savings Plan is pure financial paperwork — near-zero engineering, near-zero risk, large impact — so it ranks first despite touching nothing technical. Re-architecting a service onto event-driven Lambda might be the right long-term call, but as a cost lever it is high-effort and slow-paying, so it ranks last. Most teams invert this, because architecture is more interesting than commitment math.

There is also a sequencing rule that saves real money: clean before you commit. Right-size and delete idle resources first, then buy commitments against the smaller, true baseline. Buy a three-year Savings Plan against an over-provisioned fleet and you have locked in the waste for three years. The correct order is cleanup → measure the steady-state floor → commit to that floor → then optimize the variable layer on top. The rest of this playbook follows that order.

One honest caveat up front: percentages in this guide are typical ranges drawn from common account patterns, not guarantees. Your mix of services determines which levers dominate. A GPU-heavy training shop and a CRUD SaaS app have completely different bills, and the same lever can be worth 2% to one and 35% to the other. Use the ranking as a sequence; use your own Cost Explorer breakdown to size each step.

lever 1 — highest impact, lowest effort

IICommitments: Savings Plans and Reserved Instances

The largest fast win in almost every account. You are paying full on-demand rates for compute you run 24/7 anyway. Committing to that baseline trades a one- or three-year promise for a 20–40% discount, with zero architecture change and effectively zero downside if sized correctly.

AWS sells the same compute at radically different prices depending on whether you commit. On-demand is the default and the most expensive. Commit to a steady hourly dollar amount of usage for one or three years and AWS discounts it — typically 20–30% for a one-year Compute Savings Plan and up to ~40% for three-year, paid up front. The discount is purely financial: the instance behaves identically, the workload doesn’t change, and the saving applies automatically against matching usage.

There are two instruments, and the distinction matters. Savings Plans commit to a dollars-per-hour spend and are the modern default. Compute Savings Plans are the flexible kind — they float across EC2, Fargate, and Lambda, and across instance family, size, OS, tenancy, and Region, so they keep applying even as you change architecture. EC2 Instance Savings Plans are locked to a family in a Region and discount slightly deeper in exchange for that rigidity. Reserved Instances are the older mechanism; for most teams the only RIs still worth buying are for services where Savings Plans don’t apply — notably RDS, ElastiCache, Redshift, and OpenSearch.

The discipline that makes this safe is coverage targeting. Don’t commit to 100% of current usage — commit to the floor you are certain to run regardless of growth or churn. A common target is 70–85% commitment coverage on the steady-state baseline, leaving the top, spiky layer on on-demand or Spot. AWS Cost Explorer’s Savings Plans recommendations will compute this floor for you from your last 7, 30, or 60 days. Start with a one-year Compute Savings Plan at conservative coverage; you can layer additional commitments on top as confidence grows, but you can never unwind one early.

The trap to avoid: buying long, rigid commitments against a bill you haven’t cleaned yet. If 20% of your fleet is over-provisioned, a three-year EC2 Instance Savings Plan against today’s usage cements that 20% waste for three years. This is exactly why commitments rank first on impact but are sequenced after a quick cleanup pass — the two are not in conflict, the order just matters.

typical savings

20–40% off committed compute. One-year Compute Savings Plan ≈ 20–30%; three-year, all-upfront ≈ up to ~40%. RDS/ElastiCache/Redshift Reserved Instances ≈ 30–55% off those services. Effort: financial, not technical — hours, not sprints. Risk: low if coverage is targeted to the certain baseline.

lever 2 — high impact, low–medium effort

IIIRight-sizing: stop paying for capacity you never use

The average provisioned instance runs at a fraction of its capacity. Right-sizing matches instance type and size to actual utilization — and unlike commitments, it costs nothing and locks in nothing. It is the cleanup that should happen before you commit.

Provisioning is almost always done by guesswork, then never revisited. An engineer picks an m5.2xlarge "to be safe," the service ships, and two years later it’s still an m5.2xlarge averaging 9% CPU and 30% memory. Right-sizing is the systematic correction: look at real utilization over a representative window (2–4 weeks, including peaks), and move each workload to the smallest instance that comfortably serves its actual load. Dropping one size — 2xlarge to xlarge — halves that instance’s cost.

AWS Compute Optimizer is the native engine for this. It ingests CloudWatch metrics and emits per-resource recommendations for EC2, EBS volumes, Lambda memory, ECS-on-Fargate tasks, and Auto Scaling groups, each tagged under-provisioned / optimized / over-provisioned with a projected dollar delta. Enable it (free), give it ~14 days to gather data, and work the over-provisioned list top-down by savings. Memory-based recommendations need the CloudWatch agent installed, since EC2 memory isn’t visible to AWS by default — without it you’re right-sizing on CPU alone and may leave memory waste on the table.

Right-sizing also means changing shape, not just size. A workload pinned on CPU but light on memory belongs on a compute-optimized family (c-series); a caching layer that’s all memory belongs on r-series. Moving to the family that matches the bottleneck often beats simply shrinking within the wrong family. And right-sizing isn’t one-and-done — usage drifts, so the teams that hold their savings re-run Compute Optimizer quarterly rather than treating it as a single project.

Order matters here too: right-size first, then buy commitments against the smaller footprint. Compute Optimizer’s numbers feed directly into a more accurate Savings Plan floor, so these two top levers reinforce each other when sequenced correctly.

typical savings

15–30% off the compute line in an account that has never been right-sized. Effort: low–medium (read recommendations, validate, restart workloads on new sizes during a maintenance window). Risk: low — keep headroom for peaks and roll changes gradually. Cost to capture: $0.

levers 3 & 4 — high impact on the right workloads

IVSpot and Graviton: cheaper compute for the same work

Two levers that reduce the unit price of compute itself rather than the quantity. Spot trades guaranteed availability for a steep discount on fault-tolerant work; Graviton trades a CPU architecture port for a structural price/performance gain. Both can be large where they apply.

Spot Instances — up to ~90% off interruptible workloads

Spot sells AWS’s spare capacity at a discount of typically 60–90% off on-demand, with one condition: AWS can reclaim the instance on two minutes’ notice when it needs the capacity back. That makes Spot ideal for anything fault-tolerant and stateless — CI/CD runners, batch and data processing, rendering, ML training with checkpointing, and stateless web tiers behind a load balancer.

The modern way to run Spot is not to bid on a single instance type but to use a diversified pool. EC2 Auto Scaling and Karpenter (on EKS) let you request "any of these N instance families" and let AWS place you on whatever spare capacity is cheapest and most stable, falling back gracefully and even blending Spot with on-demand in one fleet. The mistake to avoid is putting a stateful primary database or a single-instance service with no failover on Spot — an interruption there is an outage, not a saving.

Graviton — ~20% better price/performance, structurally

Graviton is AWS’s own ARM-based processor line. For most general-purpose, compute, and memory workloads, the Graviton-backed instance families deliver materially better price/performance than the equivalent x86 family — AWS positions it at roughly 20% better, and on the right workload the real-world saving lands in a similar band, before any commitment discount stacks on top.

The catch is architecture: your code and dependencies must run on ARM64. For interpreted and managed runtimes — most Python, Node, Java, Go, .NET, and managed services like RDS, ElastiCache, and OpenSearch on Graviton — this is often a config flip with no code change. For anything with compiled native dependencies or container images, you rebuild for ARM64 (or multi-arch) and test. The migration is a real engineering task, but a one-time one, after which the price advantage is permanent and compounds with Spot and Savings Plans.

typical savings

Spot: 60–90% off the interruptible portion of compute (CI, batch, training, stateless tiers). Graviton: ~20% better price/performance on ported workloads, permanent and stackable with commitments. Effort: low for Spot on existing autoscaling; medium for Graviton (rebuild + test on ARM64).

lever 5 — steady, compounding savings

VStorage tiering and cleanup: the slow accumulator

Storage rarely spikes, so it rarely gets attention — and that’s exactly why it bloats. The lever is matching each byte to the cheapest tier that meets its real access pattern, then deleting the bytes nobody reads at all.

S3 has a ladder of storage classes priced by access frequency. S3 Standard is the default and most expensive per GB. Below it sit Standard-Infrequent Access, One Zone-IA, the Glacier tiers for archival, and — the lever most teams should reach for first — S3 Intelligent-Tiering, which monitors access per object and automatically demotes cold objects to cheaper tiers and promotes them back on access, for a tiny monitoring fee. For any bucket with mixed or unpredictable access patterns, Intelligent-Tiering captures most of the available saving with zero ongoing management and no retrieval-fee risk on the data that does get touched.

EBS is the quieter line. The lever there is twofold: migrate older gp2 volumes to gp3, which is cheaper per GB and lets you provision IOPS and throughput independently of size (so you stop paying for a giant volume just to get its bundled IOPS), and delete the snapshots nobody needs. Snapshot sprawl is endemic — automated daily snapshots accumulate for years, and orphaned snapshots from long-deleted volumes keep billing silently. A lifecycle policy that ages snapshots out and a one-time sweep of orphans both pay off immediately.

The cleanup half of this lever is pure profit: there is no tradeoff in deleting storage that serves no purpose. Unattached EBS volumes left behind by terminated instances, old AMIs and their backing snapshots, incomplete multipart uploads quietly accruing in S3, and forgotten log buckets with no expiration policy. An S3 Lifecycle policy that expires logs after N days and aborts stale multipart uploads turns a perpetually growing line into a flat one.

typical savings

20–60% off the storage line. Intelligent-Tiering or IA on cold S3 data ≈ 40–70% off those objects; gp2→gp3 ≈ ~20% off those volumes plus decoupled IOPS; snapshot/orphan cleanup is pure recovery. Effort: low (lifecycle policies + one cleanup sweep). Risk: low with correct lifecycle rules.

lever 6 — the silent killer

VIData transfer and the NAT Gateway trap

The lever almost nobody pulls deliberately, because the cost hides across thousands of tiny line items. On a chatty, multi-AZ, microservice architecture, data movement and NAT processing can quietly be 10–25% of the bill — and most of it is avoidable.

AWS charges for data movement in ways that are easy to design into a bill by accident. Traffic between Availability Zones is billed in both directions. Traffic out to the internet (egress) is billed per GB and is one of AWS’s higher-margin lines. Same-AZ traffic over private IPs is free, and inbound is free — so the cost is a direct function of how much your architecture moves data across zones and out to the world. A microservice mesh that scatters chatty services across AZs without thought can generate enormous inter-AZ charges for traffic that, with AZ-aware placement, would have been free.

The NAT Gateway is the single most common surprise on the transfer line. Resources in private subnets reach the internet through a NAT Gateway, which bills an hourly charge plus a per-GB processing charge on every byte that passes through it. The per-GB processing fee is the trap: route all your S3 reads, ECR image pulls, CloudWatch logs, and other AWS-bound traffic through the NAT and you pay a processing fee on traffic that never needed to leave AWS at all. The fix is VPC Gateway Endpoints for S3 and DynamoDB (free) and Interface Endpoints (PrivateLink) for other services, which route that traffic privately and bypass the NAT processing charge entirely. A single S3 Gateway Endpoint on a data-heavy account can cut the NAT bill dramatically on its own.

There are two more high-leverage moves on this line. First, put a CDN (CloudFront) in front of high-volume egress — CloudFront’s per-GB rates are lower than direct S3/EC2 egress, and it offsets origin transfer, so heavy public download or media workloads get cheaper and faster at once. Second, audit cross-AZ chatter and co-locate the chattiest service-to-service paths within an AZ where availability requirements allow. None of this changes what the application does; it changes the path the bytes take, and the bytes are what AWS meters.

typical savings

10–25% of the bill on transfer-heavy / multi-AZ / microservice architectures. VPC Gateway + Interface Endpoints can erase most NAT processing charges; CloudFront cuts high-volume egress; AZ-aware placement removes avoidable cross-AZ traffic. Effort: low–medium (mostly networking config). Often the most overlooked lever in the whole account.

levers 7 & 8 — fast, low-risk recovery

VIIIdle/zombie cleanup and scheduling non-prod

Two of the fastest wins in the playbook, both pure waste-elimination with effectively no architectural tradeoff. Cleanup removes resources nobody uses; scheduling switches off resources nobody uses at night.

Idle and zombie resources — delete what serves nothing

Every account accumulates resources that bill 24/7 while doing nothing. The usual suspects: unattached Elastic IPs (an idle EIP bills hourly), idle load balancers with no healthy targets, unattached EBS volumes and old snapshots, over-provisioned NAT Gateways in dev VPCs nobody uses, forgotten RDS instances from a finished project, and dormant non-prod environments left running after a launch. AWS Trusted Advisor and Cost Explorer’s resource-level views surface most of these; a quarterly sweep keeps them from re-accumulating.

The reason this lever is so attractive is that there is no tradeoff to weigh — you are deleting things that produce zero value and pure cost. The only discipline required is confidence that a resource is truly orphaned, which good tagging (below) makes trivial. On a neglected account, a single cleanup sweep routinely recovers several percent of the bill in an afternoon.

Scheduling non-prod — stop paying for nights and weekends

Development, staging, QA, and test environments are typically used during business hours, roughly 40–50 hours a week — yet they usually run 168 hours a week. Scheduling them to stop outside working hours cuts their compute and RDS cost by roughly two-thirds, because you stop paying for ~120 idle hours. AWS Instance Scheduler (a supported solution) or simple tag-driven Lambda/EventBridge automations start and stop instances and RDS databases on a defined calendar.

This lever applies only to non-production — you obviously don’t schedule production down. But for organizations where non-prod is a meaningful share of total spend (often 20–40% in active engineering shops), shutting it off two-thirds of the time is one of the highest impact-÷-effort moves available, and it carries essentially no risk because the environments are non-critical and start back up on schedule.

typical savings

Cleanup: 3–10% of a neglected bill, recovered in hours, zero tradeoff. Scheduling non-prod: ~65% off the affected non-prod compute/RDS by running it ~50 hours instead of 168. Effort: low for both. Risk: near-zero (non-prod only / orphaned only).

lever 9 — makes every other lever durable

VIIIGovernance: tagging, Budgets, and anomaly detection

The lever that doesn’t cut cost directly but stops every other saving from silently eroding. Without allocation, alerts, and ownership, an optimized account drifts back to bloat within a year. Governance is what makes FinOps a state rather than a one-off project.

It all starts with cost allocation tags. Until every resource is tagged by team, environment, product, and cost center, "the bill" is one undifferentiated number and no one owns any of it. With a consistent tagging scheme activated in the Billing console, Cost Explorer can slice spend by any dimension — so you can see that one team’s staging environment is 18% of the bill, or that an untagged "miscellaneous" bucket is quietly the third-largest line. Tagging produces no savings on its own; it produces the visibility and accountability that make every other lever findable and assignable. AWS Organizations tag policies and Service Control Policies can enforce tagging so resources can’t be created untagged.

AWS Budgets closes the loop on intent. A budget with alert thresholds (say, notify at 80% and 100% of plan) turns runaway spend into an email or Slack alert before the invoice lands, not after. Budgets can track total spend, a specific service, a specific tag, or even Savings Plan and Reserved Instance coverage and utilization — so you can be alerted not just when spend is high but when your commitment coverage drops, which is the early warning that you’re leaking back to on-demand.

AWS Cost Anomaly Detection is the automated watchdog. It learns each service’s normal spend pattern with machine learning and alerts when a service deviates — a misconfigured cron hammering a NAT Gateway, a runaway data pipeline, a left-on GPU instance, a leaked credential spinning up crypto-mining instances. It catches the spikes a monthly budget review would miss until it’s already expensive, and it’s free to enable. Together, tagging + Budgets + Anomaly Detection convert cost optimization from a heroic quarterly cleanup into a continuously enforced baseline — which is the entire point of FinOps as a discipline.

why it ranks last but matters most

Governance produces ~0% direct savings — and protects 100% of the savings from every lever above it. An account that’s right-sized, committed, and cleaned but ungoverned drifts back toward ~30% waste within a year. Tagging makes waste findable; Budgets make intent enforceable; Anomaly Detection catches the spike before the invoice. Effort: low–medium, one-time setup. Payoff: durability.

when DIY stops being worth it

IXWhen a partner-led (often AWS-funded) audit pays for itself

Everything above is doable in-house, and at small scale it should be. But there’s a crossover point where a specialist audit recovers far more than it costs — and a structural reason it frequently costs the customer nothing at all.

The in-house ceiling is real. The native tools — Cost Explorer, Compute Optimizer, Trusted Advisor, the CUR — surface the obvious levers, but the deeper savings (commitment-laddering across multiple plan types, Spot diversification on EKS, Graviton portability assessment, the full transfer/endpoint redesign) require both specialist knowledge and dedicated time that an engineering team rarely has to spare. A founder’s cloud lead at 70% allocated to shipping product is not going to model a three-instrument commitment ladder. The opportunity cost of DIY past a certain bill size is the engineering hours plus the savings left uncaptured.

As a rough rule of thumb, a partner-led audit starts paying for itself around $15K–$25K/month in AWS spend. Below that, the native tools plus this playbook usually capture most of the available saving. Above it, the absolute dollars at stake — a 25–35% reduction on a $300K/year bill is $75K–$105K a year — dwarf the cost of expert help, and the optimization is no longer a one-afternoon job but an ongoing program worth resourcing properly.

The part most teams don’t know: AWS itself funds much of this work. AWS runs Well-Architected Framework reviews (the Cost Optimization pillar is one of six) and partner-led optimization and Optimization and Licensing Assessment (OLA) engagements, and it offers funding and credits to certified partners to perform them — because a healthy, optimized, sticky customer is worth far more to AWS long-term than the short-term margin on waste. Routed through the right partner, a structured cost audit is frequently delivered at $0 to the customer, with AWS covering the partner’s engagement through these programs. That inverts the usual calculus: the question isn’t whether the savings justify the audit fee, it’s whether you’d rather keep paying the waste than accept a funded review.

The honest framing: a partner-led audit is not magic — it pulls the same nine levers in this playbook, just faster, deeper, and with the commitment math done by people who do it daily. What it changes is the slope and the cost. For a team under $15K/month with engineering slack, this guide and the native tools are enough. For a team above it, especially one that can get the audit AWS-funded, declining is usually the more expensive choice.

side by side

DIY first pass vs partner-led audit — when each makes sense

Both pull the same levers. The difference is depth, speed, the engineering hours you spend, and — often decisively — who pays. Below roughly $15K/month, DIY captures most of the win. Above it, a funded audit usually nets more even after the savings split.

VariableDIY first passPartner-led audit (often AWS-funded)
Best fitSpend < ~$15K/month, engineering slackSpend > ~$15K–$25K/month, no slack
Levers coveredThe obvious ones (commitments, right-sizing, cleanup)All nine, incl. commitment laddering + transfer redesign
Time to first savingsDays to weeks (your hours)2–4 week structured assessment
Engineering costYour team’s time (the hidden cost)Minimal — partner does the analysis
Commitment mathCost Explorer recommendationsMulti-instrument ladder, modeled
Ongoing disciplineYou build the governancePartner sets up tagging/Budgets/anomaly + handoff
Cost to you$0 (your time)Frequently $0 — AWS funds via Well-Architected / partner programs
The decision usually isn’t cost vs no cost — it’s waste vs no waste. When AWS funds the partner engagement, declining a structured audit on a large bill mostly means choosing to keep paying the overspend.
not sure which lever is biggest in your account?
Get a funded read on where your AWS bill is actually leaking
See where you’re overspending →
a recent match

A funded cost audit — anonymized

inquiry · seed-stage b2b saas, ~$22K/month AWS, EU
Seed-stage B2B SaaS, 12 engineers, ~$22K/month AWS spend, multi-AZ microservices on EKS

Situation: Bill had grown ~40% in two quarters with no clear cause. All compute on-demand. Never right-sized. Heavy cross-AZ chatter between services and all egress through a single NAT Gateway. No tagging, no budgets, no anomaly alerts. The lone platform engineer was fully allocated to product and had no time to model commitments or redesign the network path.

What CloudRoute did: Routed within 24 hours to an AWS partner with a FinOps + EKS track record who ran a Well-Architected Cost pillar review. Sequence: cleanup + right-sizing via Compute Optimizer first; then a one-year Compute Savings Plan at 75% coverage on the corrected baseline; S3 + DynamoDB Gateway Endpoints and PrivateLink to kill NAT processing charges; Karpenter Spot pools for stateless and CI workloads; gp2→gp3 and snapshot lifecycle; then tagging, Budgets, and Cost Anomaly Detection for durability. The Well-Architected engagement was AWS-funded.

Outcome: Steady-state bill down ~34% (≈$7.5K/month, ~$90K/year) within six weeks, with governance in place so it holds. NAT processing charges fell ~80% from the endpoints alone. CloudRoute’s commission was paid by the partner from AWS’s engagement funding — the customer paid $0 for the audit.

engagement window: 6 weeks · founder time: ~6 hours · run-rate cut: ~34% (~$90K/yr) · cost to customer: $0

faq

Common questions

What’s the single fastest way to reduce my AWS bill?
Commitment coverage on the compute you already run 24/7. A one-year Compute Savings Plan against your steady-state baseline is near-zero engineering and near-zero risk and typically cuts 20–30% off that compute (up to ~40% for three-year, all-upfront), with the discount applying automatically. The one rule: right-size and delete idle resources first, then commit to the smaller true baseline — never lock a commitment against an over-provisioned fleet.
How much can I realistically cut, and how much waste does a typical account carry?
A typical untouched account carries roughly 30% waste. A disciplined first pass — commitments + right-sizing + idle cleanup + fixing the data-transfer/NAT traps — recovers most of it, commonly a 25–45% run-rate reduction, within weeks. The exact figure depends entirely on your service mix: a GPU-training shop and a CRUD SaaS app have completely different bills, and the same lever can be worth 2% to one and 35% to the other.
What order should I pull the levers in?
Impact ÷ effort, with one sequencing rule: clean before you commit. Practically: (1) delete idle/zombie resources, (2) right-size with Compute Optimizer, (3) measure the resulting steady-state floor, (4) buy Savings Plans/RIs against that floor, then optimize the variable layer — (5) Spot for fault-tolerant work, (6) Graviton where it ports, (7) storage tiering, (8) the data-transfer/NAT redesign, (9) schedule non-prod — and run governance (tagging, Budgets, anomaly detection) underneath all of it so the savings hold.
Why is my data-transfer and NAT Gateway cost so high, and what fixes it?
Because AWS bills inter-AZ traffic both ways, bills internet egress per GB, and — the usual surprise — bills a per-GB processing charge on everything that passes through a NAT Gateway, including AWS-bound traffic that never needed to leave AWS. The fixes: add VPC Gateway Endpoints for S3 and DynamoDB (free) and Interface Endpoints/PrivateLink for other services to bypass NAT processing; put CloudFront in front of high-volume egress; and co-locate chatty services within an AZ where availability allows. On transfer-heavy architectures this lever alone can be 10–25% of the bill.
Are Savings Plans or Reserved Instances better in 2026?
For most teams, Compute Savings Plans — they float across EC2, Fargate, and Lambda and across family, size, Region, and OS, so they keep applying as your architecture changes. EC2 Instance Savings Plans discount slightly deeper but lock you to a family in a Region. Reserved Instances are now mainly worth buying for services Savings Plans don’t cover — RDS, ElastiCache, Redshift, and OpenSearch. A common approach is to layer a flexible Compute Savings Plan as the base and add targeted RIs for those database services.
Is moving to Graviton or Spot worth the engineering effort?
Spot is usually low-effort and high-reward for anything fault-tolerant — CI runners, batch, ML training with checkpointing, stateless tiers behind a load balancer — at 60–90% off, especially via diversified pools in Auto Scaling or Karpenter. Never put a stateful primary on Spot. Graviton is a one-time port (config flip for most managed runtimes; a rebuild + test for native/compiled dependencies) that yields a permanent ~20% price/performance gain which stacks on top of commitment discounts. Both are worth it where they apply; size them against your own usage.
When is a partner-led cost audit worth it instead of doing it myself?
Below roughly $15K/month with engineering slack, the native tools (Cost Explorer, Compute Optimizer, Trusted Advisor) plus a disciplined first pass capture most of the savings yourself. Above ~$15K–$25K/month, a 25–35% cut is tens of thousands of dollars a year — enough that specialist help nets more even after any savings split, and the work becomes an ongoing program rather than a one-off. The deciding factor is often that AWS funds much of this work (Well-Architected reviews, partner optimization programs), so a structured audit frequently costs the customer $0.
How is an AWS-funded audit free — what’s the catch?
There’s no catch for the customer. AWS runs Well-Architected Framework reviews and funds certified partners to perform cost-optimization and OLA engagements, because an optimized, sticky customer is worth far more to AWS long-term than the short-term margin on your waste. Routed through the right partner, the engagement is frequently delivered at $0 to you, with AWS covering the partner via these programs. The partner pulls the same nine levers in this playbook — just faster and deeper, with the commitment math done by people who do it daily.

Want a funded read on where your AWS bill is leaking?

CloudRoute routes you to a vetted AWS partner who runs the cost audit — commitments, right-sizing, Spot/Graviton, the transfer/NAT traps, governance. Often AWS-funded, so the customer pays $0. No procurement, no discovery theater.

typical run-rate cut25–45%
matched within< 24h
cost to youoften $0
How to Reduce Your AWS Bill: The Definitive 2026 Playbook · CloudRoute