Most AWS bills carry 25–45% waste. This is the full lever list in the order a FinOps practitioner actually runs them — commitments first, then right-sizing, Spot, Graviton, storage, data transfer, and governance — with the real savings range and effort for each. At the end: a partner-led audit that's often AWS-funded, so you cut the bill for $0.
AWS cost optimization is not one trick; it's a stack of levers, each attacking a different layer of the bill. Before you pull any of them, it helps to know where the money leaks — because the leaks are remarkably consistent across companies.
AWS is consumption-billed and default-on. You provision for peak, traffic recedes, and nobody scales the resource back down. An engineer spins up an oversized instance "to be safe," ships, and moves on. A load balancer, a NAT Gateway, and three Elastic IPs outlive the service they were attached to. Snapshots accumulate. Logs stream to CloudWatch at full retention forever. None of this is a mistake anyone notices — it's the natural entropy of a system where provisioning is one API call and de-provisioning is nobody's job.
Across the engagements CloudRoute-matched partners run, the recurring pattern is that 25–45% of an un-optimized bill is recoverable without changing what the application does. The bill is not high because the workload is expensive; it's high because the workload is running on the wrong shape, at the wrong commitment, in the wrong storage tier, paying for data movement nobody designed for.
The other reason bills drift is that on-demand pricing — the default — is the most expensive way to buy almost everything on AWS. On-demand exists so you can start with zero commitment. The moment your usage has a predictable floor, continuing to pay on-demand for that floor is pure overpayment. Closing that gap is the single biggest lever, and it's covered first below.
The rest of this page is the lever list in execution order. Each lever gets the same treatment: how it works, the typical savings range, the effort and the tradeoff, and where it sits relative to the others. The comparison table at the end puts all of them side by side so you can sequence your own program.
If you do only one thing, do this. Commitments are the highest-impact, lowest-effort lever on AWS — they change what you pay, not what you run, and they apply to the spend you already have.
A commitment is a promise to AWS: "I'll spend at least $X/hour (Savings Plans) or run this instance family (Reserved Instances) for 1 or 3 years," and in exchange AWS discounts that usage by up to ~72% versus on-demand. The discount is automatic — it applies as a billing-time credit against matching usage. You don't migrate anything; the same instances simply bill at the committed rate.
There are two instruments, and picking the right mix matters:
Compute Savings Plans are the flexible option: you commit to a dollar-per-hour of compute and the discount floats across EC2 (any region, family, size, OS, tenancy), Fargate, and Lambda. Maximum flexibility, slightly smaller discount. This is the right default for most teams because it survives instance-family changes and a future Graviton migration without stranding the commitment.
EC2 Instance Savings Plans lock you to a specific instance family in a region (e.g. m6i in us-east-1) in exchange for a deeper discount. Use these only for a baseline you're confident won't change — and ideally only after you've right-sized and picked your architecture, so you're not committing to a shape you're about to abandon.
Both come in 1-year and 3-year terms, and No Upfront / Partial Upfront / All Upfront payment options. 3-year All Upfront is the deepest discount; 1-year No Upfront is the most flexible and the place most teams start. Check the AWS pricing pages / Cost Explorer for the exact current rates by family and region — they move.
EC2 has largely moved to Savings Plans, but Reserved Instances are still the commitment instrument for RDS, ElastiCache, Redshift, and OpenSearch. If you run a steady-state production database, an RI (or the equivalent reservation) on that engine is close to free money — same database, up to ~60%+ off, just for committing to a year.
The standard mistake is leaving managed-database spend entirely on-demand because "Savings Plans cover compute." They don't cover RDS/ElastiCache. Those need their own reservations, and they're frequently the second-largest line on the bill after raw compute.
Right-size before you commit, and commit only to the steady-state baseline you're confident you'll keep — not your current peak. A good target is to cover ~70–80% of your stable, always-on usage with commitments and leave the variable top layer on-demand or Spot. Over-committing to a shape you're about to right-size or migrate is how teams get locked into paying for capacity they no longer use.
Commitments make the rate cheaper; right-sizing makes the quantity correct. Run them in that order conceptually, but do the right-sizing analysis first so you commit to the right baseline.
Right-sizing means matching instance and resource size to actual utilization. The signal comes from AWS Compute Optimizer, which reads CloudWatch metrics and flags over-provisioned EC2 instances, EBS volumes, Lambda memory settings, ECS services, and RDS instances, with a recommended target size and the projected savings. An instance sitting at 8% CPU and 20% memory for a month is a textbook downsize — often two sizes down, which roughly halves the cost of that resource.
The adjacent, even-cheaper win is killing what shouldn't be running at all: idle and "zombie" resources. Dev instances left on over nights and weekends. Load balancers and NAT Gateways with no healthy targets. Unattached EBS volumes still billing after their instance was terminated. Old EBS snapshots and AMIs nobody will ever restore. Provisioned-but-idle RDS instances from a migration that finished months ago. These cost real money and deliver zero value — deleting them is the highest savings-per-minute work on AWS.
A close cousin of right-sizing is scheduling and auto-scaling. Non-production environments rarely need to run 168 hours a week; an instance scheduler that stops dev/test outside business hours cuts those costs by roughly 65–70% on its own. Production fronted by an Auto Scaling group means you pay for the capacity demand actually requires instead of a fixed peak-sized fleet.
This lever is mostly low-risk and reversible — you can resize back up in minutes if you cut too far — which is exactly why it belongs in the quick-wins bucket alongside storage cleanup. The deeper read-through on the EC2 side is the dedicated ec2-right-sizing page in this cluster.
Right-sizing and commitments optimize the compute you run as-is. Spot and Graviton change the compute itself — bigger savings, more engineering, real tradeoffs.
Spot Instances sell AWS's spare capacity at up to ~90% off on-demand. The catch is honest and important: AWS can reclaim a Spot instance with a two-minute warning when it needs the capacity back. That makes Spot perfect for interruptible, stateless, or retry-tolerant work — batch jobs, CI/CD runners, data processing, rendering, and stateless services behind a queue or load balancer. It also makes Spot ideal for Kubernetes and container workloads: tools like Karpenter or EKS managed node groups blend Spot and on-demand automatically, draining Spot nodes gracefully on the reclaim signal. The rule of thumb: anything that can lose a node and recover belongs on Spot; anything stateful that can't tolerate a sudden interruption (a primary database, a stateful singleton) does not.
Graviton is AWS's ARM-based processor line. Migrating compute from x86 (Intel/AMD) to Graviton instance families typically delivers ~20–40% better price-performance for the same workload — you get more throughput per dollar. The effort depends on your stack: interpreted and managed runtimes (Python, Node, Java, Go, .NET, most managed services like RDS, ElastiCache, OpenSearch, Lambda) port to ARM with little or no code change; anything with compiled native dependencies, specific AMIs, or x86-only third-party agents needs a rebuild and a test pass. Because Graviton is a structural change, do it before you lock in EC2 Instance Savings Plans — otherwise you commit to an x86 family you're about to leave.
These two compose well. A mature compute setup often looks like: a Savings-Plan-covered Graviton baseline for steady-state services, Spot for everything interruptible, and on-demand only for the spiky top layer that isn't worth committing or risking on Spot. Each layer is the cheapest viable option for that slice of demand. The cluster has dedicated pages for ec2-spot-instances and aws-graviton-migration that go deeper on the migration mechanics.
Compute gets the attention, but storage and data transfer are where bills quietly balloon — and where a few configuration changes recover real money with almost no risk.
Storage optimization is mostly about putting data in the right tier and not paying for data you don't need. On S3, Intelligent-Tiering automatically moves objects between access tiers based on usage, so cold data drifts to cheaper storage without you writing lifecycle rules or risking retrieval surprises — for buckets with unknown or changing access patterns it's close to a default-on win. Where access patterns are known, explicit lifecycle policies (transition to Infrequent Access, then Glacier classes, then expire) do the same thing more aggressively.
On block storage, the cleanest single win is migrating EBS volumes from gp2 to gp3: gp3 is roughly 20% cheaper per GB and lets you provision IOPS and throughput independently, so you stop paying for capacity just to get performance. Combine that with deleting unattached volumes and pruning the snapshot backlog (snapshots are incremental but the backlog still adds up), and storage typically gives back 20–30% of its line with zero application change.
Data transfer is the silent killer — the charges nobody provisioned and everybody pays. The usual culprits: NAT Gateway data-processing charges (every GB through a NAT Gateway is billed on top of the hourly cost, and chatty private subnets rack this up fast), cross-AZ traffic (replication and service-to-service calls that hop Availability Zones are billed per GB in both directions), and internet egress (data leaving AWS to the public internet). The fixes are architectural: route AWS-service traffic through VPC endpoints (Gateway endpoints for S3 and DynamoDB are free and bypass the NAT entirely), keep chatty components in the same AZ, and put a CDN in front of egress-heavy traffic. This lever takes more thought than flipping a storage class, but on data-movement-heavy workloads it's one of the largest hidden recoveries on the bill. The aws-data-transfer-costs and nat-gateway-cost pages in this cluster break the mechanics down line by line.
Every lever above is a one-time recovery. Governance is what stops the waste from coming back — and it's the difference between a cost-cutting project and a cost-optimized organization.
You can't manage what you can't see, and you can't allocate what you don't tag. The visibility layer is the foundation everything else stands on:
Tie those tools together with the FinOps operating model — the discipline of bringing engineering, finance, and product together to manage cloud spend as a continuous practice. It runs in three repeating phases. Inform: get visibility and allocation right (tagging, dashboards, showback) so every team can see what it spends. Optimize: act on that visibility — right-size, commit, re-architect, the levers above. Operate: make it continuous — anomaly alerts, regular commitment reviews, cost as a first-class metric in engineering decisions and a line in every architecture review. Showback (showing teams their spend) drives most of the behavior change; chargeback (actually billing it to their budget) drives the rest. The aws-finops page in this cluster is the full operating-model deep-dive; aws-budgets, aws-cost-explorer, aws-cost-anomaly-detection, and aws-cost-allocation-tags each cover one tool in depth.
Knowing the levers isn't enough; sequence is what separates a 10% dent from a 40% reduction. Run them in a deliberate order so each step sets up the next.
Split the work into two buckets. Quick wins are cheap, reversible, and recover 10–20% in days: turn on Cost Explorer and a budget so you can measure, delete idle/zombie resources, migrate gp2→gp3, enable S3 Intelligent-Tiering, prune unattached volumes and stale snapshots, and schedule non-prod environments off outside business hours. None of this needs an architecture change or a meeting — it's pure cleanup, and it funds the appetite for the harder work.
Structural work recovers the next 20–40% but takes weeks and real engineering: build the commitment portfolio (Savings Plans + RIs sized to your post-right-sizing baseline), migrate eligible workloads to Graviton, move interruptible compute to Spot, and re-architect data transfer (VPC endpoints, AZ-aware placement, CDN). These pay back larger but demand testing, sequencing, and sometimes a rollback plan.
The order that works in practice: (1) measure — stand up visibility so you have a baseline and can prove savings; (2) clean up — kill idle resources and do the quick storage wins, because there's no point committing to or right-sizing waste; (3) right-size — get every resource to its correct shape so you know your true baseline; (4) re-architect compute — Graviton and Spot, so you're committing to the shape you'll actually keep; (5) commit — buy Savings Plans and RIs against that stable, right-sized, modernized baseline; (6) govern — turn on anomaly detection, tagging, and a review cadence so the savings don't erode. Commit last, not first: committing before you clean up and right-size locks you into paying for waste.
Imagine an instance running at 10% utilization costing $1,000/month on-demand. Commit first and you lock in ~$300/month (a 3-year Savings Plan) for an oversized box. Right-size it two sizes down to ~$250/month on-demand first, then commit, and you pay ~$75/month for the same work — and you're not contractually stuck paying for capacity you didn't need for the next three years. Same levers, same effort, roughly 4× the saving, because the order was right.
The levers are knowable; the hard part is finding which ones apply to your specific bill, sizing the opportunity, and actually doing the rework while the team ships product. That's the gap an audit closes.
You can find a lot yourself. Open Cost Explorer and sort spend by service to see where the money concentrates. Pull Compute Optimizer's right-sizing recommendations. List unattached EBS volumes and idle Elastic IPs. Check whether your steady-state compute is covered by commitments or bleeding on-demand. Look at the data-transfer line and trace it to NAT Gateways and cross-AZ traffic. This first pass alone usually surfaces the obvious 10–20%.
The deeper recovery — the structural 20–40% — is where most teams stall, not for lack of knowledge but for lack of time. Sizing a commitment portfolio correctly, planning a Graviton migration, re-architecting data transfer, and standing up a FinOps practice is a real project, and the engineers who'd do it are the same engineers shipping the roadmap. This is exactly what a vetted AWS partner does: a structured cost audit (frequently anchored on a Well-Architected Review of the Cost Optimization pillar), a prioritized findings list with dollar estimates, and then the hands-on rework.
Here's the part that surprises people. For qualifying, credit-eligible engagements, AWS funds the partner-led optimization work — the partner is paid through AWS partner-funding programs, and a Well-Architected Review can unlock remediation credits that offset the cost of fixing what it finds. The net result for the customer is that they cut their AWS bill for $0. Honest framing: AWS-funding applies to qualifying engagements, not every situation; where it doesn't apply, it's a vetted-partner referral that pays for itself many times over in the savings it surfaces. Either way, the customer doesn't pay CloudRoute — the partner does, as a routing commission.
That's the CloudRoute model in one line: you tell us your stack and roughly what you're spending, we route you to a partner who's done this for companies at your stage and scale, and they run the audit and the rework. The startup persona detail at /for/startup walks through what that looks like for an early-stage team, and the aws-bill-audit and aws-cost-optimization-tools pages in this cluster cover the audit scope and the tooling in depth. If you also want the credits angle, the $100K Activate path at /aws-credits/100k-aws-credits and the Well-Architected Review at /devops/aws-well-architected-review are the natural next reads.
The whole playbook on one screen. Savings ranges are representative as of 2026 (check Cost Explorer / the AWS pricing pages for current rates); "applies to" is the slice of the bill each lever moves. Run them roughly top-to-bottom, but always measure and clean up before you commit.
| Lever | How it works | Typical savings | Effort | Applies to |
|---|---|---|---|---|
| Savings Plans + RIs (commitments) | Commit 1–3 yr to spend/usage for a billing discount | up to ~72% off on-demand | Low (no rework) | Steady-state compute + managed DBs |
| Right-sizing + idle cleanup | Match size to utilization; delete zombies | 15–50% on affected resources | Low–Medium | Over-provisioned + idle resources |
| Spot Instances | Buy spare capacity; tolerate 2-min reclaim | up to ~90% off on-demand | Medium (needs fault-tolerance) | Interruptible / stateless / batch / k8s |
| Graviton migration | Move x86 → ARM instance families | ~20–40% better price-performance | Medium (rebuild + test) | Most compute + managed services |
| Storage tiering + EBS cleanup | S3 Intelligent-Tiering; gp2→gp3; prune | 20–30% of storage spend | Low | S3 + EBS + snapshots |
| Data-transfer re-architecture | VPC endpoints, AZ-aware placement, CDN | Large on transfer-heavy bills | Medium–High | NAT, cross-AZ, internet egress |
| Governance (FinOps) | Tagging, budgets, anomaly detection, reviews | Prevents 100% of drift-back | Ongoing | The whole account, continuously |
Situation: Bill had roughly tripled in 14 months as the platform scaled, with no commitments in place and nobody owning cost. Everything ran on-demand; the EKS fleet was x86 and statically sized; the data-transfer line had crept past $6K/month with nobody able to explain it; RDS was on-demand across three production engines. The two engineers who could fix it were fully allocated to a launch.
What CloudRoute did: CloudRoute routed within 24 hours to a partner with EKS + FinOps depth who ran a Well-Architected Cost Optimization review. Quick wins first: deleted idle/zombie resources, gp2→gp3 across all volumes, S3 Intelligent-Tiering, non-prod scheduled off nights/weekends. Then structural: right-sized the EKS nodes and RDS instances, moved stateless services to Spot via Karpenter (Graviton where it ported cleanly), added S3/DynamoDB Gateway endpoints to gut the NAT charges, and sized a Compute Savings Plan + RDS reservations against the new, right-sized baseline. Tagging + Budgets + Cost Anomaly Detection stood up last.
Outcome: Steady-state bill dropped ~38% (from ~$47K to ~$29K/month — roughly $216K/year saved) with no change to what the product does. The engagement qualified for AWS partner funding plus Well-Architected remediation credits, so the audit and rework cost the customer $0; CloudRoute's commission was paid by the partner. FinOps review cadence now runs monthly with anomaly alerts wired to Slack.
engagement window: ~6 weeks · steady-state reduction: ~38% · annualized saving: ~$216K · cost to customer: $0
CloudRoute routes you to a vetted AWS partner who audits your bill, builds the prioritized plan, and does the rework. For qualifying engagements it's AWS-funded, so you cut the bill for $0. No procurement, no discovery theater.