Not a generic listicle. This is an ordered playbook: the fifteen things that actually move an AWS bill, each with a realistic savings range, an honest effort rating, and the exact how-to. Do the week-one moves first (idle resources, gp2→gp3, anomaly alerts — near-zero effort, often 10–20% off), then the structural ones (Savings Plans, Graviton, NAT/cross-AZ surgery). When the rework is bigger than your team has time for, a CloudRoute-matched partner runs the audit and does the cutting — often AWS-funded, so qualifying teams cut the bill for $0.
The reason most "cut your AWS bill" lists fail is that they're unordered. They put "negotiate an EDP" next to "delete an unattached EBS volume" as if those are the same kind of work. They are not. The right mental model is savings-per-unit-of-effort, sequenced so you bank the free wins before you touch anything structural.
Three numbers decide the order of every lever: how much it saves (as a percentage of the affected spend), how much effort it takes (engineer-hours plus risk), and how reversible it is. A move that saves 60% but locks you into a 3-year commitment is not a week-one move; a move that saves 8% with zero risk and ten minutes of work absolutely is. Rank by savings-per-effort, do the cheap-and-safe ones immediately, and only then spend political capital on the structural changes.
There's a FinOps framing worth knowing because it maps cleanly onto the sequence: inform → optimize → operate. Inform is visibility — you cannot cut what you cannot see, so Cost Explorer, tagging, and anomaly alerts come first. Optimize is the actual reduction — right-sizing, commitments, storage tiering, transfer surgery. Operate is making it stick — budgets, ownership, and showback so the savings don't silently regrow. That last phase matters because cloud spend grows back: cut 35% in a quarter and it creeps back over the next two if nobody owns the number, which is why the playbook below ends on governance deliberately.
Everything in this section is reversible, needs no purchasing decision, and can be done by one engineer in a few afternoons. This is where you should start, full stop. These five moves alone routinely take 10–20% off a neglected bill.
The defining feature of week-one work is that you're deleting waste, not changing how the product runs — nothing here can interrupt a workload or lock you into anything. If a stakeholder is nervous about cost work touching production, point them here first: it builds trust before you ask for the bigger changes.
What it is: Resources that are running and billing but doing no work. Idle EC2 instances at near-0% CPU, dev boxes nobody turned off, load balancers with no targets, Elastic IPs that are allocated but unattached (AWS charges for idle EIPs), orphaned Elastic IPs after instance termination, and old RDS instances kept "just in case."
How: Open AWS Compute Optimizer and Cost Explorer's resource-level view; filter EC2 by low CPU/network over the last 14–30 days. Cross-check the ELB console for load balancers with zero healthy targets and the EC2 console for unassociated Elastic IPs. Build a kill-list, confirm ownership, and terminate. For anything you're unsure about, stop the instance first (you still pay for EBS but not compute) and delete a week later if nobody screams. Why it's first: it's pure waste — compute that produces literally nothing, with no tradeoff to weigh.
What it is: gp3 is the newer general-purpose SSD volume type and is roughly 20% cheaper per GB than gp2, while letting you provision IOPS and throughput independently. For most workloads it is a strictly better deal — same or better performance, lower price.
How: A gp2→gp3 modification is an online change — no downtime, no detach. Inventory your gp2 volumes (Cost Explorer or a quick CLI/Config query), then modify each to gp3 via the console or aws ec2 modify-volume. gp3 includes a baseline 3,000 IOPS and 125 MB/s free; only provision extra if a volume genuinely needs it. Script it and you can convert hundreds of volumes in an afternoon. Tradeoff: essentially none for general-purpose volumes — just size the included IOPS/throughput correctly on very high-throughput volumes rather than over-provisioning.
What it is: EBS snapshots accumulate forever unless something prunes them, and unattached ("available") volumes keep billing per GB after the instance they belonged to is gone. Both are silent, steady waste.
How: List volumes in the available state and confirm none are needed before deleting. For snapshots, identify ones older than your real recovery window (often 30–90 days) with no AMI dependency and delete them — or, better, automate retention with Amazon Data Lifecycle Manager so this never builds up again. Watch for snapshots underpinning AMIs you still use. Tradeoff: snapshots are your backups, so the fix is a sane retention policy, not zero retention.
What it is: AWS Cost Anomaly Detection uses ML to flag unusual spend and email you when something jumps; AWS Budgets sends threshold alerts. Together they stop the "surprise $40K month" before it compounds.
How: Enable Cost Anomaly Detection (it's free), create a monitor for the whole account or per-service, and wire alerts to email/Slack. Add a couple of AWS Budgets — one on total monthly spend, one on the noisiest service — with alerts at 80% and 100% of expected. Ten minutes of setup. Why now: it doesn't cut today's bill, but it protects every cut you're about to make and catches regressions automatically — the cheapest insurance in cloud cost management.
What it is: Small recurring charges that add up — unattached Elastic IPs, idle NAT Gateways in forgotten VPCs, leftover test endpoints, and public IPv4 addresses (AWS now charges hourly for all public IPv4, including attached ones).
How: Release Elastic IPs you're not using, audit each VPC for NAT Gateways that no longer need outbound internet, and reduce public IPv4 addresses now that they carry an hourly charge. Tradeoff: none, as long as you confirm nothing is actively using the address or endpoint.
Week-one moves are individually small but collectively meaningful — 10–20% off a neglected bill is a common combined result, with zero risk and zero spend commitment. You bank real savings before touching anything structural, which buys credibility for the harder moves.
These moves take more effort and, in the case of commitments, a real purchasing decision — but they're where the largest savings live. This is the heart of the playbook.
Two things change in month-one work. First, you start touching how workloads run (right-sizing changes instance types — validate performance after). Second, you make commitments — Savings Plans and Reserved Instances trade flexibility for a big discount. Both are worth it, but they deserve more care than the week-one deletions.
What it is: Most instances are bigger than the workload needs because someone guessed high at launch and never revisited. Right-sizing matches the instance to actual utilization — dropping a size, switching families, or consolidating.
How: AWS Compute Optimizer reads 14+ days of CloudWatch metrics and recommends specific instance types per resource with projected savings; start with the highest-cost, lowest-utilization instances. For RDS, check CPU, connections, and freeable memory before dropping a class. For ECS/EKS, right-size task/pod requests — over-provisioned container requests waste cluster capacity invisibly.
Tradeoff: Under-sizing hurts performance, so move one size at a time, validate after each change, and keep headroom for spikes.
What it is: The single biggest lever. Savings Plans give a large discount in exchange for committing to an hourly spend ($/hr) for 1 or 3 years. Compute Savings Plans are the flexible kind — they apply across EC2, Fargate, and Lambda regardless of region/family/OS — at a slightly smaller discount. EC2 Instance Savings Plans give a deeper discount but lock you to an instance family in a region.
How: Right-size first (#6) — never commit to an oversized baseline — then use Cost Explorer's Savings Plans recommendations to find your stable hourly floor. Cover the predictable baseline with commitments and leave the spiky top on-demand or Spot. Start conservative (~60–70% of baseline) on 1-year No-Upfront for flexibility, layer more as confidence grows, and reserve 3-year All-Upfront for the truly permanent core where the deeper discount is worth the lock-in.
Tradeoff (be honest): Commitments reduce flexibility — you pay the committed $/hr whether or not you use it. Over-commit and you're paying for capacity you don't need; under-commit and you leave discount on the table. The Compute SP's cross-service flexibility is usually worth giving up a few discount points for, especially for a fast-changing startup.
What it is: EC2 has largely moved to Savings Plans, but Reserved Instances are still the discount mechanism for the managed data services — RDS/Aurora, ElastiCache, Redshift, and OpenSearch. Same idea: commit 1 or 3 years for a steep discount over on-demand.
How: Your databases are usually your most stable spend, which makes them ideal RI candidates. Right-size the instance class first, then buy RIs to match the steady baseline. Check Cost Explorer's RI recommendations per service. No/Partial/All-Upfront trades cash-now for a slightly bigger discount.
Tradeoff: Same lock-in logic as Savings Plans, but databases rarely change shape week to week, so the commitment risk is lower than on compute.
What it is: Most S3 buckets keep everything in Standard forever, even data nobody has touched in a year. Lifecycle policies and storage classes move cold data to far cheaper tiers automatically.
How: For data with predictable aging (logs, backups), set lifecycle rules to transition to Standard-IA, then Glacier Instant/Flexible/Deep Archive, then expire. For unpredictable access patterns, turn on S3 Intelligent-Tiering — it auto-moves objects between tiers based on access and removes the guesswork (with a tiny monitoring fee per object). Also expire incomplete multipart uploads and old object versions, which silently accumulate.
Tradeoff: Colder tiers have retrieval latency and/or retrieval fees, so don't archive hot data. Match the class to the access pattern — that's exactly what Intelligent-Tiering automates.
What it is: Dev, staging, and QA environments rarely need to run nights and weekends, yet most run 24/7. A workday-only schedule (say 12 hours × 5 days = 60 of 168 hours) cuts those non-prod compute and database hours by roughly 65%.
How: Use AWS Instance Scheduler, native EventBridge rules, or a tag-driven Lambda to stop EC2/RDS on a schedule and start them in the morning. Tag environments by purpose so the scheduler knows what's safe to stop. Auto-Scaling groups for non-prod can scale to zero overnight.
Tradeoff: Make sure nothing critical (nightly batch, a demo env, an on-call's sandbox) gets stopped. Tag-based exclusions handle the exceptions.
Commitments are the biggest single lever, but the rule is fixed: right-size first, then commit to the baseline, then let Spot absorb the spiky top. Buying Savings Plans or RIs on an over-provisioned fleet locks in waste at a discount — you want the discount on the correct baseline, not the bloated one.
These moves take real engineering work or organizational change, so they live on a quarter horizon. They're where you re-architect for cost — moving fault-tolerant workloads to Spot, migrating to Graviton, performing surgery on data-transfer charges no dashboard surfaces by default — and where the closing governance moves stop everything you just did from regrowing.
What it is: Spot Instances are spare EC2 capacity sold at up to ~90% off on-demand, with the catch that AWS can reclaim them on two minutes' notice. For interruption-tolerant work, that discount is enormous.
How: Ideal Spot workloads: CI/CD runners, batch and data processing, rendering, ML training with checkpointing, and stateless web tiers behind a load balancer. On EKS, run Spot node groups via Karpenter or managed node groups with capacity-type Spot; mix Spot with a smaller on-demand or Savings-Plan-covered base for resilience. Diversify across instance types and AZs so a single capacity reclaim doesn't take everything down.
Tradeoff (be honest): Spot can be interrupted, so never put stateful single-instance databases or anything that can't tolerate a sudden stop on pure Spot. Architect for interruption (checkpoints, graceful drain, on-demand fallback) and Spot is close to free money; ignore that and it bites.
What it is: AWS Graviton processors are ARM-based and deliver materially better price-performance than equivalent x86 instances across EC2, RDS, ElastiCache, Lambda, and more — commonly cited at ~20–40% depending on workload.
How: Start where the lift is smallest: managed services (RDS/ElastiCache/OpenSearch) often only need an instance-class switch to a Graviton class. For your own apps, modern runtimes (Java, Go, Node, Python, .NET) generally run on ARM with little to no code change — rebuild container images as multi-arch and test. Lambda functions can flip to arm64 with a config change. Benchmark a non-prod slice, then roll forward.
Tradeoff: Some native dependencies or older binaries need an ARM build, so test before fleet-wide rollout. The price-performance gain is durable once you're there.
What it is: The silent killer of AWS bills. NAT Gateways charge both an hourly fee and a per-GB data-processing fee on everything routed through them; cross-AZ traffic is billed in both directions; internet egress is billed per GB. These line items hide inside "EC2-Other" and surprise teams at scale.
How: Add free VPC Gateway Endpoints for S3 and DynamoDB so that traffic skips the NAT Gateway entirely — often the single biggest NAT saving — and Interface Endpoints (PrivateLink) for other AWS services that would otherwise egress via NAT. Audit chatty cross-AZ patterns (replicas, brokers, service-to-service calls), co-locate tightly-coupled components in one AZ where availability allows, and review whether every private subnet truly needs its own NAT Gateway.
Tradeoff: Collapsing services into one AZ trades a little resilience for lower transfer cost — make that call per workload based on its availability target, not blanket.
What it is: You can't do showback/chargeback, can't find an owner for waste, and can't hold teams accountable without consistent tags. Cost Allocation Tags break the bill down by team, environment, product, and cost-center in Cost Explorer and the Cost and Usage Report (CUR).
How: Define a small, enforced tag schema (e.g. owner, env, service, cost-center), activate them as Cost Allocation Tags in the Billing console, and enforce on creation with Tag Policies / SCPs and IaC defaults. Backfill existing resources. Then build per-team cost views so each team sees its own spend.
Tradeoff: Tagging is unglamorous and needs enforcement to stay clean, but without it cost work has no accountability and the savings quietly regrow.
What it is: The difference between a one-time 30% cut and a permanently lower run-rate. This is the "operate" phase — budgets with owners, anomaly alerts wired to the right people, a monthly cost review, and showback so each team owns its number.
How: Put a recurring (monthly) cost review on the calendar with engineering and finance. Give every major cost center a budget and an owner. Make Cost Anomaly Detection findings route to the owning team, not a void. Track unit economics where you can (cost per customer / per request) so efficiency, not just absolute spend, is visible.
Tradeoff: It's ongoing effort rather than a one-shot fix — but it's the only thing that keeps the bill from creeping back up next quarter.
There's a clean line for when to do this in-house versus when to bring in a vetted AWS partner. It's not about capability — your team can do all fifteen moves. It's about capacity, scale, and whether AWS will fund the work.
Do it yourself when the bill is small enough that a few engineer-days clears most of the waste and the levers are the obvious ones — for a lot of early-stage teams the week-one and month-one sections above are genuinely enough. Bring in a partner when (a) the bill is large enough that a few percentage points is real money, (b) the savings live in the harder levers — commitment-portfolio modeling across Savings Plans and RIs, deep right-sizing, NAT/transfer surgery, a Graviton migration — and your team is already at capacity, or (c) you want a Well-Architected Review of the whole estate, not just a cost pass. The work is the same playbook; you're buying focused execution and a second set of expert eyes.
The part most teams don't realize: this is often AWS-funded. AWS funds partner-led cost-optimization and Well-Architected engagements through its partner programs — on qualifying engagements the partner is paid through AWS, not by you — and a Well-Architected Review can unlock remediation credits that offset the cost of the fixes. So on a qualifying, credit-eligible engagement you can cut your bill for $0. The honest caveat: where an engagement doesn't qualify for funding, it's a vetted-partner referral that pays for itself many times over in the savings it finds.
That's the CloudRoute role. You tell us your rough bill and where it hurts; we route you to a vetted AWS partner with a cost-optimization / Well-Architected track record matched to your stack and region. They run the audit, quantify the savings, and do the rework — and where the engagement qualifies, AWS funds it. If you want the credit angle alongside the optimization, the Well-Architected Review path and AWS Activate credits can stack with this.
A partner-led cost audit is often AWS-funded → you cut the bill for $0 on qualifying, credit-eligible engagements, and a Well-Architected Review can unlock remediation credits for the fixes. Where an engagement doesn't qualify, it's a vetted referral that pays for itself in the savings — we won't pretend every engagement is free, but for a lot of teams with a real bill, this one genuinely is.
Plenty of teams start cutting costs and stall, or worse, hurt reliability and conclude "cost work is dangerous." It's almost always one of these five mistakes — all avoidable.
The whole playbook in one table, ordered the way you should actually execute it — highest savings-per-effort and lowest risk at the top. Savings ranges are representative; your numbers depend on your architecture and current waste. Check AWS Cost Explorer and the AWS pricing pages for current rates.
| # | Move | Typical savings | Effort | Horizon |
|---|---|---|---|---|
| 1 | Kill idle / zombie resources | 5–15% | Low | This week |
| 2 | EBS gp2 → gp3 | ~20% on those volumes | Very low | This week |
| 3 | Delete old snapshots + unattached EBS | 2–8% | Low | This week |
| 4 | Cost Anomaly Detection + budget alerts | Prevents future spikes | Very low | This week |
| 5 | Release unattached EIPs / idle endpoints | 1–3% | Very low | This week |
| 6 | Right-size EC2 / RDS / containers | 15–30% | Medium | This month |
| 7 | Savings Plans on steady-state compute | up to ~70% | Medium | This month |
| 8 | Reserved Instances (RDS / cache / Redshift) | up to ~60% | Medium | This month |
| 9 | S3 lifecycle + Intelligent-Tiering | 20–60% on storage | Low–med | This month |
| 10 | Schedule non-prod off-hours | 60–70% on those resources | Low | This month |
| 11 | Spot for batch / stateless | up to ~90% | Med–high | This quarter |
| 12 | Graviton (ARM) migration | ~20–40% price-perf | Med–high | This quarter |
| 13 | Kill NAT / cross-AZ transfer waste | 5–20% | Medium | This quarter |
| 14 | Tagging + cost allocation tags | Enables accountability | Medium | This quarter |
| 15 | FinOps operating loop (budgets/showback) | Makes savings durable | Medium (ongoing) | This quarter |
Situation: Bill had crept from $40K to $95K/month over a year with no FinOps practice. Almost no commitment coverage (everything on-demand), EBS still mostly gp2, an EKS platform over-provisioned at the node and pod level, and a large unexplained "EC2-Other" line nobody could decompose. The platform team knew roughly what to do but was fully allocated to a product launch and couldn't take a multi-week cost project.
What CloudRoute did: Routed within 24 hours to an EU-based AWS Advanced partner with a Well-Architected + FinOps track record. The partner ran a Cost Optimization review: gp2→gp3 across the fleet, deep right-sizing via Compute Optimizer, a Compute Savings Plans portfolio sized to ~65% of the now-right-sized baseline, RIs on the RDS estate, Karpenter Spot node groups for stateless and CI workloads, S3 lifecycle rules, and — the big "EC2-Other" find — S3/DynamoDB gateway endpoints plus cross-AZ co-location that gutted NAT data-processing charges. Tagging and budgets landed last to keep it from regrowing.
Outcome: Run-rate down 38% (from ~$95K to ~$59K/month) within the engagement, with the commitment and transfer fixes contributing the most. The Well-Architected Review unlocked remediation credits that offset the fix work; the partner engagement qualified for AWS funding, so the customer paid $0 for the audit and rework. Anomaly detection + monthly reviews now keep the number flat.
engagement window: ~6 weeks · run-rate cut: 38% (~$36K/mo) · audit cost to customer: $0 (AWS-funded) · biggest levers: commitments + NAT/transfer
CloudRoute routes you to a vetted AWS partner with a cost-optimization / Well-Architected track record matched to your stack and region. On qualifying engagements AWS funds the work, so you cut the bill for $0. Otherwise it's a vetted referral that pays for itself in the savings.