Most migrations don't fail at cutover. They fail before it — in a discovery that missed half the dependencies, a 7-R disposition done by gut, a business case nobody believed, or a landing zone retrofitted under fire. This is the full playbook: how to inventory the estate, decide what happens to each application, build a TCO the CFO signs, lay the account foundation, plan the waves, run the actual cutover with a rollback you've tested, optimize the bill afterward, and bring the org along. The brochure tells you AWS will fund it. This tells you how to run it.
The mythology says cloud migrations fail at cutover — the dramatic weekend where the app doesn't come back up. In practice, cutover is where the earlier mistakes surface, not where they're made. A disciplined playbook front-loads the decisions so the cutover is boring.
Walk backward from any failed migration and you almost always land in the same three places. First: incomplete discovery. Someone migrated an application without knowing it called a database on a server nobody documented, or that a nightly batch job in another data center wrote to it. The app moved; the dependency didn't; the thing broke in a way that looked like a cutover failure but was really a discovery failure.
Second: disposition by gut. The team "lift-and-shifted everything" because it was the fastest-sounding option, then spent the next year paying for over-provisioned EC2 that mirrors hardware bought in 2019, and discovered too late that three of those apps should have been retired and two should have been replaced with SaaS. The wrong R was chosen, and the cost showed up after the migration was declared done.
Third: a business case nobody owned. The migration was justified with a hand-wavy "the cloud is cheaper," finance never saw a defensible TCO, and the moment the first month's AWS invoice came in higher than expected — because nothing had been right-sized yet — the project lost air cover. The migration was technically fine and politically dead.
A playbook fixes all three by sequencing them. You discover before you dispose. You dispose before you build. You build the business case from the disposition, not from a slogan. You migrate in waves so each one teaches the next. And you treat optimization as a named phase with a budget, not as something that "we'll get to." The rest of this piece is that sequence, in order, with the real mechanics at each step.
Inventory everything, decide what happens to each thing, prove the cost case, build the foundation once, move in waves with a tested rollback, then spend real effort making it cheap. Everything below is detail on those seven moves.
Discovery is the phase teams most want to rush and least can afford to. You cannot dispose of, cost, or sequence an estate you haven't mapped. The output of discovery is a dependency-aware inventory: every workload, what it runs on, what it talks to, what it costs today, and how hard it is to move.
There are two layers to discover. The infrastructure layer — servers, VMs, storage volumes, databases, load balancers, network paths — and the application layer — which business capabilities those resources actually serve, who owns them, and how they depend on each other. Most teams have a partial view of the first and almost none of the second. The integration map is where the surprises live.
Tooling does the heavy lifting. AWS Application Discovery Service (agentless via a vCenter connector, or agent-based per host) collects server inventory, utilization, and — critically — network connection data so you can see which machines actually talk to which. AWS Migration Hub aggregates that into a single portfolio view and tracks status as you migrate. Third-party tools (Flexera, Device42, the data your CMDB should have) fill gaps. The goal is not a spreadsheet of hostnames; it's a graph of dependencies.
Run discovery long enough to catch the cycles your business actually has — at least two to four weeks. A two-day snapshot misses the month-end close job, the quarterly batch run, the weekend reporting pipeline. Utilization data over a full cycle is also what lets you right-size later instead of lifting over-provisioned capacity one-to-one. Peak and average both matter: size to a sane percentile, not to the single highest spike from a runaway query.
The deliverable that ends discovery is an application portfolio: each application as a row, with its servers and databases, its upstream/downstream dependencies, its current monthly cost, an owner's name, and a first-pass complexity rating. That portfolio is the input to the next phase. If a workload isn't on it, it doesn't get migrated — and finding the ones that aren't on it is exactly the point of doing discovery properly.
Inventory completeness: every host accounted for, including the forgotten ones — the jump box, the license server, the "temporary" VM from 2020 that turned out to run payroll.
Dependency edges: network-flow data showing who talks to whom and on what ports, so dependency clusters (the natural migration waves) are visible rather than guessed.
Utilization over a full cycle: CPU, memory, IOPS, and network at average and peak, captured across at least one month-end so right-sizing is grounded in data.
Ownership + business context: a named owner per application and a sense of its criticality, because the disposition and the wave order both depend on it.
Once you have the portfolio, every application gets exactly one disposition. AWS's framework is the 7 Rs. The discipline is to make the call deliberately — with the cost, risk, and modernization payoff in view — rather than defaulting the whole estate to one R because it sounds fast.
The 7 Rs run from "do nothing" to "rebuild." Most real estates end up as a blend: a long tail of rehost for the undifferentiated workloads, a handful of replatform and refactor for the strategically important ones, some repurchase where SaaS is simply better, and — almost always more than teams expect — a meaningful chunk of retire. The single highest-ROI move in many migrations is discovering how much you can simply turn off.
Decommission it. Discovery routinely reveals 10–20% of an estate is dead weight — zombie servers, duplicate environments, applications whose users left years ago. Every retired workload is migration effort you don't spend and cloud cost you never incur. Hunt for these first; it's the cheapest win in the entire program.
Leave it where it is, for now. Some workloads aren't ready — a mainframe mid-contract, an app pending a separate replacement project, something with a hard data-residency or latency constraint. Retain is a legitimate decision, not a failure; the key is that it's a decision, with a revisit date, not an omission.
Move it to EC2 essentially as-is, no code changes. This is the workhorse of most migrations — fastest path off the old data center, lowest project risk, and fully automatable with AWS MGN. The trade-off: you inherit yesterday's architecture and pay cloud prices for it until you optimize. Rehost is a great start, rarely the end state for important workloads.
Move a whole VMware environment to AWS without converting the VMs — the container/hypervisor moves wholesale (e.g. VMware Cloud on AWS). Useful when you have a large VMware estate and want speed and operational continuity before deciding on per-app modernization. It's rehost's bulkier cousin: minimal change, maximum velocity.
Move with targeted modernization that captures managed-service value without a rewrite. The canonical move: migrate a self-managed database onto Amazon RDS or Aurora, or put a stateless app behind a managed load balancer and auto-scaling. You get operational relief and often cost savings for a fraction of a refactor's effort. For many important-but-not-strategic apps, replatform is the sweet spot.
Drop the self-hosted app and move to a SaaS equivalent — self-managed CRM to a SaaS CRM, a homegrown ticketing tool to a commercial one. You stop owning undifferentiated software entirely. The work shifts from migration to data export/import and change management, which is its own effort, but you delete a whole maintenance burden permanently.
Re-architect for cloud-native — break the monolith into services, go serverless, adopt containers on ECS/EKS, redesign the data layer. The highest cost and risk, and the highest ceiling on agility and long-run economics. Reserve it for the workloads where that ceiling actually matters; refactoring the whole estate up front is how migrations turn into multi-year rewrites that never finish.
A pragmatic order for most programs: retire the dead weight first, rehost/relocate the bulk to get off the data center fast, replatform the databases and stateless tiers for early wins, then refactor the few strategic workloads once you're live and the team has cloud reps. Migrate-then-modernize beats modernize-while-migrating for almost everyone.
A migration that finance doesn't believe in dies at the first surprising invoice. The disposition gives you the inputs for a defensible total-cost-of-ownership model — current-state versus future-state, including the costs migrations love to forget.
Current-state TCO is more than the hardware. Add it all up honestly: server and storage refresh cycles, data-center space/power/cooling, network and circuits, software and support licenses, and the loaded labor for the people who keep the lights on. On-prem hides cost in capital and in headcount; surfacing it fairly is what makes the comparison credible rather than a cloud sales pitch.
Future-state on AWS is built from the discovery utilization data, not from list prices on a calculator at default sizes. Right-size from real usage; assume Savings Plans or Reserved Instances for the steady-state baseline (1- and 3-year commitments cut compute meaningfully); tier storage (S3 standard vs. infrequent-access vs. Glacier); and account for data transfer. AWS Pricing Calculator and the Migration Evaluator (formerly TSO Logic) turn the inventory into a defensible projected run-rate.
Then model the migration itself as a project cost — partner/labor effort, dual-running (you pay for both environments during cutover), training, and tooling — against the benefits: lower run-rate, avoided refresh capex, retired-workload savings, and the harder-to-quantify-but-real agility and reliability gains. The output is a payback period and a multi-year cash curve, with a deliberately conservative case so the first real invoice lands inside the model, not outside it.
This is also where funding enters the math. AWS's Migration Acceleration Program can offset a large portion of both the assessment and the migration execution, which compresses payback dramatically — often turning a 12–18 month payback into single digits. We treat funding as its own phase near the end of this playbook, but model it here: the business case is far stronger when the migration is substantially AWS-funded, and weaker if you pretend it isn't available.
The landing zone is the secure, multi-account AWS environment your workloads land in. Retrofitting governance after you've migrated is painful and risky; building it right before the first wave is one of the highest-leverage things you do. Done well, it's invisible — and that's the point.
A modern landing zone is multi-account by default. AWS Organizations with separate accounts for production, non-production, security/log-archive, and shared services gives you blast-radius isolation, clean billing boundaries, and per-environment guardrails. AWS Control Tower stands this up with sensible defaults — an account factory, mandatory guardrails (preventive SCPs and detective controls), centralized logging, and an identity baseline — so you don't hand-build it.
The foundation has a handful of pillars you want settled before any workload arrives. Identity: federate to your IdP via IAM Identity Center, no long-lived root or shared credentials. Network: a deliberate VPC and connectivity design — Transit Gateway for hub-and-spoke, Direct Connect or VPN back to on-prem for the cutover window, non-overlapping CIDR ranges planned up front. Security and logging: CloudTrail org-wide, centralized GuardDuty and Security Hub, a locked-down log-archive account. Cost governance: consolidated billing, tagging standards enforced from day one, and budgets/alerts wired before spend starts.
Tagging deserves a specific call-out because it's cheap now and expensive later. Agree the tags — owner, environment, cost-center, application — and enforce them as workloads land. The teams that skip this spend the optimization phase trying to figure out who owns the mystery instance costing $2,000 a month. The teams that do it can attribute every dollar on day one of go-live.
Build the landing zone as code. Infrastructure-as-code (Terraform or CloudFormation/CDK) makes the foundation reproducible, reviewable, and consistent across accounts — and it's what lets you scale from a pilot to dozens of accounts without the configuration drift that turns a clean environment into a snowflake within a quarter. The landing zone is the one part of the migration where "measure twice, cut once" pays the highest dividend.
You do not migrate an estate in a single cutover. You group applications into waves and move them in a deliberate order, learning and de-risking as you go. Wave planning is where the dependency graph from discovery turns into a schedule.
The first wave is a proving ground, chosen for low risk and high learning. Pick something real but tolerant — an internal tool, a non-critical app with a clean dependency profile — and use it to validate the landing zone, the cutover runbook, the tooling, and the rollback before anything customer-facing is on the line. The goal of wave one is not throughput; it's to find the gaps in your process while the stakes are low.
Sequence the rest by dependency cluster and risk. Applications that talk to each other should generally move together or back-to-back, so you're not running chatty traffic across the on-prem/cloud boundary for weeks (latency and egress cost both bite). Within that constraint, climb the risk ladder: internal before external, simple before complex, batch before real-time, tier-2 before tier-1. Each wave gets its own runbook, test plan, cutover window, and explicit rollback criteria.
Plan for the hybrid interim. During the migration you'll run hybrid — some workloads on AWS, their dependencies still on-prem, or vice versa — and the connectivity, data-sync, and latency of that interim state need design, not improvisation. Database replication (DMS in ongoing-replication mode) and a solid network path are what keep a split application coherent across the boundary while you finish its wave.
Track everything in one place. Migration Hub (or an equivalent tracker) gives a portfolio-wide status board so stakeholders can see what's migrated, what's in flight, and what's queued. Migrations lose momentum when progress is invisible; a live board, a per-wave retro that feeds the next wave, and a steady cadence are what carry a multi-month program to the finish.
This is the part everyone fears and, with the right tooling and runbook, the part that should be the most boring. For servers, AWS Application Migration Service (MGN). For databases, AWS Database Migration Service (DMS). For each workload, the same four-beat loop — and a rollback you have actually tested.
AWS MGN is the primary rehost engine. You install a lightweight agent on each source server; MGN continuously replicates the entire machine — OS, apps, data — into a staging area in your AWS account with block-level replication, while the source keeps running. Nothing is disrupted during replication. When you're ready, MGN spins up the target instance from the replicated state. Because replication is continuous, the actual cutover downtime is short — typically minutes — rather than the long copy-and-pray windows of manual migration.
The decisive feature of MGN is the test launch. Before you touch production, you launch test instances from the replicated data into an isolated subnet and validate the application end-to-end — does it boot, does it connect to its dependencies, does it serve traffic, does it perform. You can test as many times as you need, fixing issues against a non-production target, while the source stays live and replication continues. By cutover time, you've already seen the migrated app work. That's what turns the cutover from a gamble into a confirmation.
AWS DMS handles the data tier. It migrates databases with the source online, supports homogeneous moves (Oracle→Oracle, MySQL→MySQL) and heterogeneous ones (Oracle→Aurora PostgreSQL, with the Schema Conversion Tool handling schema and code), and — crucially — runs in change-data-capture mode: do a full load, then continuously replicate ongoing changes so the target stays in sync until you flip. That ongoing replication is what shrinks database cutover downtime to the brief window where you stop writes to the source, let the last changes drain, and point the application at the target.
Two rules make this loop safe. First: never decommission the source until the migrated workload has run clean in production for an agreed soak period (days, sometimes a week or two for critical systems) — the intact source is your rollback path, and keeping it warm is the cheap insurance that makes aggressive cutovers reasonable. Second: write the rollback steps down and time-box the decision — "if we are not green by 02:00, we roll back" — so nobody improvises the abort call mid-incident. A cutover with a tested rollback is a calculated, reversible step; without one it is a bet.
1 · Test. Launch test instances (MGN) and a synced target database (DMS), in isolation. Run the full validation suite — functional, integration, performance, security. Iterate until it passes cleanly. The source is untouched throughout.
2 · Cutover. In the planned window: quiesce the source (stop new writes / drain connections), let replication catch up to zero lag, launch the production target from the latest replicated state, and switch traffic (DNS / load balancer / connection string). Minutes, not hours, when the prep is right.
3 · Validate. Immediately run post-cutover checks against production: health endpoints, key user journeys, data integrity spot-checks, dependency connectivity, error rates and latency dashboards. Have the owner confirm against pre-agreed success criteria, not a vibe.
4 · Rollback (if needed). If validation fails the criteria, execute the documented rollback — re-point traffic to the still-intact source. This is why you don't decommission the source at cutover. A rollback you've rehearsed turns a failed cutover into a deferred one instead of an incident.
Going live is the midpoint, not the finish. A pure lift-and-shift, un-optimized, frequently costs <em>more</em> than on-prem for the first few months — because you've mirrored over-provisioned hardware at cloud prices. The 30–50% savings the business case promised come from the optimization phase, and they have to be worked for.
Right-sizing is the first and biggest lever. The instances you migrated were sized for old hardware and peak-of-peaks; AWS Compute Optimizer and Cost Explorer's rightsizing recommendations, read against real post-migration utilization, routinely find double-digit-percent waste. Downsize the over-provisioned, modernize to current-generation instance families (better price/performance), and consider Graviton (Arm) where the workload supports it for a further step down in cost.
Pricing commitments are the second lever. Once you understand your steady-state baseline — a month or two of real running gives you this — cover it with Savings Plans or Reserved Instances. Compute Savings Plans are flexible across instance family and region; a 1- or 3-year commitment on the predictable baseline is one of the largest single line-item reductions available, often 30–50% off on-demand for the committed portion. Leave burst and experimental capacity on-demand or Spot.
Storage and the long tail come next. Move infrequently-accessed data to cheaper S3 tiers or lifecycle it to Glacier; right-size and modernize EBS volumes (gp3 over gp2); delete orphaned snapshots and unattached volumes; tier database storage appropriately. Individually small, collectively large — and easy to automate with lifecycle policies once tagging (set up in the landing zone) tells you what's what.
Finally, deferred modernization. The strategic workloads you rehosted to move fast are now candidates for the replatform/refactor you postponed — managed databases, containers, serverless where it fits — to capture the operational and cost benefits a straight lift didn't. Build a FinOps habit here: a standing cost-review cadence, anomaly alerts, and ownership of the bill, so optimization is continuous rather than a one-time cleanup. The migration earns its business case in this phase or not at all.
The technology is the more predictable half. The half that quietly sinks migrations is organizational: skills, operating model, and the people whose daily work changes. A playbook that ignores the human layer ships a technically-correct migration that the organization can't actually run.
Skills come first because they gate everything else. The team that ran physical servers may not yet know IAM, VPC design, IaC, or cloud cost management. Plan for it deliberately — training, certifications, pairing engineers with an experienced partner during the early waves so knowledge transfers rather than evaporating when the engagement ends. The fastest migrations pair a partner's velocity with intentional upskilling so the in-house team can operate the estate on day one of go-live, not month six.
The operating model changes, not just the infrastructure. Cloud rewards a different shape of team — often a small platform/landing-zone group owning the foundation and guardrails, with application teams owning their own deployments on top. A Cloud Center of Excellence (even a lightweight one) to set standards, share patterns, and steward cost and security is a common and effective structure. Decide who owns the landing zone, who owns cost, and who owns security before go-live, or those responsibilities fall through the cracks the moment the project team disbands.
And manage the change for the people affected. Communicate the why and the timeline; involve application owners in their workload's disposition and cutover so it's done with them, not to them; set expectations about the hybrid interim and the post-cutover soak. Executive sponsorship matters more than any single technical decision — a migration with visible leadership backing survives the inevitable bumps; one without it stalls the first time a wave slips or an invoice surprises. The playbook isn't only servers and databases; it's the organization that will live with the result.
The reason this playbook is economically realistic for most companies is that AWS will co-fund a large share of it through the Migration Acceleration Program (MAP). MAP maps almost one-to-one onto the phases above — which is why it can offset the assessment, the foundation, and the per-workload migration work.
MAP runs in three phases that mirror this playbook. Assess funds the discovery and business-case work — the portfolio inventory, the TCO, the migration readiness assessment (often via a funded Migration Readiness Assessment workshop). Mobilize funds building the foundation and proving the approach — the landing zone, the first-wave pilot, the runbooks and skills ramp. Migrate provides credits scaled to the migrated workloads as you move them at production scale. The funding is structured to follow exactly the sequence a sound migration already follows.
The headline mechanics: MAP credits are typically calculated as a percentage of the AWS consumption the migrated workloads will generate, and stack with general Activate-style credits in many cases, so a substantial migration can net out a long way below sticker — frequently near cash-neutral once the assessment and migration credits are applied against the dual-running and execution costs. The exact tiers, percentages, and how the money actually lands are their own topic; this playbook is the execution arc that the funding wraps around.
The practical sequencing: file MAP early — during discovery, not after — because the Assess funding is meant to pay for discovery itself, and because the program is partner-filed through the AWS Partner Network rather than a form you submit alone. Get the assessment funded, use it to produce the disposition and the business case, then let Mobilize and Migrate funding carry the build and the waves. Run the migration on this playbook; let MAP pay for as much of it as the program allows.
This section is the funding overview. For the full money side — exact MAP tiers, percentages, how credits are calculated and where they land — see the two companion cornerstones: AWS MAP funding explained (the mechanics and the math) and how a MAP-funded migration actually works (the funded walkthrough, phase by phase). This playbook is the migration; those are the dollars.
The disposition decision is the highest-leverage call in the playbook. This is the practical shape of each R — how much effort it takes, how much project risk it carries, what it does to cost, and when it's the right move.
| Disposition | Effort | Project risk | Cost impact | Best for |
|---|---|---|---|---|
| Retire | Minimal | Very low | Pure savings | Dead/duplicate workloads (often 10–20% of the estate) |
| Retain | None (now) | None | No change | Not-yet-ready or constrained workloads, with a revisit date |
| Rehost | Low | Low | Neutral until optimized | The bulk — undifferentiated workloads, speed off the data center |
| Relocate | Low | Low | Neutral until optimized | Large VMware estates moving wholesale, fast |
| Replatform | Medium | Medium | Often lower (managed services) | Databases → RDS/Aurora; stateless tiers; early wins |
| Repurchase | Medium | Medium | Removes maintenance burden | Apps with a strong SaaS equivalent |
| Refactor | High | High | Highest long-run ceiling | The few strategic, differentiating workloads |
Situation: Hard deadline: the data-center lease ended in nine months, so "do nothing" wasn't an option. The team had a partial server inventory, no dependency map, and a board that had been told "cloud is cheaper" without a defensible number behind it. Internal ops staff knew VMware well and AWS barely, and there was no appetite to fund a year of consulting out of pocket.
What CloudRoute did: Routed to an AWS Advanced-tier migration partner who filed MAP and ran the playbook. Funded Assess produced a dependency-mapped portfolio and a TCO; disposition came back as 22% retire (zombie/duplicate VMs nobody had caught), ~60% rehost via MGN, 12% replatform (self-managed databases onto Aurora and RDS via DMS), and a handful retained. Mobilize stood up a Control Tower landing zone and a low-risk first wave (internal tooling) to prove the runbook and the rollback. Remaining workloads moved in five dependency-clustered waves, each on the test → cutover → validate → rollback loop, sources kept warm through a one-week soak.
Outcome: All workloads off the data center two weeks ahead of the lease deadline; zero cutovers rolled back to production (two were aborted in test and fixed before the window). Post-migration right-sizing plus Compute Savings Plans on the steady-state baseline landed the run-rate ~38% below the old all-in data-center cost. MAP Assess + Migrate credits offset the large majority of the execution and dual-running spend; the partner's fee was covered by AWS engagement funding. CloudRoute was paid by the partner — the customer paid $0 to be routed.
estate: ~180 VMs · waves: 6 · cutovers rolled back to prod: 0 · run-rate cut: ~38% · routing cost to customer: $0
CloudRoute routes you to a vetted AWS migration partner who files MAP, runs the discovery → cutover → optimize arc, and transfers the skills to your team. Customer pays $0 to be routed. No procurement theater.