Infrastructure as code (IaC) is how you stop clicking around the AWS console and define your VPC, IAM, ECS/EKS, RDS, and DNS as versioned, reviewable code. This is the practitioner reference: how Terraform/OpenTofu, CDK, CloudFormation, and Pulumi really compare, how state, modules, and CI/CD fit together, how to test it, and how to migrate off a hand-built account — plus how CloudRoute routes you to a vetted AWS partner who builds the foundation for you (often AWS-funded, so $0 if you qualify).
Infrastructure as code means describing your cloud resources in declarative files that live in Git, then letting a tool reconcile reality with that description. On AWS specifically, that's the difference between an account anyone can rebuild and an account that lives in one engineer's head.
The console is fine for learning and for poking at one resource, but it falls apart the moment you have more than one environment, more than one engineer, or a compliance requirement: nobody remembers which security-group rule was added at 2am during an incident, or why the staging VPC has a different CIDR than production. ClickOps — building infrastructure by hand in the console — produces snowflake environments that drift, can't be reviewed, and can't be rebuilt. IaC fixes this by making the desired state explicit: you write, for example, "this VPC has these three subnets, this ECS service runs four tasks behind this ALB, this RDS instance is Multi-AZ with seven-day backups." The tool diffs that against what exists, shows you a plan, and applies it — so the same files produce dev, staging, prod, and the DR region from one source of truth with a handful of variables changed.
The payoffs that matter to a startup: repeatability (stand up an identical environment in a new region in hours, not weeks), review (infra changes go through pull requests, so a second person sees the blast radius before it ships), auditability (Git history is your change log — who changed what, when, and why, which auditors love for SOC 2 and ISO 27001), recoverability (if a region or account is lost, you re-apply the code), and velocity with guardrails (engineers self-serve infrastructure through PRs while policy checks block the dangerous changes), and cost control (when every resource is in code you can see the whole footprint and delete the orphaned NAT gateways and forgotten load balancers that hand-built accounts accumulate).
IaC (Terraform, CDK, CloudFormation, Pulumi) provisions cloud resources — VPCs, IAM roles, managed services. Configuration management (Ansible, Chef, Puppet) configures what runs inside servers. On modern AWS you lean heavily on IaC and use little classic config management, because most "inside the box" work moves into container images and ECS/EKS task definitions — themselves defined in IaC.
There are four tools a serious team would actually choose between on AWS today. They split along two axes: declarative config language vs general-purpose programming language, and cloud-agnostic vs AWS-native. The honest one-line version: Terraform/OpenTofu are the default for ecosystem and multi-cloud reach; CDK is the pick if your team is app engineers who want real code and you're happy AWS-only; CloudFormation is the lowest-dependency native option (and the substrate CDK compiles to); Pulumi is the bet for code plus a managed state platform across clouds.
Terraform uses HCL, a declarative configuration language, and a huge provider ecosystem (AWS, Cloudflare, Datadog, GitHub, and hundreds more) so one tool manages your whole stack, not just AWS — and it has the deepest hiring pool by far.
The 2026 licensing wrinkle: HashiCorp re-licensed Terraform under the Business Source License (BSL) in 2023. For nearly every normal user — companies running their own infrastructure — that changes nothing; the restriction targets vendors who would resell a competing Terraform-as-a-service. In response the community created OpenTofu, a fully open-source (MPL-2.0) fork now under the Linux Foundation. OpenTofu is largely drop-in compatible and the right default if you want a permissive license. Most teams pick one, standardize, and move on; the HCL and module ecosystem is shared.
The AWS Cloud Development Kit lets you define infrastructure in TypeScript, Python, Java, C#, or Go — with loops, conditionals, and IDE autocomplete — then synthesizes down to CloudFormation, so you inherit its drift detection and rollback. CDK shines when your team is application engineers rather than dedicated infra people: high-level "L2/L3" constructs encode AWS best practices (sensible defaults for a load-balanced Fargate service, say) so you write far less boilerplate than raw CloudFormation. The trade-offs: it's AWS-only, the abstraction can hide what's actually being created until you read the synthesized template, and version upgrades occasionally need care.
CloudFormation is AWS's own declarative IaC service (YAML or JSON templates). It's deeply integrated, needs no third-party state backend (AWS manages state as stacks), and supports change sets, drift detection, and StackSets for multi-account/multi-region rollout — with nothing extra to license or self-host. The downsides are ergonomic: raw templates are verbose, support for brand-new AWS features sometimes lags the API, and it's AWS-only. Many teams never write it by hand anymore and use it indirectly as the engine under CDK, but it remains a defensible choice for an all-in AWS shop that wants zero external dependencies.
Pulumi, like CDK, lets you write infrastructure in general-purpose languages (TypeScript, Python, Go, C#) — but unlike CDK it's multi-cloud and ships with Pulumi Cloud, a managed state and secrets backend, as the default, so you get programming-language power across AWS, Azure, GCP, and Kubernetes plus a managed control plane out of the box. The considerations: a smaller community and hiring pool than Terraform, and the default backend is a paid SaaS at scale (you can self-host state, but most don't). For an AWS-only startup the case is weaker than Terraform/OpenTofu or CDK; for a polyglot platform team it's compelling.
Every IaC tool keeps a record of what it has provisioned — an explicit state file with Terraform/OpenTofu and Pulumi, the stack AWS manages for you with CloudFormation/CDK. Mishandling it is the single most common way teams hurt themselves.
Terraform/OpenTofu state is a JSON document mapping your code to real resource IDs, and two rules are non-negotiable. First, never keep state on a laptop or in Git — it can contain secrets, and a local copy means two engineers will eventually clobber each other's changes; store it in a remote backend (a versioned, encrypted S3 bucket). Second, turn on locking so two simultaneous applies can't corrupt state. As of 2025, S3 supports native lockfile-based locking, so the old pattern of a separate DynamoDB lock table is no longer strictly required — though existing DynamoDB setups are fine.
Beyond storage, the operational disciplines: keep state files small and scoped (one giant state for the whole company is slow and has enormous blast radius), use terraform plan as a mandatory review artifact before every apply, treat state as sensitive (encrypt at rest, restrict bucket access), and prefer moved blocks and import over editing state by hand when restructuring.
CloudFormation and CDK sidestep the storage question — AWS holds state as the stack — but have their own failure mode (a stack stuck in UPDATE_ROLLBACK_FAILED, or a resource that drifted because someone changed it in the console); the equivalent hygiene is drift detection plus never touching managed resources outside the pipeline. Pulumi defaults to its managed backend (or self-hosted state), with the same "remote, locked, encrypted" principles.
Pick remote, locked, encrypted state on day one. Retrofitting state hygiene onto a year-old project that's been applying from three different laptops is genuinely painful — it's one of the most common things a partner has to untangle before anything else can improve.
A pile of .tf files in one folder works for a weekend project and nothing larger. Real IaC is organized into reusable modules and composed per environment, so dev/staging/prod share logic and differ only in parameters.
A module is a parameterized, reusable unit of infrastructure — "a standard VPC," "our ECS service pattern," "an RDS Postgres with our backup and monitoring defaults." You write the pattern once, then instantiate it for each environment with different inputs (instance size, replica count, CIDR), keeping dev and prod identical in shape while differing only where they should.
Most teams compose three layers: thin environment roots (one per dev/staging/prod, mostly wiring) that call shared internal modules (your opinionated building blocks), which may wrap vetted community/registry modules (e.g. the widely used terraform-aws-modules VPC and EKS modules). One common debate: for separating environments, most teams prefer separate directories over Terraform workspaces — it's more explicit and the blast radius is clearer.
IaC delivers most of its value only when it runs through a pipeline, not from laptops. The canonical flow: a pull request triggers a plan, humans review the plan, a merge (or a manual approval) triggers the apply.
The minimum viable pipeline runs on every PR: fmt and validate for hygiene, then plan with the output posted onto the pull request as a comment so reviewers see exactly which resources will be created, changed, or destroyed. Nothing applies to production without a human reading that plan. On merge, the pipeline runs apply — ideally gated behind a manual approval for production so a person consciously ships infra changes. You can build this on GitHub Actions, GitLab CI, or CodePipeline/CodeBuild, or adopt a purpose-built IaC automation platform as you scale.
Policy-as-code is what turns the pipeline from "runs commands" into "enforces guardrails." A policy engine evaluates the plan and blocks changes that violate rules — no public S3 buckets, no security groups open to 0.0.0.0/0 on port 22, mandatory encryption, required tags, instance-type allow-lists. The common engines are Open Policy Agent / Conftest (Rego), HashiCorp Sentinel (HCP Terraform), and checkov's policy packs. With policy gates, junior engineers can self-serve infrastructure safely because the dangerous changes simply fail the pipeline, and a scheduled nightly plan flags drift if reality diverges from code.
The pipeline should authenticate to AWS via OIDC-federated roles (GitHub Actions and GitLab both support this) — short-lived credentials, no long-lived access keys sitting in CI secrets, with the apply role scoped to what it actually needs per environment. This single change closes one of the most common audit findings on startup AWS accounts.
Infrastructure code can and should be checked before it touches an account — from fast static checks that run in seconds to full integration tests that stand up real resources.
The fast, always-on layer is static analysis. tflint catches Terraform-specific mistakes, deprecated syntax, and provider-aware errors (e.g. an invalid instance type) before you ever plan. checkov and tfsec/Trivy scan IaC for security and compliance misconfigurations — unencrypted volumes, public buckets, permissive IAM, missing logging — and map findings to benchmarks like CIS. These run in the PR pipeline in seconds; there is no good reason to skip them.
The heavier layer is real testing. Terratest (a Go library) and Terraform's native terraform test framework provision infrastructure in a throwaway account, assert it behaves correctly (the load balancer returns 200, the database is reachable only from the app subnet), then tear it down. Most startups skip this early — a reasonable trade-off at first — but it's where serious platform teams catch regressions in shared modules before production. The pragmatic ladder: fmt + validate + tflint + checkov on every PR from day one, policy-as-code as the team grows, and integration tests for your most-reused modules once a broken module would hurt multiple teams.
| Tool | What it does | Speed | When to adopt |
|---|---|---|---|
| terraform fmt / validate | Formatting + syntax/config validity | Instant | Day one |
| tflint | Terraform-aware linting, provider rules, deprecations | Seconds | Day one |
| checkov / tfsec (Trivy) | Security + compliance misconfig scanning (CIS, etc.) | Seconds | Day one |
| OPA / Conftest / Sentinel | Policy-as-code gates on the plan | Seconds | As team grows |
| Terratest / terraform test | Integration tests against real, ephemeral resources | Minutes | For reused modules |
Two questions every growing AWS estate eventually faces: how do I separate environments safely, and how do I get my existing hand-built account under IaC without a big-bang rewrite?
The AWS-recommended pattern is multi-account: separate AWS accounts for prod, staging, dev, security/logging, and shared services, all under AWS Organizations with a landing zone (AWS Control Tower, or a Terraform landing zone like the terraform-aws-modules / AFT approach). Account boundaries are the strongest blast-radius and security boundary AWS offers — a mistake in dev physically cannot touch prod, and Service Control Policies give you guardrails across the whole org. Your IaC then assumes roles to deploy the same modules into each account. (The sibling AWS landing zone page goes deep on this.)
Migrating off ClickOps is the other big one, and the key message is: you do not have to rebuild. IaC tools can import existing resources so the code adopts what's already running rather than recreating it. Terraform/OpenTofu have import blocks (and tools like Terraformer bulk-generate config from a live account); CloudFormation has resource import; CDK can wrap imported resources. The realistic playbook is incremental: import one bounded piece (the VPC, or one service's stack), get a clean plan that shows zero changes — meaning your code now matches reality — merge, and move on. Over a few weeks the whole account comes under code without a risky cutover.
A landing zone also wires up SSO via IAM Identity Center, so engineers get short-lived, role-based access into each account instead of long-lived IAM users — the human-access counterpart to the OIDC roles your pipeline uses. The honest caveat on migration: import is fiddly. Generated config usually needs hand-cleaning, and getting to "plan shows no changes" on a messy account takes patience. This is a common reason teams bring in help — getting it wrong (an apply that proposes to delete-and-recreate your production database because the imported config didn't match) is exactly the failure you want an experienced hand to prevent.
A single checklist to audit your setup against. None of these are exotic — the gap between teams that "use Terraform" and teams with a genuinely solid foundation is almost entirely about whether the unglamorous items below are actually in place.
Reading this page tells you what good looks like. Building it from scratch is several weeks of specialized work, and CloudRoute exists so you don't have to hire for that.
CloudRoute routes startups and companies to vetted AWS partners who build the foundation for you and hand you the keys (deliverables itemized below). The point is that you end up owning standard, hireable-against infrastructure — a clean Terraform/OpenTofu or CDK codebase your own team can drive — not a bespoke setup only one consultant understands.
The honest economics: for credit-eligible companies (typically institutionally funded startups), the partner engagement is often substantially AWS-funded — the partner is paid through AWS partner programs and your AWS consumption during the build is covered by credits — so the cost to you is $0 or low. CloudRoute is paid by the partner, not by you. For companies that aren't credit-eligible, it's a vetted-partner referral at the partner's rates: you still skip the hiring-and-vetting slog and the months it takes to recruit a senior platform engineer who may not exist in your hiring market.
If you also want AWS credits to fund the build and the first year of spend, the same routing covers it — see the $100K AWS credits path, and the startup page for how matching works.
A production-grade IaC repo you own · remote locked encrypted state · a CI/CD plan/apply pipeline with policy-as-code gates · a multi-account landing zone with SCP guardrails · existing resources imported under code · a short handover so your team can drive it — frequently at $0 to you via AWS funding for credit-eligible companies.
There is no universally correct answer — it depends on your team's skills, whether you're AWS-only, and how much you value ecosystem vs native integration. Here's the honest 2026 comparison.
| Dimension | Terraform / OpenTofu | AWS CDK | CloudFormation | Pulumi |
|---|---|---|---|---|
| Language | HCL (declarative) | TypeScript / Python / Java / C# / Go | YAML / JSON (declarative) | TypeScript / Python / Go / C# |
| Cloud scope | Multi-cloud (100s of providers) | AWS-only | AWS-only | Multi-cloud |
| State | Explicit file (S3 + locking) | Managed by AWS (as stacks) | Managed by AWS (as stacks) | Pulumi Cloud or self-hosted |
| License (2026) | TF: BSL · OpenTofu: MPL-2.0 (open) | Apache-2.0 (open) | AWS service | Apache-2.0 core; paid SaaS backend |
| Ecosystem / hiring pool | Largest by far | Growing, AWS-centric | Mature but native-only | Smaller |
| Best fit | Default; multi-cloud; biggest talent pool | App engineers who want real code, AWS-only | All-in AWS, zero external deps | Code-first platform teams, multi-cloud |
| Watch-outs | BSL licensing nuance → many choose OpenTofu | Abstraction hides generated CFN; AWS-only | Verbose; new features can lag the API | Smaller community; managed backend cost |
Situation: The whole AWS account had been clicked together by the founding CTO over 18 months — one account, no staging that matched prod, security groups nobody could explain, and a pending SOC 2 that needed an auditable change process. No one on the team had run Terraform in production. They were terrified of a rebuild touching the live database, and couldn't justify a $180K senior platform hire at their stage.
What CloudRoute did: CloudRoute routed them within a day to a vetted AWS partner with a Terraform + landing-zone track record. The partner stood up a multi-account layout (prod/staging/dev + logging) via Control Tower, an OpenTofu codebase with internal VPC/ECS/RDS modules, remote S3 state with locking, and a GitHub Actions plan/apply pipeline with checkov scanning, OPA policy gates, and OIDC-federated least-privilege roles. Existing production resources — including the database — were imported incrementally until plan showed zero drift; nothing was recreated.
Outcome: In 4 weeks the account went from snowflake to fully code-managed: staging now mirrors prod from the same modules, every infra change goes through a reviewed PR with a posted plan (which closed the SOC 2 change-management gap), and drift detection alerts on console edits. Because the company was credit-eligible, CloudRoute paired the build with an AWS credit application — the engagement and the AWS spend during the build were AWS-funded, so the cost to the customer was $0. The team now drives the codebase themselves.
engagement window: 4 weeks · founder time: ~12 hours · result: multi-account IaC foundation + SOC 2 change-control · cost to customer: $0 (credit-eligible)
CloudRoute routes you to a vetted AWS partner who builds your Terraform/OpenTofu or CDK foundation, CI/CD, and multi-account landing zone — and imports what you already run. For credit-eligible companies, often AWS-funded at $0 to you.