Infrastructure Disaster Recovery with Terraform #

Terraform-Dr, State-Recovery, Infrastructure-Rebuild, Cross-Region-Patterns

Advanced

Terraform, Disaster-Recovery, State-Recovery, Blue-Green, Immutable-Infrastructure, Cross-Region, Backup, Runbook

Platform-Detection, Infrastructure-Scoping, Context-Analysis, Multi-Cloud-Operations, Agent-Workflow-Design

Infrastructure Disaster Recovery with Terraform#

Application disaster recovery is well-understood: replicate data, failover traffic, restore from backups. Infrastructure disaster recovery is different — you are recovering the platform that applications run on. If your Terraform state is lost, your VPC is deleted, or an entire region goes down, how do you rebuild?

This article covers the DR patterns specific to Terraform-managed infrastructure: protecting state, recovering from state loss, designing infrastructure for regional failover, and the runbooks that agents and operators need when things go wrong.

Infrastructure Knowledge Scoping for Agents

February 22, 2026

Cloud-Services

Intermediate, Advanced

Agent-Patterns, Knowledge-Scoping, Multi-Cloud, Kubernetes, Aws, Azure, Gcp, Platform-Detection, Context-Awareness, Iam, Resource-Hierarchy, Infrastructure-as-Code

Kubectl, Aws-Cli, Gcloud, Az-Cli, Terraform, Helm

Infrastructure Knowledge Scoping for Agents#

An agent working on infrastructure tasks needs to operate at the right level of specificity. Giving generic Kubernetes advice when the user runs EKS with IRSA is unhelpful – the agent misses the IAM integration that will make or break the deployment. Giving EKS-specific advice when the user runs minikube on a laptop is equally unhelpful – the agent references services and configurations that do not exist.

Multi-Account Cloud Architecture with Terraform: AWS Organizations, Azure Management Groups, and GCP Organizations

February 22, 2026

Multi-Account-Terraform, Landing-Zone-Design, Cross-Account-Patterns, Provider-Aliasing

Advanced

Terraform, Multi-Account, Aws-Organizations, Azure-Management-Groups, Gcp-Organizations, Landing-Zone, Cross-Account, Provider-Aliasing, State-Isolation

Multi-Account Cloud Architecture with Terraform#

Single-account cloud deployments work for learning and prototypes. Production systems need multiple accounts (AWS), subscriptions (Azure), or projects (GCP) for isolation — security boundaries, blast radius control, billing separation, and compliance requirements.

Terraform manages multi-account architectures well, but the patterns differ significantly from single-account work. Provider configuration, state isolation, cross-account references, and IAM trust relationships all need explicit design.

Why Multiple Accounts#

Reason	Single Account Problem	Multi-Account Solution
Blast radius	Misconfigured IAM affects everything	Damage limited to one account
Billing	Cannot attribute costs to teams	Per-account billing and budgets
Compliance	PCI data mixed with dev workloads	Separate accounts for regulated workloads
Service limits	VPC limit of 5 per region shared	Each account has its own limits
Access control	Complex IAM policies to isolate teams	Account boundary is the strongest isolation
Testing	Dev resources can affect production	Impossible for dev to touch prod resources

AWS Organizations#

Organization Structure#

Organization Root
├── Core OU
│   ├── Management Account (billing, org management)
│   ├── Security Account (GuardDuty, SecurityHub, audit logs)
│   └── Networking Account (Transit Gateway, shared VPCs)
├── Workload OU
│   ├── Production OU
│   │   ├── App-A Production Account
│   │   └── App-B Production Account
│   └── Non-Production OU
│       ├── App-A Development Account
│       └── App-A Staging Account
└── Sandbox OU
    └── Developer Sandbox Accounts

Terraform for AWS Organizations#

resource "aws_organizations_organization" "main" {
  feature_set = "ALL"

  enabled_policy_types = [
    "SERVICE_CONTROL_POLICY",
    "TAG_POLICY",
  ]
}

resource "aws_organizations_organizational_unit" "core" {
  name      = "Core"
  parent_id = aws_organizations_organization.main.roots[0].id
}

resource "aws_organizations_organizational_unit" "workloads" {
  name      = "Workloads"
  parent_id = aws_organizations_organization.main.roots[0].id
}

resource "aws_organizations_organizational_unit" "production" {
  name      = "Production"
  parent_id = aws_organizations_organizational_unit.workloads.id
}

# Create a workload account
resource "aws_organizations_account" "app_production" {
  name      = "app-a-production"
  email     = "aws+app-a-prod@example.com"
  parent_id = aws_organizations_organizational_unit.production.id
  role_name = "OrganizationAccountAccessRole"  # cross-account admin role

  lifecycle {
    prevent_destroy = true  # accounts cannot be easily recreated
  }
}

Service Control Policies (SCPs)#

SCPs set permission boundaries for entire OUs:

Multi-Cloud Networking Patterns

February 22, 2026

Network-Architecture, Vpn-Configuration, Service-Mesh-Deployment, Dns-Management, Cloud-Networking

Intermediate, Advanced

Multi-Cloud, Vpn, Transit-Gateway, Service-Mesh, Dns-Routing, Cloud-Interconnect, Networking, Hybrid-Cloud

Terraform, Istio, Consul, Strongswan, Aws-Cli, Gcloud, Az

Multi-Cloud Networking Patterns#

Multi-cloud networking connects workloads across two or more cloud providers into a coherent network. The motivations vary – vendor redundancy, best-of-breed service selection, regulatory requirements – but the challenges are the same: private connectivity between isolated networks, consistent service discovery, and traffic routing that handles failures.

VPN Tunnels Between Clouds#

IPsec VPN tunnels are the simplest way to connect two cloud networks. Each provider offers managed VPN gateways that terminate IPsec tunnels, encrypting traffic between VPCs without exposing it to the public internet.

Terraform Cloud Architecture Patterns: VPC/EKS/RDS on AWS, VNET/AKS on Azure, VPC/GKE on GCP

February 22, 2026

Multi-Cloud-Terraform, Cloud-Architecture, Eks-Setup, Aks-Setup, Gke-Setup

Terraform, Aws, Azure, Gcp, Eks, Aks, Gke, Rds, Cloud-Sql, Vpc, Vnet, Multi-Cloud, Architecture-Patterns

Terraform Cloud Architecture Patterns#

The three-tier architecture — networking, managed Kubernetes, managed database — is the most common pattern for production deployments on any major cloud. The concepts are identical across AWS, Azure, and GCP. The Terraform code is not. Resource names differ, required arguments differ, default behaviors differ, and the gotchas that catch agents and humans are cloud-specific.

This article shows the real Terraform for each layer on each cloud, side by side, so agents can write correct infrastructure code for whichever cloud the user deploys to.

Terraform Cost Management: Writing Cost-Aware Infrastructure Code

February 22, 2026

Terraform-Cost-Awareness, Infracost-Integration, Resource-Sizing, Cost-Allocation

Terraform, Cost-Management, Infracost, Right-Sizing, Reserved-Instances, Tagging, Cloud-Costs, Finops

Terraform, Infracost, Aws-Cli, Az, Gcloud

Terraform Cost Management#

The most expensive line in your cloud bill was written in a .tf file. A single instance_type choice, a forgotten NAT Gateway, or an over-provisioned RDS instance can cost thousands per month — and none of these show up in terraform plan. Plan shows what changes. It does not show what it costs.

This article covers how to write cost-aware Terraform and catch expensive decisions before they reach production.

Terraform Import and Brownfield Adoption: Bringing Existing Infrastructure Under Code

February 22, 2026

Terraform-Import, Brownfield-Adoption, State-Management, Infrastructure-Migration

Intermediate, Advanced

Terraform, Import, Brownfield, Migration, State, Adoption, Existing-Infrastructure, Terraform-1.5

Terraform Import and Brownfield Adoption#

Most organizations do not start with Infrastructure as Code. They start with console clicks, CLI commands, and scripts. At some point they decide to adopt Terraform — and now they have hundreds of existing resources that need to be brought under management without disruption.

This is the brownfield problem: writing Terraform code that matches existing infrastructure exactly, importing the state so Terraform knows about the resources, and resolving the inevitable drift between what exists and what the code describes.

Terraform Networking Patterns: VPC, Subnets, NAT, Peering, and Transit Gateway Across Clouds

February 22, 2026

Cloud-Networking-Terraform, Vpc-Design, Cidr-Planning, Transit-Gateway-Patterns, Dns-Configuration

Terraform, Networking, Vpc, Vnet, Subnets, Nat, Peering, Transit-Gateway, Dns, Cidr, Hub-Spoke

Terraform Networking Patterns#

Networking is the first thing you build and the last thing you want to change. CIDR ranges, subnet allocation, and connectivity topology are difficult to modify after resources depend on them. Getting the network right in Terraform saves months of migration work later.

This article covers the networking patterns across AWS, Azure, and GCP — from basic VPC design to multi-region hub-spoke topologies.

CIDR Planning#

Plan CIDR ranges before writing any Terraform. Once a VPC is created with a CIDR block, changing it requires recreating the VPC and everything in it.

Terraform Secrets and Sensitive Data: Patterns for Variables, State, Providers, and CI/CD

February 22, 2026