---
title: "Testing Infrastructure Code: The Validation Pyramid from Lint to Integration"
description: "A unified testing strategy for Terraform code — static analysis, plan-based testing, contract testing for modules, cost estimation, and integration testing. The testing pyramid applied to infrastructure: fast and cheap at the bottom, slow and expensive at the top, with clear guidance on what to test at which level."
url: https://agent-zone.ai/knowledge/infrastructure/terraform-testing-pyramid/
section: knowledge
date: 2026-02-22
categories: ["infrastructure"]
tags: ["terraform","testing","tflint","checkov","conftest","terratest","infracost","opa","validation","static-analysis","integration-testing","policy-as-code"]
skills: ["terraform-testing-strategy","static-analysis","plan-testing","integration-testing","cost-estimation"]
tools: ["terraform","tflint","checkov","conftest","opa","terratest","infracost"]
levels: ["intermediate"]
word_count: 1504
formats:
  json: https://agent-zone.ai/knowledge/infrastructure/terraform-testing-pyramid/index.json
  html: https://agent-zone.ai/knowledge/infrastructure/terraform-testing-pyramid/?format=html
  api: https://api.agent-zone.ai/api/v1/knowledge/search?q=Testing+Infrastructure+Code%3A+The+Validation+Pyramid+from+Lint+to+Integration
---


# Testing Infrastructure Code

Infrastructure code has a unique testing challenge: the thing you are testing is expensive to instantiate. You cannot spin up a VPC, an RDS instance, and an EKS cluster for every pull request and tear it down 5 minutes later without significant cost and time. But you also cannot ship untested infrastructure changes to production without risk.

The solution is the same as in software engineering: a testing pyramid. Fast, cheap tests at the bottom catch most errors. Slower, expensive tests at the top catch the rest. The key is knowing what to test at which level.

## The Infrastructure Testing Pyramid

```
                    ┌─────────────────┐
                    │   Integration   │  Real cloud resources
                    │   (Terratest)   │  Expensive, slow (10-30 min)
                    │   Run: nightly  │  Catches: actual API behavior
                   ┌┴─────────────────┴┐
                   │    Plan-Based     │  Real plan output, no apply
                   │  (Conftest/OPA)   │  Moderate (1-3 min)
                   │  Run: every PR    │  Catches: policy violations
                  ┌┴───────────────────┴┐
                  │    Cost Estimation   │  Plan output → cost analysis
                  │    (Infracost)       │  Moderate (1-2 min)
                  │    Run: every PR     │  Catches: budget overruns
                 ┌┴─────────────────────┴┐
                 │    Static Analysis     │  No cloud access needed
                 │  (tflint, checkov,     │  Fast (seconds)
                 │   terraform validate)  │  Catches: syntax, config errors
                 │  Run: every commit     │  Run: every commit
                 └───────────────────────┘
```

Each level catches different classes of errors. Skipping a level means those errors reach the next level (which is slower and more expensive to run) or reach production.

## Level 1: Static Analysis (Seconds)

Static analysis checks code without executing it or connecting to any cloud API. It runs on every commit in pre-commit hooks or early in CI.

### terraform validate

Checks HCL syntax and basic resource configuration:

```bash
terraform init -backend=false    # initialize providers without backend
terraform validate               # check syntax and resource references
```

Catches: missing required arguments, invalid resource types, broken references, type mismatches. Does not catch: values that are syntactically valid but logically wrong.

### tflint

Catches provider-specific errors that `validate` misses:

```bash
tflint --init          # download provider-specific rulesets
tflint --recursive     # lint all modules
```

```hcl
# .tflint.hcl
plugin "aws" {
  enabled = true
  version = "0.30.0"
  source  = "github.com/terraform-linters/tflint-ruleset-aws"
}

rule "terraform_naming_convention" {
  enabled = true
  format  = "snake_case"
}

rule "terraform_documented_variables" {
  enabled = true
}
```

Catches: invalid instance types (`t3.superxlarge` does not exist), deprecated resource arguments, naming convention violations, variables without descriptions.

### checkov

Scans for security misconfigurations and compliance issues:

```bash
checkov -d . --framework terraform
```

Catches: unencrypted S3 buckets, public security groups, missing logging, databases without backups, KMS keys without rotation. Checkov has 2,500+ built-in policies covering CIS benchmarks, SOC2, PCI-DSS, and HIPAA.

### terraform fmt

Not a test per se, but enforces consistent formatting:

```bash
terraform fmt -check -recursive -diff
```

Run this first in CI. If formatting fails, the PR has style issues that should be fixed before deeper analysis.

### Static Analysis Pipeline

```bash
#!/bin/bash
# pre-commit or CI script
set -e

echo "=== Format check ==="
terraform fmt -check -recursive -diff

echo "=== Validate ==="
terraform init -backend=false
terraform validate

echo "=== tflint ==="
tflint --init
tflint --recursive

echo "=== Checkov ==="
checkov -d . --framework terraform --quiet

echo "=== All static checks passed ==="
```

**Total runtime**: 5-30 seconds. No cloud credentials needed. No API calls.

## Level 2: Cost Estimation (1-2 Minutes)

Cost estimation runs `terraform plan` and analyzes the planned resources against pricing data. It catches budget surprises before they reach production.

### Infracost

```bash
# Generate plan
terraform plan -out=tfplan
terraform show -json tfplan > plan.json

# Estimate cost
infracost breakdown --path=plan.json --format=json --out-file=cost.json
infracost output --path=cost.json --format=table
```

Output example:

```
Project: infrastructure/compute

 Name                                     Monthly Qty  Unit   Monthly Cost
 ─────────────────────────────────────────────────────────────────────────
 aws_instance.app
 ├─ Instance usage (t3.large)                     730  hours        $60.74
 ├─ root_block_device
 │  └─ Storage (gp3, 50 GB)                        50  GB           $4.00
 └─ ebs_block_device[0]
    └─ Storage (gp3, 200 GB)                       200  GB          $16.00

 aws_rds_cluster.main
 ├─ Aurora capacity units                         730  ACU-hours   $87.60
 └─ Storage                                        50  GB           $5.00

 OVERALL TOTAL                                                    $173.34
```

### Cost Guardrails

Add policy checks for cost:

```bash
# Fail if monthly cost exceeds threshold
COST=$(jq '.totalMonthlyCost | tonumber' cost.json)
THRESHOLD=500
if (( $(echo "$COST > $THRESHOLD" | bc -l) )); then
  echo "ERROR: Estimated monthly cost \$$COST exceeds threshold \$$THRESHOLD"
  exit 1
fi
```

### What Cost Estimation Catches

| Issue | Example | Without Cost Check |
|---|---|---|
| Oversized instances | `r5.4xlarge` instead of `t3.large` | Discovered on first bill |
| Missing spot/reserved pricing | On-demand for always-on workloads | Overpaying by 40-70% |
| Storage accumulation | 500GB EBS per instance × 20 instances | $800/mo in EBS alone |
| NAT gateway surprise | NAT per AZ + high throughput | $100-500/mo unplanned |
| Data transfer | Cross-region replication, internet egress | Largest surprise cost |

## Level 3: Plan-Based Testing (1-3 Minutes)

Plan-based testing runs `terraform plan`, converts the output to JSON, and evaluates it against policy rules. The plan is never applied — no resources are created.

### Conftest with OPA

```bash
# Generate plan JSON
terraform plan -out=tfplan
terraform show -json tfplan > plan.json

# Test against policies
conftest test plan.json --policy policies/
```

Policy examples:

```rego
# policies/tags.rego
package main

deny[msg] {
  resource := input.resource_changes[_]
  actions := resource.change.actions
  actions[_] == "create"

  # Check for required tags
  tags := resource.change.after.tags
  not tags.Environment
  msg := sprintf("Resource %s missing 'Environment' tag", [resource.address])
}

deny[msg] {
  resource := input.resource_changes[_]
  actions := resource.change.actions
  actions[_] == "create"

  tags := resource.change.after.tags
  not tags.ManagedBy
  msg := sprintf("Resource %s missing 'ManagedBy' tag", [resource.address])
}
```

```rego
# policies/security.rego
package main

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "aws_security_group_rule"
  resource.change.after.cidr_blocks[_] == "0.0.0.0/0"
  resource.change.after.type == "ingress"
  resource.change.after.from_port != 443
  resource.change.after.from_port != 80
  msg := sprintf(
    "Security group rule %s allows 0.0.0.0/0 on port %d (only 80 and 443 allowed)",
    [resource.address, resource.change.after.from_port]
  )
}
```

```rego
# policies/cost.rego
package main

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "aws_instance"
  instance_type := resource.change.after.instance_type
  expensive := {"r5.4xlarge", "r5.8xlarge", "m5.8xlarge", "c5.9xlarge"}
  expensive[instance_type]
  msg := sprintf(
    "Instance %s uses expensive type %s — requires approval",
    [resource.address, instance_type]
  )
}
```

### What Plan-Based Testing Catches

| Category | Examples |
|---|---|
| **Missing tags** | Resources created without required tags |
| **Security violations** | Open security groups, unencrypted resources, public access |
| **Naming violations** | Resources not matching naming conventions |
| **Size constraints** | Instances larger than approved sizes |
| **Destructive changes** | Resources being replaced or destroyed (flag for review) |
| **Drift-related changes** | Resources changing that were not in the code diff |

## Level 4: Integration Testing (10-30 Minutes)

Integration testing creates real infrastructure, validates it works, and tears it down. This is expensive in time and money — reserve it for nightly runs, pre-release validation, or module certification.

### Terratest

```go
package test

import (
    "testing"
    "fmt"

    "github.com/gruntwork-io/terratest/modules/terraform"
    "github.com/gruntwork-io/terratest/modules/aws"
    "github.com/stretchr/testify/assert"
)

func TestNetworkingModule(t *testing.T) {
    t.Parallel()

    opts := &terraform.Options{
        TerraformDir: "../infrastructure/networking",
        Vars: map[string]interface{}{
            "environment": "test",
            "vpc_cidr":    "10.99.0.0/16",
        },
    }

    defer terraform.Destroy(t, opts)
    terraform.InitAndApply(t, opts)

    // Verify VPC was created
    vpcId := terraform.Output(t, opts, "vpc_id")
    assert.Contains(t, vpcId, "vpc-")

    // Verify subnets are in the correct VPC
    subnetIds := terraform.OutputList(t, opts, "private_subnet_ids")
    assert.Equal(t, 2, len(subnetIds))

    for _, subnetId := range subnetIds {
        subnet := aws.GetSubnet(t, subnetId, "us-east-1")
        assert.Equal(t, vpcId, subnet.VpcId)
    }

    // Verify DNS resolution works
    vpc := aws.GetVpcById(t, vpcId, "us-east-1")
    assert.True(t, vpc.EnableDnsHostnames)
}
```

### When to Run Integration Tests

| Trigger | What to Test | Why |
|---|---|---|
| Nightly scheduled run | All modules | Catch provider API changes, drift in AMI IDs, expired certificates |
| Before tagging a module release | The module being released | Verify it works against real APIs before consumers adopt it |
| After a major provider upgrade | All modules using that provider | Verify compatibility with new API behaviors |
| After a significant refactoring | The refactored module | Verify the refactoring did not break functionality |

### Integration Test Cost Management

- Run in a dedicated test account with billing alerts
- Use the smallest viable resource sizes (`t3.micro`, `db.t3.micro`)
- Set aggressive timeouts: `defer terraform.Destroy()` ensures cleanup even on failure
- Tag all test resources with `Environment = "test"` and a TTL tag
- Run a nightly sweeper that destroys any resources older than 24 hours in the test account

## Choosing What to Test Where

| What You Want to Verify | Test Level | Tool | Cost |
|---|---|---|---|
| Valid HCL syntax | Static | `terraform validate` | Free, instant |
| Provider-specific config errors | Static | `tflint` | Free, instant |
| Security misconfigurations | Static | `checkov` | Free, instant |
| Required tags present | Plan-based | `conftest` | Free, 1-3 min |
| No open security groups | Plan-based | `conftest` | Free, 1-3 min |
| No accidental destroys | Plan-based | `conftest` | Free, 1-3 min |
| Monthly cost within budget | Plan-based | `infracost` | Free tier, 1-2 min |
| Resources actually work | Integration | `terratest` | Cloud costs, 10-30 min |
| Cross-resource connectivity | Integration | `terratest` | Cloud costs, 10-30 min |
| Module output contracts | Integration | `terratest` | Cloud costs, 10-30 min |

**The 80/20 rule**: Static analysis and plan-based testing catch 80% of issues at 1% of the cost. Integration testing catches the remaining 20% at 99% of the cost. Invest heavily in levels 1-3 before spending on level 4.

## The Agent Testing Workflow

When an agent writes or modifies Terraform:

```
1. Write the changes
2. Run: terraform fmt (fix formatting)
3. Run: terraform validate (catch syntax errors)
4. Run: tflint (catch provider-specific issues)
5. Run: checkov (catch security issues)
   ─── Fix any errors found in steps 2-5 ───
6. Run: terraform plan -out=tfplan
7. Run: conftest test (policy checks on plan)
8. Run: infracost breakdown (cost estimate)
9. Present plan summary + cost estimate to human
10. WAIT for approval
11. On approval: terraform apply tfplan
```

Steps 2-5 are automated and self-correcting — the agent fixes issues it finds. Steps 6-8 produce information for the human. Step 9 is the safety gate. Steps 2-8 together take 2-5 minutes and catch the vast majority of issues before a human ever sees the plan.

