---
title: "Terraform Cost Management: Writing Cost-Aware Infrastructure Code"
description: "How to write Terraform that does not surprise you with cloud bills. Covers Infracost integration for pre-apply cost estimates, cost-aware resource sizing patterns, right-sizing for dev vs production, the most expensive resources per cloud provider, tagging for cost allocation, reserved capacity vs on-demand decisions, and agent patterns for cost-conscious infrastructure."
url: https://agent-zone.ai/knowledge/infrastructure/terraform-cost-management/
section: knowledge
date: 2026-02-22
categories: ["infrastructure"]
tags: ["terraform","cost-management","infracost","right-sizing","reserved-instances","tagging","cloud-costs","finops"]
skills: ["terraform-cost-awareness","infracost-integration","resource-sizing","cost-allocation"]
tools: ["terraform","infracost","aws-cli","az","gcloud"]
levels: ["intermediate"]
word_count: 1170
formats:
  json: https://agent-zone.ai/knowledge/infrastructure/terraform-cost-management/index.json
  html: https://agent-zone.ai/knowledge/infrastructure/terraform-cost-management/?format=html
  api: https://api.agent-zone.ai/api/v1/knowledge/search?q=Terraform+Cost+Management%3A+Writing+Cost-Aware+Infrastructure+Code
---


# Terraform Cost Management

The most expensive line in your cloud bill was written in a `.tf` file. A single `instance_type` choice, a forgotten NAT Gateway, or an over-provisioned RDS instance can cost thousands per month — and none of these show up in `terraform plan`. Plan shows what changes. It does not show what it costs.

This article covers how to write cost-aware Terraform and catch expensive decisions before they reach production.

## The Cost Visibility Gap

`terraform plan` output:

```
# aws_instance.app will be created
+ resource "aws_instance" "app" {
    + instance_type = "r6g.2xlarge"
    ...
  }

Plan: 1 to add, 0 to change, 0 to destroy.
```

What the plan does not tell you: `r6g.2xlarge` costs $0.4032/hr = $294/month. A `t3.medium` would handle the workload at $0.0416/hr = $30/month.

## Infracost: Cost Estimates in the Workflow

Infracost reads your Terraform plan and estimates the monthly cost of every resource.

### Setup

```bash
# Install
brew install infracost  # macOS
# or
curl -fsSL https://raw.githubusercontent.com/infracost/infracost/master/scripts/install.sh | sh

# Register for free API key
infracost auth login

# Generate cost estimate from a plan
cd infrastructure/
terraform plan -out=tfplan
infracost breakdown --path=.
```

### Example Output

```
Project: infrastructure

Name                                     Monthly Qty  Unit         Monthly Cost
aws_instance.app
├─ Instance usage (Linux/UNIX, on-demand, r6g.2xlarge)   730  hours              $294.34
├─ root_block_device
│  └─ Storage (general purpose SSD, gp3)                  50  GB                   $4.00
└─ ebs_block_device[0]
   └─ Storage (general purpose SSD, gp3)                 200  GB                  $16.00

aws_nat_gateway.main
├─ NAT gateway                                           730  hours               $32.85
└─ Data processed                                  Monthly cost depends on usage

aws_db_instance.main
├─ Database instance (on-demand, db.r6g.large)           730  hours              $175.20
└─ Storage (general purpose SSD, gp3)                    100  GB                  $11.50

OVERALL TOTAL                                                                    $533.89

──────────────────────────────────
12 cloud resources were detected:
∙ 3 were estimated, 9 were free or had no cost data.
```

### Infracost in CI/CD

```yaml
# GitHub Actions — post cost estimate as PR comment
- name: Infracost Breakdown
  run: |
    infracost breakdown --path=. \
      --format=json \
      --out-file=/tmp/infracost.json

- name: Infracost Comment
  run: |
    infracost comment github \
      --path=/tmp/infracost.json \
      --repo=${{ github.repository }} \
      --pull-request=${{ github.event.pull_request.number }} \
      --github-token=${{ secrets.GITHUB_TOKEN }} \
      --behavior=update
```

### Infracost Policy (Budget Guardrails)

```yaml
# infracost.yml — fail PR if cost exceeds threshold
version: 0.1
policies:
  - path: infrastructure/
    max_monthly_cost: 1000  # fail if estimated cost > $1000/month
```

## The Most Expensive Resources

Resources that cause the biggest bill surprises:

### AWS

| Resource | Common Mistake | Monthly Cost | Fix |
|---|---|---|---|
| NAT Gateway | One per AZ in dev | $32/mo each, idle | Use 1 in dev, per-AZ in prod only |
| RDS Multi-AZ | Enabled in dev | 2x instance cost | `multi_az = false` for dev |
| EBS volumes | gp3 200GB per instance | $16/mo each | Right-size, delete unattached |
| EKS cluster | Cluster fee exists even empty | $73/mo | Cannot avoid, factor into budget |
| Elastic IP | Allocated but unattached | $3.65/mo each | Attach or release |
| CloudWatch Logs | High-verbosity logging | $0.50/GB ingested | Reduce log levels in dev |
| Data transfer | Cross-AZ traffic | $0.01/GB | Place communicating resources in same AZ |

### Azure

| Resource | Common Mistake | Monthly Cost | Fix |
|---|---|---|---|
| AKS node pool | Standard_D4s_v5 in dev | $140/mo per node | Use Standard_B2s for dev |
| Azure Firewall | Always-on in dev | $912/mo | Use NSGs in dev, Firewall in prod only |
| Log Analytics | Ingesting everything | $2.76/GB | Configure data collection rules |
| App Gateway v2 | Running in dev | $175/mo | Use simple LB in dev |
| Premium SSD | P30 (1TB) for small workloads | $122/mo | Use Standard SSD or right-size |

### GCP

| Resource | Common Mistake | Monthly Cost | Fix |
|---|---|---|---|
| GKE cluster | Management fee | $73/mo | Cannot avoid (or use Autopilot) |
| Cloud NAT | Per-VM charge + data | $32/mo base | Limit in dev |
| Cloud SQL | db-custom-4-16384 in dev | $230/mo | Use db-f1-micro or shared-core for dev |
| Cloud Armor | Per-policy + per-request | $7/mo + usage | Dev does not need DDoS protection |
| Persistent Disk | SSD 500GB per node | $85/mo each | Right-size, use standard PD in dev |

## Right-Sizing Patterns

### Environment-Based Sizing

```hcl
variable "environment" {
  type = string
}

locals {
  sizing = {
    dev = {
      instance_type    = "t3.small"
      db_instance      = "db.t3.micro"
      node_count       = 1
      disk_size        = 20
      multi_az         = false
      nat_gateway_count = 1
    }
    staging = {
      instance_type    = "t3.medium"
      db_instance      = "db.t3.small"
      node_count       = 2
      disk_size        = 50
      multi_az         = false
      nat_gateway_count = 1
    }
    prod = {
      instance_type    = "r6g.large"
      db_instance      = "db.r6g.large"
      node_count       = 3
      disk_size        = 200
      multi_az         = true
      nat_gateway_count = 3  # one per AZ
    }
  }

  config = local.sizing[var.environment]
}
```

### Conditional Expensive Resources

```hcl
# NAT Gateway: 1 in dev, per-AZ in prod
resource "aws_nat_gateway" "main" {
  count     = local.config.nat_gateway_count
  subnet_id = aws_subnet.public[count.index].id
  # ...
}

# WAF: prod only
resource "aws_wafv2_web_acl" "main" {
  count = var.environment == "prod" ? 1 : 0
  # ...
}

# Multi-AZ RDS: prod only
resource "aws_db_instance" "main" {
  instance_class = local.config.db_instance
  multi_az       = local.config.multi_az
  # ...
}
```

## Tagging for Cost Allocation

Without tags, your cloud bill is a single number. With tags, you can attribute costs to teams, projects, and environments.

### Required Tags

```hcl
locals {
  required_tags = {
    Environment = var.environment
    Project     = var.project
    Team        = var.team
    ManagedBy   = "terraform"
    CostCenter  = var.cost_center
  }
}

# AWS — apply via provider default_tags
provider "aws" {
  default_tags {
    tags = local.required_tags
  }
}

# Azure — apply via variable
resource "azurerm_resource_group" "main" {
  name     = "${var.project}-${var.environment}-rg"
  location = var.location
  tags     = local.required_tags
}

# GCP — apply via labels
resource "google_compute_instance" "app" {
  labels = local.required_tags
  # ...
}
```

### Enforcing Tags with Policy

```hcl
# AWS — SCP to deny untagged resources
resource "aws_organizations_policy" "require_tags" {
  name = "require-cost-allocation-tags"
  content = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Sid       = "DenyUntaggedResources"
      Effect    = "Deny"
      Action    = ["ec2:RunInstances", "rds:CreateDBInstance", "s3:CreateBucket"]
      Resource  = "*"
      Condition = {
        "Null" = {
          "aws:RequestTag/CostCenter" = "true"
        }
      }
    }]
  })
}
```

## Reserved Capacity Decisions

### When to Reserve

| Signal | Action |
|---|---|
| Resource running 24/7 for 6+ months | Consider 1-year reserved |
| Resource running 24/7 for 12+ months with stable sizing | Consider 3-year reserved |
| Workload is bursty or experimental | Stay on-demand |
| Planning to change instance size | Wait until sizing is stable |
| Using spot-tolerant workloads (batch, CI/CD) | Use spot instances, not reserved |

### Terraform and Reserved Instances

Reserved instances (AWS RIs, Azure Reserved VM Instances, GCP CUDs) are billing constructs — they are not managed by Terraform. Terraform creates on-demand instances, and the billing discount applies automatically if a matching reservation exists.

```hcl
# This is an on-demand instance in Terraform
# If you have a matching RI, AWS applies the discount automatically
resource "aws_instance" "app" {
  instance_type = "r6g.large"  # matches your RI? discount applies
  # ...
}
```

**Do not** use `instance_market_options` for reserved instances — that is for spot instances. Reserved instances are managed through the AWS/Azure/GCP billing console, not Terraform.

## Agent Cost Awareness Workflow

When an agent writes Terraform that creates cloud resources:

1. **Choose the smallest viable size** — start with `t3.small` / `Standard_B2s` / `e2-small` for dev. Scale up based on load testing, not guessing
2. **Check per-environment sizing** — dev should be significantly smaller than prod
3. **Count the NAT Gateways** — one per environment for dev, per-AZ for prod
4. **Check for always-on expensive resources** — firewalls, application gateways, premium features
5. **Add cost allocation tags** — every resource should be attributable to a team and project
6. **Recommend Infracost** — if not already in the CI/CD pipeline, suggest adding it
7. **Flag resources over $100/month** — call out expensive resources explicitly in plan summaries

