Back to blog
6 min read

Terraform Project Structure: What Actually Works in Production

Lessons learned from structuring Terraform projects across multiple teams and environments - what works, what doesn't, and why.

TerraformIaCDevOpsBest Practices

After managing Terraform across multiple production environments, I've learned that project structure decisions made early have compounding effects. A poor structure leads to state conflicts, copy-paste drift between environments, and onboarding friction that slows down every new team member.

Here's what actually works.

The Structure I Use

For most projects, I use an environment-based structure:

terraform/
├── environments/
│   ├── dev/
│   │   ├── main.tf
│   │   ├── terraform.tfvars
│   │   └── backend.tf
│   ├── staging/
│   └── prod/
├── modules/
│   ├── vpc/
│   ├── eks/
│   └── rds/
└── README.md

Why This Works:

  • Isolated state files - Each environment has its own state. No risk of dev changes affecting prod.
  • Clear promotion path - Changes flow from dev → staging → prod with review gates.
  • Shared modules - Common patterns live in modules/, eliminating copy-paste drift.

I started with a flat structure on smaller projects but hit scaling issues quickly. The environment-based approach handles growth without major refactoring.

State File Strategy

One state file per environment minimum. For larger infrastructures, I split by component:

prod/networking.tfstate
prod/compute.tfstate
prod/database.tfstate

This reduces blast radius - a networking change can't accidentally affect your database. It also enables parallel applies and team ownership of specific components.

When components need to reference each other:

data "terraform_remote_state" "vpc" {
  backend = "s3"
  config = {
    bucket = "mycompany-terraform-state"
    key    = "prod/networking.tfstate"
    region = "us-east-1"
  }
}

# Use outputs from networking state
subnet_ids = data.terraform_remote_state.vpc.outputs.private_subnet_ids

Variable Management

I use a consistent pattern across all projects:

environments/prod/
├── terraform.tfvars       # Non-sensitive defaults
└── backend.tf             # State configuration

Sensitive values never go in files. I pull them from SSM Parameter Store or pass them via environment variables in CI/CD:

data "aws_ssm_parameter" "db_password" {
  name            = "/myapp/prod/db_password"
  with_decryption = true
}

This keeps secrets out of version control entirely.

The .gitignore That Matters

*.tfstate
*.tfstate.*
.terraform/
*.tfvars
!example.tfvars
*.tfplan
crash.log

Two notes: I do commit .terraform.lock.hcl for reproducible builds. And I include an example.tfvars showing the expected variables without real values.

CI/CD Integration

Every project gets automated plan/apply from day one. My typical GitHub Actions workflow:

  1. On PR - Run terraform plan, post output as PR comment
  2. On merge to main - Run terraform apply with manual approval for prod

The key insight: reviewing plan output in PRs catches issues before they reach any environment. I've prevented countless production incidents this way.

Lessons Learned

Start simple, split when needed. I've seen teams over-engineer structure upfront. Start with environment-based, split by component only when you hit actual pain points.

Document your conventions. A README explaining your structure saves hours for every new team member. Include why you made certain choices, not just what they are.

Enforce standards with automation. Use terraform fmt -check and terraform validate in CI. Manual code review catches logic issues; automation catches formatting and syntax.

Lock your versions. Pin Terraform and provider versions. I've been burned by provider updates breaking production applies.

terraform {
  required_version = "= 1.6.0"
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

Key Takeaways

  • Environment-based structure scales - Isolated state files prevent cross-environment accidents
  • Split state by component at scale - Reduces blast radius and enables team ownership
  • Never commit secrets - Use SSM Parameter Store or environment variables
  • Automate from day one - CI/CD with plan review catches issues before production
  • Document your decisions - Future team members need context, not just code
  • Pin your versions - Reproducible builds prevent surprise breakages
BT

Written by Bar Tsveker

Senior CloudOps Engineer specializing in AWS, Terraform, and infrastructure automation.

Thanks for reading! Have questions or feedback?