Back to blog
8 min read

Building a Production VPC with Terraform: My Standard Setup

The VPC architecture I deploy for every new AWS project - multi-AZ, public/private subnets, NAT gateways, and cost optimizations.

TerraformAWSVPCNetworkingTutorial

Every AWS project starts with a VPC. Over time, I've developed a standard architecture that handles most production workloads. It's multi-AZ for high availability, properly segmented for security, and includes cost optimizations I've learned through experience.

Here's what I deploy and why.

The Architecture

┌─────────────────────────────────────────────────────┐
│                    VPC (10.0.0.0/16)                │
│  ┌─────────────────┐    ┌─────────────────┐        │
│  │   AZ us-east-1a │    │   AZ us-east-1b │        │
│  │  ┌───────────┐  │    │  ┌───────────┐  │        │
│  │  │  Public   │  │    │  │  Public   │  │        │
│  │  │ 10.0.1.0  │  │    │  │ 10.0.2.0  │  │        │
│  │  └───────────┘  │    │  └───────────┘  │        │
│  │  ┌───────────┐  │    │  ┌───────────┐  │        │
│  │  │  Private  │  │    │  │  Private  │  │        │
│  │  │ 10.0.11.0 │  │    │  │ 10.0.12.0 │  │        │
│  │  └───────────┘  │    │  └───────────┘  │        │
│  │  ┌───────────┐  │    │  ┌───────────┐  │        │
│  │  │ Database  │  │    │  │ Database  │  │        │
│  │  │ 10.0.21.0 │  │    │  │ 10.0.22.0 │  │        │
│  │  └───────────┘  │    │  └───────────┘  │        │
│  └─────────────────┘    └─────────────────┘        │
└─────────────────────────────────────────────────────┘

Three tiers across two availability zones:

  • Public subnets - Load balancers, NAT gateways, bastion hosts
  • Private subnets - Application servers, containers
  • Database subnets - RDS, ElastiCache, no internet access

Core VPC Setup

resource "aws_vpc" "main" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support   = true

  tags = {
    Name        = "${var.project}-${var.environment}-vpc"
    Environment = var.environment
  }
}

resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.main.id
}

I use /16 for the VPC CIDR. It provides 65,536 addresses - more than enough room to grow without re-architecting.

Subnet Strategy

I calculate CIDRs dynamically to avoid manual errors:

locals {
  azs = ["us-east-1a", "us-east-1b"]

  public_cidrs   = [for i, az in local.azs : cidrsubnet("10.0.0.0/16", 8, i + 1)]
  private_cidrs  = [for i, az in local.azs : cidrsubnet("10.0.0.0/16", 8, i + 11)]
  database_cidrs = [for i, az in local.azs : cidrsubnet("10.0.0.0/16", 8, i + 21)]
}

resource "aws_subnet" "public" {
  count                   = length(local.azs)
  vpc_id                  = aws_vpc.main.id
  cidr_block              = local.public_cidrs[count.index]
  availability_zone       = local.azs[count.index]
  map_public_ip_on_launch = true

  tags = {
    Name = "${var.project}-public-${local.azs[count.index]}"
    Tier = "public"
  }
}

resource "aws_subnet" "private" {
  count             = length(local.azs)
  vpc_id            = aws_vpc.main.id
  cidr_block        = local.private_cidrs[count.index]
  availability_zone = local.azs[count.index]

  tags = {
    Name = "${var.project}-private-${local.azs[count.index]}"
    Tier = "private"
  }
}

resource "aws_subnet" "database" {
  count             = length(local.azs)
  vpc_id            = aws_vpc.main.id
  cidr_block        = local.database_cidrs[count.index]
  availability_zone = local.azs[count.index]

  tags = {
    Name = "${var.project}-database-${local.azs[count.index]}"
    Tier = "database"
  }
}

NAT Gateway Configuration

NAT gateways enable private subnet internet access. They're also expensive - about $32/month each plus data processing fees.

resource "aws_eip" "nat" {
  count  = var.single_nat_gateway ? 1 : length(local.azs)
  domain = "vpc"
}

resource "aws_nat_gateway" "main" {
  count         = var.single_nat_gateway ? 1 : length(local.azs)
  allocation_id = aws_eip.nat[count.index].id
  subnet_id     = aws_subnet.public[count.index].id

  depends_on = [aws_internet_gateway.main]
}

My approach: One NAT per AZ in production for high availability. Single NAT in dev/staging to save costs. The single_nat_gateway variable controls this.

Routing

Public subnets route to the internet gateway:

resource "aws_route_table" "public" {
  vpc_id = aws_vpc.main.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.main.id
  }
}

Private subnets route through NAT:

resource "aws_route_table" "private" {
  count  = length(local.azs)
  vpc_id = aws_vpc.main.id

  route {
    cidr_block     = "0.0.0.0/0"
    nat_gateway_id = var.single_nat_gateway ? aws_nat_gateway.main[0].id : aws_nat_gateway.main[count.index].id
  }
}

Database subnets get no internet route - they should never reach out directly.

VPC Endpoints for Cost Savings

This is often overlooked. Traffic to S3 and DynamoDB through NAT gateway costs money. Gateway endpoints are free and keep traffic private.

resource "aws_vpc_endpoint" "s3" {
  vpc_id       = aws_vpc.main.id
  service_name = "com.amazonaws.${var.region}.s3"

  route_table_ids = concat(
    [aws_route_table.public.id],
    aws_route_table.private[*].id
  )
}

resource "aws_vpc_endpoint" "dynamodb" {
  vpc_id       = aws_vpc.main.id
  service_name = "com.amazonaws.${var.region}.dynamodb"

  route_table_ids = aws_route_table.private[*].id
}

I've seen accounts save hundreds per month just by adding these endpoints.

Security Group Baseline

I lock down the default security group to prevent accidental use:

resource "aws_default_security_group" "default" {
  vpc_id = aws_vpc.main.id
  # No rules - blocks all traffic
}

Then create explicit security groups for each tier with minimal required access.

What I Output

output "vpc_id" {
  value = aws_vpc.main.id
}

output "public_subnet_ids" {
  value = aws_subnet.public[*].id
}

output "private_subnet_ids" {
  value = aws_subnet.private[*].id
}

output "database_subnet_ids" {
  value = aws_subnet.database[*].id
}

These outputs feed into other modules - EKS clusters, RDS instances, load balancers.

Cost Optimization

ComponentMonthly CostHow I Optimize
NAT Gateway$32 + dataSingle NAT for non-prod
Elastic IPs$3.65 unusedDelete when not attached
Interface Endpoints$7.30 eachOnly enable what's needed
Gateway EndpointsFreeAlways enable S3 + DynamoDB

For dev environments, I sometimes skip NAT entirely and use SSM Session Manager for access. No internet egress needed, no NAT cost.

Lessons Learned

Always deploy across multiple AZs. I've seen single-AZ deployments go down during AWS outages. The extra NAT gateway cost is insurance.

Database subnets should never have internet routes. If your database needs to reach the internet, something is wrong with your architecture.

Enable VPC Flow Logs for production. When security incidents happen, you'll want the network traffic logs. Store them in S3 with lifecycle policies to manage costs.

Tag everything. Proper tagging enables cost allocation and makes troubleshooting easier. I tag with Project, Environment, and Tier at minimum.

Key Takeaways

  • Three-tier architecture - Public, private, database subnets provide proper isolation
  • Multi-AZ is essential - Single AZ is a single point of failure
  • NAT gateways are expensive - Use single NAT for non-prod, VPC endpoints where possible
  • Gateway endpoints are free - Always enable S3 and DynamoDB endpoints
  • Lock down defaults - Restrict the default security group
  • Plan for growth - Use /16 CIDR to avoid future re-architecture
BT

Written by Bar Tsveker

Senior CloudOps Engineer specializing in AWS, Terraform, and infrastructure automation.

Thanks for reading! Have questions or feedback?