Gsd-skill-creator infrastructure-as-code

Provides Infrastructure as Code best practices for Terraform, Pulumi, CloudFormation, and OpenTofu. Use when provisioning infrastructure, writing IaC modules, managing cloud resources, scanning for misconfigurations, or when user mentions 'terraform', 'pulumi', 'cloudformation', 'IaC', 'opentofu', 'infrastructure', 'tfsec', 'checkov', 'drift'.

install
source · Clone the upstream repo
git clone https://github.com/Tibsfox/gsd-skill-creator
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/Tibsfox/gsd-skill-creator "$T" && mkdir -p ~/.claude/skills && cp -r "$T/examples/skills/patterns/infrastructure-as-code" ~/.claude/skills/tibsfox-gsd-skill-creator-infrastructure-as-code && rm -rf "$T"
manifest: examples/skills/patterns/infrastructure-as-code/SKILL.md
source content

Infrastructure as Code

Best practices for managing cloud infrastructure declaratively with Terraform, Pulumi, CloudFormation, and OpenTofu. Covers module composition, state management, security scanning, and drift prevention.

IaC Tool Comparison

Choose the right tool based on team skills, cloud strategy, and operational requirements.

ToolLanguageState ManagementMulti-CloudLearning CurveEcosystem
TerraformHCLRemote backend (S3, GCS, etc.)ExcellentMediumLargest provider ecosystem
OpenTofuHCLSame as TerraformExcellentMediumFork-compatible with Terraform
PulumiTypeScript, Python, Go, C#Pulumi Cloud or self-managedExcellentLow for developersGrowing, SDK-based
CloudFormationYAML/JSONAWS-managedAWS onlyMediumNative AWS integration
CDKTypeScript, Python, Java, GoAWS-managed (synths to CFN)AWS onlyLow for developersLeverages CFN resources
Decision FactorRecommendation
Multi-cloud requiredTerraform or Pulumi
AWS-only shopCloudFormation or CDK
Team knows TypeScriptPulumi or CDK
Need open-source licenseOpenTofu
Existing Terraform codebaseStay Terraform or migrate to OpenTofu
Complex logic and loopsPulumi (general-purpose language)

State Management Patterns

State is the source of truth for what IaC has provisioned. Mismanaging state causes orphaned resources, duplicate deployments, and data loss.

+------------------+       +------------------+       +------------------+
|  Developer CLI   | ----> |  Remote Backend  | <---- |   CI/CD Pipeline  |
+------------------+       +------------------+       +------------------+
                           |  - S3 + DynamoDB |
                           |  - GCS + Lock    |
                           |  - Terraform Cloud|
                           +------------------+
PatternWhen to UseImplementation
Single state fileSmall projects, <20 resourcesOne backend config
State per environmentSeparate dev/staging/prodWorkspace or directory per env
State per componentLarge infra, team boundariesSeparate root modules with data sources
Hierarchical stateComplex multi-team orgsLayers: network -> compute -> app

Terraform Remote State Configuration

# backend.tf -- Remote state with locking
terraform {
  backend "s3" {
    bucket         = "myorg-terraform-state"
    key            = "prod/network/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
    encrypt        = true
  }
}

# State locking table -- provision FIRST with a bootstrap module
resource "aws_dynamodb_table" "terraform_locks" {
  name         = "terraform-locks"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"

  attribute {
    name = "LockID"
    type = "S"
  }
}

# Cross-stack references via remote state data source
data "terraform_remote_state" "network" {
  backend = "s3"
  config = {
    bucket = "myorg-terraform-state"
    key    = "prod/network/terraform.tfstate"
    region = "us-east-1"
  }
}

resource "aws_instance" "app" {
  subnet_id = data.terraform_remote_state.network.outputs.private_subnet_ids[0]
}

Module Composition and Versioning

Modules are the unit of reuse in Terraform and OpenTofu. Good module design follows the single-responsibility principle.

modules/
  vpc/
    main.tf          # Resources
    variables.tf     # Input variables with validation
    outputs.tf       # Output values for consumers
    versions.tf      # Required providers and versions
environments/
  dev/
    main.tf          # Calls modules with dev parameters
    backend.tf       # Dev state backend
    terraform.tfvars # Dev variable values
  prod/
    main.tf
    backend.tf
    terraform.tfvars

Terraform Module Example

# modules/vpc/variables.tf
variable "name" {
  description = "Name prefix for all VPC resources"
  type        = string
  validation {
    condition     = length(var.name) <= 24
    error_message = "Name must be 24 characters or fewer."
  }
}

variable "cidr_block" {
  description = "CIDR block for the VPC"
  type        = string
  default     = "10.0.0.0/16"
  validation {
    condition     = can(cidrhost(var.cidr_block, 0))
    error_message = "Must be a valid CIDR block."
  }
}

variable "availability_zones" {
  type = list(string)
}

variable "tags" {
  type    = map(string)
  default = {}
}

# modules/vpc/main.tf
resource "aws_vpc" "this" {
  cidr_block           = var.cidr_block
  enable_dns_hostnames = true
  enable_dns_support   = true
  tags = merge(var.tags, { Name = "${var.name}-vpc" })
}

resource "aws_subnet" "public" {
  count             = length(var.availability_zones)
  vpc_id            = aws_vpc.this.id
  cidr_block        = cidrsubnet(var.cidr_block, 8, count.index)
  availability_zone = var.availability_zones[count.index]
  tags = merge(var.tags, {
    Name = "${var.name}-public-${var.availability_zones[count.index]}"
    Tier = "public"
  })
}

resource "aws_subnet" "private" {
  count             = length(var.availability_zones)
  vpc_id            = aws_vpc.this.id
  cidr_block        = cidrsubnet(var.cidr_block, 8, count.index + length(var.availability_zones))
  availability_zone = var.availability_zones[count.index]
  tags = merge(var.tags, {
    Name = "${var.name}-private-${var.availability_zones[count.index]}"
    Tier = "private"
  })
}

# modules/vpc/outputs.tf
output "vpc_id" {
  value = aws_vpc.this.id
}
output "public_subnet_ids" {
  value = aws_subnet.public[*].id
}
output "private_subnet_ids" {
  value = aws_subnet.private[*].id
}

# environments/prod/main.tf -- Consuming the module
module "vpc" {
  source  = "git::https://github.com/myorg/terraform-modules.git//vpc?ref=v2.1.0"

  name               = "prod"
  cidr_block         = "10.0.0.0/16"
  availability_zones = ["us-east-1a", "us-east-1b", "us-east-1c"]
  tags = { Environment = "production", Team = "platform" }
}
Versioning StrategySource FormatWhen to Use
Git tag
git::https://...?ref=v2.1.0
Internal modules, full control
Terraform Registry
source = "hashicorp/consul/aws"
version =
"0.1.0"
Public modules
Local path
source = "../../modules/vpc"
Monorepo, development
S3/GCS archive
source = "s3::https://..."
Air-gapped environments

Pulumi TypeScript Example

Pulumi uses general-purpose languages, giving full IDE support, type checking, and testing capabilities.

import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

const config = new pulumi.Config();
const env = pulumi.getStack();
const vpcCidr = config.get("vpcCidr") || "10.0.0.0/16";

class VpcComponent extends pulumi.ComponentResource {
  public readonly vpcId: pulumi.Output<string>;
  public readonly publicSubnetIds: pulumi.Output<string>[];
  public readonly privateSubnetIds: pulumi.Output<string>[];

  constructor(name: string, args: {
    cidr: string; azCount: number; enableNat: boolean;
  }, opts?: pulumi.ComponentResourceOptions) {
    super("custom:network:Vpc", name, {}, opts);

    const vpc = new aws.ec2.Vpc(`${name}-vpc`, {
      cidrBlock: args.cidr,
      enableDnsHostnames: true,
      enableDnsSupport: true,
      tags: { Name: `${name}-vpc`, Environment: env },
    }, { parent: this });

    this.vpcId = vpc.id;
    this.publicSubnetIds = [];
    this.privateSubnetIds = [];
    const azs = aws.getAvailabilityZones({ state: "available" });

    for (let i = 0; i < args.azCount; i++) {
      const pub = new aws.ec2.Subnet(`${name}-public-${i}`, {
        vpcId: vpc.id,
        cidrBlock: `10.0.${i}.0/24`,
        availabilityZone: azs.then(az => az.names[i]),
        mapPublicIpOnLaunch: true,
        tags: { Name: `${name}-public-${i}`, Tier: "public" },
      }, { parent: this });

      const priv = new aws.ec2.Subnet(`${name}-private-${i}`, {
        vpcId: vpc.id,
        cidrBlock: `10.0.${i + args.azCount}.0/24`,
        availabilityZone: azs.then(az => az.names[i]),
        tags: { Name: `${name}-private-${i}`, Tier: "private" },
      }, { parent: this });

      this.publicSubnetIds.push(pub.id);
      this.privateSubnetIds.push(priv.id);
    }
    this.registerOutputs({ vpcId: this.vpcId });
  }
}

// Usage
const network = new VpcComponent("main", {
  cidr: vpcCidr, azCount: 3, enableNat: env === "prod",
});

export const vpcId = network.vpcId;

CloudFormation YAML Example

CloudFormation is AWS-native and requires no external state management. Ideal for AWS-only environments with strict compliance requirements.

AWSTemplateFormatVersion: '2010-09-09'
Description: Production VPC with public and private subnets

Parameters:
  EnvironmentName:
    Type: String
    Default: prod
    AllowedValues: [dev, staging, prod]
  VpcCIDR:
    Type: String
    Default: '10.0.0.0/16'

Conditions:
  IsProduction: !Equals [!Ref EnvironmentName, prod]

Resources:
  VPC:
    Type: AWS::EC2::VPC
    Properties:
      CidrBlock: !Ref VpcCIDR
      EnableDnsHostnames: true
      EnableDnsSupport: true
      Tags:
        - Key: Name
          Value: !Sub '${EnvironmentName}-vpc'

  PublicSubnetA:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      CidrBlock: !Select [0, !Cidr [!Ref VpcCIDR, 6, 8]]
      AvailabilityZone: !Select [0, !GetAZs '']
      MapPublicIpOnLaunch: true

  PrivateSubnetA:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      CidrBlock: !Select [3, !Cidr [!Ref VpcCIDR, 6, 8]]
      AvailabilityZone: !Select [0, !GetAZs '']

  NatGateway:
    Type: AWS::EC2::NatGateway
    Condition: IsProduction
    Properties:
      AllocationId: !GetAtt NatEIP.AllocationId
      SubnetId: !Ref PublicSubnetA

  NatEIP:
    Type: AWS::EC2::EIP
    Condition: IsProduction
    Properties:
      Domain: vpc

Outputs:
  VpcId:
    Value: !Ref VPC
    Export:
      Name: !Sub '${EnvironmentName}-VpcId'

Security Scanning with Checkov and tfsec

Scan IaC before apply to catch misconfigurations, policy violations, and security risks.

ToolFocusLanguageIntegration
CheckovMulti-framework (TF, CFN, K8s, Helm)PythonCLI, CI/CD, IDE
tfsecTerraform-specific, fastGoCLI, CI/CD, pre-commit
TerrascanPolicy-as-code (OPA/Rego)GoCLI, CI/CD
Snyk IaCCommercial, wide coverageSaaSCLI, CI/CD, IDE

Checkov Scanning Example

# Scan a Terraform directory
checkov -d ./environments/prod/ --framework terraform --compact

# Scan with specific checks only
checkov -d ./environments/prod/ --check CKV_AWS_18,CKV_AWS_19

# Inline suppression in .tf files:
# resource "aws_s3_bucket" "logs" {
#   # checkov:skip=CKV_AWS_18:Logging bucket does not need access logging
#   bucket = "my-logs-bucket"
# }

# Generate SARIF output for GitHub Security tab
checkov -d ./environments/prod/ --output sarif --output-file results.sarif
# .github/workflows/iac-security.yml
name: IaC Security Scan

on:
  pull_request:
    paths: ['**/*.tf', '**/*.tfvars']

jobs:
  checkov:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      security-events: write
    steps:
      - uses: actions/checkout@v4
      - uses: bridgecrewio/checkov-action@v12
        with:
          directory: ./environments/
          framework: terraform
          output_format: sarif
          output_file_path: results.sarif
          soft_fail: false
      - uses: github/codeql-action/upload-sarif@v3
        if: always()
        with:
          sarif_file: results.sarif

  tfsec:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: aquasecurity/tfsec-action@v1.0.3
        with:
          working_directory: ./environments/

Drift Detection and Prevention

Drift occurs when actual infrastructure diverges from IaC state. Left unchecked, drift causes failed applies, security gaps, and configuration inconsistencies.

Drift CausePrevention
Manual console changesLock down console write access; read-only for debugging
Auto-scaling eventsUse
lifecycle { ignore_changes }
for dynamic attributes
External automationCoordinate with IaC or import resources
Incomplete IaC coverageImport existing resources before managing them

Drift Detection Strategy

# .github/workflows/drift-detection.yml
name: Drift Detection

on:
  schedule:
    - cron: '0 6 * * *'   # Daily at 6 AM UTC

jobs:
  detect-drift:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        environment: [dev, staging, prod]
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: 1.7.0

      - name: Terraform Init
        working-directory: ./environments/${{ matrix.environment }}
        run: terraform init -input=false

      - name: Detect Drift
        id: plan
        working-directory: ./environments/${{ matrix.environment }}
        run: |
          terraform plan -detailed-exitcode -input=false 2>&1 | tee plan_output.txt
          echo "exit_code=${PIPESTATUS[0]}" >> "$GITHUB_OUTPUT"
          # Exit code 0 = no changes, 1 = error, 2 = drift detected
        continue-on-error: true

      - name: Alert on Drift
        if: steps.plan.outputs.exit_code == '2'
        run: echo "::warning::Drift detected in ${{ matrix.environment }}!"

Preventing Drift with Lifecycle Rules

resource "aws_autoscaling_group" "app" {
  lifecycle {
    ignore_changes = [desired_capacity]  # Managed by auto-scaling, not Terraform
  }
  min_size = 2
  max_size = 10
}

resource "aws_security_group" "critical" {
  lifecycle {
    prevent_destroy = true  # Prevent accidental deletion
  }
  name   = "critical-sg"
  vpc_id = module.vpc.vpc_id
}

Anti-Patterns

Anti-PatternProblemFix
Local state filesNo locking, no team access, easy to loseUse remote backend with locking (S3+DynamoDB, GCS)
Hardcoded provider credentialsSecrets in version controlUse environment variables, OIDC, or vault
Monolithic root moduleSlow plans, blast radius covers everythingSplit into composable modules per service or layer
No variable validationInvalid inputs cause cryptic errors at applyAdd
validation
blocks on all input variables
terraform apply
without plan
No review of changes before executionAlways
plan -out=plan.tfplan
then
apply plan.tfplan
Ignoring state driftManual changes accumulate, next apply failsSchedule daily drift detection; alert and remediate
No module versioningBreaking changes propagate immediatelyPin module versions with git tags or registry versions
Inline resources instead of modulesCopy-paste across environments, divergenceExtract reusable modules, parameterize with variables
Storing secrets in tfvarsCredentials committed to gitUse vault, SSM Parameter Store, or Secrets Manager
No import before manageTerraform tries to create existing resources
terraform import
existing resources first
Wildcard provider versionsUpgrades break without warningPin versions with
~>
constraints in
versions.tf
No
prevent_destroy
on stateful resources
Accidental deletion of databases, bucketsAdd
lifecycle { prevent_destroy = true }
Running apply from laptopsNo audit trail, credential exposureRun all applies through CI/CD with approval gates

Security and Compliance Checklist

  • Remote state backend with encryption at rest enabled
  • State locking configured (DynamoDB, GCS, or equivalent)
  • No credentials or secrets in
    .tf
    files or
    .tfvars
  • Provider versions pinned with constraints in
    versions.tf
  • Module versions pinned to specific tags or registry versions
  • Checkov or tfsec runs on every pull request
  • terraform plan
    output reviewed before every apply
  • Production applies gated behind CI/CD approval
  • prevent_destroy
    set on stateful resources (databases, storage)
  • Drift detection runs on a daily schedule
  • Least-privilege IAM roles for Terraform execution
  • State file access restricted to CI/CD service accounts
  • All resources tagged with
    Environment
    ,
    Team
    , and
    ManagedBy
  • Sensitive outputs marked with
    sensitive = true
  • No use of
    local-exec
    or
    remote-exec
    provisioners
  • Import existing resources before writing IaC to manage them