⚖️ When to Use Shell vs IaC

Understanding when to use shell scripting versus Infrastructure as Code tools is crucial for building maintainable, scalable automation workflows.

🎯 Decision Framework

Use Shell When:

✅ Imperative operations - Sequential steps that must happen in order ✅ One-time tasks - Provisioning scripts, migrations, data transformations ✅ Complex logic - Conditional branching, loops, error handling ✅ Existing tooling - Leveraging mature CLI tools and utilities ✅ Rapid prototyping - Quick proof-of-concepts and experimentation ✅ Local development - Developer environment setup and tooling

Use IaC When:

✅ Declarative state - Desired end-state configuration ✅ Drift detection - Automatic reconciliation of configuration drift ✅ Team collaboration - Shared, version-controlled infrastructure definitions ✅ Compliance requirements - Audit trails and policy enforcement ✅ Multi-cloud deployments - Consistent patterns across providers ✅ Large-scale infrastructure - Hundreds or thousands of resources

🔄 Hybrid Approaches

Shell-First, IaC-Later

#!/bin/bash
# bootstrap-cluster.sh - Shell-first approach

# 1. Initial setup with shell (rapid iteration)
setup_kubernetes_cluster() {
    echo "Setting up Kubernetes cluster..."

    # Download and install kubeadm
    curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubeadm"
    chmod +x kubeadm
    sudo mv kubeadm /usr/local/bin/

    # Initialize cluster
    sudo kubeadm init --pod-network-cidr=10.244.0.0/16

    # Setup kubeconfig
    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config

    echo "Cluster initialized successfully"
}

# 2. Transition to IaC for ongoing management
# terraform apply -auto-approve
# kubectl apply -f production-manifests/

setup_kubernetes_cluster
echo "Now run: terraform init && terraform apply"

IaC-First, Shell-Augmented

# main.tf - IaC-first approach with shell augmentation
resource "aws_instance" "web" {
  ami           = "ami-0c02fb55956c7d316"
  instance_type = "t3.micro"

  # Shell script for custom configuration
  user_data = <<-EOF
              #!/bin/bash
              # Install and configure web server
              yum update -y
              yum install -y httpd
              systemctl start httpd
              systemctl enable httpd

              # Custom configuration
              echo "<h1>Hello from $(hostname)</h1>" > /var/www/html/index.html
              EOF

  tags = {
    Name = "WebServer"
  }
}

# Shell script for post-deployment validation
resource "null_resource" "validate_web" {
  depends_on = [aws_instance.web]

  provisioner "local-exec" {
    command = <<EOT
      #!/bin/bash
      set -e
      echo "Validating web server..."

      # Wait for instance to be ready
      sleep 30

      # Check HTTP response
      INSTANCE_IP=${aws_instance.web.public_ip}
      if curl -sf "http://$INSTANCE_IP" | grep -q "Hello"; then
        echo "Web server validation passed"
      else
        echo "Web server validation failed" >&2
        exit 1
      fi
    EOT
  }
}

📊 Complexity Matrix

Scenario	Shell	IaC	Recommendation
Simple VM provisioning	✅ Quick	✅ Better	IaC
Complex multi-step deployment	✅ Flexible	⚠️ Limited	Hybrid
One-time data migration	✅ Perfect	❌ Overkill	Shell
Ongoing infrastructure management	❌ Brittle	✅ Ideal	IaC
Developer environment setup	✅ Great	⚠️ Heavy	Shell
Production infrastructure	⚠️ Risky	✅ Reliable	IaC
CI/CD pipeline orchestration	✅ Good	✅ Good	Either
Configuration drift remediation	❌ Manual	✅ Automatic	IaC

🛠️ Integration Patterns

1. Pre/Post Hooks

#!/bin/bash
# deploy-with-hooks.sh

# Pre-deployment validation
pre_deploy_check() {
    echo "Running pre-deployment checks..."

    # Validate environment
    if [ -z "$ENVIRONMENT" ]; then
        echo "Error: ENVIRONMENT not set" >&2
        exit 1
    fi

    # Check dependencies
    for tool in terraform kubectl helm; do
        if ! command -v $tool >/dev/null 2>&1; then
            echo "Error: $tool not found" >&2
            exit 1
        fi
    done

    # Validate configuration files
    if [ ! -f "config/$ENVIRONMENT.tfvars" ]; then
        echo "Error: Configuration file not found: config/$ENVIRONMENT.tfvars" >&2
        exit 1
    fi

    echo "Pre-deployment checks passed"
}

# Deployment (IaC)
run_deployment() {
    echo "Running Terraform deployment..."

    terraform init
    terraform plan -var-file="config/$ENVIRONMENT.tfvars"
    terraform apply -var-file="config/$ENVIRONMENT.tfvars" -auto-approve
}

# Post-deployment configuration
post_deploy_configure() {
    echo "Running post-deployment configuration..."

    # Get outputs from Terraform
    LB_DNS=$(terraform output -raw load_balancer_dns)

    # Update DNS records
    aws route53 change-resource-record-sets \
        --hosted-zone-id $HOSTED_ZONE_ID \
        --change-batch file://dns-change.json

    # Deploy applications
    kubectl apply -f k8s-manifests/

    # Run health checks
    ./health-check.sh $LB_DNS

    echo "Post-deployment configuration completed"
}

# Main workflow
main() {
    pre_deploy_check
    run_deployment
    post_deploy_configure

    echo "Deployment completed successfully"
}

main "$@"

2. Conditional Logic Bridge

# pulumi-dynamic-provider.py - Python bridge for complex logic
import pulumi
from pulumi import ResourceOptions
from pulumi.dynamic import ResourceProvider, CreateResult
import subprocess
import json

class ComplexLogicProvider(ResourceProvider):
    def create(self, inputs):
        # Complex shell-based logic
        script = f"""
        #!/bin/bash
        set -e

        # Multi-step complex operation
        echo "Step 1: Preparing environment..."
        mkdir -p /tmp/complex-operation
        cd /tmp/complex-operation

        echo "Step 2: Processing data..."
        # Complex data processing with multiple tools
        curl -s {inputs['data_url']} | \
        jq '.items[] | select(.status=="active")' | \
        python3 -c "
import sys, json
for line in sys.stdin:
    item = json.loads(line)
    print(f'Processing {{item[\"id\"]}}')
" > processing.log

        echo "Step 3: Generating outputs..."
        RESULT=$(wc -l processing.log | awk '{{print $1}}')
        echo "{{\\"processed_count\\": $RESULT}}"
        """

        # Execute complex shell script
        result = subprocess.run(
            ['bash', '-c', script],
            capture_output=True,
            text=True
        )

        if result.returncode != 0:
            raise Exception(f"Shell script failed: {result.stderr}")

        outputs = json.loads(result.stdout.strip())
        return CreateResult(id_="complex-logic-result", outs=outputs)

# Use the provider in Pulumi
complex_logic = ComplexLogicProvider("complex-logic", {
    "data_url": "https://api.example.com/data"
})

🎯 Real-World Examples

Example 1: Database Migration

#!/bin/bash
# migrate-database.sh - Shell is perfect for this

# IaC handles infrastructure
# terraform apply -target=aws_db_instance.production

# Shell handles complex migration logic
migrate_database() {
    local db_host="$1"
    local db_name="$2"

    echo "Starting database migration..."

    # Check current schema version
    local current_version
    current_version=$(psql -h "$db_host" -d "$db_name" -t -c "SELECT version FROM schema_migrations ORDER BY applied_at DESC LIMIT 1;" 2>/dev/null || echo "0")

    # Apply migrations sequentially
    for migration in migrations/*.sql; do
        local migration_version
        migration_version=$(basename "$migration" | cut -d'_' -f1)

        if [ "$migration_version" -gt "$current_version" ]; then
            echo "Applying migration $migration_version..."

            if ! psql -h "$db_host" -d "$db_name" -f "$migration"; then
                echo "Migration $migration_version failed" >&2
                return 1
            fi

            # Record migration
            psql -h "$db_host" -d "$db_name" -c "INSERT INTO schema_migrations (version, applied_at) VALUES ('$migration_version', NOW());"
        fi
    done

    echo "Database migration completed"
}

# Called from CI/CD pipeline
# migrate_database $DB_HOST $DB_NAME

Example 2: Multi-Cloud Deployment

# multi-cloud.tf - IaC for consistency
variable "region" {
  type = string
}

# AWS resources
resource "aws_s3_bucket" "data" {
  bucket = "myapp-${var.region}-data"
}

# GCP resources
resource "google_storage_bucket" "data" {
  name     = "myapp-${var.region}-data"
  location = upper(var.region)
}

# Azure resources
resource "azurerm_storage_account" "data" {
  name                     = "myapp${replace(var.region, "-", "")}data"
  resource_group_name      = azurerm_resource_group.main.name
  location                 = var.region
  account_tier             = "Standard"
  account_replication_type = "LRS"
}

# Shell script for cross-cloud orchestration
resource "null_resource" "cross_cloud_sync" {
  provisioner "local-exec" {
    command = <<EOT
      #!/bin/bash
      set -e

      # Sync data between clouds using shell tools
      echo "Syncing data between clouds..."

      # AWS to GCP
      aws s3 cp s3://${aws_s3_bucket.data.bucket} gs://${google_storage_bucket.data.name} --recursive

      # GCP to Azure
      gsutil cp gs://${google_storage_bucket.data.name}/* az://${azurerm_storage_account.data.name}/ --recursive

      echo "Cross-cloud sync completed"
    EOT
  }
}

🧾 Decision Checklist

Ask Yourself:

Is this operation idempotent? → IaC preferred
Does it require complex conditional logic? → Shell preferred
Will others need to reproduce this exactly? → IaC preferred
Is this a one-time operation? → Shell acceptable
Do you need audit trails? → IaC preferred
Is performance critical? → Depends on implementation
Will this run in production repeatedly? → IaC preferred

🧾 Summary

✅ Use shell for: Complex logic, one-time tasks, rapid prototyping, local development ✅ Use IaC for: Declarative state, team collaboration, compliance, large-scale infrastructure ✅ Hybrid approach: Leverage both tools in complementary ways ✅ Integration patterns: Pre/post hooks, conditional bridges, orchestration layers