⚖️ When to Use Shell vs IaC
Understanding when to use shell scripting versus Infrastructure as Code tools is crucial for building maintainable, scalable automation workflows.
🎯 Decision Framework
Use Shell When:
✅ Imperative operations - Sequential steps that must happen in order
✅ One-time tasks - Provisioning scripts, migrations, data transformations
✅ Complex logic - Conditional branching, loops, error handling
✅ Existing tooling - Leveraging mature CLI tools and utilities
✅ Rapid prototyping - Quick proof-of-concepts and experimentation
✅ Local development - Developer environment setup and tooling
Use IaC When:
✅ Declarative state - Desired end-state configuration
✅ Drift detection - Automatic reconciliation of configuration drift
✅ Team collaboration - Shared, version-controlled infrastructure definitions
✅ Compliance requirements - Audit trails and policy enforcement
✅ Multi-cloud deployments - Consistent patterns across providers
✅ Large-scale infrastructure - Hundreds or thousands of resources
🔄 Hybrid Approaches
Shell-First, IaC-Later
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29 | #!/bin/bash
# bootstrap-cluster.sh - Shell-first approach
# 1. Initial setup with shell (rapid iteration)
setup_kubernetes_cluster() {
echo "Setting up Kubernetes cluster..."
# Download and install kubeadm
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubeadm"
chmod +x kubeadm
sudo mv kubeadm /usr/local/bin/
# Initialize cluster
sudo kubeadm init --pod-network-cidr=10.244.0.0/16
# Setup kubeconfig
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
echo "Cluster initialized successfully"
}
# 2. Transition to IaC for ongoing management
# terraform apply -auto-approve
# kubectl apply -f production-manifests/
setup_kubernetes_cluster
echo "Now run: terraform init && terraform apply"
|
IaC-First, Shell-Augmented
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47 | # main.tf - IaC-first approach with shell augmentation
resource "aws_instance" "web" {
ami = "ami-0c02fb55956c7d316"
instance_type = "t3.micro"
# Shell script for custom configuration
user_data = <<-EOF
#!/bin/bash
# Install and configure web server
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
# Custom configuration
echo "<h1>Hello from $(hostname)</h1>" > /var/www/html/index.html
EOF
tags = {
Name = "WebServer"
}
}
# Shell script for post-deployment validation
resource "null_resource" "validate_web" {
depends_on = [aws_instance.web]
provisioner "local-exec" {
command = <<EOT
#!/bin/bash
set -e
echo "Validating web server..."
# Wait for instance to be ready
sleep 30
# Check HTTP response
INSTANCE_IP=${aws_instance.web.public_ip}
if curl -sf "http://$INSTANCE_IP" | grep -q "Hello"; then
echo "Web server validation passed"
else
echo "Web server validation failed" >&2
exit 1
fi
EOT
}
}
|
📊 Complexity Matrix
| Scenario |
Shell |
IaC |
Recommendation |
| Simple VM provisioning |
✅ Quick |
✅ Better |
IaC |
| Complex multi-step deployment |
✅ Flexible |
⚠️ Limited |
Hybrid |
| One-time data migration |
✅ Perfect |
❌ Overkill |
Shell |
| Ongoing infrastructure management |
❌ Brittle |
✅ Ideal |
IaC |
| Developer environment setup |
✅ Great |
⚠️ Heavy |
Shell |
| Production infrastructure |
⚠️ Risky |
✅ Reliable |
IaC |
| CI/CD pipeline orchestration |
✅ Good |
✅ Good |
Either |
| Configuration drift remediation |
❌ Manual |
✅ Automatic |
IaC |
🛠️ Integration Patterns
1. Pre/Post Hooks
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70 | #!/bin/bash
# deploy-with-hooks.sh
# Pre-deployment validation
pre_deploy_check() {
echo "Running pre-deployment checks..."
# Validate environment
if [ -z "$ENVIRONMENT" ]; then
echo "Error: ENVIRONMENT not set" >&2
exit 1
fi
# Check dependencies
for tool in terraform kubectl helm; do
if ! command -v $tool >/dev/null 2>&1; then
echo "Error: $tool not found" >&2
exit 1
fi
done
# Validate configuration files
if [ ! -f "config/$ENVIRONMENT.tfvars" ]; then
echo "Error: Configuration file not found: config/$ENVIRONMENT.tfvars" >&2
exit 1
fi
echo "Pre-deployment checks passed"
}
# Deployment (IaC)
run_deployment() {
echo "Running Terraform deployment..."
terraform init
terraform plan -var-file="config/$ENVIRONMENT.tfvars"
terraform apply -var-file="config/$ENVIRONMENT.tfvars" -auto-approve
}
# Post-deployment configuration
post_deploy_configure() {
echo "Running post-deployment configuration..."
# Get outputs from Terraform
LB_DNS=$(terraform output -raw load_balancer_dns)
# Update DNS records
aws route53 change-resource-record-sets \
--hosted-zone-id $HOSTED_ZONE_ID \
--change-batch file://dns-change.json
# Deploy applications
kubectl apply -f k8s-manifests/
# Run health checks
./health-check.sh $LB_DNS
echo "Post-deployment configuration completed"
}
# Main workflow
main() {
pre_deploy_check
run_deployment
post_deploy_configure
echo "Deployment completed successfully"
}
main "$@"
|
2. Conditional Logic Bridge
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52 | # pulumi-dynamic-provider.py - Python bridge for complex logic
import pulumi
from pulumi import ResourceOptions
from pulumi.dynamic import ResourceProvider, CreateResult
import subprocess
import json
class ComplexLogicProvider(ResourceProvider):
def create(self, inputs):
# Complex shell-based logic
script = f"""
#!/bin/bash
set -e
# Multi-step complex operation
echo "Step 1: Preparing environment..."
mkdir -p /tmp/complex-operation
cd /tmp/complex-operation
echo "Step 2: Processing data..."
# Complex data processing with multiple tools
curl -s {inputs['data_url']} | \
jq '.items[] | select(.status=="active")' | \
python3 -c "
import sys, json
for line in sys.stdin:
item = json.loads(line)
print(f'Processing {{item[\"id\"]}}')
" > processing.log
echo "Step 3: Generating outputs..."
RESULT=$(wc -l processing.log | awk '{{print $1}}')
echo "{{\\"processed_count\\": $RESULT}}"
"""
# Execute complex shell script
result = subprocess.run(
['bash', '-c', script],
capture_output=True,
text=True
)
if result.returncode != 0:
raise Exception(f"Shell script failed: {result.stderr}")
outputs = json.loads(result.stdout.strip())
return CreateResult(id_="complex-logic-result", outs=outputs)
# Use the provider in Pulumi
complex_logic = ComplexLogicProvider("complex-logic", {
"data_url": "https://api.example.com/data"
})
|
🎯 Real-World Examples
Example 1: Database Migration
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40 | #!/bin/bash
# migrate-database.sh - Shell is perfect for this
# IaC handles infrastructure
# terraform apply -target=aws_db_instance.production
# Shell handles complex migration logic
migrate_database() {
local db_host="$1"
local db_name="$2"
echo "Starting database migration..."
# Check current schema version
local current_version
current_version=$(psql -h "$db_host" -d "$db_name" -t -c "SELECT version FROM schema_migrations ORDER BY applied_at DESC LIMIT 1;" 2>/dev/null || echo "0")
# Apply migrations sequentially
for migration in migrations/*.sql; do
local migration_version
migration_version=$(basename "$migration" | cut -d'_' -f1)
if [ "$migration_version" -gt "$current_version" ]; then
echo "Applying migration $migration_version..."
if ! psql -h "$db_host" -d "$db_name" -f "$migration"; then
echo "Migration $migration_version failed" >&2
return 1
fi
# Record migration
psql -h "$db_host" -d "$db_name" -c "INSERT INTO schema_migrations (version, applied_at) VALUES ('$migration_version', NOW());"
fi
done
echo "Database migration completed"
}
# Called from CI/CD pipeline
# migrate_database $DB_HOST $DB_NAME
|
Example 2: Multi-Cloud Deployment
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45 | # multi-cloud.tf - IaC for consistency
variable "region" {
type = string
}
# AWS resources
resource "aws_s3_bucket" "data" {
bucket = "myapp-${var.region}-data"
}
# GCP resources
resource "google_storage_bucket" "data" {
name = "myapp-${var.region}-data"
location = upper(var.region)
}
# Azure resources
resource "azurerm_storage_account" "data" {
name = "myapp${replace(var.region, "-", "")}data"
resource_group_name = azurerm_resource_group.main.name
location = var.region
account_tier = "Standard"
account_replication_type = "LRS"
}
# Shell script for cross-cloud orchestration
resource "null_resource" "cross_cloud_sync" {
provisioner "local-exec" {
command = <<EOT
#!/bin/bash
set -e
# Sync data between clouds using shell tools
echo "Syncing data between clouds..."
# AWS to GCP
aws s3 cp s3://${aws_s3_bucket.data.bucket} gs://${google_storage_bucket.data.name} --recursive
# GCP to Azure
gsutil cp gs://${google_storage_bucket.data.name}/* az://${azurerm_storage_account.data.name}/ --recursive
echo "Cross-cloud sync completed"
EOT
}
}
|
🧾 Decision Checklist
Ask Yourself:
- Is this operation idempotent? → IaC preferred
- Does it require complex conditional logic? → Shell preferred
- Will others need to reproduce this exactly? → IaC preferred
- Is this a one-time operation? → Shell acceptable
- Do you need audit trails? → IaC preferred
- Is performance critical? → Depends on implementation
- Will this run in production repeatedly? → IaC preferred
🧾 Summary
✅ Use shell for: Complex logic, one-time tasks, rapid prototyping, local development
✅ Use IaC for: Declarative state, team collaboration, compliance, large-scale infrastructure
✅ Hybrid approach: Leverage both tools in complementary ways
✅ Integration patterns: Pre/post hooks, conditional bridges, orchestration layers
🧾 See Also