Przejdź do treści

🐳 Advanced Shell in Containers

🧠 Overview

Shell scripts inside containers behave differently than on a normal Linux host. Why? Because containers change:

  • PID hierarchy
  • signal delivery
  • process groups
  • environment initialization
  • filesystem layout
  • entrypoint semantics
  • zombie reaping
  • logging and stdout/stderr behavior

This module explains how to write robust, production‑grade shell scripts that run correctly inside Docker, Kubernetes, and containerized CI/CD environments.


🎓 Who this is for

  • DevOps/SRE building container images or entrypoints.
  • Engineers writing startup scripts, health checks, or lifecycle hooks.
  • Anyone debugging signal handling, zombie processes, or shutdown issues.
  • People deploying applications in Kubernetes, Nomad, ECS, or Docker Swarm.

🧩 Internals / Mechanics

🧩 PID 1 is special

Inside a container, the entrypoint becomes PID 1.

PID 1 has unique semantics:

  • ignores some signals by default
  • does not automatically reap zombie processes
  • does not forward signals unless explicitly implemented
  • is responsible for clean shutdown

This is the root cause of many container bugs.

🧩 Shell as PID 1 is dangerous

If your entrypoint is:

1
CMD ["sh", "-c", "run.sh"]

Then sh becomes PID 1 and:

  • does not reap children
  • may ignore SIGTERM
  • may not forward signals to the app
  • may leave zombies
  • may cause slow or broken shutdowns

🧩 Exec vs non‑exec entrypoints

Bad:

1
app "$@"

Good:

1
exec app "$@"

exec replaces the shell with the application → no extra shell process → correct signal handling.

🧩 Environment initialization

Containers often run with:

  • missing environment variables
  • empty PATH
  • no user dotfiles
  • minimal locale settings

Scripts must validate everything.


🔧 Techniques

🔧 Always use exec in entrypoints

1
2
3
#!/bin/sh
set -e
exec app "$@"

🔧 Use a minimal init system

Recommended:

  • tini
  • dumb-init

Example:

1
2
ENTRYPOINT ["/usr/bin/tini", "--"]
CMD ["run.sh"]

🔧 Validate environment variables before use

1
: "${CONFIG_PATH:?CONFIG_PATH must be set}"

🔧 Use predictable logging

1
log() { printf '[%s] %s\n' "$(date +%H:%M:%S)" "$*" >&2; }

🔧 Use trap for graceful shutdown

1
trap 'cleanup; exit 0' SIGTERM SIGINT

⚠️ Pitfalls

⚠️ Shell as PID 1 not reaping zombies

1
2
cmd &
# no wait → zombie

⚠️ SIGTERM not stopping the app

Kubernetes sends SIGTERM → shell ignores it → pod hangs.

⚠️ Using sleep infinity in entrypoints

Prevents proper shutdown.

⚠️ Relying on interactive features

Containers do not load:

  • .bashrc
  • .profile
  • .zshrc

⚠️ Using tail -f as the main process

Prevents signal propagation.


🚨 Real‑World Failures

🚨 Failure: Kubernetes pod refuses to terminate

Entrypoint:

1
2
3
#!/bin/sh
app &
wait

SIGTERM sent → shell ignores → pod stuck in Terminating.

Fix:

1
2
trap 'kill 0; exit 0' SIGTERM
exec app

🚨 Failure: Zombie processes accumulate in container

Long‑running script spawns children but never reaps them.

Fix:

Use tini or implement:

1
trap 'wait' CHLD

🚨 Failure: Application never receives SIGTERM

Entrypoint:

1
app "$@"

Shell remains PID 1 → app is child → SIGTERM goes to shell, not app.

Fix:

1
exec app "$@"

🛠️ Patterns

🛠️ Pattern: Use exec to replace the shell

Ensures correct signal handling.

🛠️ Pattern: Use a real init system

tini solves:

  • zombie reaping
  • signal forwarding
  • predictable shutdown

🛠️ Pattern: Validate environment early

Fail fast if required variables are missing.

🛠️ Pattern: Use health checks that do not fork excessively

Avoid heavy loops.


❌ Anti‑Patterns

❌ Anti‑pattern: Shell as a long‑running supervisor

Shell is not systemd.

❌ Anti‑pattern: Using sleep infinity

❌ Anti‑pattern: Using tail -f as PID 1

❌ Anti‑pattern: Ignoring SIGTERM

❌ Anti‑pattern: Running background jobs without cleanup


🔍 Debugging

🔍 Inspect PID tree inside container

1
ps -o pid,ppid,stat,cmd

🔍 Debug signals

1
strace -f -e trace=signal -p 1

🔍 Debug FD leaks

1
ls -l /proc/1/fd

🔍 Debug entrypoint behavior

Add:

1
2
set -x
echo "PID=$$"

⚙️ Performance

⚙️ Avoid heavy loops in entrypoints

Use compiled tools for heavy work.

⚙️ Avoid unnecessary forks

Use builtins where possible.

⚙️ Use streaming tools for large data

awk, sed, jq are optimized for streaming.


🧵 Process Control

🧵 PID 1 must forward signals

Otherwise:

  • Kubernetes cannot stop pods
  • Docker cannot stop containers
  • CI jobs hang

🧵 PID 1 must reap children

Otherwise zombies accumulate.


🛰️ CI/CD

🛰️ Containers in CI must fail fast

Use:

1
set -euo pipefail

🛰️ Validate environment before running commands

🛰️ Avoid background jobs unless necessary


🧠 Summary

Shell scripts inside containers must account for:

  • PID 1 semantics
  • signal forwarding
  • zombie reaping
  • environment validation
  • deterministic startup and shutdown
  • predictable logging
  • safe entrypoint design

Mastering these techniques ensures reliable, production‑grade container behavior.