🐳 Advanced Shell in Containers
🧠 Overview
Shell scripts inside containers behave differently than on a normal Linux host. Why? Because containers change:
- PID hierarchy
- signal delivery
- process groups
- environment initialization
- filesystem layout
- entrypoint semantics
- zombie reaping
- logging and stdout/stderr behavior
This module explains how to write robust, production‑grade shell scripts that run correctly inside Docker, Kubernetes, and containerized CI/CD environments.
🎓 Who this is for
- DevOps/SRE building container images or entrypoints.
- Engineers writing startup scripts, health checks, or lifecycle hooks.
- Anyone debugging signal handling, zombie processes, or shutdown issues.
- People deploying applications in Kubernetes, Nomad, ECS, or Docker Swarm.
🧩 Internals / Mechanics
🧩 PID 1 is special
Inside a container, the entrypoint becomes PID 1.
PID 1 has unique semantics:
- ignores some signals by default
- does not automatically reap zombie processes
- does not forward signals unless explicitly implemented
- is responsible for clean shutdown
This is the root cause of many container bugs.
🧩 Shell as PID 1 is dangerous
If your entrypoint is:
1 | |
Then sh becomes PID 1 and:
- does not reap children
- may ignore SIGTERM
- may not forward signals to the app
- may leave zombies
- may cause slow or broken shutdowns
🧩 Exec vs non‑exec entrypoints
Bad:
1 | |
Good:
1 | |
exec replaces the shell with the application → no extra shell process → correct signal handling.
🧩 Environment initialization
Containers often run with:
- missing environment variables
- empty PATH
- no user dotfiles
- minimal locale settings
Scripts must validate everything.
🔧 Techniques
🔧 Always use exec in entrypoints
1 2 3 | |
🔧 Use a minimal init system
Recommended:
tinidumb-init
Example:
1 2 | |
🔧 Validate environment variables before use
1 | |
🔧 Use predictable logging
1 | |
🔧 Use trap for graceful shutdown
1 | |
⚠️ Pitfalls
⚠️ Shell as PID 1 not reaping zombies
1 2 | |
⚠️ SIGTERM not stopping the app
Kubernetes sends SIGTERM → shell ignores it → pod hangs.
⚠️ Using sleep infinity in entrypoints
Prevents proper shutdown.
⚠️ Relying on interactive features
Containers do not load:
.bashrc.profile.zshrc
⚠️ Using tail -f as the main process
Prevents signal propagation.
🚨 Real‑World Failures
🚨 Failure: Kubernetes pod refuses to terminate
Entrypoint:
1 2 3 | |
SIGTERM sent → shell ignores → pod stuck in Terminating.
Fix:
1 2 | |
🚨 Failure: Zombie processes accumulate in container
Long‑running script spawns children but never reaps them.
Fix:
Use tini or implement:
1 | |
🚨 Failure: Application never receives SIGTERM
Entrypoint:
1 | |
Shell remains PID 1 → app is child → SIGTERM goes to shell, not app.
Fix:
1 | |
🛠️ Patterns
🛠️ Pattern: Use exec to replace the shell
Ensures correct signal handling.
🛠️ Pattern: Use a real init system
tini solves:
- zombie reaping
- signal forwarding
- predictable shutdown
🛠️ Pattern: Validate environment early
Fail fast if required variables are missing.
🛠️ Pattern: Use health checks that do not fork excessively
Avoid heavy loops.
❌ Anti‑Patterns
❌ Anti‑pattern: Shell as a long‑running supervisor
Shell is not systemd.
❌ Anti‑pattern: Using sleep infinity
❌ Anti‑pattern: Using tail -f as PID 1
❌ Anti‑pattern: Ignoring SIGTERM
❌ Anti‑pattern: Running background jobs without cleanup
🔍 Debugging
🔍 Inspect PID tree inside container
1 | |
🔍 Debug signals
1 | |
🔍 Debug FD leaks
1 | |
🔍 Debug entrypoint behavior
Add:
1 2 | |
⚙️ Performance
⚙️ Avoid heavy loops in entrypoints
Use compiled tools for heavy work.
⚙️ Avoid unnecessary forks
Use builtins where possible.
⚙️ Use streaming tools for large data
awk, sed, jq are optimized for streaming.
🧵 Process Control
🧵 PID 1 must forward signals
Otherwise:
- Kubernetes cannot stop pods
- Docker cannot stop containers
- CI jobs hang
🧵 PID 1 must reap children
Otherwise zombies accumulate.
🛰️ CI/CD
🛰️ Containers in CI must fail fast
Use:
1 | |
🛰️ Validate environment before running commands
🛰️ Avoid background jobs unless necessary
🧠 Summary
Shell scripts inside containers must account for:
- PID 1 semantics
- signal forwarding
- zombie reaping
- environment validation
- deterministic startup and shutdown
- predictable logging
- safe entrypoint design
Mastering these techniques ensures reliable, production‑grade container behavior.