🐳 Advanced Shell in Containers

🧠 Overview

Shell scripts inside containers behave differently than on a normal Linux host. Containers change:

PID hierarchy
signal delivery
process groups
environment initialization
filesystem layout
entrypoint semantics
zombie reaping
logging and stdout/stderr behavior

This module explains how to write robust, production‑grade shell scripts that run correctly inside Docker, Kubernetes, Nomad, ECS, Swarm, and containerized CI/CD environments.

🎓 Who this is for

DevOps/SRE building container images or entrypoints
Engineers writing startup scripts, health checks, or lifecycle hooks
Anyone debugging signal handling, zombie processes, or shutdown issues
People deploying applications in Kubernetes, Nomad, ECS, or Docker Swarm
Engineers writing Docker entrypoints, sidecars, init containers, or CI containers

🧩 Role in the Ecosystem

Container shell behavior interacts with:

Containers are where PID 1 semantics, signals, and zombies become very real.

1. Internals / Mechanics

1.1 PID 1 is special

Inside a container, the entrypoint becomes PID 1.

PID 1 has unique semantics:

ignores some signals by default
does not automatically reap zombie processes
does not forward signals unless explicitly implemented
is responsible for clean shutdown
is the parent of all processes in the container

This is the root cause of many container bugs:

pods stuck in Terminating
containers ignoring SIGTERM
zombie processes accumulating
CI jobs hanging forever

1.2 Shell as PID 1 is dangerous

If your entrypoint is:

CMD ["sh", "-c", "run.sh"]

Then sh becomes PID 1 and:

does not reap children
may ignore SIGTERM
may not forward signals to the app
may leave zombies
may cause slow or broken shutdowns
may swallow exit codes

This is especially bad when:

the shell spawns background jobs
the shell uses pipelines
the shell uses subshells
the shell does not use exec

1.3 Exec vs non‑exec entrypoints

Bad:

app "$@"

Good:

exec app "$@"

exec replaces the shell with the application:

no extra shell process
correct signal handling
no zombie shell
PID 1 is the actual app
simpler process tree

Without exec, you get:

PID 1 = shell
app is child of shell
SIGTERM goes to shell, not app
shell may ignore or mishandle signals

1.4 Environment initialization in containers

Containers often run with:

missing environment variables
empty or minimal PATH
no user dotfiles
minimal locale settings
no login shell semantics

This means:

no .bashrc, .profile, .zshrc
no aliases
no functions
no PATH modifications
no locale configuration

Scripts must validate everything:

required variables
required tools
required directories
required files

1.5 Filesystem and PID namespace

Containers run with:

isolated PID namespace
isolated mount namespace
overlayfs or unionfs
ephemeral writable layers

This affects:

process inspection (ps, /proc)
file descriptors (/proc/1/fd)
zombie visibility
performance of heavy I/O loops

2. Techniques

2.1 Always use `exec` in entrypoints

Your entrypoint should almost always end with exec:

#!/bin/sh
set -e

# optional: setup, logging, validation
: "${CONFIG_PATH:?CONFIG_PATH must be set}"

exec app "$@"

Benefits:

PID 1 = app
signals go directly to the app
no extra shell layer
no zombie shell
simpler debugging

2.2 Use a minimal init system

PID 1 must:

reap zombies
forward signals
exit cleanly

Shells are not init systems. Use:

tini
dumb-init

Example:

RUN apk add --no-cache tini
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["run.sh"]

Then in run.sh:

#!/bin/sh
set -e
exec app "$@"

tini:

reaps zombies
forwards signals
handles PID 1 semantics correctly

2.3 Validate environment variables before use

: "${CONFIG_PATH:?CONFIG_PATH must be set}"
: "${ENVIRONMENT:?ENVIRONMENT must be set}"
: "${SERVICE:?SERVICE must be set}"

If missing → container exits fast and loud, instead of failing later in weird ways.

2.4 Use predictable logging

log() { printf '[%s] %s\n' "$(date +%H:%M:%S)" "$*" >&2; }

In containers:

stdout/stderr are the primary logging channels
logs are collected by Docker, Kubernetes, ECS, etc.
structured logs are easier to parse

2.5 Use `trap` for graceful shutdown

cleanup() {
  log "Cleaning up..."
  # stop children, flush buffers, remove temp files, etc.
}

trap 'cleanup; exit 0' SIGTERM SIGINT

This ensures:

graceful shutdown on docker stop
graceful shutdown on Kubernetes SIGTERM
graceful shutdown on CI timeouts (if signals are forwarded)

3. Pitfalls

3.1 Shell as PID 1 not reaping zombies

Example:

#!/bin/sh
worker &
sleep infinity

worker exits → becomes zombie
PID 1 (shell) does not reap it
zombies accumulate

3.2 SIGTERM not stopping the app

Entrypoint:

#!/bin/sh
app &
wait

Kubernetes sends SIGTERM to PID 1 (shell):

shell may ignore SIGTERM
app keeps running
pod stuck in Terminating

3.3 Using `sleep infinity` in entrypoints

#!/bin/sh
setup
sleep infinity

container never exits
SIGTERM may be ignored
shutdown is broken

3.4 Relying on interactive features

Containers do not load:

.bashrc
.profile
.zshrc

No:

aliases
prompts
interactive read
interactive sudo

3.5 Using `tail -f` as the main process

1	`tail -f /var/log/app.log`

PID 1 = tail
app is separate process
signals go to tail, not app
shutdown is broken

4. Real‑World Failures (Intro)

Poniżej — konkretne incydenty z produkcji, które wynikają z:

PID 1 semantics
braku exec
braku reaping
złego trapowania sygnałów
złego entrypoint design

Szczegółowe przypadki rozwinę w części 2/3.

4.1 Kubernetes pod refuses to terminate

Entrypoint:

#!/bin/sh
app &
wait

SIGTERM sent → shell ignores → pod stuck in Terminating.

4.2 Zombie processes accumulate in container

Long‑running script spawns children but never reaps them.

4.3 Application never receives SIGTERM

Entrypoint:

app "$@"

Shell remains PID 1 → app is child → SIGTERM goes to shell, not app.

5. Patterns (Intro)

Use exec to replace the shell
Use a real init system (tini, dumb-init)
Validate environment early
Use lightweight health checks
Keep entrypoints small and deterministic

Pełne rozwinięcie patterns/anti‑patterns/debugging/performance będzie w częściach 2/3 i 3/3.

🐳 Advanced Shell in Containers

🧠 Overview

🎓 Who this is for

🧩 Role in the Ecosystem

1. Internals / Mechanics

1.1 PID 1 is special

1.2 Shell as PID 1 is dangerous

1.3 Exec vs non‑exec entrypoints

1.4 Environment initialization in containers

1.5 Filesystem and PID namespace

2. Techniques

2.1 Always use exec in entrypoints

2.2 Use a minimal init system

2.3 Validate environment variables before use

2.4 Use predictable logging

2.5 Use trap for graceful shutdown

3. Pitfalls

3.1 Shell as PID 1 not reaping zombies

3.2 SIGTERM not stopping the app

3.3 Using sleep infinity in entrypoints

3.4 Relying on interactive features

3.5 Using tail -f as the main process

4. Real‑World Failures (Intro)

4.1 Kubernetes pod refuses to terminate

4.2 Zombie processes accumulate in container

4.3 Application never receives SIGTERM

5. Patterns (Intro)

2.1 Always use `exec` in entrypoints

2.5 Use `trap` for graceful shutdown

3.3 Using `sleep infinity` in entrypoints

3.5 Using `tail -f` as the main process