Przejdลบ do treล›ci

๐Ÿงต Advanced Shell Process Control

๐Ÿง  Overview

Process control is where the shell stops being โ€œa command runnerโ€ and becomes a process orchestrator. This includes:

  • process groups
  • job control
  • signal delivery
  • foreground/background execution
  • subshells
  • zombies and reaping
  • traps
  • PID 1 behavior in containers

This is one of the most misunderstood areas of shell behavior โ€” and one of the most critical for production systems.


๐ŸŽ“ Who this is for

  • DevOps/SRE managing longโ€‘running scripts, daemons, or containers
  • Engineers writing orchestration logic or supervising child processes
  • Anyone debugging zombies, hanging pipelines, or broken signal handling
  • People who want deterministic, productionโ€‘grade process behavior

๐Ÿงฉ Role in the Ecosystem

Process control is the backbone of:

Without understanding process groups, sessions, and signal propagation, shell scripts behave unpredictably under load, in pipelines, or inside containers.


๐Ÿงฉ Internals / Mechanics

๐Ÿงฉ The shell as a process controller

A shell manages:

  • processes (PIDs)
  • process groups (PGIDs)
  • sessions
  • terminal control
  • signal routing
  • job tables

When you run:

1
cmd1 | cmd2 &

the shell:

  1. creates pipes
  2. forks children
  3. assigns them to a process group
  4. optionally puts the group in the background
  5. tracks them in the job table

๐Ÿงฉ Foreground vs background

  • Foreground job owns the terminal โ†’ receives SIGINT, SIGQUIT, etc.
  • Background job does NOT own the terminal โ†’ signals must be sent manually.

๐Ÿงฉ Process groups

A pipeline typically shares a process group:

1
cmd1 | cmd2 | cmd3

All three commands receive SIGINT when you press Ctrlโ€‘C.


๐Ÿงฉ Subshells and isolation

Subshells:

  • have their own PID
  • do NOT share variable state
  • inherit environment
  • inherit file descriptors unless redirected

See also: Subshells & Environment


๐Ÿ”ง Techniques

๐Ÿ”ง Use wait to reap children

1
2
3
cmd &
pid=$!
wait "$pid"

Prevents zombies.


๐Ÿ”ง Use trap for clean shutdown

1
trap 'cleanup; exit 0' SIGINT SIGTERM

๐Ÿ”ง Use process substitution to avoid unnecessary subshells

1
diff <(sort a) <(sort b)

๐Ÿ”ง Use set -m (job control) only in interactive shells

Never enable job control in scripts.


โš ๏ธ Pitfalls

โš ๏ธ Zombie processes from unreaped children

1
2
cmd &
# no wait โ†’ zombie

โš ๏ธ Ctrlโ€‘C not stopping pipelines

If the shell is not managing process groups correctly, only the foreground process receives SIGINT.


โš ๏ธ Traps not firing in subshells

1
( trap 'echo hi' EXIT )

The trap runs in the subshell, not the parent.


โš ๏ธ Using kill -9 as a default

SIGKILL prevents cleanup and can corrupt state.


๐Ÿšจ Realโ€‘World Failures

๐Ÿšจ Failure: Shell script used as PID 1 leaks zombies

In Docker:

1
CMD ["sh", "-c", "run-app.sh"]

sh becomes PID 1 โ†’ does NOT reap children โ†’ zombies accumulate.

Fix:

  • use tini or dumb-init
  • or implement a SIGCHLD handler + wait

๐Ÿšจ Failure: CI job hangs due to orphaned background process

1
2
long_task &
exit 0

The background process keeps running โ†’ CI never finishes.

Fix:

1
trap 'kill 0' EXIT

๐Ÿšจ Failure: Ctrlโ€‘C doesnโ€™t stop a pipeline

1
cmd1 | cmd2

If the shell doesnโ€™t set a unified process group, only cmd2 receives SIGINT.


๐Ÿ› ๏ธ Patterns

๐Ÿ› ๏ธ Pattern: Explicit signal handling

1
trap 'echo stopping; kill 0; exit' SIGINT SIGTERM

๐Ÿ› ๏ธ Pattern: Use wait for all children

1
2
3
4
5
6
7
8
9
pids=()
for x in {1..5}; do
  worker "$x" &
  pids+=("$!")
done

for pid in "${pids[@]}"; do
  wait "$pid"
done

๐Ÿ› ๏ธ Pattern: Use a minimal init in containers

tini or dumb-init solves:

  • zombie reaping
  • signal forwarding
  • predictable shutdown

โŒ Antiโ€‘Patterns

โŒ Using shell as a process supervisor

Shell is not systemd. Avoid:

  • longโ€‘running loops
  • manual restarts
  • complex signal routing

โŒ Ignoring SIGCHLD

Leads to zombie accumulation.


โŒ Running background jobs without cleanup

1
2
cmd &
exit

๐Ÿ” Debugging

๐Ÿ” Inspect process tree

1
2
ps f
pstree -p

๐Ÿ” Inspect process groups

1
ps -o pid,pgid,comm

๐Ÿ” Trace signals

1
strace -e trace=signal -f sh script.sh

โš™๏ธ Performance

โš™๏ธ Avoid excessive forking

Use builtins where possible.


โš™๏ธ Avoid longโ€‘running background loops

They consume CPU and complicate shutdown.


โš™๏ธ Use wait -n (Bash) for efficient worker pools


๐Ÿณ Containers

๐Ÿณ Shell as PID 1

PID 1 has special semantics:

  • ignores some signals by default
  • must reap children
  • must forward signals

๐Ÿณ Use an init wrapper

1
2
ENTRYPOINT ["/usr/bin/tini", "--"]
CMD ["run.sh"]

๐Ÿ›ฐ๏ธ CI/CD

๐Ÿ›ฐ๏ธ Ensure deterministic shutdown

CI runners kill jobs with SIGTERM โ†’ scripts must handle it.


๐Ÿ›ฐ๏ธ Avoid background jobs unless necessary

They often outlive the job and cause hangs.


๐Ÿงช Testing

1
ps -o pid,pgid,sid,tty,comm
1
2
( sleep 1 & )
ps -el | grep Z
1
2
3
trap 'echo TERM; exit' TERM
sleep 100 &
kill -TERM $$

๐Ÿง  Summary

Process control is the backbone of reliable shell scripting. Mastering:

  • process groups
  • signals
  • job control
  • subshells
  • reaping
  • PID 1 behavior

โ€ฆis essential for writing safe, predictable, productionโ€‘grade automation.