🧵 Execve & Fork Internals

🧠 Overview

This module goes deep into how a POSIX‑style shell actually creates and replaces processes:

fork() / vfork() / clone() (conceptually)
execve() and the exec family
file descriptor inheritance and CLOEXEC
PATH lookup and execve failures
shebang handling (#!)
pipelines and process graphs
how this all behaves in containers and CI

The goal: when you see cmd1 | cmd2, you should be able to mentally draw the process tree and FD graph.

🎓 Who this is for

DevOps/SRE debugging stuck pipelines, zombie leaks, or weird FD behavior.
Engineers writing entrypoints or process supervisors in shell.
People integrating shell with other runtimes (agents, runners, task executors).
Anyone who wants to understand what really happens between bash and the kernel.

You should already be comfortable with:

basic shell scripting
processes and PIDs
exit codes
redirections and pipelines

🧩 Role in the ecosystem

Exec/fork internals underpin:

Advanced Shell Architecture
Advanced Process Control
Subshells & Environment
Advanced Pipelines
container entrypoints and PID 1 behavior

If you don’t understand how processes are created and replaced, you’re guessing when debugging:

“Why doesn’t this env var show up?”
“Why is this FD still open?”
“Why does this pipeline hang?”
“Why does this script behave differently in CI vs locally?”

🧩 Internals / Mechanics

🧩 Fork: cloning the shell process

Conceptually:

pid_t pid = fork();
if (pid == 0) {
    // child
} else {
    // parent
}

In the shell:

Parent: continues the main loop, tracks jobs, waits.
Child: inherits:
memory (copy‑on‑write)
environment
open file descriptors
current directory
signal dispositions (with some nuances)

The child then typically:

sets up redirections (dup2, close)
adjusts process group / session if needed
calls execve() to replace itself with the target program

If execve() fails, the child usually prints an error and exits with a non‑zero status.

🧩 Execve: replacing the process image

Conceptually:

execve("/usr/bin/ls", argv, envp);
// if we get here, execve failed
perror("execve");
_exit(127);

Key properties:

Same PID: execve() does not create a new process; it replaces the current one.
New code, same process: memory, code, stack, heap are replaced.
Environment: passed explicitly as envp (or inherited if using execvp/execlp wrappers).
File descriptors: remain open unless marked CLOEXEC.

This is why:

a process can exec another binary and keep sockets/pipes open.
PID‑based supervision still works across exec boundaries.

🧩 Exec family and PATH lookup

Common exec variants:

execve(path, argv, envp) — no PATH lookup, raw syscall.
execvp(file, argv) — uses PATH to search for file.
execlp(file, arg0, ..., NULL) — same, but varargs.

Shell behavior:

When you run ls, the shell:
searches PATH for ls
builds argv (["ls", ...])
builds envp from current environment
calls execve("/bin/ls", argv, envp) (via execvp‑like logic)

If PATH lookup fails:

command not found
exit code is typically 127.

🧩 Shebang (`#!`) handling

When you run a script file:

1	`./script.sh`

The kernel:

Reads the first line.
If it starts with #!, e.g.:

1	`#!/usr/bin/env bash`

It runs:

1	`/usr/bin/env bash ./script.sh`

(with any extra arguments from the shebang line).

Implications:

The interpreter (e.g. bash) is what actually runs the script.
Environment and PATH of the parent process affect which interpreter is used.
If the shebang is missing or invalid, behavior depends on the OS and invoking shell.

🧩 File descriptors and CLOEXEC

When the shell forks:

The child inherits all open FDs from the parent (stdin, stdout, stderr, pipes, sockets, logs, etc.).
Before execve(), the child may:
dup2() FDs to 0, 1, 2 for redirections.
close FDs that should not be visible to the child.

CLOEXEC (FD_CLOEXEC) flag:

If set on an FD, the kernel automatically closes it on execve().
This prevents leaking internal FDs (e.g. listening sockets, control pipes) into child processes.

Architecturally:

Without CLOEXEC: every exec can accidentally inherit internal FDs → hangs, resource leaks, security issues.
With CLOEXEC: only explicitly passed FDs survive.

🧩 Pipelines: process and FD graph

For:

cmd1 | cmd2 | cmd3

The shell typically:

Creates two pipes: p1 (between cmd1 and cmd2), p2 (between cmd2 and cmd3).
Forks three children.
In each child:
cmd1:
- dup2(p1_write, STDOUT_FILENO)
- closes unused FDs
- execve(cmd1, ...)
cmd2:
- dup2(p1_read, STDIN_FILENO)
- dup2(p2_write, STDOUT_FILENO)
- closes unused FDs
- execve(cmd2, ...)
cmd3:
- dup2(p2_read, STDIN_FILENO)
- closes unused FDs
- execve(cmd3, ...)

If any process keeps a pipe write end open:

readers may never see EOF → pipeline hangs.

This is a classic source of “mysterious” hangs in complex scripts.

🧩 Subshells vs exec

Subshell:

( some commands )

Implemented via fork() (new process).
Runs a copy of the shell with the same environment and state snapshot.
Changes to variables, cd, etc. do not affect the parent.

Exec in the current shell:

1	`exec some-command`

No new process is created.
The current shell process is replaced by some-command.
Useful in:
PID 1 entrypoints
final step of a script where you don’t need the shell anymore

🔧 Techniques

🔧 Use `exec` in PID 1 entrypoints

In containers:

#!/bin/sh
# bad: shell stays as PID 1, app is child
run-app "$@"

# better:
exec run-app "$@"

Benefits:

The app becomes PID 1.
Signals go directly to the app.
No extra shell process to manage.

If you need the shell as a supervisor, that’s a different pattern (and you must handle SIGCHLD, wait, etc.).

🔧 Use CLOEXEC for internal FDs

In languages like Python/Go/Rust, set CLOEXEC on:

internal control pipes
listening sockets
log pipes

So that when you exec tools from your process, they don’t inherit those FDs.

In shell, you can’t set CLOEXEC directly, but you should assume that tools you call might leak FDs if they don’t use it.

🔧 Debug PATH and exec failures

When cmd fails with “not found”:

Check echo "$PATH".
Use type cmd or command -v cmd.
Use strace -f -e execve sh script.sh to see what the shell is actually trying to exec.

🔧 Visualize process trees

Use:

ps f
pstree -p

to see:

which process exec’d what
which PIDs are still shells
where your app actually lives in the tree

⚠️ Pitfalls

⚠️ Shell as a supervisor without understanding exec/fork

Using shell as a long‑running supervisor:

while true; do
  run-worker
  sleep 1
done

…without:

wait for children
proper signal handling
understanding FD inheritance

…leads to:

zombie accumulation
stuck FDs
broken shutdown

⚠️ Leaking FDs into children

If a parent process:

opens a socket or pipe
then execs tools without CLOEXEC

…those tools may:

keep FDs open
prevent EOF on pipes
keep ports bound
cause “address already in use” or hangs

⚠️ Misusing `exec` in the middle of scripts

echo "starting"
exec some-command
echo "this will never run"

After exec, the shell is gone. Anything after it is dead code.

⚠️ PATH‑dependent behavior

Scripts that rely on:

PATH containing specific directories
env resolving to a specific binary
bash being at /bin/bash

…behave differently across:

distros
containers
CI runners

🚨 Real‑world failures

🚨 Failure: CI job hangs due to inherited FD

Scenario:

A test runner opens a pipe/socket.
It then execs a child process that runs tests.
The child inherits the FD and never closes it.
The parent waits for EOF on the pipe → never comes → CI job hangs.

Root cause:

No CLOEXEC on internal FDs.
No explicit FD management before exec.

🚨 Failure: Container doesn’t stop on SIGTERM

Scenario:

CMD ["sh", "-c", "run-app.sh"]

sh is PID 1.
run-app.sh is a child.
sh doesn’t forward signals correctly.
docker stop sends SIGTERM to PID 1 → shell exits or ignores → app keeps running or dies uncleanly.

Fix:

Use exec in the entrypoint:

1	`exec run-app.sh`

Or use a minimal init (tini, dumb-init).

🚨 Failure: “Command not found” only in CI

Scenario:

Locally: PATH includes /usr/local/bin, CI: doesn’t.
Script calls my-tool assuming it’s globally available.
In CI, execvp can’t find it → command not found.

Fix:

Validate tools explicitly at the top:

command -v my-tool >/dev/null 2>&1 || {
  echo "my-tool is required" >&2
  exit 1
}

Or use absolute paths.

🛠️ Patterns

🛠️ Pattern: Final `exec` in entrypoints

#!/bin/sh
set -e

# setup, env, migrations, etc.
prepare_app

# replace shell with the app
exec "$@"

No extra shell process.
Clean signal behavior.
Predictable shutdown.

🛠️ Pattern: Explicit process graph thinking

When designing:

cmd1 | cmd2 | cmd3

ask:

How many processes?
Who owns which FDs?
Who closes which ends of which pipes?
What happens on SIGINT?

This prevents “mysterious” hangs and partial shutdowns.

🛠️ Pattern: Use `exec` in small wrappers

Instead of:

#!/bin/sh
my-real-binary "$@"

use:

#!/bin/sh
exec my-real-binary "$@"

So that:

there’s no extra shell layer
PID, signals, and exit codes map directly to the real binary

❌ Anti‑patterns

using shell as a complex, long‑running supervisor without understanding fork/exec
relying on PATH and shebangs without validation
ignoring FD inheritance and CLOEXEC
sprinkling exec randomly in the middle of scripts
assuming “PID 1 is just another process”

🔍 Debugging

🔍 Trace exec/fork with `strace`

strace -f -e trace=process sh script.sh

You’ll see:

fork() / clone() calls
execve() calls
which binaries are actually executed
which paths are tried

🔍 Inspect open FDs

Inside a process:

ls -l /proc/$$/fd

You’ll see:

which FDs are open
which pipes/sockets/files are still alive

This is invaluable for debugging hangs and leaks.

🧠 Summary

Execve & fork internals are the mechanical heart of shell execution:

fork() clones the shell.
execve() replaces the child with the target program.
FDs are inherited unless CLOEXEC is used.
PATH and shebangs decide what actually runs.
Pipelines are just process graphs + FD wiring.

Once you can mentally simulate fork/exec and FD inheritance, you stop guessing and start designing process behavior—especially in containers, CI, and production automation.

🧵 Execve & Fork Internals

🧠 Overview

🎓 Who this is for

🧩 Role in the ecosystem

🧩 Internals / Mechanics

🧩 Fork: cloning the shell process

🧩 Execve: replacing the process image

🧩 Exec family and PATH lookup

🧩 Shebang (#!) handling

🧩 File descriptors and CLOEXEC

🧩 Pipelines: process and FD graph

🧩 Subshells vs exec

🔧 Techniques

🔧 Use exec in PID 1 entrypoints

🔧 Use CLOEXEC for internal FDs

🔧 Debug PATH and exec failures

🔧 Visualize process trees

⚠️ Pitfalls

⚠️ Shell as a supervisor without understanding exec/fork

⚠️ Leaking FDs into children

⚠️ Misusing exec in the middle of scripts

⚠️ PATH‑dependent behavior

🚨 Real‑world failures

🚨 Failure: CI job hangs due to inherited FD

🚨 Failure: Container doesn’t stop on SIGTERM

🚨 Failure: “Command not found” only in CI

🛠️ Patterns

🛠️ Pattern: Final exec in entrypoints

🛠️ Pattern: Explicit process graph thinking

🛠️ Pattern: Use exec in small wrappers

❌ Anti‑patterns

🔍 Debugging

🔍 Trace exec/fork with strace

🔍 Inspect open FDs

🧠 Summary

🧩 Shebang (`#!`) handling

🔧 Use `exec` in PID 1 entrypoints

⚠️ Misusing `exec` in the middle of scripts

🛠️ Pattern: Final `exec` in entrypoints

🛠️ Pattern: Use `exec` in small wrappers

🔍 Trace exec/fork with `strace`