Przejdลบ do treล›ci

๐Ÿ”— Extended Pipelines and Process Substitution

Pipelines chain commands. Process substitution treats command output as files. Together they enable powerful data flows.

๐Ÿงญ Pipeline Fundamentals

1
command1 | command2 | command3

Data flows left to right: - stdout of command1 โ†’ stdin of command2 - stdout of command2 โ†’ stdin of command3

Exit code reflects last command (unless pipefail is set).


๐Ÿงช Advanced Pipeline Patterns

Tee โ€” Branch Output

Save to file AND continue pipeline:

1
command | tee output.log | further_processing

Append mode:

1
command | tee -a output.log | filter

Combining Stdout and Stderr

1
2
command 2>&1 | grep ERROR   # Both streams to grep
command |& grep ERROR       # Bash shorthand for 2>&1 |

Named Pipes (FIFOs)

Create persistent communication channel:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
mkfifo mypipe

# Producer (runs in background)
while true; do
    echo "data $(date)" > mypipe
    sleep 1
done &

# Consumer
while IFS= read -r line; do
    echo "Received: $line"
done < mypipe

Named pipes persist until both ends are closed.


๐Ÿง  Process Substitution

Syntax

1
2
command <(subcommand)
command >(subcommand)

Practical Examples

Compare sorted outputs:

1
diff <(sort file1.txt) <(sort file2.txt)

Feed generated data:

1
2
3
while IFS= read -r line; do
    process "$line"
done < <(generate_data)

Parallel processing:

1
paste <(cut -d',' -f1 data.csv) <(cut -d',' -f2 data.csv)

๐Ÿงช Complex Pipeline Chains

Multi-Stage Processing

1
2
3
4
5
6
7
cat access.log \
  | grep "POST /api" \
  | awk '{print $1, $4, $7}' \
  | sort -k2 -n \
  | uniq -c \
  | sort -nr \
  | head -10

Stages: 1. Filter POST requests 2. Extract IP, timestamp, endpoint 3. Sort by timestamp 4. Count unique entries 5. Sort by count 6. Show top 10

Parallel Branches

1
2
3
4
5
{
    echo "Header"
    cat part1.txt
    cat part2.txt
} | sort | tee >(wc -l > count.txt) | gzip > archive.gz

Explanation: - Merge header + two files - Sort merged stream - Branch 1: Count lines โ†’ count.txt - Branch 2: Compress โ†’ archive.gz


๐Ÿง  Buffering Issues

Pipelines use buffers โ€” this affects timing:

Buffer Type Size Trigger
Full 4-8 KB Buffer fills
Line N/A Newline encountered
None 0 Immediate flush

Force line buffering:

1
2
3
stdbuf -oL command | while read line; do
    echo "Got: $line"
done

Or use unbuffer (from expect package):

1
unbuffer command | process

๐Ÿงช Performance Optimization

Minimize Pipeline Stages

Each | spawns a subprocess:

1
2
3
4
5
# โŒ 4 processes
cat file | grep pattern | wc -l

# โœ… 1 process (faster)
grep -c pattern file

Use Built-ins When Possible

1
2
3
4
5
# โŒ External commands
cat file | while read line; do echo "$line"; done

# โœ… Built-in (no subshell)
while IFS= read -r line; do echo "$line"; done < file

๐Ÿงพ Portability Notes

Feature POSIX Bash Zsh
Basic pipes \| โœ… โœ… โœ…
Process substitution โŒ โœ… โœ…
|& (combined) โŒ โœ… โŒ
Named pipes mkfifo โœ… โœ… โœ…
stdbuf โŒ โœ… โŒ

๐Ÿงพ Summary

  • Pipelines chain commands efficiently.
  • Use tee to branch output.
  • Process substitution treats commands as files.
  • Be aware of buffering โ€” use stdbuf when needed.
  • Minimize subprocesses for performance.
  • Prefer built-ins over external commands.

๐Ÿ‘‰ Continue to: Filesystem Operations