Przejdลบ do treล›ci

๐Ÿงพ Extended Text Processing

Efficient text parsing is essential for log analysis, config management, and data transformation.

๐Ÿงญ Core Tools

Tool Purpose Portability
grep Pattern matching โœ… POSIX
sed Stream editor โœ… POSIX
awk Column-based processing โœ… POSIX
cut Extract columns โœ… POSIX
sort Sort lines โœ… POSIX
uniq Remove duplicates โœ… POSIX
tr Character translation โœ… POSIX
jq JSON processing โŒ External

๐Ÿงช grep โ€” Pattern Matching

Basic Usage

1
2
3
4
grep ERROR logfile.txt
grep -i warning messages.log      # Case-insensitive
grep -v DEBUG logfile.txt         # Invert match (exclude DEBUG)
grep -E "error|warning" log.txt   # Extended regex

Useful Flags

Flag Purpose
-i Case-insensitive
-v Invert match
-n Show line numbers
-c Count matches
-l List files with matches
-r Recursive search
-E Extended regex
-o Show only matching part
-A N Show N lines after match
-B N Show N lines before match
-C N Show N lines context

Practical Examples

Count errors per file:

1
grep -c ERROR *.log

Show context around matches:

1
grep -C 3 "Exception" app.log

Multiple patterns:

1
grep -E "ERROR|WARNING|CRITICAL" system.log

๐Ÿงช sed โ€” Stream Editor

Basic Substitution

1
2
3
sed 's/old/new/' file.txt           # First occurrence per line
sed 's/old/new/g' file.txt          # All occurrences
sed 's/old/new/2' file.txt          # Second occurrence only

In-Place Editing

1
2
sed -i 's/old/new/g' file.txt       # Modify file directly
sed -i.bak 's/old/new/g' file.txt   # Create backup

Delete Lines

1
2
3
sed '/pattern/d' file.txt           # Delete matching lines
sed '1,10d' file.txt                # Delete lines 1-10
sed '$d' file.txt                   # Delete last line

Insert and Append

1
2
sed '1i\Header line' file.txt       # Insert before line 1
sed '$a\Footer line' file.txt       # Append after last line

๐Ÿงช awk โ€” Column Processing

1
2
awk '{print $1, $3}' data.txt       # Print columns 1 and 3
awk -F',' '{print $2}' csv.csv      # CSV with comma delimiter

Filtering

1
2
awk '$3 > 100' sales.csv            # Rows where col 3 > 100
awk '/ERROR/ {print $0}' log.txt    # Lines containing ERROR

Calculations

1
2
awk '{sum += $1} END {print "Total:", sum}' numbers.txt
awk '{count++} END {print "Lines:", count}' file.txt

Formatting

1
awk '{printf "%-10s %5d\n", $1, $2}' data.txt

๐Ÿงช cut โ€” Simple Column Extraction

1
2
3
cut -d',' -f1,3 data.csv            # Columns 1 and 3
cut -c1-10 file.txt                 # Characters 1-10
cut -f2 -d' ' names.txt             # Second field (space delimiter)

๐Ÿง  Combining Tools

Classic Pipeline

1
2
3
4
5
6
7
cat access.log \
  | grep "POST /api" \
  | awk '{print $1}' \
  | sort \
  | uniq -c \
  | sort -nr \
  | head -10

This finds top 10 IPs hitting POST /api.

Log Analysis Example

1
2
3
4
5
# Count status codes
awk '{print $9}' access.log | sort | uniq -c | sort -nr

# Find slowest requests
awk '$NF > 1.0 {print $0}' access.log | sort -k10 -nr

๐Ÿงช jq โ€” JSON Processing

Basic Usage

1
2
3
4
5
echo '{"name":"Alice","age":30}' | jq '.name'
# Output: "Alice"

echo '{"users":[{"name":"Bob"},{"name":"Carol"}]}' | jq '.users[].name'
# Output: "Bob" "Carol"

Filtering

1
2
jq '.[] | select(.age > 18)' users.json
jq '.users[] | select(.active == true)' data.json

Formatting

1
2
jq -c '.' messy.json               # Compact output
jq '. | {name, age}' user.json      # Select fields

๐Ÿงพ Performance Tips

Pattern Performance Notes
grep \| sed \| awk โš ๏ธ Slow Multiple processes
awk alone โœ… Fast Single process
grep -o โœ… Fast Built-in extraction
sed -i โš ๏ธ Slow Rewrites entire file
jq โœ… Fast Optimized for JSON

๐Ÿงพ Summary

  • Master grep, sed, awk, cut, sort, uniq.
  • Combine tools in pipelines for complex tasks.
  • Use awk when possible โ€” it's faster than multiple tools.
  • jq is essential for JSON processing.
  • Always quote variables to prevent word splitting.

๐Ÿ‘‰ Continue to: Portability Patterns