Przejdลบ do treล›ci

๐Ÿ†š bpftrace vs strace

Comparing bpftrace and strace for system call tracing and debugging. Both tools are valuable, but they serve different purposes and have distinct advantages.


๐ŸŽฏ Overview Comparison

Aspect strace bpftrace
Purpose System call tracing eBPF-based observability
Performance High overhead Near-zero overhead
Scope Single process System-wide
Safety Safe userspace Kernel-verified
Flexibility Limited Highly flexible
Learning Curve Low Moderate

๐Ÿš€ Performance Impact

strace Overhead

1
2
3
4
5
6
# strace has significant performance impact
time ls  # Baseline: ~0.01s
time strace -o /dev/null ls  # With strace: ~0.1s (10x slower)

# Tracing multiple processes is even more expensive
strace -f -o trace.log ./complex_application  # Can slow down by 100x+

bpftrace Overhead

1
2
3
4
5
6
7
# bpftrace has minimal performance impact
time ls  # Baseline: ~0.01s
time bpftrace -e 'tracepoint:syscalls:sys_enter_open { }' -c 'ls'  # Minimal impact

# System-wide tracing with low overhead
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_open { @count = count(); }' &
# Run applications normally - almost no impact

๐Ÿ”ง Usage Patterns

strace: Process-Centric Debugging

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Debug a specific process
strace -o trace.log ./my_application

# Trace with specific syscalls
strace -e trace=open,read,write ./my_application

# Follow child processes
strace -f ./my_application

# Count syscalls
strace -c ./my_application

bpftrace: System-Wide Observability

1
2
3
4
5
6
7
8
# Monitor all processes opening files
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_open { printf("%-6d %-16s %s\n", pid, comm, str(args->filename)); }'

# Count syscalls system-wide
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_* { @calls[probe] = count(); } interval:s:10 { print(@calls); exit(); }'

# Custom aggregations and filtering
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_open /pid == 1234/ { @files[str(args->filename)] = count(); } interval:s:30 { print(@files); exit(); }'

๐Ÿ“Š Feature Comparison

Filtering Capabilities

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# strace filtering (limited)
strace -p 1234  # By PID
strace -e trace=open  # By syscall

# bpftrace filtering (powerful)
sudo bpftrace -e '
tracepoint:syscalls:sys_enter_open
/comm == "nginx" && args->flags & 0x40/  # O_CREAT flag
{
  printf("NGINX creating file: %s\n", str(args->filename));
}'

Data Aggregation

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# strace: No built-in aggregation
strace -c ./app  # Only basic counts

# bpftrace: Rich aggregation
sudo bpftrace -e '
tracepoint:syscalls:sys_exit_read
/args->ret > 0/
{
  @read_sizes = hist(args->ret);
  @read_by_process[comm] = sum(args->ret);
}
interval:s:30
{
  print(@read_sizes);
  print(@read_by_process);
  exit();
}'

Custom Metrics

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# strace: Fixed output format
strace -tt -T ./app  # Timestamps and durations

# bpftrace: Custom output and calculations
sudo bpftrace -e '
tracepoint:syscalls:sys_enter_open
{
  @start[tid] = nsecs;
}
tracepoint:syscalls:sys_exit_open
/@start[tid]/
{
  $duration = nsecs - @start[tid];
  if ($duration > 1000000)  // > 1ms
    printf("Slow open: %s (%.2f ms)\n", str(((struct pt_regs *)args)->di), $duration / 1000000.0);
  delete(@start[tid]);
}'

๐ŸŽฏ Use Case Scenarios

When to Use strace

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# 1. Quick debugging of a single process
strace ./failing_application

# 2. Detailed syscall sequence analysis
strace -tt -T -v ./application

# 3. Interactive debugging with a process
strace -p $(pgrep nginx)

# 4. Educational purposes (easy to understand output)
strace ls -la

When to Use bpftrace

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# 1. Production performance monitoring
sudo bpftrace -e 'profile:hz:99 { @cpu[comm] = count(); } interval:s:60 { print(@cpu); exit(); }'

# 2. System-wide anomaly detection
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_open { if (strstr(str(args->filename), ".ssh/") != 0) { printf("SSH access: %s by %s\n", str(args->filename), comm); } }'

# 3. Custom metrics collection
sudo bpftrace -e 'kprobe:tcp_sendmsg { @network_bytes[comm] = sum(args->size); } interval:s:10 { print(@network_bytes); clear(@network_bytes); }'

# 4. Historical analysis and trending
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_* { @hourly[probe, strftime("%Y-%m-%d %H", nsecs)] = count(); } interval:s:3600 { print(@hourly); clear(@hourly); }'

๐Ÿ”’ Security and Safety

strace Safety

1
2
3
4
5
# Safe for userspace tracing
strace ./my_app  # No special privileges needed for own processes

# Requires ptrace permissions for other processes
sudo strace -p 1234  # Needs ptrace capability

bpftrace Safety

1
2
3
4
5
6
7
# Requires root privileges
sudo bpftrace script.bt  # Loads eBPF programs into kernel

# Kernel verifies program safety
# - No infinite loops
# - No unsafe memory access
# - Bounded execution time

๐Ÿ“ˆ Performance Comparison Examples

Benchmark: File Open Monitoring

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Test setup: 1000 file operations
TEST_SCRIPT='for i in {1..1000}; do echo "test" > /tmp/test_$i; done'

# strace approach
time strace -e trace=openat -o /dev/null bash -c "$TEST_SCRIPT"
# Result: ~2-3 seconds (100x slowdown)

# bpftrace approach
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_openat { }' -c "bash -c '$TEST_SCRIPT'"
# Result: ~0.1 seconds (minimal impact)

Benchmark: CPU Profiling

1
2
3
4
5
6
7
# strace CPU profiling (inefficient)
strace -c -f ./cpu_intensive_app
# High overhead, limited insights

# bpftrace CPU profiling (efficient)
sudo bpftrace -e 'profile:hz:997 { @cpu[comm] = count(); } interval:s:10 { print(@cpu); exit(); }'
# Low overhead, detailed insights

๐Ÿ› ๏ธ Migration Guide

From strace to bpftrace

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# strace: Count syscalls for a process
strace -c ./my_app

# bpftrace equivalent (system-wide)
sudo bpftrace -e '
tracepoint:syscalls:sys_enter_*
{
  @calls[probe] = count();
}
interval:s:5
{
  print(@calls);
  exit();
}'

# bpftrace equivalent (specific process)
sudo bpftrace -e '
tracepoint:syscalls:sys_enter_*
/pid == 1234/
{
  @calls[probe] = count();
}
interval:s:5
{
  print(@calls);
  exit();
}'

Hybrid Approach

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Use strace for detailed single-process analysis
strace -tt -T -v -p 1234 > detailed_trace.log

# Use bpftrace for system-wide monitoring
sudo bpftrace -e '
tracepoint:syscalls:sys_enter_open
{
  @file_opens[comm] = count();
}
interval:s:60
{
  print(@file_opens);
  clear(@file_opens);
}' > system_monitor.log

๐Ÿงพ Summary

โœ… Use strace for: Quick debugging, single-process analysis, educational purposes โœ… Use bpftrace for: Production monitoring, system-wide observability, custom metrics โœ… Performance: bpftrace << strace (especially under load) โœ… Flexibility: bpftrace >> strace โœ… Safety: Both safe, but bpftrace has kernel-level verification


๐Ÿงพ See Also