๐ bpftrace vs strace
Comparing bpftrace and strace for system call tracing and debugging. Both tools are valuable, but they serve different purposes and have distinct advantages.
๐ฏ Overview Comparison
| Aspect |
strace |
bpftrace |
| Purpose |
System call tracing |
eBPF-based observability |
| Performance |
High overhead |
Near-zero overhead |
| Scope |
Single process |
System-wide |
| Safety |
Safe userspace |
Kernel-verified |
| Flexibility |
Limited |
Highly flexible |
| Learning Curve |
Low |
Moderate |
strace Overhead
| # strace has significant performance impact
time ls # Baseline: ~0.01s
time strace -o /dev/null ls # With strace: ~0.1s (10x slower)
# Tracing multiple processes is even more expensive
strace -f -o trace.log ./complex_application # Can slow down by 100x+
|
bpftrace Overhead
| # bpftrace has minimal performance impact
time ls # Baseline: ~0.01s
time bpftrace -e 'tracepoint:syscalls:sys_enter_open { }' -c 'ls' # Minimal impact
# System-wide tracing with low overhead
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_open { @count = count(); }' &
# Run applications normally - almost no impact
|
๐ง Usage Patterns
strace: Process-Centric Debugging
| # Debug a specific process
strace -o trace.log ./my_application
# Trace with specific syscalls
strace -e trace=open,read,write ./my_application
# Follow child processes
strace -f ./my_application
# Count syscalls
strace -c ./my_application
|
bpftrace: System-Wide Observability
| # Monitor all processes opening files
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_open { printf("%-6d %-16s %s\n", pid, comm, str(args->filename)); }'
# Count syscalls system-wide
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_* { @calls[probe] = count(); } interval:s:10 { print(@calls); exit(); }'
# Custom aggregations and filtering
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_open /pid == 1234/ { @files[str(args->filename)] = count(); } interval:s:30 { print(@files); exit(); }'
|
๐ Feature Comparison
Filtering Capabilities
| # strace filtering (limited)
strace -p 1234 # By PID
strace -e trace=open # By syscall
# bpftrace filtering (powerful)
sudo bpftrace -e '
tracepoint:syscalls:sys_enter_open
/comm == "nginx" && args->flags & 0x40/ # O_CREAT flag
{
printf("NGINX creating file: %s\n", str(args->filename));
}'
|
Data Aggregation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17 | # strace: No built-in aggregation
strace -c ./app # Only basic counts
# bpftrace: Rich aggregation
sudo bpftrace -e '
tracepoint:syscalls:sys_exit_read
/args->ret > 0/
{
@read_sizes = hist(args->ret);
@read_by_process[comm] = sum(args->ret);
}
interval:s:30
{
print(@read_sizes);
print(@read_by_process);
exit();
}'
|
Custom Metrics
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17 | # strace: Fixed output format
strace -tt -T ./app # Timestamps and durations
# bpftrace: Custom output and calculations
sudo bpftrace -e '
tracepoint:syscalls:sys_enter_open
{
@start[tid] = nsecs;
}
tracepoint:syscalls:sys_exit_open
/@start[tid]/
{
$duration = nsecs - @start[tid];
if ($duration > 1000000) // > 1ms
printf("Slow open: %s (%.2f ms)\n", str(((struct pt_regs *)args)->di), $duration / 1000000.0);
delete(@start[tid]);
}'
|
๐ฏ Use Case Scenarios
When to Use strace
| # 1. Quick debugging of a single process
strace ./failing_application
# 2. Detailed syscall sequence analysis
strace -tt -T -v ./application
# 3. Interactive debugging with a process
strace -p $(pgrep nginx)
# 4. Educational purposes (easy to understand output)
strace ls -la
|
When to Use bpftrace
| # 1. Production performance monitoring
sudo bpftrace -e 'profile:hz:99 { @cpu[comm] = count(); } interval:s:60 { print(@cpu); exit(); }'
# 2. System-wide anomaly detection
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_open { if (strstr(str(args->filename), ".ssh/") != 0) { printf("SSH access: %s by %s\n", str(args->filename), comm); } }'
# 3. Custom metrics collection
sudo bpftrace -e 'kprobe:tcp_sendmsg { @network_bytes[comm] = sum(args->size); } interval:s:10 { print(@network_bytes); clear(@network_bytes); }'
# 4. Historical analysis and trending
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_* { @hourly[probe, strftime("%Y-%m-%d %H", nsecs)] = count(); } interval:s:3600 { print(@hourly); clear(@hourly); }'
|
๐ Security and Safety
strace Safety
| # Safe for userspace tracing
strace ./my_app # No special privileges needed for own processes
# Requires ptrace permissions for other processes
sudo strace -p 1234 # Needs ptrace capability
|
bpftrace Safety
| # Requires root privileges
sudo bpftrace script.bt # Loads eBPF programs into kernel
# Kernel verifies program safety
# - No infinite loops
# - No unsafe memory access
# - Bounded execution time
|
Benchmark: File Open Monitoring
| # Test setup: 1000 file operations
TEST_SCRIPT='for i in {1..1000}; do echo "test" > /tmp/test_$i; done'
# strace approach
time strace -e trace=openat -o /dev/null bash -c "$TEST_SCRIPT"
# Result: ~2-3 seconds (100x slowdown)
# bpftrace approach
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_openat { }' -c "bash -c '$TEST_SCRIPT'"
# Result: ~0.1 seconds (minimal impact)
|
Benchmark: CPU Profiling
| # strace CPU profiling (inefficient)
strace -c -f ./cpu_intensive_app
# High overhead, limited insights
# bpftrace CPU profiling (efficient)
sudo bpftrace -e 'profile:hz:997 { @cpu[comm] = count(); } interval:s:10 { print(@cpu); exit(); }'
# Low overhead, detailed insights
|
๐ ๏ธ Migration Guide
From strace to bpftrace
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27 | # strace: Count syscalls for a process
strace -c ./my_app
# bpftrace equivalent (system-wide)
sudo bpftrace -e '
tracepoint:syscalls:sys_enter_*
{
@calls[probe] = count();
}
interval:s:5
{
print(@calls);
exit();
}'
# bpftrace equivalent (specific process)
sudo bpftrace -e '
tracepoint:syscalls:sys_enter_*
/pid == 1234/
{
@calls[probe] = count();
}
interval:s:5
{
print(@calls);
exit();
}'
|
Hybrid Approach
1
2
3
4
5
6
7
8
9
10
11
12
13
14 | # Use strace for detailed single-process analysis
strace -tt -T -v -p 1234 > detailed_trace.log
# Use bpftrace for system-wide monitoring
sudo bpftrace -e '
tracepoint:syscalls:sys_enter_open
{
@file_opens[comm] = count();
}
interval:s:60
{
print(@file_opens);
clear(@file_opens);
}' > system_monitor.log
|
๐งพ Summary
โ
Use strace for: Quick debugging, single-process analysis, educational purposes
โ
Use bpftrace for: Production monitoring, system-wide observability, custom metrics
โ
Performance: bpftrace << strace (especially under load)
โ
Flexibility: bpftrace >> strace
โ
Safety: Both safe, but bpftrace has kernel-level verification
๐งพ See Also