bpftrace: Tracing in a Single Line

Part I wrote eBPF programs by hand in C: clang compiles them to bytecode, bpftool loads them, the verifier checks them. Complete and deep, but to answer a quick question — "which process is opening which file?" — that's a lot of steps. Part II uses bpftrace: one command line answers immediately. We open with bpftrace because it's the fastest way to see eBPF at work, before we write more complex tools ourselves in Part III.

bpftrace is still eBPF

Something to be clear about right away: bpftrace isn't a different technology — it's a high-level tracing language that bpftrace itself compiles into eBPF bytecode and loads into the kernel, the exact same flow as Part I, except we don't write C by hand. Proof: count the eBPF programs before/during/after running a bpftrace one-liner:

sudo bpftool prog show | grep -c '^[0-9]*:'                 # 140 (Cilium only)
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_openat { @ = count(); }' &   # run in background
sudo bpftool prog show | grep -c '^[0-9]*:'                 # 141 !
# ...Ctrl-C bpftrace...
sudo bpftool prog show | grep -c '^[0-9]*:'                 # back to 140

140      # before
141      # while bpftrace runs: +1 program (type tracepoint)
140      # bpftrace exits: it detaches itself

An eBPF program (type tracepoint) appears exactly when bpftrace runs and disappears when it exits. bpftrace takes our script, compiles it to eBPF, loads + attaches it, aggregates data through maps (Article 3), then cleans up when done — all automatically. We get concise syntax while it's still real eBPF underneath.

Syntax: probe / filter / action

A bpftrace program is one or more blocks following the shape:

probe /filter/ { action }

probe — where to attach (this is the hook/program type of Article 4, written as a string).
filter (predicate, optional) — the condition for the action to run.
action — what to do when the probe fires.

A commonly used one-liner: which process opens which file.

sudo bpftrace -e 'tracepoint:syscalls:sys_enter_openat { printf("%-16s %s\n", comm, str(args.filename)); }'

bash             /dev/null
cat              /etc/ld.so.cache
cat              /lib/x86_64-linux-gnu/libc.so.6
cat              /usr/lib/locale/C.UTF-8/LC_IDENTIFICATION

Read the command: the probe is tracepoint:syscalls:sys_enter_openat (Article 4 — we attached to exactly this tracepoint in C, now it's one line); the action prints comm (a built-in variable: the process name) and str(args.filename). args is the probe's context — for a syscall tracepoint, it's the parameters; args.filename is the pointer to the filename, and str() reads the string out safely. bpftrace provides many built-in variables: comm, pid, tid, args, nsecs, cpu, kstack...

The probe inventory: where you can attach

bpftrace can attach to very many points. List and count them by type on the node:

sudo bpftrace -l | wc -l
sudo bpftrace -l | sed 's/:.*//' | sort | uniq -c | sort -rn

122991                  # total attachable probes
  61331 kprobe          # every kernel function you can hook
  58834 kfunc           # kernel functions (BTF-typed, with typed args)
   1756 tracepoint      # stable tracepoint points
   1026 rawtracepoint
     18 iter
     14 software
     12 hardware

More than 120 thousand points to observe. tracepoint is the stable interface kernel developers commit to keeping; kprobe can be inserted into nearly every kernel function (more flexible, but the function name can change between kernel builds). bpftrace -l 'pattern*' probes by name — like bpftrace -l 'tracepoint:syscalls:sys_enter_open*' returning sys_enter_openat and sys_enter_openat2.

Filter: narrow down to what you need

Add a predicate so the action only runs when a condition holds — for example, only caring about files that the cat command opens:

sudo bpftrace -e 'tracepoint:syscalls:sys_enter_openat /comm == "cat"/ { printf("%s opened %s\n", comm, str(args.filename)); }'

cat opened /etc/ld.so.cache
cat opened /lib/x86_64-linux-gnu/libc.so.6
cat opened /usr/lib/locale/C.UTF-8/LC_IDENTIFICATION

/comm == "cat"/ is the filter: the program still fires on every openat system-wide, but the action only runs when the process is cat. Filtering inside the kernel like this is much cheaper than printing everything and grepping in userspace.

🧹 Cleanup

bpftrace detaches the program + maps itself on exit (Ctrl-C) — bpftool prog show goes back to 140. There's nothing to clean up by hand. The commands in this post are at github.com/nghiadaulau/ebpf-from-scratch, directory 06-bpftrace-basics.

Wrap-up

bpftrace is a high-level tracing language: we write a short script, bpftrace compiles it to eBPF bytecode, loads + attaches it, aggregates data through maps, then cleans up on exit — proven by bpftool seeing the program appear (140→141) while running and disappear on exit. The syntax has three parts: probe /filter/ { action } — the probe is a hook (Article 4) written as a string, the filter is a condition, the action uses built-in variables (comm, pid, args, str()...). The kernel offers more than 120 thousand probes (kprobe hooks every kernel function, tracepoint is the stable interface), and bpftrace -l probes them. A predicate filters right inside the kernel. All of it without C, clang, or writing a loader — the fast path to observing the system.

Printing line by line works when events are sparse; with dense events (every syscall, every packet) it floods the screen. Article 7 uses bpftrace's main capability: maps and aggregation — counting and building distribution charts (histograms) right inside the kernel, returning only the aggregated result.