libbpf and CO-RE: Writing an eBPF Tool Yourself

K
Kai··4 min read

bpftrace (Part II) is great for quick questions, but it's an ad-hoc tool: type a line, look, exit. When you need a real tool — distributable, resident, integrated into other systems — you write the eBPF program in C with libbpf and CO-RE (Article 5). This is also how bcc-libbpf, Cilium, and Tetragon are written. Part III builds exactly that way, starting with a complete execsnoop.

The two sides of a libbpf application

An eBPF tool has two halves, compiled separately:

   exec.bpf.c  ──clang──► exec.bpf.o ──bpftool gen skeleton──► exec.skel.h
   (runs IN the kernel)                                              │ #include
                                                                     ▼
   exec.c  ────────────────clang + link libbpf───────────────►  execsnoop
   (runs IN userspace: load, attach, read events)
  • Kernel side (exec.bpf.c): the eBPF program attaches to the exec tracepoint, pushing events out to a ring buffer.
  • Skeleton (exec.skel.h): bpftool gen skeleton embeds exec.bpf.o into a C header with open/load/attach functions and pointers to each map/prog — userspace doesn't have to parse the ELF by hand.
  • Userspace side (exec.c): uses the skeleton + libbpf to load the program, attach it, then read the ring buffer and print.

Both share a common event struct in exec.h.

Kernel side: pushing events through a ring buffer

struct {
    __uint(type, BPF_MAP_TYPE_RINGBUF);
    __uint(max_entries, 256 * 1024);
} events SEC(".maps");

SEC("tracepoint/sched/sched_process_exec")
int handle_exec(struct trace_event_raw_sched_process_exec *ctx)
{
    struct event *e = bpf_ringbuf_reserve(&events, sizeof(*e), 0);   // reserve space
    if (!e) return 0;
    e->pid = bpf_get_current_pid_tgid() >> 32;
    struct task_struct *task = (struct task_struct *)bpf_get_current_task();
    e->ppid = BPF_CORE_READ(task, real_parent, tgid);               // CO-RE (Article 5)
    bpf_get_current_comm(&e->comm, sizeof(e->comm));
    unsigned off = ctx->__data_loc_filename & 0xFFFF;               // tracepoint's dynamic field
    bpf_probe_read_kernel_str(&e->filename, sizeof(e->filename), (void *)ctx + off);
    bpf_ringbuf_submit(e, 0);                                       // submit
    return 0;
}

The ring buffer (BPF_MAP_TYPE_RINGBUF) is the modern way to push events from the kernel to userspace: the program reserves a slot, writes the data, submits — userspace reads them as a stream. Unlike the counter map in Article 3 (userspace polls a value), a ring buffer pushes each event in order, efficiently. ppid is read with BPF_CORE_READ (relocated against BTF — Article 5), filename comes from the tracepoint's dynamic field.

Userspace side: skeleton + libbpf

Generate the skeleton, then write the loader:

#include <bpf/libbpf.h>
#include "exec.skel.h"

static int on_event(void *ctx, void *data, size_t sz)
{
    struct event *e = data;
    printf("%-16s pid=%-7u ppid=%-7u %s\n", e->comm, e->pid, e->ppid, e->filename);
    return 0;
}

int main(void)
{
    setvbuf(stdout, NULL, _IOLBF, 0);                       // (see the trap below)
    struct exec_bpf *skel = exec_bpf__open_and_load();      // load + verifier + CO-RE relocate
    exec_bpf__attach(skel);                                 // attach to the tracepoint
    struct ring_buffer *rb =
        ring_buffer__new(bpf_map__fd(skel->maps.events), on_event, NULL, NULL);
    while (!stop)
        ring_buffer__poll(rb, 100);                     // wait + call on_event per event
    exec_bpf__destroy(skel);
}

exec_bpf__open_and_load() (generated by the skeleton) does all the work Articles 1–2 hinted at: load the bytecode, run CO-RE relocation against this kernel's BTF, push it through the verifier, JIT. ring_buffer__new + poll register a callback that runs each time the kernel submits an event.

Building the whole chain

clang -O2 -g -target bpf -I. -c exec.bpf.c -o exec.bpf.o   # 1. C -> eBPF bytecode
bpftool gen skeleton exec.bpf.o > exec.skel.h              # 2. generate skeleton (40k lines)
clang exec.c -o execsnoop -lbpf -lelf -lz                  # 3. loader, link libbpf

Three steps: compile the kernel side to bytecode, generate the skeleton, compile the loader linking libbpf. Run it:

sudo ./execsnoop
COMM             PID         PPID         FILENAME
iptables         pid=367016  ppid=300744  /usr/sbin/iptables
ip6tables        pid=367017  ppid=300744  /usr/sbin/ip6tables
iptables         pid=367018  ppid=213711  /usr/sbin/iptables
iptables         pid=367019  ppid=213711  /usr/sbin/iptables

Every time any process on the machine execs, a line appears — through the ring buffer, not by polling. On this cluster you immediately see cilium-agent (ppid 213711) constantly exec'ing iptables/ip6tables to update rules. This is a real tool, resident, packageable — not a one-liner.

A real trap: stdout buffering

The first run printed nothing at all. The reason: printf writes into a pipe that is block-buffered, and when timeout kills the process the buffer hasn't flushed yet, so everything is lost. Fix it with setvbuf(stdout, NULL, _IOLBF, 0) (line-buffered) at the top of main — each line flushes immediately. This is a common C bug, nothing to do with eBPF, but worth remembering when writing a long-running tracing tool.

🧹 Cleanup

execsnoop detaches the program on exit (exec_bpf__destroy); the node returns to 140 programs. The full source (exec.bpf.c, exec.c, exec.h, Makefile) is at github.com/nghiadaulau/ebpf-from-scratch, directory 09-libbpf-core.

Wrap-up

When you need a real eBPF tool instead of a one-liner, you write it in C with libbpf + CO-RE. An application has two halves: the kernel side (exec.bpf.c) attaches to the exec tracepoint and pushes events through a ring buffer (reserve/submit — ordered push, unlike polling a map in Article 3), using BPF_CORE_READ to get ppid (CO-RE, Article 5); the userspace side (exec.c) uses the skeleton generated by bpftool gen skeleton to open_and_load (which includes verifier + CO-RE relocate + JIT) then ring_buffer__poll to read events. The build chain: clang → .bpf.o → skeleton → loader linking libbpf. The result, execsnoop, streams every exec with pid/ppid/comm/filename — we saw cilium-agent constantly calling iptables. Plus a C lesson: remember to set setvbuf line-buffered, or the output gets stuck in the buffer.

Article 10 rewrites this exact tool but loads it from Go with the cilium/ebpf library — how the Go ecosystem (including Cilium itself) builds eBPF applications, and the most common loader in the Kubernetes world.