Spark Program | Ckb-probe: Deep Observability Tool for CKB Nodes Based on Aya Kernel eBPF/ckb-probe:基于 Aya 内核 eBPF 的 CKB 节点深度可观测性工具

Week 5 Report: Performance Optimization, Docker Environment, CLI Polish, and Collection Framework

Period: 2026-04-13 to 2026-04-19
Author: Clair
Project: ckb-probe — an eBPF-based deep observability tool for CKB full nodes


1. Goals for This Week

  1. Polish CLI output based on clap
  2. Build a reproducible Docker environment
  3. Implement the 48-hour collection/reporting logic and prepare to launch the 48-hour stability test
  4. Execute and optimize P-1 to P-4 performance tests

2. Completed Work

Deliverable Status Notes
CLI clap polish :white_check_mark: Added help text, examples, and standardized exit codes for all three subcommands
Single-container Docker environment :white_check_mark: Three-stage Dockerfile, 6 demo scripts, and env-check.sh
48-hour collection scripts :white_check_mark: stability-48h.sh + generate-report.sh, with 3 parallel ckb-probe instances; launch scheduled within the week
Performance optimization (P-2) :white_check_mark: Perf buffer reduced from 1024 to 16 pages/CPU; RSS reduced from 87.9 MB to 21.9 MB
BPF map optimization :white_check_mark: HashMap max_entries reduced from 10240 to 1024
RingBuf refactor :white_check_mark: Migrated SLOW_EVENTS from PerfEventArray to RingBuf
S-4 process restart recovery :white_check_mark: Automatically detects CKB exit, polls for new PID, and reconnects
P-1 to P-4 performance testing :white_check_mark: Executed inside Docker with two fresh IBD runs for comparison
CI configuration :white_check_mark: Added build, lint, script check, and CKB version compatibility checks

3. CLI Output Polish

3.1 Refactoring with clap derive API

Using clap’s #[command] and #[arg] attributes, the following were completed for all three subcommands:

  • about / long_about — command summary and detailed description
  • after_help — usage examples and exit code notes
  • value_name — parameter placeholders (PATH / PID / MICROSECONDS)
  • help — help text for every argument
$ ckb-probe --help

ckb-probe uses eBPF (uprobe / kprobe / tracepoint) to deliver
application-semantic, real-time performance insights for CKB full nodes.

Usage: ckb-probe <COMMAND>

Commands:
  check    Check environment and validate eBPF probes
  symbols  Analyse a CKB binary for uprobe-attachable symbols
  rocksdb  Monitor RocksDB operations on a live CKB node via eBPF

EXAMPLES:
    sudo ckb-probe check --binary ./ckb --pid $(pgrep -x ckb)
    ckb-probe symbols ./ckb --json
    sudo ckb-probe rocksdb --binary ./ckb --pid $(pgrep -x ckb)

3.2 Standardized Exit Codes

Exit Code Meaning
0 Normal exit, including Ctrl+C
1 Runtime error / target process exited
2 Invalid arguments (clap default)

3.3 Cursor Restoration on Ctrl+C

On exit, \x1B[?25h is emitted to restore the cursor visibility, since TUI mode may hide it.


4. Reproducible Docker Environment

4.1 Two-Stage Dockerfile

The image does not include the CKB binary. Instead, the CKB binary is mounted from the host at runtime using bind mount (-v /path/to/ckb:/path/to/ckb:ro).

Stage 1 — FROM rust:latest AS probe-builder

  • Install build dependencies: clang, llvm, libelf-dev, zlib1g-dev, pkg-config
  • Install bpf-linker (cargo install) + nightly toolchain + rust-src component (required for eBPF compilation)
  • Copy source code (Cargo.toml, Cargo.lock, .cargo/, xtask/, ckb-probe/, ckb-probe-common/, ckb-probe-ebpf/)
  • Build eBPF: cargo xtask build-ebpf --release
  • Build userspace CLI: cargo build --release -p ckb-probe
  • Build db_bench from RocksDB v9.10.0 source after installing libgflags-dev, libsnappy-dev, liblz4-dev, and libzstd-dev, then run make -j db_bench

Stage 2 — FROM ubuntu:24.04 (runtime)

  • Install runtime tools: bash, sysstat, curl, jq, procps, tar, gzip, zstd, unzip, coreutils, grep, sed, gawk, iproute2, lsof, ca-certificates
  • Install db_bench runtime dependencies: libgflags2.2, libsnappy1v5, liblz4-1, libzstd1
  • Copy from Stage 1:
    • ckb-probe/usr/local/bin/ckb-probe
    • ckb-probe-ebpf ELF → /opt/ckb-probe-ebpf/target/bpfel-unknown-none/release/ckb-probe-ebpf
    • db_bench/usr/local/bin/db_bench
  • Copy all scripts to /opt/scripts/ under four subdirectories: perf, demo, case, and stability
  • Copy the entrypoint dispatcher /entrypoint.sh and ckb.toml.aggressive
  • Set WORKDIR /opt because ckb-probe locates the eBPF ELF via a relative path
  • Declare VOLUME ["/data", "/tmp/perf-run"] so data is mounted from the host
  • Set ENTRYPOINT ["/entrypoint.sh"] and CMD ["help"] so the container shows help by default

4.2 Six Demo Scripts

Script Function Validation Result
demo-check Environment, symbols, and eBPF validation :white_check_mark: 92 uprobes / 7 TCP / 198 syscall events
demo-table Default table mode :white_check_mark: Real-time GET / PUT / TXN_COMMIT data
demo-histogram Latency distribution histogram :white_check_mark: Log2 buckets, visible bimodal GET distribution
demo-slow Slow operation capture :white_check_mark: RingBuf works, 0 loss
demo-normal JSON monitoring output :white_check_mark: JSONL format
demo-stress db_bench stress injection :white_check_mark: 125 slow operations, 0 loss

4.3 env-check.sh

Checks six host-side prerequisites: kernel version, Docker, RAM, disk space, BTF, and BPF config.


5. Performance Optimization

5.1 P-2 Memory Optimization

Problem: perf_array.open(cpu_id, Some(1024)) allocated 1024 pages (4 MB) of perf ring buffer per CPU. With 24 CPUs, that meant 24 × 4 MB = 96 MB, and 87.9 MB RSS in practice, far exceeding the 50 MB budget.

Fix: Reduced it to 16 pages (64 KB) per CPU. Under an extreme workload of 13K events/sec, 64 KB still provides more than 15 ms of buffering headroom.

Result: RSS dropped from 87.9 MB to 21.9 MB and remained stable with no growth.

5.2 BPF Map Capacity Optimization

Reduced max_entries from 10240 to 1024 for the three HashMaps: UPROBE_START, TCP_START, and PUT_PENDING_BYTES. Since CKB uses around 100 threads, 1024 still provides roughly 10× headroom.

5.3 Replacing PerfEventArray with RingBuf

SLOW_EVENTS was migrated from PerfEventArray to RingBuf (256 KB):

PerfEventArray RingBuf
Buffer layout Per-CPU (24 independent ring buffers) One shared buffer across all CPUs
Wake-up method One epoll wake-up per event Userspace polling every 50 ms
Overhead at 10K events/sec ~10K context switches ~20 polls
CPU impact +5% (threshold=1) +1.16% (threshold=1000)

6. 48-Hour Collection Framework

6.1 stability-48h.sh

Three ckb-probe instances run in parallel:

Instance Mode Collected Data Resource Statistics
#1 (primary) --json OP_STATS / anomalies :white_check_mark: Only this instance is counted for CPU/RSS
#2 --slow --threshold 1000 SLOW_EVENTS + BPF loss Not counted
#3 --histogram --interval 30 Full LATENCY_HIST distribution Not counted

Sampling is performed every 10 seconds, producing the following output files:

File Contents
timeseries.tsv Probe CPU% / RSS / CKB CPU% / RSS
events.tsv Per-operation QPS / avg / P50 / P99 / bytes
tip-sync.tsv CKB tip height / delta blocks / blocks per minute
event-loss.tsv Total BPF events / lost events / loss rate
event-counts-by-op.tsv Per-operation QPS for each interval
slow-events.log Raw slow operation output
histogram.log Raw latency histogram output

6.2 generate-report.sh

Generates a Markdown report from the stability output directory, including:

  1. S-1 to S-4 evaluation table
  2. Time-series plots (gnuplot PNG or ASCII) — CPU / RSS / P99 / throughput / tip sync
  3. Resource usage summary table (Min / Max / Avg / P99)
  4. Event fidelity analysis (per-operation breakdown + BPF loss + sync speed)
  5. Latency distribution histogram (log2 buckets + CDF)
  6. Case 1: IBD write pattern
  7. Case 2: Compaction / anomaly spikes
  8. Reproduction instructions

6.3 Permission Checks

At startup, stability-48h.sh verifies root privileges, debugfs mount, and BTF availability. If not run as root, it fails immediately with a sudo hint.


7. S-4 Process Restart Recovery

Implemented in rocksdb.rs and validated on a real CKB node:

Monitoring PID 3310428 → CKB stopped
⚠ Target process (PID 3310428) exited. Waiting for CKB to restart...
✅ CKB restarted (new PID 673651). Reattaching probes...
Monitoring PID 673651 → data resumed seamlessly

Implementation logic:

  1. A background thread checks /proc/{pid} once per second
  2. If the process exits, release BPF resources
  3. Poll /proc/*/exe to find the same binary
  4. Once a new PID is found, reload the BPF ELF and reattach all uprobes

8. CI Configuration

.github/workflows/ci.yml contains four jobs:

Job Trigger Contents
build push / PR cargo xtask build-ebpf + cargo build --release
lint push / PR cargo fmt --check + cargo clippy -D warnings
scripts push / PR bash -n for all .sh files + shellcheck
ckb-compat Every Monday 08:00 UTC Download the latest binary from the nervosnetwork/ckb GitHub release page, run ckb-probe symbols to verify 5 core symbols, and automatically create an Issue on failure

9. Deviations from the Original Plan

  1. Docker changed from dual-container to single-container
    The original plan used a docker-compose.yml dual-container topology (CKB node + ckb-probe sidecar). After evaluation, it was changed to a single-container approach: for case study scenarios, a single docker run command is enough, the PID namespace is automatically shared, and reviewer onboarding cost is minimized. A production sidecar mode can be implemented in a later version.

  2. The 48-hour collection module was implemented in shell scripts instead of a Rust --record module
    The original plan was to add a new Rust subcommand --record <dir> or a standalone collector binary. In practice, stability-48h.sh + generate-report.sh were adopted instead. Three parallel ckb-probe instances (--json, --slow, and --histogram) are used, while shell scripts collect /proc metrics and RPC tip data. The functionality is fully equivalent. The collected data formats (TSV/JSONL) can also be fed directly into gnuplot scripts for visualization.

  3. 48-hour stability test:counterclockwise_arrows_button: Scheduled to start this weekend, with data collection continuing into Week 6.


10. P-1 to P-4 Performance Test Results

This week, a strict comparison was completed using two fresh IBD runs.

Test method:

  • Executed inside a Docker container, using the RingBuf data path with threshold=1000μs
  • Phase A (with-probe) and Phase B (baseline) both started from tip = 20,447,628 (diff=0)
  • Each phase lasted 2 hours, fully covering both the IBD peak period (~31 min) and the steady-state period

Results:

════════════════════════════════════════════════════════════════════════════════
  ckb-probe · P-1~P-4 Performance Evaluation Report
  Generated: 2026-04-16 07:05
  Mode: Docker, RingBuf, threshold=1000μs
  Both Phase A and B started from tip=20447628 (diff=0)
  Environment: Linux 6.8.0-106-generic, 24 CPU, CKB testnet
════════════════════════════════════════════════════════════════════════════════

  The synchronization process can be divided into two periods:
    ① Peak period (0~31 min): locally cached block data is batch-written into RocksDB
       RocksDB operation density is extremely high, and CPU usage saturates multiple cores (~330%)
    ② Steady-state period (31 min~2 h): waiting for new blocks to be downloaded from the network and then written one by one
       RocksDB operations become sparse and are limited by network bandwidth

════════════════════════════════════════════════════════════════════════════════
  P-1   Additional CPU usage ≤ 3% (relative)
════════════════════════════════════════════════════════════════════════════════

  ① Peak period (0~31 min, local batch writes)
     baseline           : 329.96%
     with-probe         : 324.92%
     relative delta     : -1.53%
     → with-probe is slightly lower; probe overhead is within system noise

  ② Overall (full 2 h)
     baseline           : 130.58%
     with-probe         : 133.34%
     relative delta     : +2.11%

  P-1 budget: ≤ 3%
  status    : ✅ PASS

════════════════════════════════════════════════════════════════════════════════
  P-2   ckb-probe RSS ≤ 50 MB (continuous monitoring over 2 h)
════════════════════════════════════════════════════════════════════════════════

  samples : 1435   mean : 21.97 MB   max : 21.97 MB

  P-2 budget: ≤ 50 MB
  status    : ✅ PASS

════════════════════════════════════════════════════════════════════════════════
  P-3   BPF event loss rate < 0.1%
════════════════════════════════════════════════════════════════════════════════

  threshold=1000μs (this test): 0 / 78,353 attempted, 0.0000%
  threshold=1 extreme stress test (historical): 0 / 29,052,243 events, 0.0000%, peak ~13K/sec

  P-3 budget: < 0.1%
  status    : ✅ PASS

════════════════════════════════════════════════════════════════════════════════
  P-4   CKB sync speed degradation < 1%
════════════════════════════════════════════════════════════════════════════════

  ① Peak period (0~31 min, local batch writes)
     baseline           : 10827.3 blocks/min  (326,095 blocks)
     with-probe         : 10116.0 blocks/min  (304,781 blocks)
     degradation        : +6.57%
     → Peak-period degradation of 6.57% comes from microscopic interruption by uprobes
       to the CKB execution pipeline (cache locality, branch prediction), which does
       not show up directly in CPU utilization

  ② Overall (full 2 h)
     baseline           : 2800.4 blocks/min  (333,299 blocks)
     with-probe         : 2790.0 blocks/min  (332,057 blocks)
     degradation        : +0.37%
     → Overall 2 h degradation is only 0.37%, well below the 1% budget.
       Probe impact is negligible during the steady-state period.

  P-4 budget: < 1%
  status    : ✅ PASS (based on the full 2 h overall result of +0.37%)

════════════════════════════════════════════════════════════════════════════════
  Summary
════════════════════════════════════════════════════════════════════════════════

  ┌───────────┬──────────────────────────────────────┬────────┬────────┐
  │ Metric    │ Result                               │ Budget │ Status │
  ├───────────┼──────────────────────────────────────┼────────┼────────┤
  │ P-1 CPU   │ -1.53% peak / +2.11% overall 2 h     │ ≤ 3%   │ ✅     │
  │ P-2 RSS   │ 21.97 MB (stable, no growth)         │ ≤ 50MB │ ✅     │
  │ P-3 Loss  │ 0/78353 (0.0000%)                    │ <0.1%  │ ✅     │
  │ P-4 Deg.  │ +6.57% peak / +0.37% overall 2 h     │ < 1%   │ ✅     │
  └───────────┴──────────────────────────────────────┴────────┴────────┘

  All four metrics passed.
════════════════════════════════════════════════════════════════════════════════

Key findings:

  1. P-1 peak-period CPU: -1.53%with-probe actually showed slightly lower CPU usage than baseline, indicating that probe overhead is within system noise. The Week 5 optimizations—perf buffer reduction (96 MB → 1 MB), RingBuf refactor, and BPF map capacity reduction—were highly effective.

  2. P-4 peak-period degradation of 6.57% vs. only 0.37% overall over 2 hours — During the peak period (~10K ops/sec), microscopic interruptions from uprobes affected the CKB pipeline efficiency through cache locality and branch prediction effects. However, this is only significant at extreme operation density. During steady-state operation, probe impact is negligible, and the overall 2-hour degradation of 0.37% is far below the 1% budget.

  3. P-2 RSS remained stable at 21.97 MB — Continuous monitoring over 2 hours showed no growth, confirming that the Week 5 memory optimization (87.9 MB → 22 MB) was effective.

  4. P-3 zero loss — At threshold=1000μs, 78K events were captured with zero loss; in the historical extreme stress test with threshold=1 (29M events at 13K/sec), loss remained zero as well.

10.1 Launch of the 48-Hour Stability Test

Immediately after the P-1 to P-4 tests, the 48-hour stability test was launched:

  • Command used (single-container, detached background mode):

    docker run -d --name stability-test \
      --privileged --pid host --network host \
      -v /sys/kernel/debug:/sys/kernel/debug:ro \
      -v /sys/kernel/btf:/sys/kernel/btf:ro \
      -v /root/ckb-testnet/ckb:/root/ckb-testnet/ckb:ro \
      -v /tmp/perf-run:/tmp/perf-run \
      -e CKB_BIN=/root/ckb-testnet/ckb \
      -e CKB_RPC=http://127.0.0.1:8124 \
      ckb-probe:latest stability
    
  • Test scope: S-1 (no crash for 48h) / S-2 (RSS growth ≤ 5 MB) / S-3 (no BPF warnings in dmesg) / S-4 (automatic reconnection after CKB restart at T+24h)

  • Sampling frequency: once every 10 seconds, for a total of 17,280 data points

  • Data will be processed by generate-report.sh into a complete stability report for Week 6


11. Next Steps

Week 6: Stability Testing and Case Analysis

  1. Wrap up the 48-hour stability test and organize the data — including time-series charts, resource usage summaries, event fidelity reports, and latency distribution plots
  2. Analyze two RocksDB diagnostic scenarios — IBD write pattern analysis and compaction latency spike capture, as the core case studies in the stability report

Week 7: Hardening and Final Preparation

  1. Global JSON output — ensure that all modes have a unified JSON output format with complete fields
  2. Prepare complete demo documentation — a structured written demo report (Markdown / PDF) covering the exact same five demo workflows as the video, with full terminal screenshots, key command explanations, and output interpretation for each step
  3. If the Week 6 48-hour stability test is not yet complete, finish the remaining work during this week

Week 8: Release and Project Closure

  1. Finalize bilingual documentation — final review of all documentation in both Chinese and English
  2. GitHub v0.1.0 Release — create tag, write release notes, and attach prebuilt binaries
  3. Final project report — organize all deliverables, acceptance checklist, and known limitations according to main_proj.md
  4. Community sharing — submit the final monthly report
4 Likes