Kernel Regression Bisection
Fastpath provides automated kernel regression bisection to identify which specific kernel commit introduced a performance regression. This uses Git bisection integrated with Fastpath’s benchmarking and analysis capabilities.
Overview
The bisection process:
Prepare - Identify good/bad kernels and create bisection context (fastpath bisect start)
Execute - Git bisect automatically builds, tests, and evaluates each commit (git bisect run)
Analyze - Git bisect identifies the first bad commit (git bisect log)
Prerequisites
A result store with benchmark results from both:
Good swprofile - Known working version with acceptable performance
Bad swprofile - Version showing the regression
Both swprofiles must have been tested with:
Same SUT (System Under Test)
Same benchmark
Identical configuration (cmdline, sysctl, bootscript)
Only difference should be the kernel git SHA
SSH access to the SUT for running tests
Kernel source repository for building commits
Prepare Context
Create a bisection context file with fastpath bisect start:
# Basic command structure
fastpath bisect start \
--host <hostname> --user <user> --port <port> --keyfile <keyfile> \
--sut <sut-id> \
--good-swprofile <good-id> --bad-swprofile <bad-id> \
--benchmark <suite/name> --resultclass <metric> \
--resultstore <url> --context <output.yaml>
Example
fastpath bisect start \
--host test-server --user root --keyfile ~/.ssh/id_rsa \
--sut "Ampere Altra Max" \
--good-swprofile "6.8.0-baseline" --bad-swprofile "6.9.0-regression" \
--benchmark "sysbench/thread" --resultclass "sysbenchthread-110" \
--context ./bisection_context.yaml
This validates baseline results exist, profiles match (except kernel_git_sha), and
SUT is single-node. Then creates a temporary result store with baseline copies and
generates bisection_context.yaml.
Note
Result Store Isolation: Original result store remains read-only. A temporary store holds baseline copies plus new bisection results.
To configure additional sampling when the initial 1/1/1 run is ambiguous, use
--warmups, --repeats, and --sessions. These are stored in the plan
but only used by bisect for the followup pass if needed.
Generated Context File:
The bisection_context.yaml contains:
Test plan (SUT connection, benchmark config, shared profile fields)
Baseline profile names and kernel git SHAs
Resultclass for performance evaluation
Result store paths (original and temporary)
Benchmark run counts for any follow-up collection
plan:
sut:
name: "Ampere Altra Max"
connection:
method: SSH
params: {...}
swprofiles:
- cmdline: [...]
sysctl: []
bootscript: []
benchmarks:
- suite: sysbench
name: thread
...
defaults:
benchmark:
warmups: 1 # used only if initial 1/1/1 run is ambiguous
repeats: 2
sessions: 2
good-swprofile: "6.8.0-baseline"
good_sha: "a1b2c3d4e5f6"
bad-swprofile: "6.9.0-regression"
bad_sha: "f6e5d4c3b2a1"
resultclass: "sysbenchthread-110"
resultstore: "mysql://..."
output-resultstore: "/tmp/bisect-resultstore-abc123/"
Swprofile Naming:
During bisection, swprofiles are named as follows:
Baseline swprofiles: Retain their original names from the resultstore (e.g.,
6.8.0-baseline,6.9.0-regression)Test swprofiles: Initially named
bisect-<sha>where<sha>is the first 12 characters of the kernel git SHA being tested (e.g.,bisect-a1b2c3d4e5f6). After evaluation, they are relabeled to<iteration>-<verdict>-<sha>format (e.g.,01-b-a1b2c3d4for iteration 1, bad verdict, 8-char SHA). Verdict codes:g(good),b(bad),s(skip),e(error).
This naming scheme makes it easy to identify bisection runs in the result store, track which kernel commit each test corresponds to, and see the outcome at a glance.
Run Bisection
Run automated bisection in your kernel source repository:
cd /path/to/kernel/source
# Extract SHAs and start git bisect
export GOOD_SHA=$(python3 -c "import yaml; print(yaml.safe_load(open('bisection_context.yaml'))['good_sha'])")
export BAD_SHA=$(python3 -c "import yaml; print(yaml.safe_load(open('bisection_context.yaml'))['bad_sha'])")
git bisect start
git bisect good $GOOD_SHA
git bisect bad $BAD_SHA
# Run automated bisection
git bisect run /path/to/fastpath/scripts/execute_bisection.sh \
/path/to/bisection_context.yaml
The bisection script will:
Build the kernel for each commit tested
Create a unique swprofile named
bisect-<sha>(first 12 chars of SHA)Execute the benchmark on the SUT
Compare results against good/bad baselines using confidence intervals
Report to git bisect: GOOD (0), BAD (1), SKIP (125), or ERROR (128)
Git will automatically test commits until it identifies the first bad commit.
For more control over each bisection step:
# Manual bisection loop
cd /path/to/kernel/source
# Extract SHAs and start git bisect
export GOOD_SHA=$(python3 -c "import yaml; print(yaml.safe_load(open('bisection_context.yaml'))['good_sha'])")
export BAD_SHA=$(python3 -c "import yaml; print(yaml.safe_load(open('bisection_context.yaml'))['bad_sha'])")
git bisect start
git bisect good $GOOD_SHA
git bisect bad $BAD_SHA
# For each commit picked by git bisect, repeat until done:
# 1. Build the kernel
./scripts/build_local_kernel.sh
source ./scripts/.env
# 2. Test and evaluate
fastpath bisect run \
--context bisection_context.yaml \
--kernel $KERNEL_PATH \
--modules $MODULES_PATH \
--gitsha $GITSHA
# 3. Mark commit based on exit code (0=good, 1=bad, 125=skip)
git bisect good # if exit code is 0
git bisect bad # if exit code is 1
git bisect skip # if exit code is 125
# Git bisect picks next commit and repeats until first bad commit found
Bisection Output
During each bisection step:
Building kernel for current commit...
Executing plan.yaml...
Result: REGRESSION detected for resultclass 'sysbenchthread-110'
comparing 'bisect-a1b2c3d4e5f6' vs '6.8.0-baseline'.
fastpath bisect run exited with status 1
Final bisection result:
a1b2c3d4e5f6 is the first bad commit
commit a1b2c3d4e5f6
Author: Developer Name <dev@example.com>
Date: Mon Nov 1 10:00:00 2025 +0000
Subject line of the problematic commit
Understanding Results
Each commit is tested and classified by comparing results against good baseline:
GOOD (0): Performance matches or exceeds good baseline
BAD (1): Performance regression detected
SKIP (125): Overlapping confidence intervals
ERROR (128): Fatal error, abort bisection (environment/infrastructure failure)
Adaptive Testing: Tests with 1 sample initially (1/1/1). If inconclusive (gap <2× the 99% confidence interval width), runs additional sampling using the configured warmups, repeats, and sessions (defaults: 1/2/2, yielding 4 more samples for 5 total).
Limitations: Single-node SUTs only.