Benchmarks
Helix ships a JSON benchmark harness (python -m benchmarks.api_benchmarks) and continuously publishes the results to CI. The goals are:
- Comparable runs: every payload is tagged with commit SHA, dataset provenance, CPU/threads, BLAS vendor, RNG seed, and per-case RSS stats.
- Drift detection: CI rejects PRs when a case slows down by more than 5 % relative to
.bench/baseline.json. - Transparency: the latest measurements are summarized here, powered by the CSV in
docs/data/bench/history.csv(populated automatically onmain).
Running locally
python -m benchmarks.api_benchmarks \
--repeat 5 \
--warmup 1 \
--limit 0 \
--out bench-results/api.json \
--summary-md bench-results/api.md
--limit Nkeeps only the firstNnucleotides/aminos (0 = entire dataset). Use this for quick inner-loop runs or to mimic CI’s 10k-sample sweep.- Override datasets with
HELIX_BENCH_DNA_FASTA=/abs/path/...andHELIX_BENCH_PROTEIN_FASTA=/abs/path/.... The harness records those paths in the JSON payload so dashboards can compare apples-to-apples. - Pass
--baseline path/to/baseline.jsonto compute Δ% vs. a stored run.scripts/bench_check.py baseline current --threshold 5is what CI uses to gate regressions.
Heavy datasets via GitHub Actions
Trigger a manual heavy sweep from the Actions → CI → Run workflow button:
- Set
bench_heavytotrue(this bumps repeats to 10 and disables the 10k sampling limit). - Optionally provide runner-accessible overrides for
dna_fasta/protein_fasta. On hosted runners you typically leave these blank; on self-hosted boxes you can point at a mounted volume or fetcher script.
Each run publishes:
benchmarks/out/bench-<SHA>.json— the full schema payload.benchmarks/out/bench-<SHA>.md— a Markdown table appended to the CI summary.docs/data/bench/history.csv(main branch only) — an append-only log that powers the chart below.
Trend (mean seconds)
The gallery below visualizes every *.mean_s column recorded in docs/data/bench/history.csv and summarizes the latest run.
Loading benchmark history…
CSV source:
docs/data/bench/history.csv. Commit history contains the raw JSON artifacts underbenchmarks/out/(uploaded by CI) if you need to recompute metrics offline.