Benchmarks

Helix ships a JSON benchmark harness (python -m benchmarks.api_benchmarks) and continuously publishes the results to CI. The goals are:

Comparable runs: every payload is tagged with commit SHA, dataset provenance, CPU/threads, BLAS vendor, RNG seed, and per-case RSS stats.
Drift detection: CI rejects PRs when a case slows down by more than 5 % relative to .bench/baseline.json.
Transparency: the latest measurements are summarized here, powered by the CSV in docs/data/bench/history.csv (populated automatically on main).

Running locally

python -m benchmarks.api_benchmarks \
  --repeat 5 \
  --warmup 1 \
  --limit 0 \
  --out bench-results/api.json \
  --summary-md bench-results/api.md

--limit N keeps only the first N nucleotides/aminos (0 = entire dataset). Use this for quick inner-loop runs or to mimic CI’s 10k-sample sweep.
Override datasets with HELIX_BENCH_DNA_FASTA=/abs/path/... and HELIX_BENCH_PROTEIN_FASTA=/abs/path/.... The harness records those paths in the JSON payload so dashboards can compare apples-to-apples.
Pass --baseline path/to/baseline.json to compute Δ% vs. a stored run. scripts/bench_check.py baseline current --threshold 5 is what CI uses to gate regressions.

Heavy datasets via GitHub Actions

Trigger a manual heavy sweep from the Actions → CI → Run workflow button:

Set bench_heavy to true (this bumps repeats to 10 and disables the 10k sampling limit).
Optionally provide runner-accessible overrides for dna_fasta / protein_fasta. On hosted runners you typically leave these blank; on self-hosted boxes you can point at a mounted volume or fetcher script.

Each run publishes:

benchmarks/out/bench-<SHA>.json — the full schema payload.
benchmarks/out/bench-<SHA>.md — a Markdown table appended to the CI summary.
docs/data/bench/history.csv (main branch only) — an append-only log that powers the chart below.

Trend (mean seconds)

The gallery below visualizes every *.mean_s column recorded in docs/data/bench/history.csv and summarizes the latest run.

Loading benchmark history…

CSV source: docs/data/bench/history.csv. Commit history contains the raw JSON artifacts under benchmarks/out/ (uploaded by CI) if you need to recompute metrics offline.