← Docs
Helix CLI docs
Browse Helix CLI docs

CRISPR Off-target Panel Benchmark

This benchmark connects Helix's CRISPR off-target search + scoring engine to assay-like hit counts (e.g., CHANGE-seq / CIRCLE-seq).

  • Spec template: templates/crispr_offtarget_panel.helix.yml
  • Harness: benchmarks/crispr_offtarget_panel.py

Spec shape

The off-target panel config declares:

  • genome.fasta – digital DNA reference used for off-target search.
  • defaults – shared search parameters:
    • pam – name of a registered PAM (e.g. SpCas9-NGG).
    • max_mm, max_gap – mismatch / gap limits for enumeration.
  • guides[] – per-guide entries with:
    • id, sequence, pam override (optional).
    • search – optional per-guide search params.
    • assay – optional off-target results:
      • file – path to a TSV.
      • columns – mapping of column names for chrom, start, strand, read_count.

Running the benchmark

Example invocation:

python -m benchmarks.crispr_offtarget_panel \
  --config templates/crispr_offtarget_panel.helix.yml \
  --out bench-results/crispr_offtarget_panel.json

The harness:

  • Enumerates off-target candidates using helix.crispr.score.enumerate_off_targets over the panel's genome window.
  • Scores hits with helix.crispr.score.score_off_targets (default weight profile).
  • Loads assay hit counts, aggregates by (strand, start), and compares:
    • Per-guide metrics: predicted vs observed site counts and a Pearson correlation between predicted scores and log1p(read_count) over overlapping sites.
    • Panel-level summary: number of guides evaluated and mean Pearson correlation.

The JSON output (crispr_offtarget_panel schema) is designed for CI drift checks and for supporting methods-text statements about off-target risk calibration.