CRISPR Efficiency Panel Benchmark
This benchmark ties Helix's CRISPR on-target scoring model to observed edited fractions for a panel of endogenous-like sites.
- Spec template:
templates/crispr_efficiency_panel.helix.yml - Harness:
benchmarks/crispr_efficiency_panel.py
Spec shape
The efficiency panel config declares:
- Global
cassettings (Cas system type, PAM pattern, mismatch weights). - A
targetslist where each entry defines:reference_sequence– digital DNA window for the locus.guide– guideid,sequence, and optionalstrand.measurements.edited_fraction– either:value: inline edited fraction for quick tests, orfile+column: path to a TSV with an edited fraction column.
- Optional
chromatin_features(ATAC scores, histone marks, etc.) passed through to the JSON output.
Running the benchmark
Example invocation:
python -m benchmarks.crispr_efficiency_panel \
--config templates/crispr_efficiency_panel.helix.yml \
--out bench-results/crispr_efficiency_panel.json
The harness:
- Builds a
CasSystem+GuideRNAfrom the panel'scasand per-targetguide. - Uses
helix.crispr.simulator.predict_efficiency_for_targets(the public CRISPR engine batch API) to obtain physics-based on-target scores for each reference window. - Compares predicted scores to observed edited fractions and reports:
- Per-target status and absolute error.
- Panel-level mean absolute error (MAE) and Pearson correlation (when at least two observations are available).
The JSON output (crispr_efficiency_panel schema) is suitable for CI drift tracking or methods-text calibration summaries.