Helix validation
Scientific validation (planned)
What we plan to track for lab-facing accuracy checks:
- CRISPR outcome distributions vs small public datasets (frameshift %, del/ins mix).
- Prime editing predicted efficiencies / outcome profiles vs published assays.
- Basic metrics: correlation, KL divergence, frameshift accuracy, top-outcome match.
Status (seed entries)
| Dataset | Modality | Metrics | Observed vs Predicted | Notes |
|---|---|---|---|---|
| BRCA1 exon demo (synthetic) | CRISPR | cut%, frameshift%, small_del vs small_ins | TODO: drop numbers | Synthetic scaffold; replace with public figure once picked |
| peg RTT=GATTACA (synthetic) | PRIME | intended_prime%, no_cut%, frameshift% | TODO: drop numbers | Synthetic; swap for a published peg when available |
Planned CLI (future):
helix-cli validate --dataset path/to/ground_truth.json --backend gpu→ prints metrics summary.
Notes:
- Helix scoring versions (CRISPR/Prime) are recorded in every run/report; use those when comparing.
- Engine backend (GPU/CPU) is captured via EngineInfo for reproducibility.
Validation packs (shipped)
Helix also ships a pack runner that produces a single machine-parsable verdict artifact suitable for CI and archival.
- List packs:
helix validate list - Run a pack:
helix validate pack ampseq_indel_profile_v1 --outdir out/validation --ci - Sign the verdict:
helix validate pack ampseq_indel_profile_v1 --outdir out/validation --sign-private-key publisher.ed25519 - Verdict schema:
schemas/validation/helix_validation_verdict_v1.json
Shipped packs (Public Validation Pack v1):
| Pack | Demonstrates | Mode | Typical runtime | Key outputs (besides verdict.json) |
|---|---|---|---|---|
ampseq_indel_profile_v1 | Paired-end FASTQ ➜ deterministic editing outcomes | Exact-golden | Seconds | artifacts/alleles.tsv, artifacts/top_alleles.tsv, artifacts/indel_hist.json, artifacts/summary.json |
crisprresso_import_v1 | CRISPResso folder ➜ normalized artifacts | Exact-golden | Seconds | artifacts/editing_summary.tsv, artifacts/alleles.tsv, artifacts/metadata.json |
bam_qc_coverage_v1 | BAM ➜ QC counts + coverage bins + per-region coverage | Exact-golden | Seconds | artifacts/summary.json, artifacts/coverage_hist.json, artifacts/region_coverage.tsv |
vcf_qc_annot_v1 | VCF ➜ QC + lightweight annotation join | Exact-golden | Seconds | artifacts/qc_summary.json, artifacts/annotated_variants.tsv |
signed_plugin_trust_chain_v1 | Signed plugin install/load + tamper control | Exact-golden | Seconds | artifacts/trust_chain.json |
Killer snippet (trust posture in 30 seconds):
helix validate pack signed_plugin_trust_chain_v1 --ci --explain
Negative controls (intentional mismatch that still returns OK + crisp --explain):
helix validate pack ampseq_indel_profile_v1 --case negative --ci --explain
Verify a signed verdict:
helix plugins keygen --out-dir out/keys --name publisher
helix validate pack ampseq_indel_profile_v1 --outdir out/validation --sign-private-key out/keys/publisher.ed25519
helix verify verdict out/validation/ampseq_indel_profile_v1/verdict.json --pubkey out/keys/publisher.pub
Verify a reference validation bundle (directory or .tar.gz):
helix verify bundle reference_validation_bundle/
helix verify bundle helix-reference-validation-bundle-v1.2.3-linux.tar.gz
Reference validation bundle contents:
publisher.pub: public key for all verdict signatures in the bundlebundle.index.json: inventory of verdict files +sha256+ expected statusverdicts/**/verdict.json: signed verdict artifacts (one per pack + case)verdicts/**/verdict.summary.txt: human-readable summaries
Machine summary output (useful for CI / Teams acceptance gates):
helix verify bundle helix-reference-validation-bundle-v1.2.3-linux.tar.gz --json-out verify_bundle.json
Pack fixtures are bundled under src/helix/datasets/validation_packs/ and each pack includes a manifest.json with per-file SHA-256.
UI layout snapshots
Studio layout regressions are tracked via tools/snap_layout.py, which writes artifacts/layout/*.png at common resolutions. CI workflow .github/workflows/ui-layout.yml runs this on PRs that touch Studio UI; reviewers can open the uploaded PNGs to sanity-check proportions and docking.