HelixSpec MVP v0.1
Goal
A typed, deterministic DSL for genome editing that compiles into a canonical IR
(helixspec_ir_v1) and a verifiable manifest (helixspec_manifest_v1).
Scope (v0.1)
- One file -> one IR document (no sweeps/templates yet).
- Supported edits: cut, prime, base.
- Explicit, typed fields only. No general expressions.
Grammar (high level)
version 1genome <build>sample "name" { ploidy: <int> }locus NAME = chrN:start-end(+|-)cut edit NAME { nuclease ... guide ... target ... repair ... }prime edit NAME { nuclease ... pegRNA ... target ... }base edit NAME { editor ... guide ... edit ... target ... }
Type rules
- DNA sequences: A/C/G/T only (uppercased).
- PAM patterns: IUPAC set A/C/G/T/N/R/Y/S/W/K/M/B/D/H/V.
- Locus start/end are 1-based, start <= end.
ref(locus, offset)resolves topos_1based = locus.start + offsetand must fall inside the locus.- pegRNA constraints: PBS length 1-20, RTT length 1-80.
Canonicalization
- All strings normalized (Unicode NFC + LF newlines).
- Lists are ordered deterministically (loci by name, edits by edit_id).
- Canonical JSON uses sorted keys and stable formatting.
CLI
helix compile spec.hxspec outdir- Writes
ir.json,manifest.json,SHA256SUMS.txt. --deterministicpinscreated_at_utcfor reproducible outputs.
- Writes
helix verify <outdir|manifest.json>- Recomputes IR hash + manifest identity hash and checks SHA256SUMS.
helix diff a_manifest b_manifest- Semantic IR diff (edit-level changes).
Non-goals (v0.1)
- No execution (
helix run) yet. - No sweeps/templates.
- No reference sequence materialization.
- No GUI integration beyond DSL compilation.