CRISPR CTMC Tau-Leap (GPU-friendly)
Pure Gillespie (exact CTMC event-by-event) is accurate but divergence-heavy on GPUs: every cell takes variable time steps and branches differently. Helix’s tau-leap CTMC path uses fixed-Δt hazard stepping so the core update is SIMD-friendly and maps cleanly to CUDA.
What’s implemented
- CPU reference simulator:
src/helix/crispr/ctmc_tau_leap.py - Config schema (JSON Schema draft 2020-12):
src/helix/schema/simulation/helix_ctmc_tau_leap_v1.jsonschemas/simulation/helix_ctmc_tau_leap_v1.json
- CLI hook:
helix crispr tau-leap --config <yaml/json> --json <out.json>
Core math (tau-leap hazard stepping)
For hazard rate λ over timestep Δt:
p(event in Δt) = 1 - exp(-λΔt)
For competing hazards {λ_i}:
Λ = Σ λ_i- sample
p(any) = 1 - exp(-ΛΔt) - if any event happens, pick event i with probability
λ_i / Λ
Rule of thumb: keep most ΛΔt ≲ 0.1. If not, clamp with sim.maxLambdaDt or reduce Δt.
Minimal mechanistic model (v1)
Per allele state (diploid, per locus):
INTACT→ can be cutDSB→ can resect and/or repairREPAIRED_MUT→ terminal mutated allele (no recut in v1)
Mechanistic gates:
- resection state
0 → 1 → 2enables pathway availability - competing repair hazards gated by resection:
- NHEJ (rs==0)
- alt-EJ (rs≥1)
- HDR (rs==2 and donor>0)
- SSA (rs==2 and repeat_context)
Chromatin memory:
- cut →
TRANSIENT - repair →
REFRACTORY - relaxation hazard
kRelaxreturns toOPEN
Outcomes:
- NHEJ: insertion vs deletion mixture + geometric length sampling
- alt-EJ: microhomology length + larger deletions
- HDR: precise with
pPreciseelse fallback to NHEJ - SSA: fixed-size deletion (v1)
CUDA-shaped architecture (planned)
Use Structure-of-Arrays and a 1D flattening for (cell,locus,allele):
idx = (cell * L + locus) * 2 + allele
Kernel decomposition (matches the CPU reference flow):
- cell cycle update (optional)
- cut intact alleles
- DSB progress + competing repair + outcome sampling
- chromatin relaxation
- phenotype feedback (optional; currently biallelic KO → repair multipliers)
RNG (debuggable + reproducible)
Use counter-based RNG so results are deterministic independent of block sizes:
- key:
(global_seed, replicate_id) - counter tuple:
(cell_id, locus_id, allele_id, step_id, subdraw_id)
CPU reference uses splitmix64-based hashing to generate uniforms; CUDA should use Philox with the same counter tuple structure.
Validation strategies
- Unit invariants:
kCut=0→ allno_cut- deterministic replay with fixed seed
- KO feedback triggers when outcomes force biallelic frameshift
- Statistical sanity:
- sweep Δt and confirm convergence of allele spectrum as Δt → 0
- check pathway fractions respond monotonically to hazard multipliers
- GPU parity (future):
- match CPU reference bit-for-bit for the RNG stream + event selection, then validate distributions.