← Docs
Helix CLI docs
Browse Helix CLI docs

Helix Engine Architecture

This document captures the now-frozen boundary between Helix's Python shell and the native execution engines.

Public API Surface (Python)

  • helix.crispr.simulator provides dataclasses like EfficiencyTargetRequest and batch APIs such as predict_efficiency_for_targets. This layer is pure Python and must not grow new responsibilities when chasing performance.
  • helix.crispr.physics defines the CrisprPhysics protocol and helpers such as compute_on_target_score_encoded / score_pairs_encoded. These work exclusively with encoded np.ndarray buffers and simple numeric parameters.

All callers should treat these as the final, stable boundary. Switching between backend="cpu-reference", backend="native-cpu", or backend="gpu" happens inside create_crispr_physics with no API changes.

Configuration / CLI Surface

  • CLI flag --engine-backend {cpu-reference,native-cpu,gpu} (or env HELIX_CRISPR_BACKEND) sets the preferred backend. Defaults to native-cpu when available, otherwise cpu-reference.

  • --engine-allow-fallback / HELIX_CRISPR_ALLOW_FALLBACK=1 lets Helix fall back to cpu-reference if the requested backend can’t load.

  • Studio and other Python callers inherit the same environment variables by default; no extra wiring is required beyond the batch APIs above.

  • helix engine info prints the resolved backend, fallback state, and whether the native module is available—handy for bug reports.

  • Helix Studio surfaces the active backend/fallback/native status in the control panel so users immediately see which engine is powering the visualization.

  • All CRISPR artifacts now carry crispr_scoring_version (currently 1.0.0) and crispr_engine_backend metadata. Any deliberate scoring change must bump the version, refresh the fixtures, and document the change (see docs/scoring_version.md).

  • Backend behavior matrix (requested backend vs availability):

    RequestedNative built?CUDA available?Fallback allowed?Result backendBehavior
    gpuyesyesn/agpuCUDA kernels run; metadata reports gpu.
    gpuyes/nononoRuntimeError (“Helix CUDA backend is unavailable.”).
    gpuyes/nonoyescpu-referenceWarning logged; metadata records cpu-reference.
    native-cpuyesn/an/anative-cpuNative extension handles scoring.
    native-cpunon/anoRuntimeError explaining the native build is missing.
    native-cpunon/ayescpu-referenceWarning logged; metadata records cpu-reference.
    cpu-referencen/an/an/acpu-referencePure Python reference implementation, always available.

    use_gpu=True in helper APIs simply selects the gpu row above.

GPU Backend (Experimental)

The CUDA path is available when you build helix_engine._native with HELIX_ENGINE_ENABLE_CUDA=ON. PyPI wheels remain CPU-only; build locally via:

./scripts/build_native_cuda.sh
HELIX_CRISPR_BACKEND=gpu helix engine benchmark --backends gpu --json gpu_benchmark.json

If CUDA is unavailable the CLI falls back to cpu-reference and records the fallback in both tables and JSON.

CRISPR Engine Throughput (Example Dev Machine)

Measured on a CUDA-enabled dev container (RTX A4000, Python 3.12, build -O3). Run helix engine benchmark (or python -m helix.cli engine benchmark) to collect the same metrics locally and compare against this table.

BackendGNLMPairs/s
cpu-reference1512200.06
native-cpu1512200.90
gpu151220<0.01
cpu-reference964096200.07
native-cpu964096207.53
gpu964096203.31
cpu-reference25616384200.09
native-cpu256163842011.05
gpu256163842012.17

Tiny GPU workloads (1×512×20) are dominated by kernel launch overhead and settle around 0.002 MPairs/s even though they quantize to 0.00 at two decimal places. Larger batches amortize the cost and reach parity/speedups vs native.

Prime editing respects the same knobs: by default it mirrors the CRISPR backend choice, but you can override it via HELIX_PRIME_BACKEND and HELIX_PRIME_ALLOW_FALLBACK if needed. Prime physics scoring terminology is documented in docs/prime_physics.md.

Prime Engine Throughput (Example Dev Machine)

BackendTargetsGenome ntSpacerPredictions/sec
cpu-reference322,000205634.48
cpu-reference644,000203037.36

Benchmark JSON Schema (v1)

helix engine benchmark --json emits a structured payload so CI/doc tooling can track regressions. Schema v1 fields:

  • helix_version – package version string.
  • scoring_versions.crispr / .prime – scoring-version identifiers.
  • env.platform, env.python_version, env.cuda_available, env.gpu_name, env.native_backend_available, env.backends_built.
  • seed – RNG seed used for the synthetic data.
  • config.backends, config.crispr_shapes, config.prime_workloads.
  • benchmarks.crispr[] entries contain backend_requested, backend_used, shape (GxNxL), raw g/n/l, mpairs_per_s, and elapsed_seconds.
  • benchmarks.prime[] entries contain backend_requested, backend_used, workload (targetsxgenome_lenxspacer), predictions/predictions_per_s, and elapsed_seconds.

Example (trimmed) JSON:

{
  "helix_version": "1.0.2",
  "scoring_versions": {"crispr": "1.0.0", "prime": "1.0.0"},
  "env": {
    "platform": "Linux-6.6.87-x86_64",
    "python_version": "3.12.3",
    "cuda_available": false,
    "gpu_name": null,
    "native_backend_available": false,
    "backends_built": ["cpu-reference"]
  },
  "seed": 1,
  "config": {
    "backends": ["cpu-reference"],
    "crispr_shapes": ["1x512x20"],
    "prime_workloads": ["32x2000x20"]
  },
  "benchmarks": {
    "crispr": [
      {
        "backend_requested": "cpu-reference",
        "backend_used": "cpu-reference",
        "shape": "1x512x20",
        "mpairs_per_s": 0.06,
        "elapsed_seconds": 0.53
      }
    ],
    "prime": [
      {
        "workload": "32x2000x20",
        "predictions": 32,
        "predictions_per_s": 5634.48,
        "elapsed_seconds": 0.006
      }
    ]
  }
}

Future versions of the schema will only add fields; existing keys remain stable.

Remote Engine Contract

Remote runners (OGN) reuse the same JSON contracts as the CLI:

  • Performance endpoints must emit the benchmark schema above.
  • Prime scoring endpoints must include the physics_score block defined in docs/prime_physics.md.

See docs/ogn_integration.md for the exact fields expected from remote benchmarks and scoring services.

Kernel Contract

Every backend (Python, native C++, CUDA) implements the same kernel semantics:

  • Guides are encoded as uint8 arrays with shape (G, L) (row-major).
  • Windows are encoded as uint8 arrays with shape (N, L).
  • Output scores are float32 arrays with shape (G, N).
  • No broadcasting is permitted; mismatched shapes raise errors.
  • Encodings follow the helix.crispr.physics._encode_sequence_to_uint8 convention: A/C/G/T → 0/1/2/3 with all other bases mapping to 4.
  • The shared helpers live in helix.engine.encoding, so CRISPR, Prime, and any other engines use the exact same mapping.

This is the exact contract exposed via the C++ header in src/helix_engine/include/helix/physics.hpp and mirrored by CUDA kernels in the future.

Native Engine Layout

src/helix_engine/
├── include/helix/physics.hpp   # PhysicsParams + C++ kernel prototypes
├── src/physics.cpp             # Reference CPU implementations
├── src/binding.cpp             # pybind11 bindings (exposes _native module)
└── native.py                   # Python shim that calls into the extension

helix_engine.native exposes the same scoring functions to Python and only falls back to pure Python helpers when the extension is missing. When users explicitly request backend="native-cpu", Helix requires the compiled module and raises a clear error otherwise.

Testing Expectations

  • tests/test_crispr_simulator.py keeps exercising the high-level simulator and ensures encoded helpers behave consistently.
  • tests/test_crispr_native_engine.py (skipped unless _native is importable) asserts scalar/batch parity between the C++ backend and the Python reference.
  • Golden fixtures live in tests/data/engine_crispr. The test suite compares every backend (cpu-reference, native-cpu, later gpu) against those saved scores (tests/test_crispr_engine_fixtures.py). Any change to scoring must update the fixture data explicitly.
  • Prime fixtures live alongside them in tests/data/engine_prime, enforced by tests/test_prime_engine.py.
  • CI environments without the native build remain fully green; local developer environments with the extension compiled gain stronger guarantees.

Performance Harness

  • benchmarks/engine_crispr_perf.py generates random encoded arrays and measures throughput for score_pairs_encoded across user-specified (G, N, L) regimes and backends. Run it locally to track wins/regressions before landing CUDA or other optimizations. Example (reference backend on dev container):

    backend=cpu-reference shape=(1,512,20)   -> 0.05 MPairs/s
    backend=cpu-reference shape=(32,4096,20) -> 0.06 MPairs/s
    
  • For comparing multiple backends/configs, call the helpers in helix.crispr.physics (e.g. score_pairs_encoded_multi) at a higher layer rather than overloading the core kernel API. The kernel itself always assumes “one backend, many guides/windows.”

  • Prime editing follows the same pattern. benchmarks/engine_prime_perf.py offers a starter perf harness, and predict_prime_outcomes_for_targets provides the batch API until the physics engine matures further. Prime JSON outputs report prime_scoring_version (currently 1.0.0) and prime_engine_backend fields so consumers know which engine generated them. Example baseline (targets=64, genome_len=2000, spacer length 20) on this dev container: cpu-reference ≈ 5.4K predictions/sec.

With this structure in place, adding CUDA support means implementing the same score_pairs_encoded kernel on the device and hooking it up to the existing protocol—no changes to simulator or benchmark code are required. A native CUDA backend is available when the engine is built with -DHELIX_ENGINE_ENABLE_CUDA=ON; see docs/engine_cuda_plan.md for build notes and future milestones.