← Docs
Helix CLI docs
Browse Helix CLI docs

Run Ingest Pipeline

Omnis/Helix treats every instrument or cloud ecosystem the same once a run is translated into a canonical RunDescriptor and bundled with checksums.

RunDescriptor

A minimal example (see schemas/run/run_descriptor.schema.json for the full contract):

{
  "schema_version": "run_descriptor_v0.1",
  "instrument": {"vendor": "illumina", "model": "NextSeq", "vendor_run_id": "RUN123", "data_uri": "/runs/RUN123"},
  "assay": {"name": "PanelA", "read_structure": "151T8I151T", "read_lengths": [151, 8, 151]},
  "samples": [{"sample_id": "SampleA"}, {"sample_id": "SampleB"}],
  "data_products": {"fastq_sets": [{"sample_id": "SampleA", "lane": "L001", "read_pair": {"r1": "...", "r2": "..."}}]},
  "routing": {"pipeline": "generic_align", "reference_build": "hg38"}
}

RunBundle layout

run-<vendor_run_id>-<timestamp>.tar.gz
  descriptor.json
  manifest.json            # file list with size + sha256 + bundle digest
  checksums.sha256         # convenience copy
  sample_sheet.raw.csv     # optional
  data/                    # optional embedded FASTQ/BAM/VCF
  reports/                 # optional vendor QC

Edge → Ingest flow (quick start)

  1. Start ingest verifier/registry:
    python tools/mock_run_ingest_server.py --host 0.0.0.0 --port 8001 --store /tmp/ingest_store
    
  2. Start Ion receiver if testing Torrent payloads:
    python tools/ion_receiver_server.py --config edge_config.yaml --host 0.0.0.0 --port 8082
    
  3. Start edge agent watching an Illumina fixture:
    python tools/helix_edge_agent.py --config edge_config.yaml --loop
    
  4. Drop the Illumina fixture run into the watched folder. The agent will bundle, upload, and ingest will verify.

Results:

  • GET /runs on ingest shows the new run summary.
  • GET /runs/<id> returns the full descriptor + verification flag.
  • Helix Instrument Runs view can call the same endpoints through the RunRegistryClient.

Run states

  • received: bundle verified and stored by ingest.
  • queued: OGN accepted the run and created a job.
  • running: OGN job is executing.
  • finished: pipeline completed successfully.
  • failed: pipeline errored; check last_message for context.

Live demo walkthrough

  1. Start ingest verifier/registry (tools/mock_run_ingest_server.py).
  2. Start Ion receiver (optional) and edge agent watching an Illumina fixture.
  3. Start OGN scheduler with ingest callbacks (see helix/ogn/scheduler_status_hooks.py).
  4. Submit a fixture run via the edge agent.
  5. Watch /runs/{id}: status_history should walk received → queued → running → finished.
  6. Open Helix Instrument Runs tab:
    • Filter “Active” to see queued/running turn blue.
    • Filter “Failed” to show broken jobs with tooltip last_message.

Vendor mapping notes

  • Illumina run folder: parse RunInfo/RunParameters/SampleSheet; vendor_run_id comes from RunId or folder name.
  • BaseSpace: REST API provides Run.Id and FASTQ files; token refresh via BaseSpaceTokenManager.
  • Thermo Connect: treat synced run root; read run_completed.json and FASTQ glob.
  • Ion Torrent: plugin posts JSON to Ion receiver; paths in fastq_sets are bundled/uploaded.

For developer-facing details see schemas/run/README.md; this doc is the user-facing narrative.