Cross-Version Semantic Drift Checks
Goal: catch silent meaning changes across releases by running the same bundle through two Helix versions and comparing outputs.
Quick start (two installed versions)
# previous release in /tmp/helix-prev, current repo in /home/chris/helix
/tmp/helix-prev/bin/helix run repro/helix_repro_bundle_v1/inputs/case01.spec.json --out /tmp/out-prev --backends cpu-reference
helix run repro/helix_repro_bundle_v1/inputs/case01.spec.json --out /tmp/out-curr --backends cpu-reference
python tools/semantic_drift.py --prev /tmp/out-prev/cpu-reference/run.json --curr /tmp/out-curr/cpu-reference/run.json
The script normalizes known nondeterministic fields and reports PASS/FAIL with a focused diff.
CI hook (optional)
- Keep the previous release wheel cached (e.g.,
pip install helix-governance==<last>into.cache/helix-prev). - Run
tools/semantic_drift.py --prev-cmd .cache/helix-prev/bin/helix --curr-cmd ./venv/bin/helixin CI. - Fail the job if drift is detected; update expected tolerances or document the intentional change in release notes.
What is compared
- Canonical repro spec:
repro/helix_repro_bundle_v1/inputs/case01.spec.json(extendable). - Primary comparison:
run.jsonpayloads for D0; for D1 backends, compare within stored tolerances when available. - Normalized out:
env_fingerprint,helix_version,git_sha, and timestamps.
When drift is acceptable
- Only when intentional and documented: include a
drift_reasonin release notes and update conformance packs/fixtures.
Extending coverage
- Add more specs to
repro/and reference them intools/semantic_drift.py. - Store prior-version expected outputs in
artifacts/semantic_drift/<tag>/...if you need offline comparisons.