OpenXR Option Scorer Model Version Binding Mismatch on Quest Build - Release Lane and Tuple Lock Fix
Your calibration packet says model M-2026.2.4 and expects option B to win for cluster MIT-12. On the Quest build candidate, rankings invert, policy filters disagree, or promotion labels do not match the signer packet. Nobody changed weights on purpose.
In 2026 mitigation and OpenXR release lanes, this is rarely "random XR behavior." It is almost always binding drift: more than one code path can supply scores, or the build tuple does not lock the scorer version your governance assumed.
Problem
Typical symptoms:
- Editor or dogfood build ranks options correctly; Quest store candidate does not
- CI gate passes on one agent; release owner reproduces different top score on device
- Telemetry shows
model_versionmissing, null, or different across two logs from the same build id - policy filter accepts an option in replay but rejects it in the candidate you intended to ship
- mitigation decisions made pre-build do not match post-install first-session scoring
If you see ranking motion without a logged model identity, treat every conclusion as suspect until binding is proven.
Root cause summary
- Multiple scorers — shadow/canary code accidentally left enabled; production path still reads legacy weights.
- Unpinned resources — scoring config loaded from StreamingAssets, remote JSON, or Addressables without version hash tied to the release tuple.
- Conditional compilation —
#ifbranches load different weight files for Quest than for Editor. - Stale cached config — warm start or persistent storage rehydrates an older model id after you thought you shipped a new one.
- Tuple skew — build number, git SHA, and scorer manifest disagree; humans reference one tuple while automation references another.
Fix strategy: one binding path, one source of truth per tuple, mandatory model_version in telemetry.
Fastest safe fix path
- Search the project for every load of scoring weights or calibration JSON. There must be one production resolver for Quest builds.
- Embed
model_version(and ideally a short hash) in player settings or a generatedScorerManifestincluded only via deterministic pre-build step. - Log
model_versionat session start on device and fail closed in internal builds if it is missing. - Lock the release tuple: commit id, build number, scorer manifest hash in one row your signer packet cites.
- Re-run one replay pack on the exact Quest artifact and compare rankings to the calibration packet.
Step-by-step fix
Step 1: Inventory binding sites
- List every class or ScriptableObject that can provide weights or dimension definitions.
- Mark each as
production,shadow,editor-only, ordeprecated.
Success check: only one production path remains for player builds.
Step 2: Collapse duplicate initialization
Common bug: startup order runs an old initializer after your new loader.
- Ensure scorer init runs once, after config is available, before first option list is evaluated.
- If you use Addressables, confirm the label you load in Quest matches CI.
Success check: deterministic order in player log with single "scorer_bound" event.
Step 3: Pin manifest to build
Generate a small manifest at build time:
model_versionweights_revisionor file hashgenerated_at(UTC)git_shaorbuild_number
Embed it in:
- a
Resources/StreamingAssetsfile only replaced by CI, or - PlayerSettings scripting define that maps to a checked-in manifest for that tag
Success check: manifest on device matches signer packet row.
Step 4: Align Editor vs Quest defines
Search for:
- different preprocessor symbols between Editor and Android
- missing
OPENXRor headset-only branches that skip new loader
Success check: Development and Release Quest builds both load same manifest for a given tag.
Step 5: Fix cache and persistence issues
If you cache scorer config:
- key cache by
model_version - clear cache on app upgrade when manifest hash changes
- never reuse cache across different build numbers without validation
Success check: cold install and upgrade install both report identical model_version on first frame where scoring is active.
Step 6: Telemetry contract
Add fields to your existing OpenXR startup or mitigation telemetry (see related help on startup instrumentation):
active_model_versionscorer_manifest_hashconfig_load_source(resource path id, not full secrets)
Success check: every scoring decision log row joins to the same version as startup.
Verification checklist
- Quest artifact A and B (same tag) produce identical
model_versionlogs - Rankings for a frozen option set match calibration packet within expected float tolerance
- Policy filter outcomes match calibration table for the same inputs
- Removing network does not change local scorer version (unless you intentionally stream config; then block promotion when offline at first lock)
Alternative fixes
- Feature flag service: if you must remote-switch models, gate by signed payload and log flag id beside
model_version. Do not silently override local manifest without audit row. - Split configs: keep Quest-only tuning in a separate file but still one loader with explicit merge rules documented in signer packet.
Prevention tips
- Treat scorer changes like code changes: review + CI + tuple lock.
- Never approve promotion without a device log snippet showing
model_version. - After wide rollout of a new model, run one "binding regression" test in your weekly cadence.
Related links
- OpenXR post-rollout verification packet missing scorer stamps on Quest - resume timing and window boundary fix — bind milestones, resume discipline, and UTC verification windows for evidence packets.
- OpenXR signer review deck shows stale contract revision after correction packet - query pack refresh fix — deterministic query-pack regeneration and footnote hash refresh after correction-state changes.
- OpenXR follow-up response packet uses wrong snapshot UTC after signer review - escalation routing fix — hold-state and escalation routing when post-review packets cite stale snapshots.
- OpenXR auto-remediation package applies without rollback gate on Quest - response lane fix — enforce rollback criteria schema and deterministic keep/tune/rollback decisions for trigger-driven interventions.
- OpenXR promotion-gate waiver not expiring and package still ships on Quest - fix — enforce waiver expiry, candidate-scoped approvals, and confidence revalidation before conditional promotions.
- OpenXR exception-budget override approved but post-window debt not reconciled on Quest - fix — close override windows with reconciliation classes, carryover penalties, and SLA-enforced debt closure before new approvals.
- OpenXR route closure reviewers disagree on confidence band - calibration dispute adjudication Quest fix — resolve confidence-band conflicts with deterministic trigger thresholds, criterion-level tie-break rules, and policy-coupled adjudication logging.
- OpenXR reason-code version migration mixed adjudication and policy drift on Quest - fix — versioned reason-code migration windows, close-time lineage, compatibility mapping, and reopen policy to prevent mixed-semantics policy drift.
- OpenXR reason-code compatibility map missing during migration on Quest - how to fix — require mapping-rule IDs, remap lineage, and deterministic recompute joins before migration-window expansion.
- OpenXR startup selection telemetry missing on Quest build - instrumentation route fix — ordering and fields for startup/scorer telemetry.
- OpenXR mitigation mode exits but next launch restores old fallback route - Quest fix — baseline and persistence discipline adjacent to config binding.
- Guide: Unity 6.6 LTS OpenXR Calibration-Change Rollout and Safe Model-Rollback Preflight
- Blog: Unity Quest OpenXR score model rollout - shadow, canary, and rollback playbook 2026
FAQ
Why do rankings differ slightly between Editor and Quest?
Floating-point order or platform math can cause ties to resolve differently. Freeze tolerance bands in your calibration packet and test on device for borderline cases.
Should the scorer live in native plugin code?
If it does, ensure the same version string is exposed to C# and logged. Split stacks often cause invisible drift.
Can Addressables serve scorer config safely?
Yes, if the address is pinned per release tuple, content hash is verified before bind, and offline behavior is defined.
Escalation criteria for release owners
Escalate to a hold or rollback discussion when:
model_versioncannot be confirmed on device for two consecutive candidates- policy outcomes disagree with calibration packet on the same frozen fixture
- shadow and production paths both emit scores in one session (duplicate bind detected)
These are governance signals, not “wait for next patch” cosmetic issues.