Lesson 32: Read-Only Analytics Warehouse Contracts and Service Accounts
Lesson 31 automated how rows arrive. This lesson defines where they live when spreadsheets stop being enough: a read-only analytics warehouse layer that stores the same closure and train facts you already trust, but with schemas, partitions, and accounts that scale to more builds and more stakeholders.
You are not building a full data mesh. You are adding one disciplined read layer so product, support, and production can query metrics without touching the ingest writer or raw webhook secrets.

What you will build
By the end of this lesson, you will have:
- A three-table contract (build dimension, closure facts, ingest audit) that maps directly onto Lesson 29 through Lesson 31 fields
- A read-only service account pattern for BI tools with explicit deny lists on write APIs
- A partitioning rule keyed on
promotion_dateso historical trains stay fast to scan - A published view layer that hides webhook payload columns from casual consumers
Step 1 - Decide the graduation trigger
Move off pure Sheets when at least one of these is true:
- you have more than twelve active
build_idvalues in the rolling window and scroll lag breaks standups - two teams need the same numbers with different filters and you are tired of duplicating tabs
- you need queryable history for audits without granting edit rights to the ingest automation account
If none apply, stay on Lesson 31. Warehouse work has a carrying cost.
Step 2 - Freeze the logical model before picking a vendor
Start vendor-neutral. Map columns you already export:
| Logical entity | Source lessons | Minimum columns |
|---|---|---|
dim_build |
Lesson 29 closures, Lesson 30 columns | build_id, promotion_date, platform_targets, owner_lane |
fact_stability_closure |
Lesson 29 row at close | build_id, yellow_open_close, stability_signal, dialogue_signal, closure_utc |
fact_release_train_snapshot |
Lesson 30 grid | build_id, train_note, recurrence_flag, rolling_window_label |
stg_ingest_batch |
Lesson 31 | ingest_batch_id, source, ingest_utc, row_count, checksum |
Add one surrogate key only if your warehouse makes joins easier. Otherwise build_id plus ingest_batch_id stays the composite truth anchor, matching your Lesson 26 analytics confidence habit of naming the evidence object explicitly.
Step 3 - Land data in staging, promote to facts
Never let webhooks or CSV loaders write straight into BI-facing views.
- Landing tables or buckets accept raw files exactly as Lesson 31 produces them.
- Staging applies type casts, rejects unknown headers, and quarantines bad rows with a
reject_reasoncolumn. - Facts receive idempotent merges keyed on
(build_id, ingest_batch_id)the same way you described in Lesson 31.
If a row fails staging, the dashboard should still show the last good fact row for that build_id, not a blank cell. That is a player-trust issue dressed up as SQL.
Step 4 - Create the read-only service account
Create two principals:
| Principal | Purpose | Permissions |
|---|---|---|
svc_rpg_ingest_writer |
Lesson 31 automation | insert into landing and staging only |
svc_rpg_bi_reader |
Looker Studio, Metabase, Hex | select on approved views, no insert, no delete |
Document a one-page deny list: the reader account must not call job APIs, bucket delete APIs, or secret managers. If your cloud console makes that annoying, that is a signal you are still on the default broad role. Narrow it.
Step 5 - Publish analyst-safe views
Ship at least these views:
vw_release_train_current_windowmirrors Lesson 30’s rolling window rule in SQL so every consumer uses the same predicate.vw_yellow_recurrenceencodes the recurrence flag logic so exec summaries cannot drift from engineering definitions.vw_build_lookupjoinsdim_buildto the latest successfulingest_batch_idperbuild_id.
Name columns the same way Lessons 28 and 29 name them in briefings so Lesson 28 patch-readiness packets still read familiar when someone deep-links from a slide deck.
Step 6 - Partition and prune with intent
Partition fact tables on promotion_date month (or week if you ship very often). Keep cold partitions for legal or publisher audits, but set a documented retention cap for raw webhook JSON if it includes stack snippets. Align that cap with what you already decided in Lesson 31 about PII.
Pro tips
- Pro tip: Treat
ingest_batch_idlike a Git tag. If a batch is wrong, you roll forward with a corrective batch, not silent edits. - Pro tip: Mirror Launch Lesson 24 scorecard rhythm by materializing a monthly summary table fed only from
vw_release_train_current_window, so commercial and stability narratives share one clock. - Pro tip: Add a cheap row count diff alert: landing row count versus staging promoted count should match within your duplicate tolerance every run.
Common mistakes
- Granting the BI tool the same OAuth app your writers use, then wondering why someone dragged a formula over production ingest credentials.
- Letting analysts create private extracts with stale
build_idfilters that contradict the Lesson 21 style control panel thresholds your launch track already published. - Skipping
stg_ingest_batchmetadata, which makes it impossible to answer “which upload broke the train view?” in under five minutes.
Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
| Dashboard shows double trains | merge key too narrow | require (build_id, ingest_batch_id, source) uniqueness in staging |
| Warehouse costs spike | full table scans on JSON | move parsed columns into typed fact columns, partition by date |
| BI tool timeouts | view joins too wide | precompute vw_release_train_current_window on a schedule |
Mini challenge
In one document, write:
- The exact
CREATE VIEW(or equivalent) definition forvw_release_train_current_window, including the rolling window predicate in plain English and SQL. - The IAM or role binding list for
svc_rpg_bi_reader. - A rollback note: how you would freeze reads and revert to Lesson 31 CSV-only mode for one release week.
FAQ
Do I need dbt or Airflow on day one?
No. Scheduled SQL in your warehouse console plus version-controlled .sql files in Git is enough until you have three failing merges a month.
How do I keep warehouse schema in sync with Lesson 31 CSV headers?
Add a CI check that parses the header row of a golden sample CSV and compares it to a checked-in schema.json. Fail the build when someone renames recurrence_flag without updating the loader.
What if we use SQLite or Postgres instead of a cloud warehouse?
The same contracts apply. Use schemas, roles, and views. Swap partition syntax for table inheritance or monthly tables.
Lesson recap
You can now:
- describe a minimal star layout for RPG stability metrics without over-modeling
- separate writer and reader credentials for ingest versus BI
- expose only vetted views that match Lesson 29 through Lesson 31 language
- partition and retain data with the same seriousness you apply to player crash bundles in the crash triage tooling overview
Next lesson teaser
When dashboards multiply, add query budgets and billing guardrails so warehouse spend stays predictable: Lesson 33: Warehouse Query Budgets and Cost Guardrails for RPG Live-Ops Metrics. Until then, instrument warehouse spend per ingest_batch_id week and review it beside your Lesson 30 train notes.
Related learning
- Lesson 31: Rolling Stability Dashboard CSV Export and Webhook Ingest Automation
- Lesson 30: Multi-Patch Rolling Stability Dashboard
- Lesson 26: Analytics Confidence Review and Live-Ops Handoff
- Unity Cloud Save Conflict Resolution Overwrites Newer Data — different domain, same insistence on explicit merge keys and visible failure modes.
If this lesson stopped a silent schema drift from erasing trust in your yellow mitigation story, save it beside your Lesson 28 briefing template runbook.