Lesson 32: Read-Only Analytics Warehouse Contracts and Service Accounts

Lesson 31 automated how rows arrive. This lesson defines where they live when spreadsheets stop being enough: a read-only analytics warehouse layer that stores the same closure and train facts you already trust, but with schemas, partitions, and accounts that scale to more builds and more stakeholders.

You are not building a full data mesh. You are adding one disciplined read layer so product, support, and production can query metrics without touching the ingest writer or raw webhook secrets.

Iphone 17 illustration for read-only analytics warehouse lesson

What you will build

By the end of this lesson, you will have:

A three-table contract (build dimension, closure facts, ingest audit) that maps directly onto Lesson 29 through Lesson 31 fields
A read-only service account pattern for BI tools with explicit deny lists on write APIs
A partitioning rule keyed on promotion_date so historical trains stay fast to scan
A published view layer that hides webhook payload columns from casual consumers

Step 1 - Decide the graduation trigger

Move off pure Sheets when at least one of these is true:

you have more than twelve active build_id values in the rolling window and scroll lag breaks standups
two teams need the same numbers with different filters and you are tired of duplicating tabs
you need queryable history for audits without granting edit rights to the ingest automation account

If none apply, stay on Lesson 31. Warehouse work has a carrying cost.

Step 2 - Freeze the logical model before picking a vendor

Start vendor-neutral. Map columns you already export:

Logical entity	Source lessons	Minimum columns
`dim_build`	Lesson 29 closures, Lesson 30 columns	`build_id`, `promotion_date`, `platform_targets`, `owner_lane`
`fact_stability_closure`	Lesson 29 row at close	`build_id`, `yellow_open_close`, `stability_signal`, `dialogue_signal`, `closure_utc`
`fact_release_train_snapshot`	Lesson 30 grid	`build_id`, `train_note`, `recurrence_flag`, `rolling_window_label`
`stg_ingest_batch`	Lesson 31	`ingest_batch_id`, `source`, `ingest_utc`, `row_count`, `checksum`

Add one surrogate key only if your warehouse makes joins easier. Otherwise build_id plus ingest_batch_id stays the composite truth anchor, matching your Lesson 26 analytics confidence habit of naming the evidence object explicitly.

Step 3 - Land data in staging, promote to facts

Never let webhooks or CSV loaders write straight into BI-facing views.

Landing tables or buckets accept raw files exactly as Lesson 31 produces them.
Staging applies type casts, rejects unknown headers, and quarantines bad rows with a reject_reason column.
Facts receive idempotent merges keyed on (build_id, ingest_batch_id) the same way you described in Lesson 31.

If a row fails staging, the dashboard should still show the last good fact row for that build_id, not a blank cell. That is a player-trust issue dressed up as SQL.

Step 4 - Create the read-only service account

Create two principals:

Principal	Purpose	Permissions
`svc_rpg_ingest_writer`	Lesson 31 automation	insert into landing and staging only
`svc_rpg_bi_reader`	Looker Studio, Metabase, Hex	select on approved views, no insert, no delete

Document a one-page deny list: the reader account must not call job APIs, bucket delete APIs, or secret managers. If your cloud console makes that annoying, that is a signal you are still on the default broad role. Narrow it.

Step 5 - Publish analyst-safe views

Ship at least these views:

vw_release_train_current_window mirrors Lesson 30’s rolling window rule in SQL so every consumer uses the same predicate.
vw_yellow_recurrence encodes the recurrence flag logic so exec summaries cannot drift from engineering definitions.
vw_build_lookup joins dim_build to the latest successful ingest_batch_id per build_id.

Name columns the same way Lessons 28 and 29 name them in briefings so Lesson 28 patch-readiness packets still read familiar when someone deep-links from a slide deck.

Step 6 - Partition and prune with intent

Partition fact tables on promotion_date month (or week if you ship very often). Keep cold partitions for legal or publisher audits, but set a documented retention cap for raw webhook JSON if it includes stack snippets. Align that cap with what you already decided in Lesson 31 about PII.

Pro tips

Pro tip: Treat ingest_batch_id like a Git tag. If a batch is wrong, you roll forward with a corrective batch, not silent edits.
Pro tip: Mirror Launch Lesson 24 scorecard rhythm by materializing a monthly summary table fed only from vw_release_train_current_window, so commercial and stability narratives share one clock.
Pro tip: Add a cheap row count diff alert: landing row count versus staging promoted count should match within your duplicate tolerance every run.

Common mistakes

Granting the BI tool the same OAuth app your writers use, then wondering why someone dragged a formula over production ingest credentials.
Letting analysts create private extracts with stale build_id filters that contradict the Lesson 21 style control panel thresholds your launch track already published.
Skipping stg_ingest_batch metadata, which makes it impossible to answer “which upload broke the train view?” in under five minutes.

Troubleshooting

Symptom	Likely cause	Fix
Dashboard shows double trains	merge key too narrow	require `(build_id, ingest_batch_id, source)` uniqueness in staging
Warehouse costs spike	full table scans on JSON	move parsed columns into typed fact columns, partition by date
BI tool timeouts	view joins too wide	precompute `vw_release_train_current_window` on a schedule

Mini challenge

In one document, write:

The exact CREATE VIEW (or equivalent) definition for vw_release_train_current_window, including the rolling window predicate in plain English and SQL.
The IAM or role binding list for svc_rpg_bi_reader.
A rollback note: how you would freeze reads and revert to Lesson 31 CSV-only mode for one release week.

FAQ

Do I need dbt or Airflow on day one?

No. Scheduled SQL in your warehouse console plus version-controlled .sql files in Git is enough until you have three failing merges a month.

How do I keep warehouse schema in sync with Lesson 31 CSV headers?

Add a CI check that parses the header row of a golden sample CSV and compares it to a checked-in schema.json. Fail the build when someone renames recurrence_flag without updating the loader.

What if we use SQLite or Postgres instead of a cloud warehouse?

The same contracts apply. Use schemas, roles, and views. Swap partition syntax for table inheritance or monthly tables.

Lesson recap

You can now:

describe a minimal star layout for RPG stability metrics without over-modeling
separate writer and reader credentials for ingest versus BI
expose only vetted views that match Lesson 29 through Lesson 31 language
partition and retain data with the same seriousness you apply to player crash bundles in the crash triage tooling overview

Next lesson teaser

When dashboards multiply, add query budgets and billing guardrails so warehouse spend stays predictable: Lesson 33: Warehouse Query Budgets and Cost Guardrails for RPG Live-Ops Metrics. Until then, instrument warehouse spend per ingest_batch_id week and review it beside your Lesson 30 train notes.

Related learning

Lesson 31: Rolling Stability Dashboard CSV Export and Webhook Ingest Automation
Lesson 30: Multi-Patch Rolling Stability Dashboard
Lesson 26: Analytics Confidence Review and Live-Ops Handoff
Unity Cloud Save Conflict Resolution Overwrites Newer Data — different domain, same insistence on explicit merge keys and visible failure modes.

If this lesson stopped a silent schema drift from erasing trust in your yellow mitigation story, save it beside your Lesson 28 briefing template runbook.