Lesson 180: AI-Assisted Governance Packet Red-Team Prompts with Human Sign-Off Gate (2026)
Direct answer: Lesson 179 kept regional reviewer comments honest. Lesson 180 lets you use an LLM to stress-test frozen governance packets only as a read-only critic—findings land in an append-only log, and humans alone promote text into signer-visible annex rows.

Why this matters now (mid-2026)
Mid-2026 partner questionnaires stopped asking whether you use AI and started asking what touches submission packets:
- Teams paste model output straight into FAQ-bound annex tables from Lesson 170 — reviewers flag unattested AI-generated annex risk.
- Engineers wire “helpful” auto-fix scripts that rewrite
partner_reply_packetbodies from Lesson 176 after an LLM pass — silent tuple drift meets forbid-silent-rewrite CI failures. - Legal wants a blanket ban on LLMs; product still wants red-team coverage. The defensible middle path is frozen packet in → findings out → human promotion only.
This lesson is that middle path—not a vibe checklist.
Lesson objectives
You will implement:
governance_packet_red_team_runbound topublish_tuple_hashand frozen packet bytes- Versioned
red_team_prompt_templatewith read-only packet scope - Append-only
red_team_findingrows (severity, section ref, model rationale not promoted until human ack) human_sign_off_promotionlane (who promoted, when, diff hash)forbid_auto_mutate_signer_fieldsguard on annex / reply tables- Retention policy for model I/O logs aligned to WORM discipline from Lesson 175
- Publish-gate extension
ai_red_team_human_signoff_pending
Prerequisites
- Lesson 171 — active
publish_tuple_hashfreeze before any red-team run - Lesson 176 — reply packets versioned; red-team never overwrites
byte_hash - Lesson 175 — append-only archive; red-team logs are new objects, not mutations
- Lesson 170 — executive readback / FAQ annex field list defines signer-visible columns
- Optional: 15 Free LLM-Driven NPC Dialogue and Local Fallback Net Resources — moderation + human-gate patterns for game LLM lanes; this lesson applies the same discipline to governance packets
Signer-visible field contract
Maintain a CSV or JSON allow-list exported from Lesson 170 FAQ schema:
{
"signer_visible_fields": [
"annex.executive_summary",
"annex.deficiency_table",
"annex.faq_bound_rows",
"partner_reply_packet.body_markdown"
],
"model_may_write": [],
"human_promotion_targets": [
"red_team_finding.promoted_annex_patch"
]
}
Rule: model_may_write stays empty in production. CI fails if any automation sets those keys.
governance_packet_red_team_run
CREATE TABLE governance_packet_red_team_run (
red_team_run_id TEXT PRIMARY KEY,
publish_tuple_hash TEXT NOT NULL,
packet_archive_id TEXT REFERENCES packet_archive_pointer(archive_id),
prompt_template_id TEXT NOT NULL,
prompt_template_version TEXT NOT NULL,
frozen_packet_sha256 TEXT NOT NULL,
model_provider TEXT NOT NULL,
model_id TEXT NOT NULL,
started_at_utc TIMESTAMPTZ NOT NULL DEFAULT now(),
completed_at_utc TIMESTAMPTZ,
run_status TEXT NOT NULL DEFAULT 'running'
CHECK (run_status IN ('running', 'completed', 'failed', 'aborted')),
UNIQUE (publish_tuple_hash, prompt_template_id, prompt_template_version)
DEFERRABLE INITIALLY DEFERRED
);
Invariant: frozen_packet_sha256 must match Lesson 171 active tuple manifest at started_at_utc. Abort run if tuple promoted mid-flight.
red_team_prompt_template
Store templates outside the packet:
CREATE TABLE red_team_prompt_template (
prompt_template_id TEXT NOT NULL,
template_version TEXT NOT NULL,
system_preamble TEXT NOT NULL,
user_prompt_skeleton TEXT NOT NULL,
max_output_tokens INT NOT NULL CHECK (max_output_tokens BETWEEN 256 AND 8192),
PRIMARY KEY (prompt_template_id, template_version)
);
Skeleton rules (embed in system_preamble):
- You receive read-only JSON of the frozen packet—no instructions to edit it.
- Output only a JSON array of findings:
{ "section_ref", "severity", "finding_text", "suggested_human_action" }. - Never emit replacement annex text—only critiques.
- Flag contradictions between annex totals and leadership rollup language (Lesson 166 epsilon vocabulary).
red_team_finding (append-only)
CREATE TABLE red_team_finding (
finding_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
red_team_run_id TEXT NOT NULL REFERENCES governance_packet_red_team_run(red_team_run_id),
section_ref TEXT NOT NULL,
severity TEXT NOT NULL CHECK (severity IN ('blocker', 'major', 'minor', 'info')),
finding_text TEXT NOT NULL,
suggested_human_action TEXT,
model_rationale TEXT,
created_at_utc TIMESTAMPTZ NOT NULL DEFAULT now(),
promotion_status TEXT NOT NULL DEFAULT 'pending_human'
CHECK (promotion_status IN ('pending_human', 'promoted', 'rejected', 'deferred'))
);
No UPDATE on finding_text after insert—rejections append red_team_finding_event rows instead.
Human-only promotion lane
CREATE TABLE human_sign_off_promotion (
promotion_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
finding_id UUID NOT NULL REFERENCES red_team_finding(finding_id),
promoted_by_email TEXT NOT NULL,
promoted_at_utc TIMESTAMPTZ NOT NULL DEFAULT now(),
target_field TEXT NOT NULL,
promoted_text_sha256 TEXT NOT NULL,
diff_against_frozen TEXT NOT NULL,
signer_ack_required BOOLEAN NOT NULL DEFAULT true,
CHECK (target_field <> ANY(ARRAY[]::TEXT[])) -- replace with signer_visible allow-list check in app layer
);
Workflow:
- Model run completes → findings
pending_human. - Governance owner reviews in UI—copy acceptable phrasing manually into annex editor.
- Insert
human_sign_off_promotionwith SHA-256 of exact promoted bytes. - Route Lesson 174 signer ack if
signer_ack_required. - Only then set
promotion_status = 'promoted'.
Forbid auto-mutate on signer tables
CREATE OR REPLACE FUNCTION forbid_auto_mutate_signer_fields()
RETURNS TRIGGER AS $$
BEGIN
IF current_setting('app.red_team_actor', true) = 'llm_pipeline' THEN
RAISE EXCEPTION 'signer_field_auto_mutate_forbidden';
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
Set app.red_team_actor = 'llm_pipeline' only inside the isolated worker role. Human editors use human_editor or default.
Log retention (partner questionnaire row)
| Log type | Retention | Storage class |
|---|---|---|
| Raw model request/response | 90 days hot | Encrypted object store, separate prefix red-team-logs/ |
red_team_finding rows |
Life of cert_window_id + 24 months |
Database + export to WORM per Lesson 175 |
human_sign_off_promotion |
Same as findings | Tied to annex version semver |
Export monthly red_team_retention_manifest.json with object keys + SHA-256 for intake packets.
Publish-gate coupling
Extend Lesson 171:
-- Pseudocode policy
block_reason = 'ai_red_team_human_signoff_pending'
WHEN EXISTS (
SELECT 1 FROM red_team_finding f
JOIN governance_packet_red_team_run r USING (red_team_run_id)
WHERE r.publish_tuple_hash = :active_tuple
AND f.severity IN ('blocker', 'major')
AND f.promotion_status = 'pending_human'
);
Distinct from follow_the_sun_handoff_stale (Lesson 179) and mock_audit_carve_back_pending (Lesson 178).
Six-step red-team procedure
- Freeze tuple — confirm Lesson 171 gate green for promotion into red-team, not out.
- Snapshot bytes — hash frozen packet; store
frozen_packet_sha256on run row. - Execute template — single model pass; save raw log to
red-team-logs/prefix. - Parse findings — insert append-only rows; zero rows promoted automatically.
- Human triage — owner marks
promoted/rejected/deferredwith promotion table for accepts. - Signer ack — Lesson 174 route for any
promotedblocker; re-run publish gate.
Common mistakes
- Promoting model text without
human_sign_off_promotionrow — fails 2026 annex AI attestation questions. - Re-running red-team after tuple promotion without new run id — findings reference stale hash.
- Letting CI “fix” annex typos via LLM — bypasses human lane; use spell-check locally on promoted text only.
- Deleting raw logs at 30 days when partner asks for 90-day evidence.
- Using red-team to mutate
partner_reply_packet.byte_hash— use new semver row per Lesson 176.
Verification checklist
- [ ] Worker role cannot UPDATE
partner_reply_packet.body_markdown(trigger fires). - [ ] Blocker finding with
pending_humanblocks publish with distinctblock_reason. - [ ] Promoted text SHA-256 matches annex editor export.
- [ ] Raw log object exists for each
red_team_run_id. - [ ] Lesson 170 FAQ export lists count of AI-assisted findings promoted vs rejected.
Mini exercise (90 minutes)
Freeze a synthetic packet with one intentional FAQ contradiction. Run red-team template v1. Confirm model returns a major finding. Attempt scripted UPDATE on annex via llm_pipeline role—watch trigger fail. Manually promote corrected sentence, insert human_sign_off_promotion, clear gate.
Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
| Gate stuck after “fix” | Finding still pending_human |
Reject or promote all blocker/major rows |
| Partner says no AI log | Retention manifest missing | Export red_team_retention_manifest.json |
| Duplicate findings | Re-run same template version | Bump template_version or abort duplicate run |
| Signer rejects promoted text | Promotion bypassed FAQ schema | Re-promote only allow-listed fields |
Pro tips
- Tip — Template semver: bump minor when partner questionnaire adds a new annex section.
- Tip — Pair with resource list: game NPC LLM fallback discipline and governance red-team share the same human-before-signer rule.
- Tip — Defer info findings: do not block publish on
info—partners care about blockers/majors only.
Next lesson teaser
The next lesson (Lesson 181: Q1 2027 Cert Intake Rehearsal Calendar Export from Lesson 172 Rubric JSON (2026)) exports the 172 rubric into ICS + governance README so Q1 2027 tabletop rooms are not double-booked in October 2026 planning decks. Lesson 180 keeps AI out of signer-visible text; Lesson 181 keeps calendar chaos out of intake week.
Continuity
- Paired Unity guide chapter (next Guide-Create pass): Unity 6.6 LTS OpenXR governance red-team prompt exporter preflight — ScriptableObject template + frozen packet hash gate before any LLM call.
- Lesson 179 — regional ingestion findings may feed red-team
section_reftags. - Lesson 178 — carve-back tickets should not be auto-closed by model suggestions.
- Lesson 177 — dictionary columns in frozen packet must match migration phase.
- Lesson 176 — promoted patches create new reply packet semver, never in-place hash rewrite.
- Lesson 175 — red-team raw logs append to WORM prefix, not mutate archive rows.
- Lesson 174 — signer ack on promoted blockers.
- Lesson 172 — rubric JSON becomes Lesson 181 calendar source.
- Lesson 171 — tuple freeze is the red-team entry gate.
- Lesson 170 — FAQ field allow-list defines signer-visible boundary.
FAQ
Can we ban LLMs entirely and skip this lesson?
Yes for production annex authoring. You still need a written “no AI on signer fields” policy in intake packets—partners ask even when you do not run models.
Does red-team replace mock audit tabletop?
No. Lesson 172 tabletop stays human-scored; red-team is extra critique on frozen bytes before send.
Which model provider?
Any—log model_provider + model_id on the run row. Policy is process, not vendor.
Can deferred findings ship?
Yes if severity is minor/info only. Blocker/major pending_human keeps publish gate red.
Partners in 2026 do not fear LLMs—they fear unattested LLM edits on annexes they must sign. Red-team loudly, promote quietly, and keep every signer-visible byte behind a human thumbprint.