Lesson 22: Prompt Guardrails, Lore Consistency, and Failure Modes
You already reconciled the AI RPG syllabus and identified the first high-value gap. This lesson closes that gap with one practical outcome: a guardrail pattern that keeps generated dialogue inside your game world even when API responses are noisy or delayed.
Your goal is not perfect AI output. Your goal is predictable player-facing behavior under real production conditions.
What You Will Build
By the end of this lesson, you will have:
- A lore contract your prompts must follow
- A two-layer prompt structure (system + scene constraints)
- A response validator for format and lore drift
- A fallback ladder for timeout, refusal, and malformed output
- A quick QA matrix for pass/fail verification
Step 1 - Define the Lore Contract Before Prompting
Do not start with prompt wording. Start with world rules.
Create a lore_contract.json (or ScriptableObject equivalent) that includes:
- region names and faction vocabulary
- canonical timeline boundaries
- banned terms or out-of-setting references
- allowed tone bands (grim, neutral, playful)
- hard no-go content for your age rating target
If this contract is vague, your prompts cannot save consistency.
Mini Task
Write 10 non-negotiable lore constraints for your current vertical slice zone and NPC cast.
Step 2 - Use a Two-Layer Prompt Template
Split your prompt into:
- Layer A: System guardrails (global, reusable)
- Layer B: Scene intent (quest state, NPC motivation, current player context)
This keeps your logic composable and easier to debug.
Example structure:
[SYSTEM]
You are a dialogue assistant for a fantasy RPG.
Follow lore_contract rules exactly.
Never reference modern real-world brands or technologies.
Output valid JSON only with keys: line, intent, flags.
[SCENE]
NPC: Quartermaster Elrin
Location: Ashfall Gate
Quest State: player missing item proof
Tone: firm but not hostile
Goal: provide clue, avoid revealing hidden faction spoiler
Step 3 - Enforce Output Shape and Lore Checks
Add a validation pass before rendering text in UI.
Validation checklist:
- Output is parseable JSON
- Required keys exist (
line,intent,flags) - Line length and profanity filters pass
- Lore contract checks pass (terms, timeline, faction constraints)
If any check fails, route to fallback mode immediately.
Pro Tip
Log validation reason codes (bad_json, lore_violation, banned_term) so QA can cluster failures quickly.
Step 4 - Implement a Failure Mode Ladder
Use one deterministic fallback chain:
- Retry once with stricter condensed prompt
- If still invalid, use pre-authored safe line
- If API unavailable, use quest-state generic fallback
- Mark session for telemetry review
Players should never see raw model errors or blank dialogue bubbles.
Step 5 - Connect Guardrails to Quest State
Prompt quality is not enough if quest context is stale.
Before generation:
- verify current quest flags are loaded
- verify NPC relationship tier is current
- verify spoiler guard by quest progression
Then inject only minimal required state. Overloaded prompts create drift and higher malformed output risk.
Release-Week Hardening Tips
- Freeze lore contract keys 48 hours before candidate build tagging so QA validates one stable ruleset.
- Keep one emergency fallback pack per narrative zone to avoid copy-paste generic lines during live incidents.
- Run at least one degraded-mode playtest per build where API is intentionally unavailable and only fallback paths are exercised.
Common Mistakes
- Treating prompt text as your only safety system
- Allowing free-form output when UI expects structured fields
- Using one fallback line for every NPC and context
- Failing open when API timeout occurs
Troubleshooting
NPC suddenly breaks lore voice after hotfix
Re-check prompt layer versioning and cached system message payload. Version mismatch often causes tone drift.
Output passes JSON check but still feels wrong
Your schema is valid but semantic checks are weak. Add lore keyword gates and quest-state contradiction checks.
Too many fallback lines triggering
Prompt may be too broad, token budget too tight, or output schema too strict. Reduce scene context to essentials first.
Mini Challenge
Create a DialogueGuardrailTest.md with 12 test cases:
- normal success
- malformed JSON
- refusal text
- timeout
- lore-breaking faction mention
- spoiler leak attempt
- banned modern term
- empty response
- over-length response
- wrong tone response
- stale quest state
- fallback telemetry logged
Mark each as pass/fail with reason code.
FAQ
Should we keep retries enabled for every dialogue request?
No. Use retries for high-value interactions only. For ambient or low-stakes lines, fast fallback often protects pacing better than repeated waits.
What is the fastest way to catch lore drift after new quest content ships?
Run a small regression set using the same NPC scenes before and after the content merge, then diff reason codes and fallback frequency.
Lesson Recap
You now have a guardrail-first dialogue workflow:
- lore contract first
- two-layer prompting
- strict response validation
- deterministic fallback ladder
- QA-readable failure telemetry
This is the baseline for safe AI RPG dialogue in production, not just in demos.
Next Lesson Teaser
Next, you will implement API failure handling at runtime with user-safe messaging and retry budgets so dialogue quality remains stable during traffic spikes.
Related Learning
If this lesson saved you rework, bookmark it and reuse the validation ladder in every new AI dialogue feature.