Your First OBS Replay Buffer ffmpeg Concat Before Whisper Playtest Batch in One Evening - 2026 Beginner Pipeline

You dropped twelve OBS Replay Buffer MKV files into playtest-vod/inbox/. Each clip plays fine alone. You ran ffmpeg -f concat and got Non-monotonous DTS, a 0-byte output, or audio that drifts two seconds by minute four. You gave up and transcribed clips one-by-one—then wondered why your local Whisper pipeline batch script skipped half the folder.

June–July 2026 facilitators batch Replay Buffer saves after Discord playtests. The failure is almost never Whisper—it is merge discipline before ASR. This Tutorials & Beginner-First pipeline is the evening between capture and transcription: lock OBS fragment settings, ffprobe every fragment, normalize to one audio format, concat with proof, set concat_ok on playtest_vod_triage_receipt_v1.json, then hand off to Whisper.

Non-repetition note: OBS zero-duration audio help is missing audio tracks; this URL is timestamp and concat failures when audio exists. Whisper API 413 chunking is cloud size limits—not local MKV merge. Deep fix: OBS MKV fragments ffmpeg concat help (pairs this tutorial).

Pair with 15 Free Local Whisper and ffmpeg tools, 18 playtest feedback tools, playtest isolation, and BUILD_RECEIPT for build_id on every batch.

Who this is for and what you get

Audience	You will be able to…
First-time playtest facilitator	Merge a session's Replay Buffer clips into one Whisper-ready file
Solo dev	Stop losing Tuesday clips to concat errors
Producer	Require `concat_ok: true` before triage standup

Time: one evening (~90 minutes first setup; 20 minutes per playtest session after).
Prerequisites: OBS Studio with Replay Buffer enabled, ffmpeg and ffprobe on PATH, empty playtest-vod/inbox/ folder convention from the Whisper pipeline blog.

Why this matters now (June–July 2026)

Replay Buffer default — Community playtest ops recommend Replay Buffer; facilitators produce many small MKVs, not one MP4.
Whisper batch scripts — The local VOD triage blog assumes one audio file per session; concat is the missing middle step.
DTS gaps — Mixed sample rates and non-monotonic timestamps explode naive concat—beginners blame Whisper.
Consent and cost — Merged local file avoids re-uploading twelve fragments to a cloud API.
October volume — Fixing merge in July prevents triage collapse when fest playtests multiply.

After normalize + concat, file ffprobe_concat_ok_fragment_v1 using OBS ffprobe concat_ok receipt fields preflight so concat_ok and whisper_batch_allowed stay honest on BUILD_RECEIPT.

Direct answer: fragments/ → per-file ffprobe log → normalize to 48 kHz stereo WAV segments → concat to session_merged.wav → concat_ok in receipt → Whisper once.

Evening overview (four blocks)

Block	Minutes	Output
1 — OBS profile lock	20	`obs-replay-profile.md` with buffer seconds + tracks
2 — Fragment intake + O1–O2	25	Renamed clips + `ffprobe_table.csv`
3 — Normalize + concat O3–O5	35	`session_merged.wav` + concat log
4 — Receipt + Whisper handoff O6	10	`playtest_vod_triage_receipt_v1.json` with `concat_ok`

Mental model — three layers

Layer	Tool	Proves
Capture	OBS Replay Buffer	Last N seconds saved on hotkey
Merge	ffmpeg (this article)	One timeline-safe audio file
Understand	Whisper	Searchable text for triage

Skipping merge and running Whisper per clip works for three files; it fails operationally at twelve with no build_id session story.

Block 1 — OBS Replay Buffer profile lock

Document once in playtest-vod/obs-replay-profile.md:

Setting	Recommended	Why
Format	MKV	Default; supports separate tracks
Replay Buffer	120–180 s	Enough context; not huge files
Audio tracks	Desktop + Mic (if used)	Zero-duration audio if tracks wrong
Filename pattern	`replay_%buildid_%YYYY-MM-DD_%HH-mm-ss`	Sortable; see naming below
Output path	`playtest-vod/inbox/`	Matches triage blog

Hotkey discipline: Facilitators save with build_id spoken aloud or typed in overlay—matches Thursday row review build_id parity.

Outbound: OBS Replay Buffer documentation (official KB).

Naming fragments (beginner rule)

playtest-vod/inbox/
  2026-05-25_session-rc4/
    001_replay_2026-05-25_19-02-11.mkv
    002_replay_2026-05-25_19-14-33.mkv
    ...

Rules:

One folder per playtest session (same build_id).
Three-digit prefix enforces sort order—never rely on filesystem mtime.
No spaces in filenames (Windows shell scripts thank you).

Gates O1–O6 (concat pass)

Gate	Name	Pass criterion
O1	Audio present	Each fragment: `ffprobe` shows audio stream, duration greater than 0
O2	Sample rate known	`sample_rate` and `channels` logged per file
O3	Normalized segments	Each fragment converted to `norm_XXX.wav` same rate/channels
O4	Concat output	`session_merged.wav` exists and duration ≈ sum of inputs ± 2 s
O5	Listen smoke	No speed-up chipmunk, no long silence gaps mid-file
O6	Receipt	`concat_ok: true` in `playtest_vod_triage_receipt_v1.json`

O1–O5 block Whisper when RED. Fix merge before ASR spend.

Block 2 — ffprobe every fragment (O1–O2)

From session folder:

cd playtest-vod/inbox/2026-05-25_session-rc4
for f in *.mkv; do
  echo "=== $f ==="
  ffprobe -hide_banner -show_streams -select_streams a:0 "$f"
done > ffprobe_log.txt

Build ffprobe_table.csv:

file,audio_codec,sample_rate,channels,duration_sec,pass_o1
001_replay....mkv,aac,48000,2,125.4,yes
002_replay....mkv,aac,44100,2,118.2,yes

O1 fail signal	Likely cause	Fix pointer
No audio stream	OBS track matrix	Zero-duration audio help
duration=0	Corrupt save / disk full	Re-capture
Mixed 44100 and 48000	Different OBS sessions	Normalize in O3 (not a fail if O3 passes)

Block 3 — Normalize then concat (O3–O5)

Do not concat raw MKVs when sample rates differ. Normalize first.

Step A — Normalize each fragment to WAV

mkdir -p norm
i=1
for f in $(ls -1 *.mkv | sort); do
  out=$(printf "norm/norm_%03d.wav" "$i")
  ffmpeg -y -i "$f" -vn -ac 2 -ar 48000 -c:a pcm_s16le "$out"
  i=$((i+1))
done

Beginner checks:

-vn drops video—Whisper only needs audio.
48000 Hz stereo is a stable interchange; Whisper accepts 16 kHz later—extract step in triage blog can resample once.

Step B — Concat demuxer list file

cd norm
ls -1 norm_*.wav | sort | sed "s/^/file '/;s/$/'/" > concat_list.txt
ffmpeg -y -f concat -safe 0 -i concat_list.txt -c copy ../session_merged.wav

If -c copy fails with DTS errors, re-encode once:

ffmpeg -y -f concat -safe 0 -i concat_list.txt -ac 2 -ar 48000 -c:a pcm_s16le ../session_merged.wav

Step C — Duration proof (O4)

ffprobe -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 session_merged.wav

Sum duration_sec from CSV; merged duration within ±2 seconds passes O4. Larger drift → missing fragment or double-count—re-run inventory.

Step D — Listen smoke (O5)

Headphones: skimming start, middle, end of session_merged.wav. Chipmunk voice → wrong sample rate assumed. Long dead air → hotkey saved menu idle—not a concat fail, but tag triage low priority.

Block 4 — Receipt and Whisper handoff (O6)

Extend triage receipt from Whisper pipeline blog:

{
  "schema": "playtest_vod_triage_receipt_v1",
  "build_id": "nextfest-oct-2026-rc4",
  "surface": "playtest_invite",
  "session_folder": "playtest-vod/inbox/2026-05-25_session-rc4",
  "fragment_count": 12,
  "concat_ok": true,
  "merged_audio": "playtest-vod/inbox/2026-05-25_session-rc4/session_merged.wav",
  "merged_duration_sec": 1420.5,
  "normalize_profile": "48000_stereo_pcm_s16le",
  "gates": {
    "O1_audio_present": "pass",
    "O2_sample_rate_logged": "pass",
    "O3_normalized": "pass",
    "O4_duration_proof": "pass",
    "O5_listen_smoke": "pass",
    "O6_receipt": "pass"
  },
  "whisper_next": "extract_16k_mono_then_batch",
  "notes": "Replay Buffer 120s; facilitator laptop Win11"
}

Only after concat_ok: true run Whisper extract + transcribe on session_merged.wav (or chunk merged file per API 413 if you chose cloud lane).

Surface and build_id (do not skip)

Field	Source
`build_id`	In-game `build_label` or BUILD_RECEIPT
`surface`	`playtest_invite` vs `fest_public` per isolation playbook

Wrong surface corrupts triage boards—not a concat bug, but receipt must be right before standup.

PowerShell variant (Windows facilitators)

$session = "playtest-vod\inbox\2026-05-25_session-rc4"
New-Item -Force -Path "$session\norm" | Out-Null
$i = 1
Get-ChildItem "$session\*.mkv" | Sort-Object Name | ForEach-Object {
  $out = "{0}\norm\norm_{1:D3}.wav" -f $session, $i
  ffmpeg -y -i $_.FullName -vn -ac 2 -ar 48000 -c:a pcm_s16le $out
  $i++
}

Concat list and merge commands mirror bash; keep paths quoted.

Common concat errors (troubleshooting)

Error / symptom	Cause	Fix
`Non-monotonous DTS`	Raw MKV concat	Normalize to WAV first (O3)
Output file 0 bytes	Empty concat list	Sort + `concat_list.txt` paths
Audio faster than video	Wrong `-ar` on input	Re-normalize from source MKV
Whisper timestamps jump	Per-clip transcribe merged without offsets	Transcribe merged file once
Fragment 7 silent	OBS saved before game audio	O1 fail; exclude from concat
Merged too short	Missing numbered file	Re-check 001–00N sequence

When not to concat

Situation	Action
Fragments from different `build_id`	Separate session folders
One fragment is 2 h rest-of-stream	Transcribe alone; do not merge with 2 min clips
Legal requires per-clip deletion	Keep fragments; document policy
Cloud API only	Still normalize; upload segments under 25 MB each

When concat fails, use Lesson 207 — Whisper path decision tree; the ffmpeg concat failure decision tree expands lane choice—tonight assume local concat + local Whisper.

Facilitator README snippet (paste)

## Replay Buffer → Whisper
1. Save clips into `playtest-vod/inbox/YYYY-MM-DD_session-<build_id>/` with 001_ prefix.
2. Run ffprobe table; fix zero-audio before merge.
3. Normalize to 48 kHz WAV → `session_merged.wav`.
4. Set `concat_ok` in playtest_vod_triage_receipt_v1.json.
5. Run Whisper batch only when concat_ok is true.

Link README from multi-channel facilitator contract when that post ships.

Integration with weekly ops

Day	Ritual	Uses merged audio?
After playtest	This concat pipeline	Creates `session_merged.wav`
Same night	Whisper triage	Yes
Wednesday	Demo smoke	No (binary)
Thursday	Row review	Receipt only

Worked example (twelve fragments)

Input: 12 MKVs, 11× ~120 s + 1× 45 s, mixed 44100/48000 from two OBS restarts.

Step	Result
O1	Fragment 9 fails—re-export from OBS; 11 pass
O3	11 `norm_*.wav`
O4	Merged 1335 s vs expected 1338 s — PASS
O5	No chipmunk
Whisper	One transcript; issues tagged with approximate timestamps

Lesson: Excluding bad fragment beat blind concat of all twelve.

Python batch helper (optional)

from pathlib import Path
import json, subprocess

session = Path("playtest-vod/inbox/2026-05-25_session-rc4")
fragments = sorted(session.glob("*.mkv"))
norm = session / "norm"
norm.mkdir(exist_ok=True)
for i, mkv in enumerate(fragments, 1):
    out = norm / f"norm_{i:03d}.wav"
    subprocess.run([
        "ffmpeg", "-y", "-i", str(mkv), "-vn",
        "-ac", "2", "-ar", "48000", "-c:a", "pcm_s16le", str(out)
    ], check=True)
# write concat_list.txt then ffmpeg concat (see Block 3)
receipt = {
    "concat_ok": True,
    "fragment_count": len(fragments),
    "merged_audio": str(session / "session_merged.wav"),
}
(session / "playtest_vod_triage_receipt_v1.json").write_text(
    json.dumps(receipt, indent=2), encoding="utf-8"
)

Automate after one manual GREEN evening—scripts should not hide O5 listen smoke.

Privacy and retention

Merged WAV still contains player voice—same consent as triage blog.
Delete norm/ intermediates after Whisper if disk tight; keep receipt + transcript.
Do not upload session_merged.wav to public issue trackers.

Outbound references

ffmpeg concat demuxer — official concat docs
Whisper GitHub — model sizes for batch after merge

Key takeaways

Replay Buffer produces many MKVs—merge before Whisper, not twelve separate ASR jobs without a plan.
Run O1–O6: ffprobe audio, log rates, normalize to 48 kHz WAV, concat, duration proof, listen smoke, receipt.
Set concat_ok: true on playtest_vod_triage_receipt_v1.json before batch transcription.
Normalize before concat fixes most Non-monotonous DTS errors beginners blame on Whisper.
Numbered filenames (001_, 002_) beat sorting by clock or mtime.
One session folder per build_id; never merge fragments across builds.
Pair with zero-duration audio help when O1 fails.
~90 minutes first evening; ~20 minutes per session once profile is locked.
Forward-fix depth: MKV concat help.
16-tool concat prep listicle bookmarks tools; this URL is the hands-on beginner pipeline.

FAQ

Why not concat MKV directly?

Different codecs, B-frames, and DTS timelines across hotkey saves break naive concat. WAV normalize is boring and reliable.

Does this replace the Whisper pipeline blog?

No. That blog owns extract → transcribe → triage. This blog owns OBS fragments → session_merged.wav.

What if only three clips exist?

You may skip concat and transcribe per clip—but still run O1 and use the same receipt schema with concat_ok: true and fragment_count: 3 noting per-file mode.

Should I use MP4 instead of MKV in OBS?

MKV is fine if you normalize. MP4 does not remove the need for O3 when rates differ.

Cloud Whisper after concat?

Yes—chunk session_merged.wav under API limits. Local-first teams stay on local Whisper resources.

How does this relate to Thursday row review?

Row review diffs BUILD_RECEIPT rows; concat receipt proves triage inputs were merged correctly for that build_id.

What sample rate for Whisper?

This pipeline uses 48 kHz interchange; triage blog often resamples to 16 kHz mono at extract—one resample step, not three per fragment.

Batch folder layout (release evidence)

Archive proof beside BUILD_RECEIPT when partners ask how playtest feedback was captured:

release-evidence/
  06-playtest-vod/
    2026-05-25_session-rc4/
      ffprobe_table.csv
      concat_log.txt
      playtest_vod_triage_receipt_v1.json
      session_merged.wav          # optional archive; may delete after transcript
      transcript/
        session_merged.txt

Producer rule: Standup slide shows concat_ok, fragment_count, and top three issue titles from transcript—not raw MKV paths.

Silero VAD pre-check (optional O5b)

Before Whisper, run a ten-second VAD sanity check on session_merged.wav if facilitators were silent during menu captures:

# illustrative: your VAD tool prints speech spans
# fail O5b if zero speech spans but duration > 600s

This catches merged menus with no commentary—not a concat failure, but saves ASR minutes. Link Silero docs when you add tools listicle #2.

Compare lanes — concat merge vs per-clip ASR

Approach	Pros	Cons
Merged + one Whisper	One timeline; simpler issue titles	One bad fragment excluded manually
Per-clip Whisper	Isolates corrupt MKV	Twelve transcripts; offset math hell
Cloud API per clip	No local GPU	Cost; consent; 413 on long clips

Tonight picks merged + one Whisper for facilitators with 6–20 clips per session.

Steam Playtest night checklist (facilitator)

Before session	During	After (this pipeline)
OBS profile saved	Hotkey saves with verbal `build_id`	ffprobe table GREEN
Disk 5 GB free	Note game mode + map	normalize + concat
README linked in Discord	No desktop audio-only tracks	`concat_ok` receipt
Isolation playbook read	Tag `surface` in overlay	Whisper triage blog steps

Engine-agnostic note

This pipeline is tooling, not Unity/Godot/Construct specific. Console capture may use different containers—still run O1–O6 on whatever files land in inbox/. Wednesday demo smoke remains the game binary gate; concat is the feedback audio gate.

Mistakes we see in Discord support threads

Quote	Reality
“Whisper broke”	Concat never produced merged WAV
“Only last clip transcribed”	concat_list.txt not sorted
“Chipmunk voice”	Forced 16 kHz on 48 kHz without resample
“Merged 40 minutes”	Included AFK hour—split sessions
“API 413”	Merged file huge—chunk after concat per 413 help

Pointing beginners to O1–O6 before model size debates saves hours.