Procedural Audio Workflows with Riffusion and FMOD - Game Audio Tutorial

Procedural and AI-generated audio can speed up sound design and keep your game fresh without recording hundreds of one-off clips. Riffusion turns text prompts into short audio via spectrogram-based generation, and FMOD (or Wwise) gives you adaptive music and dynamic SFX in the engine. Together they form a practical pipeline for indie and small teams.

This tutorial walks you through a simple workflow: generate candidate sounds with Riffusion, clean and trim them, then bring them into FMOD and hook them up to game parameters so music and effects respond to gameplay.

What You Need

  • Riffusion: Run locally (Python + model) or use a hosted interface. You give it a text description and get a short audio clip (often a few seconds).
  • FMOD Studio: Free for indie/small revenue. You build banks, events, and parameters; the game calls the FMOD API to play events and set parameters.
  • Game engine: Unity, Unreal, or another engine with an FMOD integration (official or community).

No deep music theory or DSP is required. You will work with prompts, waveforms, and FMOD events and parameters.

Step 1 - Generate Sounds with Riffusion

Riffusion generates audio from text by imagining a spectrogram and then converting it to waveform. The quality and style depend heavily on the prompt.

Prompting tips

  • Be specific: "dark ambient drone, low rumble, 4 seconds" works better than "scary sound."
  • Mention duration if the tool allows it (e.g. "3 second clip").
  • For game SFX, try: "footstep on gravel," "magic spell whoosh," "UI click metallic," "engine idle loop."
  • For ambience: "wind through trees," "distant thunder," "crowded market chatter."

Workflow

  1. Install or open Riffusion (e.g. Riffusion repo or a Colab/hosted demo).
  2. Enter a prompt and generate. Export the clip as WAV (e.g. 44.1 kHz, mono or stereo as needed).
  3. Generate several variants per need (e.g. 3–5 variations for "footstep gravel") so you can pick the best or layer them.
  4. Listen and trim silence or unwanted tails in an editor (Audacity, Reaper, or DAW).

Pro Tip: Keep a small spreadsheet or doc of prompts that worked well so you can reuse and tweak them for future projects.

Step 2 - Edit and Prepare for FMOD

Riffusion output is often a single clip. For FMOD you usually want clean, loopable or one-shot assets.

Editing

  • Trim start and end so there is no long silence or click.
  • Normalize if levels are low (e.g. -3 dB peak so you have headroom in FMOD).
  • Loop points: For ambience or music loops, set loop start/end in your editor so the loop is seamless, or export with a few seconds of material and let FMOD loop.
  • One-shots: For SFX (clicks, impacts, whooshes), export as one-shot; FMOD will play them on demand.

Organization

  • Name files clearly: ambience_forest_wind_01.wav, sfx_ui_click_02.wav.
  • Put Riffusion-generated assets in a dedicated folder (e.g. audio/riffusion/) so you can replace or regenerate them later without mixing them with recorded library sounds.

Step 3 - Build FMOD Projects and Events

In FMOD Studio you organize sounds into Banks and Events.

Banks

  • One bank per context (e.g. "Music," "SFX_UI," "SFX_Ambience") or one per level if you prefer. This keeps load times and memory predictable.
  • Add your Riffusion WAVs to the project and assign them to the right bank.

Events

  • One-shot SFX: Create a single-event-per-sound (e.g. "UI/Click") that plays one asset. Trigger it from the game when the action happens.
  • Ambience: Create an event that plays a loop (your Riffusion ambience). Use a loop region in the timeline or enable looping on the clip. Optionally add a volume or filter parameter so the game can duck or filter based on game state.
  • Adaptive music: Create a multi-track event with layers (e.g. base, tension, action). Use parameters (e.g. "Intensity" 0–1) to crossfade or switch layers. Feed those parameters from the game (e.g. health, enemy count, phase).

Parameters

  • Define parameters in the event or globally (e.g. "TimeOfDay," "CombatLevel"). The game sets them via the FMOD API; FMOD uses them to choose variations, crossfade, or modulate volume/filter.
  • For Riffusion-driven content, you can use parameters to blend between multiple Riffusion clips (e.g. two ambiences) or to control intensity of a single loop.

Step 4 - Integrate FMOD in Your Game

Each engine has an FMOD integration (Unity, Unreal, etc.). High-level flow:

  1. Load banks at startup or when entering a level (e.g. Studio::Bank::load() or the engine’s FMOD wrapper).
  2. Play events by name or reference when something happens (e.g. EventInstance::start() for "UI/Click").
  3. Set parameters each frame or when game state changes (e.g. EventInstance::setParameterByName("Intensity", value)).
  4. Stop or release events when no longer needed (e.g. when leaving a level or when a one-shot has finished).

Refer to the FMOD documentation and your engine’s FMOD plugin for exact API calls and best practices (e.g. avoiding too many simultaneous events, using snapshots for reverb).

Step 5 - Iterate and Expand

  • Replace placeholders: Start with a small set of Riffusion sounds; replace or add more as you lock design. Keep prompts and settings so you can regenerate in a consistent style.
  • Combine with library audio: Use Riffusion for unique or hard-to-find material (e.g. specific ambience, UI beeps) and keep using royalty-free or recorded library for common SFX if that’s faster.
  • Tune in FMOD: Use FMOD’s mixer, effects, and ducking so Riffusion clips sit well with the rest of the mix. A little reverb and leveling go a long way.

Common Mistakes to Avoid

  • Vague prompts: "Cool sound" rarely gives usable results. Be concrete about style, duration, and context.
  • Skipping editing: Raw Riffusion output often has tails or level issues. Trim and normalize before importing into FMOD.
  • Overloading one bank: Put only what you need per level or context in each bank so load times and memory stay manageable.
  • Ignoring parameters: For adaptive music or dynamic ambience, drive FMOD parameters from the game so the audio responds to gameplay.
  • No backup of prompts: If you need to regenerate later, you will want the exact prompts and settings; document them.

Recap

  • Use Riffusion to generate short clips from text prompts; be specific and keep good prompts for reuse.
  • Edit clips (trim, normalize, set loop points) and organize them before bringing into FMOD.
  • In FMOD, create banks and events (one-shots, loops, adaptive music) and use parameters so the game can control intensity, layers, and blend.
  • Integrate FMOD in your engine: load banks, start/stop events, set parameters from game code.
  • Iterate: Replace and add Riffusion content over time and combine it with library or recorded audio as needed.

Found this useful? Bookmark it for your next audio pipeline sprint, and share it with your team if you are building procedural or AI-assisted game audio.


Frequently Asked Questions

Is Riffusion free to use?

Riffusion’s code and models are open source. Running it locally is free; some hosted or API-based services may have usage limits or fees. Check the project’s current terms and your intended use (personal vs commercial).

Can I use Riffusion output in commercial games?

Yes, but verify the license of the Riffusion model and any service you use. Open-source Riffusion typically allows commercial use; always confirm before shipping.

Do I need FMOD or can I use Wwise/Unity Audio?

You can use the same Riffusion workflow with any middleware or engine audio system. The steps are: generate with Riffusion, edit, then implement in your chosen tool (FMOD, Wwise, or engine-native events and parameters).

Why do my Riffusion clips sound noisy or weird?

Riffusion can produce artifacts or inconsistent quality. Try clearer prompts, different seeds, or multiple takes and pick the best. Light post-processing (EQ, noise reduction) in a DAW can help.

How do I make adaptive music with Riffusion clips?

Export several short loops or layers from Riffusion (e.g. "calm layer," "tension layer," "action layer"). In FMOD, put them in one event with a parameter (e.g. "Intensity") and use automation or crossfades so the game can blend between layers by setting that parameter from code.