SCUMM v5 sound — SOUN resources and sound-gated waits

What MI1's SOUN resources contain, and the one place sound timing leaks into game logic: cutscenes and transitions pace themselves by busy-waiting on sound completion, so the interpreter has to know how long each sound plays. This note is the data and that behavior; how GrogVM times sounds to satisfy it is engine/audio.md.

1. Sound-gated waits

The canonical pacing idiom (g#57, g#107, g#108, g#122, g#131, g#143, room-1 ENCD, …):

startSound N
breakHere
isSoundRunning g0 = sound N
equalZero g0 -> -10     ; g0 != 0 (still running) → loop back to breakHere
<loadRoom / startSound next / putActorInRoom / …>

equalZero jumps when the value is non-zero, so this is "yield each game-frame, re-poll, fall through when the sound ends." For the transition to be paced at all, the interpreter must report isSoundRunning truthfully for the sound's real length — which means knowing that length.

The two distinct loop shapes:

VAR_MUSIC_TIMER (14) is a separate clock — auto-incremented per jiffy, polled by the credits cutscene — unrelated to isSoundRunning. There is no wait-for-sound opcode; all sound waits are isSoundRunning polls.

2. The SOUN resource formats

SOUN blocks live in MONKEY.001, indexed by the DSOU lane ({room, offset} per sound id); resolve one exactly like a global script. The top-level SOUN block uses the inclusive size convention (header included); everything nested below it is payload-only (exclusive).

MI1 (CD-DOS-VGA) has 105 SOUN blocks in three timing-relevant shapes.

SOU container → device renditions

A SOU block holds one or more renditions of the same sound for different hardware, the first listed being the primary one:

24-byte CD-audio trigger

Sound ids 100–129 are not SOU containers but 24-byte commands (0x18 …) that trigger a redbook CD track:

Fine print (MI1): one trigger uses the cue — #108 = track 17 from 01 23 30 (1 m 35 s 48 f ≈ 95.6 s, see §4). Bytes 8/9 look like a volume pair (0xc8 c8 everywhere except 0xff ff on the two track-17 triggers); bytes 21–23 are always zero (presumably an end position, unused — playback runs to the track's end).

The track audio is not in MONKEY.001 — it ships as separate TrackN.* files (the IT CD-DOS-VGA rip uses FLAC TrackN.fla, the EN rip MP3 TrackN.mp3; the original pressing had true redbook CD sessions). A trigger's playback length is therefore that track file's length. The IT and EN encodes agree to within a couple of frames (same music).

MI2 has no external track files — its sounds are all SOU containers (SBL/MIDI in MONKEY2.001), so the CD-trigger shape doesn't appear.

3. Which sounds gate

Every MI1 wait-gated sound is one of: an SBL effect (#28, ~2.7 s), a MIDI piece (#50, ~4.8 s), or a one-shot CD track (#104–107 = track 6, the ~12.5 s "Il Viaggio" voyage theme; #117 = track 7). No wait-gate ever polls a looping sound, so none can hang. Looping CD music (byte 17 = 0xff) plays until explicitly stopped and never gates a wait.

4. A worked example — the title → lookout music handoff

The opening of MI1 shows how the pieces compose. CD track 17 holds two musical segments back to back: the title theme, then the lookout piece from ≈ 95.6 s in. Two different triggers play the two halves:

  1. The credits/title cutscene (global #152, room 10) starts #110 — track 17 from the top, one-shot — and stops it on both exits: the natural end waits on VAR_MUSIC_TIMER > 5700 (≈ 95 s, the length of the title segment), and the ESC override path runs the same stopSound 110; endCutScene. The theme never survives the title.
  2. Boot then proceeds to the lookout (room 38). Its ENCD starts the room's AdLib bed #98 only when the boot script is not running (getScriptRunning(1) gate) — i.e. on later revisits (room 37 starts #98 unconditionally). On the boot path the lookout cutscene (room-local #203, via local #200) instead starts #108 — track 17 again, cued at 01 23 30: the lookout segment.

So the 5700-jiffy gate and the #108 cue point to the same seam inside one CD track; the music "changes" at the lookout because playback jumps to the track's second half.

5. Beyond timing — the audio payload

Beyond the timing above, a SOUN also fully describes its audio — the SBL samples, the MIDI note stream for each device, the iMUSE control data (soundKludge / VAR_SOUNDRESULT; 0 MI1 uses, loud-halts). Synthesizing it is output-backend territory — see engine/audio.md §4.