Gapless looping MP3 tracks
This article has two objectives: to be a "how-to" guide for creating an MP3 track that can loop gaplessly, and to give the details of how it works. The first section is pragmatic: it simply tells you what you need and how to use it. If you are more interested in knowing how the ready-made software performs the trick, you will find the answers in the second part of this article.
Especially the second section assumes that you know a bit of digitized audio and that a DOS box (or "console") program does not scare you.
Downloads
-
The MP3Loop utility - for Microsoft Windows
(725 KiB, version 1.4A)
that serves as a proof-of-concept of the technique described in this article.
The software comes as both a GUI utility and as a console ("DOS box") utility.
Both versions require a
WAV
file as input. - Blues-Loop.mp3 (203 KiB): a track that has been "prepared" according to the procedure in this article. The original clip is below.
- Blues-Loop.wv (423 KiB): a gapless looping clip, compressed in WavPack format. This is a local copy of the clip "blues.loop03.wav" by dobroide; it is available from the Freesound Project web site. Apart from packing the clip with WavPack (a lossless compression algorithm), the clip has not been modified or resampled.
-
Two classical music loops
(1.4 MiB):
a ZIP file with the MP3 tracks together with their source "
WAV
" files in WavPack format. These clips come from the Loopology content for Adobe Audition.
Part 1 - Creating loops
The first step in creating an MP3 track that loops without gap or "plop", is to
have a gapless looping clip as an uncompressed "WAV
" file. This is
a challenge in itself, and it takes a good audio editor to do it. When possible,
cut off the audio at the "zero crossings", where the audio wave crosses the
zero-amplitude line. I do not cover this step in this article (even though it is
arguably the most important step), because the topic of this article is a
technique to avoid gaps introduced by the MP3 compression format.
The WAV
file may be mono or stereo, but it must use a 16-bit sample
resolution; 8-bit files are not supported. All sample frequencies for
MPEG 1 Layer 3 (32 kHz, 44.1 kHz, 48 kHz) are acceptable,
but I advise to stick with the common 44.1 kHz sample frequency. The lower
sample frequencies of MPEG 2 and MPEG 2.5 are not supported.
If the input WAV
file has a different sample frequency than what
standard MP3 sample frequency you want to use, you should resample the original
WAV
file so that it has the correct sample frequency already. That
is, you should avoid that the MP3 encoder resamples the input file to a different
sample frequency.
What you need
To convert the WAV
file to an MP3 file, you use an encoder.
There are many MP3 encoders (Fraunhofer, LAME, Blade, ...) and several may be
suitable. This article focuses on LAME,
because it is one of the best encoders and it conveniently stores its own
configuration in a special "header frame" of the MP3 file (called the
"Xing/Info tag" or the "LAME tag").
To make the MP3 clip loopable, you need an additional utility, MP3Loop, that prepares the WAV file prior to encoding and postprocesses the file afterwards. This second utility "folds around" the encoder: it does something to the input file before the encoder transforms it to an MP3 file and it does something to the result after the encoder has completed running.
The utility accompanying this article includes both the MP3Loop utility and the LAME encoder as a DLL, for Microsoft Windows.
Creating the MP3 track
After you have downloaded and installed the MP3Loop
utility,
you can create a loopable MP3 track from a WAV
file by running the GUI version of the MP3Loop
utility. First open
a file, then click on the button "Create MP3 loop". (No options can be set with
the GUI version of the utility; it always uses lame_enc
, the LAME encoder.)
Alternatively, in a console window (or "DOS box"), you can use the following command:
mp3loop myloop.wav
If all works well, the output file is called "myloop.mp3" or "myloop.wav.mp3".
Depending on the encoder, you may need to add additional command line options.
In the default configuration, MP3Loop
launches LAME with the
options: --nores
and -v
, see below for details on
these options. Since no encoder is specified on the command line,
MP3Loop
uses LAME. By preference it uses the DLL version of LAME,
lame_enc.dll
, and otherwise it looks for lame.exe
somewhere in the search path. If LAME is present on your system, but it is not
in the search path, use the command line option --encoder=...
to
specify the path to lame.exe
.
If you are not using LAME, you have to add the option --encoder=...
to the command line and add any options that this alternative encoder needs.
One setting is mandatory: no bit reservoir; with the LAME encoder, the
command line option is "--nores
". I also advise to use variable
bitrate (VBR), since disabling the bit reservoir can cause audio quality degradation
for constant bitrate (CBR) files. The option "-v
" makes LAME create a VBR file.
If the encoder lacks an option to disable the bit reservoir, it cannot be used
with MP3Loop
.
The Fraunhofer MP3Enc encoder has no explicit option to disable the bit reservoir, but the encoder does not use the bit reservoir when encoding at a bitrate of 320 kbps. The encoder delay for MP3Enc is 1159 samples. As an aside, note that you must run MP3Enc in Windows 98/ME compatibility emulation if you run it on Windows 2000 or Windows XP (and likely any later version of Microsoft Windows).
When using a different encoder (i.e. not LAME), you should also verify
the "encoder delay"; if it is different from 576 (the value for LAME), you must
set it when running MP3Loop
, option --encdelay
.
Below is a list of the command line options that MP3Loop
accepts.
Any option that MP3Loop
does not recognize, it passes on to the
encoder. However, if you use lame_enc.dll
, the DLL version of
LAME, there is only one option that is taken into account: --quality
.
Option | Description |
---|---|
--help | Shows a brief overview of the command line syntax and the options. |
-? | Same as --help . |
--encdelay=value | Sets the encoder delay; check the MP3 encoder for this value. The default is 576, which is the encoder delay for recent versions of LAME. |
--encoder=path | The path and filename of the MP3 encoder. The default is "lame.exe". When using another encoder than LAME, default LAME settings (--nores and -v ) are switched off. |
--leader=value | The number of frames to strip from the head of the MP3 track after encoding. You do not need to use it when encoding with LAME, as this value is read from the Xing/Info tag. |
--quality=value | Sets the quality to use for the encoding when using lame_enc.dll . The quality value must be between 1 (lowest) and 10 (highest); the default quality is 7. The higher the quality, the bigger the MP3 file. |
--trailer=value | The number of frames to strip from the tail of the MP3 track after encoding. As with --leader , you use it only when encoding with something other than LAME. |
Using the sound loops
When you have made a loopable MP3 track, there are two approaches to test it. Unfortunately, the simplest approach: "load the file on a PC in Winamp or Media Player, then play with repeat", does not work. These software players are file-based and they shut down the decoder at the end of the file before cycling back to the beginning. The net effect is that the player incurs the decoder delay at the start of every iteration. The decoder delay causes a short silence of 12 ms, see part 2 of this article for details.
For proper looping, the MP3 decoder should not be shut-down at the end of the track, but instead the data should wrap around from the end to the beginning. Below are a few ways to do this.
a) Using a hardware MP3 decoder
If you have a hardware MP3 player that supports looping, such as (hint, hint, ...)
one of our programmable MP3-players,
store the track on a memory card (CompactFlash, SD-card) under the filename
"myloop.mp3
", along with the executable code (the "autorun.amx
"
file) of the following script:
main() play "myloop.mp3", 255
If you call your track something other than "myloop.mp3", you can modify the script
accordingly. The value at the end of the play
command gives the
repeat count (0 to 254); the a value of 255 means infinite repeat. For information
on programming the H0420 programmable
MP3-player or the Starling audio controller,
please see the references at the end of this article.
b) Using Adobe/Macromedia Flash
Although some people have reported success with gapless looping sounds in Adobe Flash, the Flash run-time appears to reset the decoder while playing a loop, and this causes the loop to be not exactly gapless. This will have to be investigated further.
Flash can play audio that is embedded in the SWF file or that it loads from a separate file. The latter case is especially convenient if the sound to play depends on selections made by the user and/or by the web site, because packing all possible sound files inside the SWF file would make it large and slow to load. To reduce the transfer time further, the dynamically loaded files are often in MP3 format.
The simplest way to play a loop in Flash is to load the MP3 track as non-streaming
(the false
parameter in function loadSound
) and obviously
you must use a loopable MP3 track. A snippet of ActionScript that illustrates
this is below:
track = new Sound(); track.loadSound("myloop.mp3", false); function StartPlayback() { track.start(0,999); // 0=start position in seconds; 999=loop count } track.onLoad = StartPlayback;
In the above snippet, the track gets loaded completely before it starts being
played in a loop; i.e. it is non-streaming. Playing streaming tracks in a
loop is more difficult, especially because the start
command
ignores the second parameter (the loop count) in streaming mode. Your ActionScript
must therefore detect the end of the track and restart it itself. The simplest
way would be to catch the onSoundComplete
event, but the loop is
unfortunately not gapless with this method. The following note and snippet are courtesy of
Justin Gitlin (http://www.factorylabs.com).
Flash's internal MP3-playing capabilities aren't quite exact, so restarting the MP3 on theonSoundComplete
event doesn't loop perfectly —there will be a very short gap in the sound. I found a resource online that explained a great solution to work around Flash's MP3-playing shortcomings. The following code will take this into consideration and loop seamlessly with a little tweaking of the variables:// set up looping variables based on your // mp3 encoding software var leader:Number = 50; // gap at the start of the mp3 (in ms) var follower:Number = 50; // gap at the end of the mp3 (in ms) var placeToStop:Number; // load mp3, start playing, and check the duration var track:Sound = new Sound(); track.loadSound("test.mp3", false); // false = non-streaming track.onLoad = function(){ track.start(); placeToStop = track.duration - follower; } // run interval to constantly check the position of the mp3 // and restart after the initial mp3 gap. var intervalVal:Number = setInterval( runMe , 1 ); function runMe():Void{ if (track.position > placeToStop) { track.stop(); track.start(leader/1000); } }
c) Using your own "streaming MP3" program
The low-level codec APIs in DirectX and other libraries work in a similar way as that of hardware decoders: they expect buffers with data to be "pushed in" in a regular fashion. If you build a program using these APIs, you can create a gapless loop by wrapping around to the beginning of the file as soon as you get to the end of the file.
For purposes of illustration, I modified of one of the sample programs of the FMOD library. This library supports streaming audio, from disk, memory or HTTP. When streaming from disk, it is up to the programmer to implement the file access functions. In the sample program "STREAM" (that comes in the FMOD archive), the function that reads the samples is:
int F_CALLBACKAPI myread(void *buffer, int size, void *handle) { return fread(buffer, 1, size, (FILE *)handle); }
What we need to do, is to wrap back to the beginning of the file as soon as we skip past the end. The modified function that is below does exactly that. It is slightly more complex than needed, because it takes into account the special case of a very short MP3 clip: one that would fit in the buffer completely and leave room to spare.
int F_CALLBACKAPI myread(void *buffer, int size, void *handle) { int numread; int total = 0; while (size > 0) { numread = fread(buffer, 1, size, (FILE *)handle); size -= numread; total += numread; buffer = (unsigned char*)buffer + numread; // if there is more to read, wrap around and continue if (size > 0) fseek((FILE *)handle, 0, SEEK_SET); } /* while */ return total; }
With only this single modified function FMOD now plays a gapless loop.
d) Using concatenated files
If you lack a hardware MP3 player, Adobe/Macromedia Flash and the patience/skills to fiddle with FMOD, a simple way to test the gapless looping of the MP3 file is to concatenate a few copies of the track into a new track and play that one in a common file-based player. To create such a track chain under Microsoft Windows, type the following command in a DOS box (or "console window"):
copy /b myloop.mp3 + /b myloop.mp3 + /b myloop.mp3 testloops.mp3
The above command concatenates three copies of the file "myloop.mp3" into the file
"testloops.mp3". You can add more copies if you wish. The /b
switch
sets binary mode for each source file; this is required, otherwise copy
assumes text files when concatenating and inserts a carriage-return/line-feed
sequence in the middle of the output file. The concatenation mode is activated
through the "+
" characters between the source files.
Part 2 - Theory of operation
The properties of the MP3 file format make it difficult to play a sound clip in a loop without a gap (silence) or an audible "plop" at the junction point. The MP3 file format seems to insist on having silence at the start and the end of the MP3 track (with a duration in the order of 10 ms to 50 ms).
The common solution to gapless looping is to use a decoder that is able to skip a programmable amount of samples that it decodes. Skipping means not only not playing them but also not allotting time for the samples. The samples get decoded and immediately discarded. Obviously the decoder must work ahead by at least the number of samples that it plans to discard; in a typical case this could be a few frames. To instruct the decoder which samples to discard, there is a dummy frame in the MP3 file that contains this information. Several software players use this approach: Winamp, for example, can use the information that the LAME MP3 encoder stores in the MP3 file (the "Xing/Info tag").
The H0420 MP3 player uses an STA013 decoder chip, which does not support the Xing/Info tag or any other non-standard frames. Likewise, the VS1052 decoder chip in the Starling audio controller ignores any non-standard frames and tags. Both hardware players/controllers also do not have sufficient internal memory to work the requested number of frames ahead. For the purpose of gapless looping in the MP3 file, we need a solution in the MP3 file itself.
Encoder & decoder delays and padding cause a "gap" in the loop when the track played is iteratively
The common belief is that this is impossible (e.g. see the Hydrogenaudio Knowledgebase). The perceived problems are:
- MPEG 1 Layer 3 (MP3) stores a sound clip in frames of 1152 samples and all frames must be full. When you take an arbitrary clip, it is unlikely that it has a multiple of 1152 samples and encoding it into an MP3 file will therefore probably result in a final frame that has "silence" padding up to the next multiple of 1152 samples. Even if the original clip has an exact multiple of 1152 samples, many encoders append another frame of silence as "padding".
- The encoder adds a delay to the first frame. This cannot be avoided, because the internal filter bank needs to process a number of samples before working properly.
- The decoder has a similar delay (decoder delay) and it is unavoidable for the same reasons.
- The Modified Discrete Cosine Transform (MDCT) is an overlapping transform, and the decoding of a frame depends on the previous frame. This means that there is a problem at the loopback point because the decoding of the first frame should depend on the last frame.
- The "bit reservoir" makes the dependencies on earlier frames go back far further than the single frame of the MDCT. Dependencies going back up to seven, eight frames is quite possible.
But in fact, the hurdles in the above list are easily fixed or circumvented. If the backward dependencies caused by the bit reservoir are a nuisance, those are gone if you shut the bit reservoir off. Similarly, if the encoder would need to know the last samples of the audio track in order to encode the first frame, why not just "initialize" the encoder with these final samples before presenting it with that first frame?
The MP3 file format uses a "modified discrete cosine transform" (MDCT), which is somewhat comparable to the well known Fourier transform. The encoder transforms "frames" individually with the MDCT; frames are chunks of audio of typically 26 ms. To avoid "edge transitions" due to consecutive frames being encoded with different parameters, the MDCT takes the samples of the previous frame into account when processing each frame.
It will probably not come as a surprise that some frames are more apt to compression than others. A musical piece has moments of slow progression and moments of flurry. Yet, the original MP3 format allocates an equal number of bits to all frames, and every frame must be made to fit in that amount of bits. Clearly, the audio reproduction quality would diverge widely between frames of varying "activity". The bit reservoir exists to make the audio quality more uniform across frames, by using "spare" bits in low-activity frames for "overflowing" bits of high-activity frames that are ahead. The bit reservoir is a way to store, essentially, frames of varying compressed sizes into a uniform "bitrate". A more direct approach for storing variable frame sizes was "invented" later: Variable Bit Rate MP3 files, or VBR.
The approach to gapless looping
In recipe form, here are the steps in creating a gapless looping MP3 track:
- Make a looping clip as a WAV file
The very first step is to extract or create a set of samples that loop without gap when played as an uncompressed "WAV" file. As stated in the introduction, this is the topic for a different article. - Expand/compress the audio to an exact multiple of 1152 samples
A frame in an MPEG 1 layer 3 file contains 1152 samples (when encoding as version 2 or 2.5, use 576 samples, and also change the numbers in the remainder of the procedure). The clip obtained in the preceding step is unlikely to be a multiple of 1152 samples. So you need to expand or compress the clip, via resampling and interpolation, to an exact multiple of 1152 samples.
Resampling the track to the nearest multiple of 1152 samples changes its pitch, but the longer the original track, the smaller the pitch change, For example, when creating a loop from a 1.3 second track, the deviation may be up to 1%, but for a 13 second track the deviation would be 0.1% maximum. With that in mind, when the original track is short, you may want to concatenate a few copies of it into a new track, so that theMP3Loop
utility sees a longer track (and resampling has less deviation). - Insert the last 1152 samples in front of the start of the clip
When the encoder processes the first frame of 1152 samples, the MDCT would assume "silence" for the predecessor frame (because there is no frame preceding the first frame). For purposes of looping, we need to tell the encoder to take the last 1152 samples as the predecessor for the first frame. The simplest way to do this (rather than writing a new MP3 encoder) is to copy the last 1152 samples in front of the clip before encoding it, and stripping it off later. - Fill up the encoder delay to a complete frame
The encoder delay does not take a full frame. We make this so by inserting silence, with a number of samples that is equal to the 1152 minus the encoder delay (576 samples in recent versions of LAME). At this point, the total number of samples in the clip will now be a multiple of 1152 minus the encoder delay. - Encode the clip, with the bit reservoir disabled
With the bit reservoir disabled, the only interdependency between frames is the MDCT overlap, which we dealt with separately. After encoding, the first true sample will now fall at a frame boundary (due to the encoder delay and our silence padding at the front of the file) and the last sample will fall on a frame boundary too. Actually, one would expect the very last frame of the MP3 file to be completely full, but LAME adds a full frame of silence if the calculation drops out thus. - Cut bogus frames off the MP3 track
LAME will write a frame with an extended Xing/Info tag at the beginning of the MP3 file. This is a full frame of silence as far as a hardware MP3 decoder is concerned. A frame with an encoder delay plus the (1152 - 576) samples of silence from our source WAV file follows. The prefixed frame, with the 1152 samples from the end of the clip, comes next. These three must be cut off, so that the track starts with the true initial samples of the clip. At the end of the MP3 track is a full frame of silence that the LAME encoder added. This frame must also be cut off.
The result of the above stepwise procedure is an MP3 track where all frames are
full of audio data (resampled to a multiple of 1152 samples) and where the samples
of the last frame are the "initialization data" forthe Modified Discrete Cosine
Transform to encode the first frame. The only issue not specifically handled is
the decoder delay, which is 529 samples. This decoder delay will only
happen in the very first iteration: when the clip plays for the first time. In
every next iteration, there is no delay, because the MP3 decoder sees the last
frame as the predecessor for the first frame. If it is important that the clip
starts at exactly the right sample for the first iteration, the entire clip has
to be rotated with the number of samples of the decoder delay, before encoding.
The rotation should be done between steps 2 and 3 in the above procedure. The
MP3Loop
utility has no provisions for rotating the audio clip.
References & further reading
- Programmable MP3-player for exhibitions and kiosk applications
- A description of the MP3 player model H0420, which is programmable and supports looping audio (plus a lot more).
- Starling model H0440, with dual decoder
- A description of the Starling audio player/controller, model H0440, which (like the H0420) is programmable and supports looping audio.
- "Winamp Pro" by nullsoft
- Winamp Pro comes with a licensed version of the LAME encoder DLL (for which the royalties for the MP3 patents held by Thomson/Fraunhofer Gesellschaft have been paid for).
- Gapless - Hydrogenaudio Knowledgebase
- A wiki page on gapless compressed audio and the technical issues, plus test clips. Unfortunately, the test clips are not of a very high "loopable" quality.
- platinumloops
- The platinumloops site offers good quality audio loops (in WAV format) that are free for personal use. You can also become a member and get access to more and longer loops.
- "Loopkit" by Loopheads
- Another site that offers good audio loops (in WAV format) that are free to experiment with.
- Loopology content for Adobe Audition
- Adobe makes a large set of looping, royalty free clips available for its product "Audition" (formerly "Cool Edit Pro"). Two of the example clips on this page come from the classical_orchestral loop set.
- The STA013 MP3 decoder
- This page on a hardware MP3 decoder chip also contains some information on MP3 frames.
- FMOD audio library
- FMOD is a cross platform audio library (for developers) support various audio data formats (MP3, MIDI, MOD, etc.).