Introduced in 1993, the SoundFont sample-based synthesis format has become a standard with the proliferation of the Creative Technology SoundBlaster AWE32 sound card (and its sequels Live!, and Audigy) which use the EMU8000 synthesizer engine.
SoundFonts, in a manner analogous to character fonts, enable the portable rendering of a musical composition with the actual timbres intended by the performer or composer. The SoundFont format is a portable, extensible, general interchange standard for sample-based synthesizer sounds and associated articulation data.
A SoundFont bank is a collection of sounds in the SoundFont format. Such bank contains both the digital audio samples which have been captured from a sound source and the instructions to the synthesizer on how to articulate this sound based on the musical or sonic context as expressed by MIDI.
For example, a trumpet could be a particular sound in a SoundFont bank which might contain both recordings of trumpets being played at several different pitches, as well as information which could tell the synthesizer to filter or mute the sounds when notes were played softly, loop information about the sample which would allow a short recording to be stretched into a sustained note and instructions on how to apply vibrato or to bend the pitch of the note based on MIDI commands.
The trumpet sound example above is just like a letter “a” in a type font. The different sounds produced by different keys and velocities of the trumpet in the SoundFont bank are analogous to different displays produced by different sizes of the letter “a” in the type font. Different monitors displaying the letter “a” in different sizes based on their resolution, memory and other hardware capabilities is just like different synthesizers playing the trumpet based on their synthesis capabilities.
SoundFonts come in two flavours: Standard and Compressed. This page describes both types of SoundFonts and their use.
The Musical Instrument Digital Interface (MIDI) language has become a standard in the PC industry for the representation of musical scores.
However, as you probably know, MIDI files do not carry any sound, they are a collection of commands for sound producing equipment (synthesizers). The commands are like: “play note C5 using a guitar sound”.
When it receives the command, the synthesizer must come up with the actual guitar sound. SoundFont files provide that kind of information. SoundFonts carry not only the actual instrument sounds, the so-called samples, but also the so-called articulation data, that is, instructions on how to play the sample data.
When the synthesizer receives a NOTE ON command like the one stated previously, it looks in the SoundFont for the sample corresponding to the desired sound and plays it.
Among the several synthesis methods in use today this is the one that provides more realism as what you hear is actually the sound recorded from a real instrument. The drawback is that it takes a huge amount of data to this.
Imagine that you are to build a SoundFont for a piano. If we assume that each note will sound at most for 10 seconds, you will need, for a 16-bit mono sample at the standard rate of 44100 samples per second (the CD standard), 882000 bytes of data for each note. Considering an 88 key piano, you would need 77,616,000 bytes for the entire set. Now consider that when you strike a piano key with different strengths (called velocity in MIDI parlance), the sound does not only vary in strength but has a different composition. So for more realism, you should multiply the previous number by a set of different velocities. This is what you need for a single instrument; now imagine what you need for the 128 instruments that make up a General Midi set.
One way to solve this problem is to resort to looping: Instead of using a sample for the entire duration of a note, use a smaller sample and repeat it as needed to fulfil the note length required. Unfortunately, by doing this, the sound quality is affected, and some artefacts are introduced, defeating to some extent the objective of sound realism.
On the other hand, the sound of an instrument does not have a steady amplitude for the duration of a note; indeed it follows a pattern, called envelope, which changes almost continuously. So, if you use loops, you must supply the envelope to modulate the sound. Other elements of the sound like tremolo and vibrato must also be simulated. This will push the quality of the sound further away from the original.
Another way to reduce storage requirements for sample data is to use samples for some of the notes, not for all of them, for instance, use samples for every other note. The sound corresponding to the missing samples is generated by interpolation between the sounds of the two closest notes. This again reduces quality and introduces undesirable artefacts.
Conclusion: Good sound quality and realism can be obtained using standard SoundFonts, but it is expensive.
As stated above, sample size can greatly affect the quality of the sound generated by a sample-based synthesis engine. However, standard SoundFonts impose limits on the size of a file and the size of each sample in a file. On the other hand, huge SoundFont files are just too awkward to handle: For instance if a file does not fit in a single CD, it becomes difficult to carry from one system to another.
One way out of this is to compress the sample data. However, to maintain sound quality, the compression and decompression processes must be lossless, that is, after decompression, the sample data must be bit-by-bit identical to the original data.
Compressed SoundFonts have mostly the same information as Standard ones, but compressed in a very efficient way. This allows the creation of very big SoundFonts, which can lead to big improvements in sound quality.
Here are some links to sites that have either SoundFonts or information about them:
Comments, suggestions and bug reports are welcome and should be sent to firstname.lastname@example.org
This page last modified 2003-03-20 - Copyright © 2000-2003 ACE