Three characteristics can determine the quality and size of a digital waveform: the frequency of the samples, the amount of information stored per sample, and the number of channels recorded.
Samples are taken at the same frequency to divide the waveform into identically sized portions. The more portions (i.e., the higher the frequency), the more quality and disk storage required. More portions also means higher tones in the sound will be recorded; 11.025 kHz sampling only captures tones lower than 5.513 kHz in frequency.
Of course, this is only an approximation; inevitably some information present in the original waveform gets lost in the process. This is why the frequency of the samples affects so directly the quality. The more frequent the samples, the less information lost in approximation.
The three standard sampling frequencies are 44.1 kHz, 22.05 kHz, and 11.025 kHz.
The amount of information stored per sample specifies the precision with which sample is measured. Information per sample is derived by vertically dividing each waveform sample into equal units. An 8-bit sample divides each sample into 256 equal units. A 16-bit sample divides into 65,536 equal units.
The greater the number of vertical units used to describe the waveform characteristics in the sample, the more accurately the sample resembles the original analog waveform. Of course, more information also requires more storage.
The number of units between the baseline and the upper limit of the waveform is sometimes referred to as its dynamic range. For 8-bit samples to divide the waveform into 256 units, the waveform must have a dynamic range that covers all (or most) of the 256 units. If the waveform's dynamic range only covers 128 units, precision (and quality) is reduced—as though only 7-bits were used per sample.
The number of sound channels specifies whether a recording produces one waveform (referred to as monaural or mono) or produces two waveforms (referred to as stereo). Stereo sound can offer a richer listening experience than mono, but also requires twice the amount of storage.
Digital sound files are large, no matter what quality you choose, but the lower sampling rates produce much smaller files than the higher sampling rates. Use this formula to estimate storage needs for audio:
(sampling rate * bits per sample)/8 = bytes/sec
For example, a one-minute monaural sound clip requires the following space:
One minute of monaural music
Bits/sample | Sampling Rate | # Bytes required |
8 bits | 11.025 kHz | 0.66 MB/minute |
8 bits | 22.05 kHz | 1.32 MB/minute |
16 bits | 44.1 kHz | 5.292 MB/minute |