DirectX SDK

Sound Data

[C++]

IDirectSound and IDirectSoundCapture work with waveform audio data, which consists of digital samples of the sound at a fixed frequency. The particular format of a sound can be described by a WAVEFORMATEX structure. This structure is documented in the Multimedia Structures section of the Platform SDK documentation, but is briefly described here for convenience:

typedef struct { 
    WORD  wFormatTag; 
    WORD  nChannels; 
    DWORD nSamplesPerSec; 
    DWORD nAvgBytesPerSec; 
    WORD  nBlockAlign; 
    WORD  wBitsPerSample; 
    WORD  cbSize; 
} WAVEFORMATEX; 
 

The wFormatTag member contains a unique identifier assigned by Microsoft Corporation. A complete list can be found in the Mmreg.h header file. The only tag valid with DirectSound is WAVE_FORMAT_PCM. This tag indicates Pulse Code Modulation (PCM), an uncompressed format in which each sample represents the amplitude of the signal at the time of sampling. DirectSoundCapture can capture data in other formats by using the Audio Compression Manager.

For information on using non-PCM data with DirectSound, see Compressed Wave Formats.

The nChannels member describes the number of channels, usually either one (mono) or two (stereo). For stereo data, the samples are interleaved. The nSamplesPerSec member describes the sampling rate, or frequency, in hertz. Typical values are 11,025, 22,050, and 44,100.

The wBitsPerSample member gives the size of each sample, generally 8 or 16 bits. The value in nBlockAlign is the number of bytes required for each complete sample, and for PCM formats is equal to (wBitsPerSample * nChannels / 8). The value in nAvgBytesPerSec is the product of nBlockAlign and nSamplesPerSec.

Finally, cbSize gives the size of any extra fields required to describe a specialized wave format. This member is always zero for PCM formats.

[Visual Basic]

DirectSound and DirectSoundCapture work with waveform audio data, which consists of digital samples of the sound at a fixed frequency. The particular format of a sound can be described by a WAVEFORMATEX type. This type is briefly described here for convenience:

Type WAVEFORMATEX
    lAvgBytesPerSec As Long
    lExtra As Long
    lSamplesPerSec As Long
    nBitsPerSample As Integer
    nBlockAlign As Integer
    nChannels As Integer
    nFormatTag As Integer
    nSize As Integer
End Type

The nFormatTag member contains a unique identifier assigned by Microsoft Corporation. A complete list can be found in the Mmreg.h header file. The only tag valid with DirectSound is WAVE_FORMAT_PCM. This tag indicates Pulse Code Modulation (PCM), an uncompressed format in which each sample represents the amplitude of the signal at the time of sampling. DirectSoundCapture can capture data in other formats by using the Audio Compression Manager.

For information on using non-PCM data with DirectSound, see Compressed Wave Formats.

The nChannels member describes the number of channels, usually either one (mono) or two (stereo). For stereo data, the samples are interleaved. The lSamplesPerSec member describes the sampling rate, or frequency, in hertz. Typical values are 11,025, 22,050, and 44,100.

The nBitsPerSample member gives the size of each sample, generally 8 or 16 bits. The value in nBlockAlign is the number of bytes required for each complete sample, and for PCM formats is equal to (nBitsPerSample * nChannels / 8). The value in lAvgBytesPerSec is the product of nBlockAlign and lSamplesPerSec.

Finally, nSize gives the size of any extra fields required to describe a specialized wave format. This member is always zero for PCM formats.