June 1998

Exploring DirectX 5.0, Part II: DirectSound Gives Your Apps Advanced 3D Sound
Download Jun98DirectSoundcode.exe (1,217KB)
| Jason Clark supports software core development for Microsoft. 
He believes that logic is pure science. He can be reached at jclark@microsoft.com 
 | 
| Sound rounds out
  the gaming experience by providing feedback to the ears. Even in the days of endless beeps and whizzing noises, a game just wasn't the same without sound. The value of sound is a given; the question today is one of sophistication. Will your application generate yesterday's beeps and whizzing noises, or will it offer "three-dimensional" sound that is seamlessly integrated into your game's alternate reality? DirectX® 5.0 lets you add sophisticated sound effects to your Windows®-based applications without too much pain. In my last article ("May the Force Feedback Be with You: Grappling with DirectX and Direct Input," MSJ, February 1998), I introduced DirectX 5.0 with a discussion of DirectInput®, including the new force feedback features. This time I will delve into the features offered by the DirectSound® component of DirectX 5.0. If you are unfamiliar with DirectX, take a moment to scan the February issue of MSJ and you will be ready to roll. 
 Introducing DirectSound!
 
 Getting Started with DirectSound 
 
 Creating a DirectSound Object 
 | 
|  | 
| The first parameter represents the GUID of the sound card for which you want to create a DirectSound object. As mentioned before, passing a NULL value here requests the default sound card for the system. I will discuss how to find other GUIDs shortly. The second parameter is the address of a pointer to an IDirectSoundInterface. This parameter is potentially confusing if you are new to COM. Remember that all COM objects are manipulated using interfaces. Your goal in calling DirectSoundCreate is to retrieve a pointer to an interface that you can use to manipulate the Object. Your application should define a variable (possibly global) of type LPDIRECTSOUND and pass its address as the second parameter to DirectSoundCreate. The last parameter is known as pUnkOuter and has to do with aggregation. DirectSound does not currently support aggregation so you must pass a NULL for this parameter. The return value is an HRESULT, which can be checked against the possible error values for this function. You can also apply one of the COM SUCCESS or FAILED macros to check the success of the call. As you can see, it is reasonably simple to retrieve a pointer to an IDirectSound interface for the system's default sound card. Although uncommon, it is possible for a system to have more than one sound card installed. If this is the case, you may want to create a DirectSoundObject for a card other than the system default. This requires that you pass a GUID for the card as the first parameter to DirectSoundCreate. You can obtain this GUID by enumerating available sound cards on the system through the DirectSoundEnumerate function. DirectSoundEnumerate is defined as follows: | 
|  | 
| If you are familiar with other enumeration APIs in Windows, such as EnumWindows, then you will feel right at home with DirectSoundEnumerate. You simply pass a pointer to an application-defined callback function and an application-defined 32-bit value. The callback function will be given a GUID for each sound card on the system. Applications that enumerate sound devices commonly call DirectSoundCreate from within the callback function. Last, I should mention that it is possible to create an instance of a DirectSound object using the standard COM function CoCreateInstance. Unlike DirectSoundCreate, this will return an IDirectSound interface to an object that has not been initialized and is not affiliated with any sound card on the system. Before the Object can be used you must call the Initialize member function of the interface and pass the GUID of a sound card or NULL for the default sound card. If you are familiar with COM or you are using other COM objects in your application, then you may be more comfortable with this approach to creating a DirectSound object. The choice is entirely up to you. You now have an IDirectSound interface at your disposal, so what next? 
 Using the IDirectSound Interface 
 
 Buffers and Waves 
 | 
|  | 
| Figure 1 Digitally Sampled Sound Data | 
| 
 | 
|  | 
| At this point you should be familiar with the return value HRESULT, and as with the rest of DirectSound, pUnkOuter should be NULL. The second parameter, lpcDSBufferDesc, is the address of a DSBUFFERDESC structure that describes the buffer you want to create. The third parameter is the address of an LPDIRECTSOUNDBUFFER variable, which will receive a pointer to the new buffer's IDirectSoundBuffer interface. The DSBUFFERDESC structure is fairly simple. The first member, dwSize, should be initialized to the size of the structure in bytes. The second member, dwFlags, is the most complicated; I'll come back to it in a moment. The dwBufferBytes member is the length in bytes of your sound data. The dwReserved member is currently unused, but must be initialized to zero. The last member of DSBUFFERDESC is a pointer to a WAVEFORMATEX structure, which contains format information for the sampled sound such as frequency and resolution. I will talk more about this structure when I discuss how to get sampled data out of a .WAV file. The dwFlags member of DSBUFFERDESC tells DirectSound how you intend to use this sound buffer. If you want an interface for the primary buffer, include the DSBCAPS_ PRIMARYBUFFER flag in the dwFlags member; otherwise, a secondary buffer is created. Static buffers are created if the DSBCAPS_STATIC flag is OR'd into the dwFlags value. Other flags determine whether the buffer is stored in the sound card's memory or in your system's memory, and whether your application can play its buffer when it doesn't have focus. Finally, there are flags that tell DirectSound which control features you want for the buffer. the Se include DSBCAPS_CTRL3D, which indicates that the buffer can participate in 3D sound, and DSBCAPS_CTRLPAN, which indicates that the buffer's output can be panned from one speaker to the next. It is important to include only the CTRL flags that are necessary for a particular buffer so that DirectSound can optimize performance for that buffer. But don't neglect to call a CTRL flag if you need it. A call, for example, to the SetPan member function of the IDirectSoundBuffer interface will fail if the DSBCAPS_CTRLPAN flag is not included when the buffer is created. If a call to CreateSoundBuffer is failing mysteriously, check two things before you begin pulling your hair out. First, be sure that the dwReserved member of DSBUFFERDESC is set to zero. Second, check all of your flags to make sure that you are not using two that are mutually exclusive; this is a common reason for a failed call to CreateSoundBuffer. Now that you have a DirectSound buffer object, your next step is to copy sampled sound data into the buffer. This requires some understanding of .WAV files, as well as the Lock and Unlock members of the IDirectSound interface. The Lock function returns pointers to buffer memory in your process's address space into which you copy sound data. Note that the Lock member function returns more than one pointer to a buffer. This is because sound buffers are circular so that streaming buffers can be played while your application writes data to a different part of the buffer. Lock returns two pointers to memory, along with the lengths of each portion of the buffer. The second pointer represents the wrapped-around portion of the buffer. If this second pointer returns as NULL, then the first pointer points to the entire buffer. When you call Unlock, the buffer is out of your hands and managed by the sound buffer object. It is important to Lock, write sound data to, and Unlock buffer memory as quickly as possible to allow DirectSound to maintain efficient control of its buffers. Now that you know how to write the sound data to your buffer, let's discuss where to get sound data. True, you can easily write random or equation-generated data into a buffer and play the sound. But for the most part, you will want to play recorded, real-world sounds. So it's time to talk about .WAV files and Win32 multimedia functions. 
 Multimedia and the .WAV file
 | 
|  | 
| Notice the call to the function mmioFOURCC. It takes 
the four identifying characters and combines the M into a single 32-bit value for identifying the chunk. If the call to mmioDescend succeeds, the file is a .WAV file. Notice 
the third parameter is a NULL value. This indicates that the WAVE chunk is not a subchunk. If you descended to a subchunk, then you would include a pointer to an MMCKINFO structure identifying the parent chunk. This is what you do next when you descend to the "fmt " subchunk. The "fmt " subchunk's data portion holds a WAVEFORMATEX structure, which is exactly what you need to create your secondary sound buffer. After descending to the "fmt " subchunk, you need to read this data into an instance of a WAVEFORMATEX structure: | 
|  | 
| Now that you have your WAVEFORMATEX structurewhich holds the format information for the waveall you need is the actual wave data. There are only a few more steps in the parsing process. Remember, your multimedia file pointer is currently on an "fmt " subchunk. The next step is to call mmioAscend to move the pointer back out a level so that you can call mmioDescend to descend to the "data" subchunk. This chunk contains the actual sampled sound data for the .WAV file. Once you have descended to the "data" subchunk, you can read the wave data in the same way you read the "fmt " subchunk. mmioRead reads the data directly into the buffer returned by the Lock member function of the IDirectSoundBuffer interface. Finally, you can Unlock your IDirectSoundBuffer and pass the multimedia file handle to mmioClose. That's it. You now have a secondary buffer with data from an existing .WAV file that is ready to be played. 
 IDirectSoundBuffer 
 
 You Win Some, You Lose Some
 
 Sound is 3D
 | 
|  | 
| Figure 2 Sound Cones | 
| 
 
 Working With 3D Sound 
 | 
|  | 
| The first parameter to QueryInterface is the GUID for the desired interface; the second is a pointer to the interface pointer that you want filled in by QueryInterface. As you can see from the example, the GUID for the IDirectSound3D-Listener is IID_IDirectSound3DListener. A similar call to QueryInterface using the IID_IDirectSound3DBuffer GUID structure would be used to retrieve a pointer to an IDirectSound3DBuffer interface. The IDirectSound3DListener interface provides methods to set the location and orientation of the listener in 3D space, as well as other settings that affect all 3D sounds. The IDirectSound3DBuffer interface contains methods that manipulate the 3D characteristics of a single sound buffer, including location, Doppler effects, and cone settings. I will cover the Se interfaces in more detail shortly. There are a few special considerations when dealing with 3D sound. Some of the features of DirectSound are incompatible with or meaningless to 3D sound. For example, although DirectSound supports playback of stereo sound buffers, this concept has no meaning with 3D sound. The 3D features of DirectSound create stereo (or better) output from a composite of mono sound sources positioned in 3D space. If you create a secondary sound buffer that contains stereo sound data, DirectSound will be forced to convert the sound data into a mono format when using it in a three-dimensional manner. This uses CPU cycles and should be avoided. Another related feature of DirectSound that is incompatible with 3D sound is the ability to pan sounds from one speaker to the next. If you are creating a secondary sound buffer and you want to use this buffer with 3D sound, then you should not include the DSBCAPS_CTRLPAN flag in your call to CreateSoundBuffer. This means you should not use the DSBCAPS_ DEFAULT or the DSBCAPS_ALL flags either, since they both include the DSBCAPS_CTRLPAN flag. A more global consideration when using 3D sound is the actual 3D coordinate system, and how you will apply it to your application. Will it correlate directly to the coordinate system you are using in your screen output? Will it require translation or scaling? Direct Sound provides a very flexible system that lets you fit the virtual 3D space to your application's needs. 
 Understanding 3D Coordinates and Distance
 | 
|  | 
| Figure 3 Left-handed Coordinate System | 
| 
 
 The Listener
 | 
|  | 
| Figure 4 Listener Orientation | 
| 
 
 3D Sound Buffers
 
 A Word on Quality 
 
  The Sample Program
 From the June 1998 issue of Microsoft Systems Journal. 
 |