Tony Cox
Microsoft Corporation
November 1999
Summary: This article provides in-depth answers to frequently asked development questions regarding Microsoft DirectX, version 7.0, and includes code samples, resources, newsgroups, and SDK information for the DirectX developer. (26 printed pages)
Contents
DirectX Developer Resources
The DirectX SDK and Run Time
General DirectX Development Issues
DirectDraw
Direct3D (Immediate Mode)
DirectSound
DirectPlay
DirectMusic
A number of excellent resources are available to DirectX developers. The primary resource is, of course, the Microsoft® DirectX® Web site at www.microsoft.com/directx/. In addition, beta program members can obtain access to the Microsoft private newsgroups for DirectX.
The DirectX SDK can be obtained through Microsoft's DirectX Web site at http://msdn.microsoft.com/directx/downloads.asp. MSDN Universal subscribers automatically receive the latest SDK as part of their MSDN subscription. DirectX beta program members automatically receive both beta and final versions of the SDK, as they become available. The end user run-time portion of DirectX is always available for download from the Microsoft site.
You should send email to directx@microsoft.com. Make sure you include your name, company name, and a fax number and/or postal address. You will be faxed/mailed a nondisclosure agreement (NDA), which you must sign and return before being accepted onto the beta program. Once you're on the beta program, you will be sent beta releases of the SDK, together with an account and password for the Microsoft private DirectX newsgroups.
The general mailing address for all DirectX issues is directx@microsoft.com. Bug reports should be sent to the DirectX bug reporting address, dxbugs@microsoft.com. When you submit a bug report, always include the report produced by the DirectX bug reporting tool, which contains details of the hardware and driver versions of various components installed on your system. This information makes it much easier for Microsoft to track down and fix your bug.
DirectX version 3, plus Microsoft DirectPlay® version 5.2, is supported on Microsoft Windows NT® 4.0 with service pack 3. With service pack 4, DirectPlay is upgraded to version 6.0, with other components still remaining at version 3; and service pack 5 upgrades DirectPlay to version 6.1a.
Windows 2000 fully supports DirectX 7, including full Direct3D hardware acceleration.
They're still there, but they are now all rolled into the same cabinet (CAB) file. The redistributable is now compressed, so it takes up less space than before, despite having all the foreign language versions included.
You probably don't have your include path set correctly. Many compilers, including Microsoft Visual C++®, include an earlier version of the SDK. So if your include path searches the standard compiler include directories first, you'll get incorrect versions of the header files. To remedy this, make sure the include path and library paths are set to search the DirectX include and library paths first. See also the dxreadme.txt file in the SDK.
The various globally unique identifiers (GUIDs) you use should be defined once and only once. The definition for the GUID will be inserted if you #define the INITGUID symbol before including the DirectX header files. Therefore, you should make sure that this only occurs for one compilation unit. An alternative to this method is to link with the dxguid.lib library, which contains definitions for all of the DirectX GUIDs. If you use this method (which is recommended), then you should never #define the INITGUID symbol.
No. DirectX interfaces are Component Object Model (COM) interfaces. This means that there is no requirement for higher numbered interfaces to be derived from corresponding lower numbered ones. Therefore, the only safe way to obtain a different interface to a DirectX object is to use the QueryInterface() method of the interface. This method is part of the standard IUnknown interface, from which all COM interfaces must derive.
The return value will be the current reference count of the object. However, the COM spec says that you should not rely on this, and the value is generally only available for debugging purposes. The values you observe may be unexpected because various other system objects may be holding references to the DirectX objects you create. For this reason, you should not write code that repeatedly calls Release() until the reference count is zero, as the object may then be freed even though another component may still be referencing it.
It shouldn't matter, because COM interfaces are reference counted. However, there are some known bugs with the release order of interfaces in some versions of DirectX. For safety, you are advised to release interfaces in the reverse of creation order, where possible.
A smart pointer is a C++ template class designed to encapsulate pointer functionality. In particular, some standard smart pointer classes are designed to encapsulate COM interface pointers. These pointers automatically perform QueryInterface() instead of a cast, and handle AddRef() and Release() for you. Whether you should use them is largely a matter of taste. If your code contains lots of copying of interface pointers, with multiple AddRef()'s and Release()'s, then smart pointers can probably make your code neater and less error prone. Otherwise, you can do without them. Visual C++ includes a standard Microsoft COM smart pointer, defined in the comdef.h header file (look up com_ptr_t in the help).
The most common problem with debugging DirectX applications is attempting to debug while a DirectDraw surface is locked. This situation can cause a "Win16 Lock" on Windows 9x systems, which prevents the debugger window from painting. Specifying the DDLOCK_NOSYSLOCK flag when locking the surface usually eliminates this. Windows 2000 does not suffer from this problem. When developing an application, it is also useful to be running with the debugging version of the DirectX run time (selected when you install the SDK), which performs some parameter validation and outputs useful messages to the debugger output.
Use the SUCCEEDED() and FAILED() macros. DirectX methods can return multiple success and failure codes, so a simple ==DD_OK test will not always suffice.
You are probably forgetting to correctly initialize a structure you're passing. In particular, ensure that the dwSize field is correctly filled out with the structure size. Also make sure that unused fields are zeroed. It's usually a good idea to do a ZeroMemory() or memset() to clear the structure before you use it. Another useful trick is to write code like:
DDSURFACEDESC2 desc = {sizeof(desc)};
This has the effect of zeroing the structure and setting the first member (dwSize is always the first member) to zero.
Inside DirectX, published by Microsoft Press, covers most of the DirectX components, with the notable exception of Direct3D. A companion book, Inside Direct3D, is expected to be out soon.
Inside COM by Dale Rogerson, published by Microsoft Press, is an excellent introduction to COM.
Lots. However, the ones that are highly recommended are:
Programming Windows 95 by Charles Petzold and Paul Yao (Microsoft Press) and Advanced Windows by Jeffrey Richter (Microsoft Press)
Use the new D3DX utility library. This library takes care of all sorts of initialization problems, and provides a useful set of math functions and simple shapes (spheres, boxes, cylinders, etc.) as well as texture loading and manipulation functions.
DirectDraw is responsible for managing basic display functions; it also acts as a video memory manager and provides access to hardware blitting functionality.
No. If you know you just want to use the primary desktop display device, then it's safe to pass NULL when specifying the device for DirectDrawCreate. However, if you want to take advantage of multimonitor capabilities or use a secondary 3-D device that may be present, you must enumerate the available devices.
To determine the pixel format for the screen, you must have an interface to the primary display surface. You then call the GetPixelFormat() method to return the format description structure. From this, you can work out if the pixel values are indices into a palette, or actual packed colour values. If the pixels are actual colour values, the structure tells you how the bits are packed for each colour channel. Do not assume any particular pixel format (in particular, there are at least two popular ways of bitpacking 16-bit pixels).
First, you must lock the surface, using the Lock() method. The surface description structure will be filled out with a pointer to, and the pitch of, the surface. The pitch of the surface tells you the increment, in bytes, between vertically adjacent pixels. Note that the pitch of a surface is not always equal to the width of the surface. The address of the desired pixel can then be computed. The data you write to the surface must be in the correct pixel format for the surface. The following code snippet shows writing a single pixel to a 16-bit surface:
void WritePixel16( IDirectDrawSurface7* surface ,
int x , int y , WORD colour_value )
{
DDSURFACEDESC2 desc;
memset(&desc,0,sizeof(desc)); desc.dwSize = sizeof(desc);
if (SUCCEEDED(surface->Lock(NULL,&desc,
DDLOCK_NOSYSLOCK|DDLOCK_WAIT,NULL)))
{
char* address = desc.lpSurface;
address += (x*2) + (y*desc.dwPitch);
*((WORD*)address) = colour_value;
surface->Unlock(NULL);
}
}
In practice, of course, you would not lock and unlock the surface each time a pixel is drawn, because locking can be an expensive operation if the surface is in video memory. Rather, you would lock the surface once, draw all the required pixels, and then unlock.
No. Although it is often the case that the address of a surface is unchanged, you should not rely on this behavior. The driver is technically free to return a different address each time a surface is locked, and indeed it may wish to do so for virtual memory management reasons.
No. With the exception of unpacking compressed textures, a DirectDraw blit simply copies, bit-for-bit, the data held in the surface. You cannot use DirectDraw to perform general format conversion. In particular, no palette remapping is performed by a DirectDraw blit. You can use the graphics device interface (GDI) to obtain this functionality. DirectDraw will unpack 'DXTn' compressed textures to any sensible RGB(A) formatted surface.
Yes. With the exception of rotation and mirroring, all blit functions will be emulated when there is no blitting hardware present. You can disable this emulation by specifying the DDCREATE_HARDWAREONLY flag when creating the DirectDraw object (in which case, blits may fail where no hardware blit is available). Alternatively, you can also force all blits to be emulated by specifying the DDCREATE_EMULATEONLY flag.
No. The only operation supported is a blit (although this can be used to draw coloured boxes). Higher level 2-D functionality is available via the GDI.
You are probably not releasing the device context properly after use. This is an easy mistake to make because, though the GetDC() method takes a pointer to an HDC, the ReleaseDC() method takes the HDC itself. For example:
// Get a DC for the surface.
HDC dc;
if (SUCCEEDED(surface->GetDC(&dc)))
{
// Do something with the DC.
TextOut( dc , 0 , 0 , "Hello" , 5 );
// Now release the DC...
surface->ReleaseDC(&dc); // Wrong!! Should be dc, not &dc.
}
Unfortunately, the above (incorrect) code will compile without error. To remedy this, #define the symbol STRICT before including Windows.h, which will enable the compiler to spot type errors like this.
DirectDraw does not currently support alpha blending in blit operations. If 3D hardware is available, then this can be used instead, by drawing alpha blended textured quads to simulate blits. This is also very likely to be the fastest method, and gives opportunity for additional effects like scaling, rotation, and filtering at little cost. If this approach is not suitable, perhaps because no 3-D hardware is available, the only alternative is to perform the alpha blend 'by hand,' locking the surfaces and modifying the data using the CPU. Most fast implementations use tables to speed up the blending operation, although where MMX instructions are available, these can be used to implement fast alpha blending for non-paletted surfaces. The following code snippet demonstrates how to use a lookup table to implement a fixed-weight alpha blend between two 8-bit paletted surfaces.
BYTE g_BlendTable[256][256];
// Find nearest match for a given colour in the palette, using a
// squared distance error function.
int FindColour( LPPALETTEENTRY palette , int r , int g , int b )
{
int best = 0;
int best_error = INT_MAX;
for ( int i = 0 ; i < 256 ; i++ )
{
int er,eg,eb;
er = r - (int)palette[i].peRed; er *= er;
eg = g - (int)palette[i].peGreen; eg *= eg;
eb = b - (int)palette[i].peBlue; eb *= eb;
int error = er + eg + eb;
if ( error < best_error )
{
best_error = error;
best = i;
}
}
return best;
}
// Initialise the blend table, given the palette.
// The weight factor is from 0 to 256.
void InitialiseTable( LPPALETTEENTRY palette , int weight )
{
for ( int i = 0 ; i < 256 ; i++ )
{
for ( int j = 0 ; j < 256 ; j++ )
{
// Compute the colour we'd like the blend
// of colour indices i and j to be.
int r,g,b;
r = (palette[i].peRed * weight);
r += (palette[j].peRed * (256-weight));
r /= 256;
g = (palette[i].peGreen * weight);
g += (palette[j].peGreen * (256-weight));
g /= 256;
b = (palette[i].peBlue * weight);
b += (palette[j].peBlue * (256-weight));
b /= 256;
// Find nearest match in our palette.
int index = FindColour( palette , r , g , b );
// Store in table
g_BlendTable[i][j] = index;
}
}
}
// Perform an alpha-blend, with no clipping or stretching. Assumes
// that the table has been initialised, and that both surfaces are
// 8-bit paletted, with the same palette as was used to
// initialise the table. The surfaces can't be the same.
void AlphaBlend( IDirectDrawSurface7* dest ,
IDirectDrawSurface7* src ,
int dest_x , int dest_y ,
int src_x , int src_y ,
int width , int height )
{
DDSURFACEDESC2 descd , descs;
memset(descd,0,sizeof(descd)); descd.dwSize = sizeof(descd);
memset(descs,0,sizeof(descs)); descs.dwSize = sizeof(descs);
if (SUCCEEDED(dest->Lock(NULL,&descd,DDLOCK_WAIT,NULL)))
{
if (SUCCEEDED(src->Lock(NULL,&descs,DDLOCK_WAIT,NULL)))
{
BYTE* destptr = (BYTE*) descd.lpSurface;
BYTE* srcptr = (BYTE*) descs.lpSurface;
destptr += dest_x + (dest_y * descd.dwPitch);
srcptr += src_x + (src_y * descs.dwPitch);
while (--height>=0)
{
BYTE* dd = destptr;
BYTE* ds = srcptr;
int w = width;
while (--w>=0)
{
*dd = g_BlendTable[*dd][*ds];
dd++;
ds++;
}
destptr += descd.dwPitch;
srcptr += descs.dwPitch;
}
src->Unlock(NULL);
}
dest->Unlock(NULL);
}
}
The same can be done for 16-bit surfaces, by splitting the lookup table into two (the size would be prohibitive otherwise).
With paletted screen modes, a fade can be achieved via a simple manipulation of the palette. For non-paletted modes, the best method is to use the IDirectDrawGammaControl interface (queried from the primary surface) to adjust the colour ramp up or down. These methods also easily allow a fade to a colour other than black. Where gamma controls are not available, the fade needs to be done by manipulation of the pixels on the surface. The fastest mechanism is to use the alpha blending capabilities of 3-D hardware. Where this is not an option, the pixels must be manipulated 'by hand,' either by a table lookup (as with alpha blending) or by simple repeated decrement or division of the colour values.
Shutting down DirectDraw incorrectly can cause problems. The most common mistake is to destroy the application window before shutting down DirectDraw. The window handle passed to SetCooperativeLevel() must remain valid until DirectDraw has been shut down. The safest place to shut down DirectDraw is in the processing of the WM_DESTROY message for your window.
Accessing video memory with the CPU is very slow, especially read operations. For this reason, when you are doing a significant amount of direct manipulation of the surface, it is often faster to use a back buffer in system memory and blit it to the front buffer each frame. This is particularly true of alpha blending, or similar operations that require reading from surfaces. On the other hand, blits between video memory surfaces will often be significantly faster than system memory blits due to hardware acceleration. In addition, hardware accelerated blits can occur in parallel with regular processing, again boosting performance. Therefore, you should keep surfaces in video memory whenever possible if you will not need to access them with the CPU.
You can use the GetAvailableVidMem() method to determine the available video memory. However, not all drivers implement this method, returning zero. Also, alignment restrictions, private data structures and other factors mean that you should never rely on the exact byte count returned. For example, if there are x*y bytes free, it does not necessarily imply that you will be able to create an x by y 8-bit surface. The returned values are best treated as a guideline only.
The alignment rules for surfaces are determined by the driver. A typical restriction is to pad surfaces to multiples of 8 bytes wide. However, the driver is free to impose any restriction it chooses. In particular, some legacy devices have rectangular allocation schemes. Therefore, it is important to determine the pitch of the surface by querying rather than computation.
The GetDeviceIdentifier() method, introduced in version 6.0, returns a structure containing information about the chipset and driver, both as a unique GUID for the device/driver pair, vendor IDs, and descriptive strings that can be presented to a user.
Direct3D is primarily responsible for providing access to 3-D acceleration hardware, although it does include software rasterization devices. Using Direct3D requires an understanding of DirectDraw, because DirectDraw is used to create and manage surfaces used by Direct3D (e.g., textures).
The 'standard' book on the subject is Computer Graphics: Principles and Practice by Foley, Van Dam et al. and is a valuable resource for anyone wanting to understand the mathematical foundations of geometry, rasterization and lighting techniques. The FAQ for the comp.graphics.algorithms Usenet group also contains useful material.
No. If you are using a hardware device, Direct3D will perform no emulation of missing functionality. You must determine the available functionality by check capability bits and using the ValidateDevice() method.
Direct3D provides two software rasterization devices. The first is the regular RGB software rasterizer. This has been greatly improved in functionality from the 5.0 version, and now supports bilinear filtering and a full range of alpha blending operations. It also supports 2-stage multitexture with most common operations. It automatically takes advantage of MMX instructions, where available (previously the MMX rasterizer was enumerated as a separate device). The old ramp mode rasterizer is now obsolete and can only be accessed via the old (version 5.0 or earlier) interfaces. There is also a high-quality reference rasterizer. The reference rasterizer (known as refrast) is full featured, supporting eight-stage multitexture, all legal blending operations, anisotropic filtering, stencil buffer, and a wide range of texture formats. It is not high performance, but is a very useful reference to compare against the output of hardware devices when debugging.
DirectX 7 supports hardware accelerated transformation and lighting. In addition, the programming model has been greatly simplified. Lights, materials, and viewports are no longer distinct COM objects, but are set by directly calling methods of IDirect3DDevice7. Textures no longer have a special interface, but use the regular IDirectDrawSurface7 interface with the Load() method moved to IDirect3DDevice7.
Your 3dfx card is a separate DirectDraw device to your main 2-D graphics card. Therefore, to select it, you must enumerate the DirectDraw devices in your system and select the Voodoo. You can then enumerate the 3-D devices associated with that card. One of them will be the Hardware Abstraction Layer (HAL) for the Voodoo.
The following is a checklist of some of the common pitfalls when working with z-buffers:
The driver determines what range of values is expected for w by examining the projection matrix. For this reason the application should set a correct projection matrix when w-buffering, even if the application performs its own transformations.
With DirectX 7, lighting is performed based on the D3DRENDERSTATE_LIGHTING renderstate, rather than the flexible vertex format (FVF) you pass. Because lighting is on by default, you are probably getting all your polygons lit by Direct3D—which, if you didn't specify lights or materials, means everything comes out black. If you don't want Direct3D to perform lighting, you need to explicitly disable it.
The following are key areas to look at when optimising performance:
A vertex buffer is an object that encapsulates an array of vertices. It has similar semantics to a DirectDraw surface; a vertex buffer must be locked while accessing the contents. At creation time, the vertex buffer is placed in system or video memory, and the type of vertex contained is set. If the vertex buffer contains untransformed vertices, it can be transformed and the results placed in a target vertex buffer via the ProcessVertices() method. Using vertex buffers results in several potential benefits:
This procedure is only valid for DirectX 7. Also note that VB locks on DirectX 7 are superfast (~50 cycles), and unlocks are faster, so don't worry about the cost of steps 5 or 7.
Many meshes encountered in real data feature vertices that are shared by multiple polygons. To maximize performance, it is desirable to reduce the duplication in vertices transformed and sent across the bus to the rendering device. It is clear that using simple triangle lists achieves no vertex sharing, and so is the least optimal method. The choice then is between using strips and fans, which imply a specific connectivity relationship between polygons, and using indexed lists. Where the data naturally falls into strips and fans, this is the most appropriate choice, because they minimize the data sent to the driver. However, decomposing meshes into strips and fans often results in a large number of separate pieces, implying a large number of DrawPrimitive calls. For this reason, the most efficient method is usually to use a single DrawIndexedPrimitive call with a triangle list. An additional advantage of using an indexed list is that a benefit can be gained even when consecutive triangles only share a single vertex.
In summary If your data naturally falls into large strips or fans, then use strips or fans; otherwise use indexed lists.
Yes and no. In theory, you should have only one BeginScene()/EndScene() pair per render target per frame, and this rule certainly applies for scene capture cards like the PowerVR. However, for most conventional rendering devices, this restriction is unnecessary, and you will get the expected results from using multiple pairs. In most situations, multiple pairs are unnecessary, and for the sake of scene capture cards should be avoided.
The usual example of this situation is a cube where you want to use a different texture for each face. Unfortunately the answer is no—it is not currently possible to index the vertex components independently. This is sometimes used as an argument in favour of custom transformation code, rather than using the Direct3D pipeline. However, this argument is often spurious for the following reasons. First, the cube example is somewhat contrived, and in more realistic situations with larger polygon count meshes, it is far more common for a vertex shared between polygons to share all components. Second, an independent indexing mechanism might interfere with the smooth flow of data to the driver and/or card, and would likely have to be emulated by extra copy operations, negating a large amount of potential benefit.
Yes. The Direct3D geometry pipeline has several different code paths, depending on the processor type, and will utilize the special floating point operations provided by the 3DNow! or Pentium III SIMD instructions, where these are available.
You can filter out pixels with an alpha value above or below a given threshold. You control this behavior by using the renderstates ALPHATESTENABLE, ALPHAREF and ALPHAFUNC.
A stencil buffer is an additional buffer of per-pixel information, much like a z-buffer. In fact it 'lives' in some of the bits of a z-buffer. Common stencil/z-buffer formats are 15-bit z and 1-bit stencil, or 24-bit z and 8-bit stencil. It is possible to perform simple arithmetic operations on the contents of the stencil buffer on a per-pixel basis as polygons are rendered. For example, the stencil buffer can be incremented or decremented, or the pixel can be rejected if the stencil value fails a simple comparison test. This is useful for effects that involve marking out a region of the frame buffer, and then performing rendering only the marked (or unmarked) region. Good examples are volumetric effects like shadow volumes.
The key to this, and other volumetric stencil buffer effects, is the interaction between the stencil buffer and the z-buffer. A scene with a shadow volume is rendered in three stages. First, the scene without the shadow is rendered as usual, using the z-buffer. Next, the shadow is marked out in the stencil buffer as follows. The front faces of the shadow volume are drawn using invisible polygons, with z-testing enabled but z-writes disabled, and the stencil buffer incremented at every pixel passing the z-test. The back faces of the shadow volume are rendered similarly, but decrementing the stencil value instead. Now, consider a single pixel. Assuming the camera is not in the shadow volume, there are four possibilities for the corresponding point in the scene. If the ray from the camera to the point does not intersect the shadow volume, then no shadow polygons will have been drawn there, and the stencil buffer is still zero. Otherwise, if the point lies in front of the shadow volume, the shadow polygons will be z-buffered out and the stencil again remains unchanged. If the point lies behind the shadow volume, then the same number of front shadow faces as back faces will have been rendered and the stencil will be zero, having been incremented as many times as decremented. The final possibility is that the point lies inside the shadow volume. In this case, the back face of the shadow volume will be z-buffered out, but not the front face, so the stencil buffer will be a non-zero value. The end result is that portions of the frame buffer lying in shadow have non-zero stencil value. Finally, to actually render the shadow, the whole scene is washed over with an alpha-blended polygon set to only affect pixels with non-zero stencil value. An example of this technique can been seen in the "Shadow Volume" sample that comes with the DirectX SDK.
You can defeat the automatic mipmapping performed by some drivers (notably for nVidia hardware) by explicitly specifying a mipmap chain with a depth of 1.
A common 'gotcha' when upgrading to the new interfaces is specifying the vertex type incorrectly. The old DX5 interfaces took a member of the D3DVERTEXTYPE enum, whereas the DX6 interfaces expect a flexible vertex format (FVF) specification. For example, instead of D3DVT_VERTEX, you now need to use D3DFVF_VERTEX.
Unfortunately, the lighting documentation has some errors. In particular, the documentation on how attenuation is computed is wrong. Attenuation is computed according to the following formula, shown in Figure 1:
Figure 1. Formula for computing attenuation
where D is the distance between the light and the vertex, in world units. Note that this distance is not normalized in any way (as it says in the docs).
Also note that the range of the light has no effect at all on the attenuation calculation; it is used only in determining whether to consider that light at all.
These changes were made to make the Direct3D lighting model the same as the OpenGL lighting model, which is useful because hardware needs to implement the lighting model.
This is explained fully in the DirectX 7 documentation (under the article entitled "Directly Mapping Texels to Pixels"). However, the executive summary is that you should bias your texture coordinates by –0.5 of a texel in order to align properly with screen pixels. Most cards now conform properly to the texel alignment rules; however, some older cards or drivers do not. To handle these cases, the best thing to do is contact the hardware vendor in question and request updated drivers or their suggested workaround.
No. Texture coordinate generation and transformation functionality are only available when you are using the Direct3D transformation pipeline.
You probably installed the debug version of the DirectX 7 run time. The debug version of the run time initializes all newly allocated DirectSound buffers with static, in order to help developers catch bugs with uninitialized buffers. You should not assume what the contents of a newly allocated DirectSound buffer will be (in particular, the buffer is not guaranteed to be zeroed out).
This can occur if you fail to free DirectPlay properly on the termination of your application. When the program is finished, all the COM interfaces must be freed by calling the Release() method on each interface. Remember also to uninitialize COM properly by calling CoUninitialize(). This error can also occur if you break into a DirectPlay application in the debugger and then terminate the application without going through the shutdown code. This problem should be fixed with the next original equipment manufacturer (OEM) service release (OSR) of Windows 98.
The most common cause of packet delays is from overloading your bandwidth. A common mistake is to send data on every frame. Many applications can get away with sending data much less frequently than their display frame rate. For example, fast action client-server applications often send data only at a 10Hz rate.
DirectPlay versions 6.0 and earlier have a known problem with host migration and reliable messaging. It is probably best to use reliable messaging only for setup and initialization. This problem has been fixed for version 6.1a.
The flag (DPCAPS_GUARANTEEDSUPPORTED) indicates the capability of the service provider if the DirectPlay protocol is not used. If you use the DirectPlay protocol, you get guaranteed messaging capability, even if the service provider does not support this functionality.
Certain machines will not see others unless you set the frame type and network number for IPX in the system. Set the frame type to 802.3 and to set the network number to 2702 on all the systems you're using IPX on.
.
What can I do?We also call these NATs, for Network Address Translation. There is a problem with applications working across Incremental Change Synchronization (ICS). ICS gives its clients local Internet Protocols (IPs) and strips these IPs external to the ICS. DirectPlay has address information embedded in the message that ICSs do not manage currently.
Is there anything wrong with sending network packets from multiple threads? We do most of our sending in our main thread. However, in some instances we would like to send packets in our separate receive thread as well. Are we going to see conflicts, deadlocks, increased instability, missed packets, etc. by doing this?
There shouldn't be any penalty for sending from multiple threads. The DirectPlay Protocol takes everything you throw at it and funnels it through a single thread of its own anyway.
DirectPlay should be thread safe, so the workaround shouldn't cause problems.
"Send Completion" means "send complete, on the wire," not "received by other DPlay" and certainly not "pulled out of the queue by the application on the other side" (IDirectPlay::Receive). The application has to do this itself.
Use EnumAddress. Then use a callback, like this:
BOOL FAR PASCAL EnumAddressCallback(REFGUID guidDataType,
DWORD dwDataSize,
LPCVOID lpData,
LPVOID lpContext)
{
if (DPAID_INet == guidDataType)
{
lstrcpy((char *) lpContext, (char *) lpData);
return FALSE;
}
return TRUE;
}
The IP address ends up in the context passed into EnumAddress. Just pass in a buffer big enough to hold the IP address.
The proper way to prevent this close delay in DirectPlay 7 would be to poll the message queue via GetMessageQueue( , DPMESSAGEQUEUE_SEND, ) in a loop between CancelMessage & DestroyPlayer to make sure it did actually reach 0. Then use a sleep(1000) to allow system messages to clear.
Here is a checklist of suggestions/caveats/gotchas:
I think I did everything right, but I still don't hear anything
Example: If you have a segment that has notes on PChannels 1, 4, 5, 33, and 59, you'll need to at least do the following:
pPerf->AssignPChannelBlock(0, pPort, 1); // for channels 1, 4, and 5
pPerf->AssignPChannelBlock(1, pPort, 2); // for channel 33
pPerf->AssignPChannelBlock(3, pPort, 3); // for channel 59
(Channel group is arbitrary, but must be 1 or greater, and not collide with other group assignments.)
Here is the workaround:
HRESULT CacheDefaultGMCollection
(
IDirectMusicLoader* pLoader
)
{
HRESULT hr = E_FAIL;
DMUS_OBJECTDESC desc;
static IDirectMusicCollection* pCollection = NULL;
if (NULL != pCollection)
{
pCollection->Release();
pCollection = NULL;
}
//**********************************************************************
// Setup DMUS_OBJECTDESC to represent Default GM Collection
//**********************************************************************
ZeroMemory(&desc, sizeof(desc));
desc.dwSize = sizeof(DMUS_OBJECTDESC);
desc.guidObject = GUID_DefaultGMCollection;
desc.guidClass = CLSID_DirectMusicCollection;
desc.dwValidData = (DMUS_OBJ_CLASS | DMUS_OBJ_OBJECT);
hr = pLoader->GetObject(&desc, IID_IDirectMusicCollection, (void **)&pCollection);
if ( FAILED(hr) )
{
DPF(0, "**** Failed to Load Object [collection]");
}
return hr;
}
The mere fact of doing a GetObject with the above DMUS_OBJECTDESC will re-establish the loaders link to the Default GM Collection. Also, you do not need to release the Collection; it will be released when the Loader goes away. *Note: This is one possible implementation of this workaround. A more solid solution could include wrapping the check for Collection pointer with a critical section, to prevent synchronization problems.
Gotchas
[debug]
DMBAND=3
Table 1. CPU usage regarding reverb and sampling rate
Reverb Status | Sampling Rate | CPU Usage |
Reverb off | 22 kHz | Least CPU |
Reverb on | 22 kHz | Better sounding |
Reverb off | 44.1 kHz | Probably not that useful; 22 kHz with reverb on usually sounds better, but use your own taste |
Reverb on | 44.1 kHz | Best sounding if you are using 44.1-kHz samples |
Or you can give the end user ultimate control via the audio control panel in your game. Of course, if all your samples are 22 kHz, you should run the synth no faster than 22 kHz.
[debug]
DMBAND=-1
This can be used to turn off debug statements and is necessary sometimes because debug statements can affect performance for some DLLs more than others.
If you're working on a typical game application, you probably don't want to use AutoDownload, as it can cause a performance hit when playing back Segments. Instead, manually download with Segment->SetParam( GUID_Download, pIPerformance) to tell the Segment to download the DLS instruments associated with the Segment. This should be called at a convenient time (like a scene change) or you can call it in a separate thread prior to playback. The Band should be placed in a Band Track in your Segment to ensure that this will work properly.
After playing the Segment, call Segment->SetParam( GUID_Unload, pIPerformance) when you're done with the Segment. For this to work, all collections must be referenced properly from within the Band. When you load the Segment, the Band Track reads the name, file name, and GUID for each referenced collection and asks the Loader to load those as well. The easiest way for the Loader to know where to find them is to rely on file names. If you store your data in a resource, then you should call SetObject on each resource chunk first so the Loader will know where to find it.
When using AutoDownload, if you are using only the instruments from the default collection (GM.DLS) in your primary Segment and the Band in your Secondary Segment references only instruments from a custom collection (replacing GM.DLS), then the instruments from the default collection should be returned automatically when the Secondary Segment playback stops, if the primary Segment is still playing.
If you write a basic Play Segment/MIDI file app, you can use AutoDownload so you don't have to manage downloading the instruments. However, in a typical game situation, AutoDownload incurs a performance hit if you ever play a Segment more than once. And, it causes the downloading of instruments to occur right at the start of Segment playback, causing a blip in CPU at that point and potential delay in performance. Downloading and unloading repeatedly (which AutoDownload may do) takes time, and can potentially degrade performance. If you are concerned about CPU performance in your application, consider turning AutoDownload off.
Relying on AutoDownload can cause other problems: You might also want to turn off AutoDownload if you have a Band in a Secondary Segment (Secondary Segments are played on top of a primary Segment). Otherwise the instruments in the Band may be downloaded automatically when the Secondary Segment starts, changing your instruments. If the Secondary Segment stops playing before the primary Segment stops, AutoDownload will then unload the Band. If this happens, you will not revert to the original instruments, as you may expect. Rather, you may lose sound output entirely because you now have no Bands loaded.
For more information on DirectMusic, please check out the DirectMusic FAQ.