DirectX Frequently Asked Questions

Microsoft Corporation

August 2005

Introduction

This is a collection of Frequently Asked Questions (FAQ) about Microsoft DirectX.

Contents:

General DirectX Development Issues
Direct3D Questions
DirectSound Questions
DirectX Extensions for Alias Maya
XInput Questions

General DirectX Development Issues

Should game developers still be publishing games for Windows 95, Windows 98 or Windows ME?

Not anymore for two reasons: performance and feature set.

If the minimum CPU speed required for your game is 1.2GHz or above (which is more common for high performance titles), then the vast majority of these machines will be running Windows XP. By the time machines with CPU speeds above 1.2GHz were being sold, Windows XP was installed as the default operating system by almost all manufacturers. This means that there are many features found in Windows XP that today's game developers should be taking advantage of including:

Improved multitasking - which results in a better, smoother experience for video, audio and gaming.
More stable video driver model - which allows easier debugging, smoother game play and better performance.
Easier configuration for networking - which allows easier access to multi-player games.
Support for DMA transfers by default from hard drives - which results in smoother, faster loading applications.
Windows error reporting - which results in a more stable OS, drivers and applications.
Unicode support - which greatly simplifies localization issues.
Better security and stability - which results in better consumer experiences.
Better support for modern hardware - most of which no longer uses Windows 98 drivers.
Improved memory management - which results in better stability and security.
Improved NTFS file system - which is more resistant to failure, and has better performance with security features.

Should game developers still be publishing games for Windows 2000?

Not anymore. In addition to the reasons listed in Should game developers still be publishing games for Windows 95, Windows 98 or Windows ME?, Windows 2000 does not have these features:

Windows XP supports advanced processor features such as Hyper-Threading, Multi-Core and x64.
Windows XP supports side-by-side components which significantly reduces application versioning conflicts.
Windows XP supports no-execute memory protection which helps prevent malicious programs and can aid debugging.
Windows XP has improved support for advanced AGP and PCI Express based video cards.
Windows XP supports fast user switching, remote desktop and remote assistance which can help lower product support costs.
Performance tools like Reference (in the DirectX Developer SDK) no longer support Windows 2000.

In short, Windows 2000 was never designed or marketed as a consumer operating system.

I think I have found a driver bug, what do I do?

First, ensure you have checked the results with the Reference Rasterizer. Then check the results with the latest WHQL certified version of the IHVs driver. You can programmatically check the WHQL status using the GetAdapterIdentifier() method on the IDirect3D9 interface passing the D3DENUM_WHQL_LEVEL flag. With a WHQL certified driver issue, send a description of the bug, the output from dxdiag and a repro case to directx@microsoft.com with a note in the subject line "WHQL Driver Bug".

Why do I get so many error messages when I try to compile the samples?

You probably don't have your include path set correctly. Many compilers, including Microsoft Visual C++, include an earlier version of the SDK, so if your include path searches the standard compiler include directories first, you'll get incorrect versions of the header files. To remedy this issue, make sure the include path and library paths are set to search the Microsoft DirectX include and library paths first. See also the dxreadme.txt file in the SDK. If you install the DirectX SDK and you are using Visual C++, the installer can optionally set up the include paths for you.

I get linker errors about multiple or missing symbols for globally unique identifiers (GUIDs), what do I do?

The various GUIDs you use should be defined once and only once. The definition for the GUID will be inserted if you #define the INITGUID symbol before including the DirectX header files. Therefore, you should make sure that this only occurs for one compilation unit. An alternative to this method is to link with the dxguid.lib library, which contains definitions for all of the DirectX GUIDs. If you use this method (which is recommended), then you should never #define the INITGUID symbol.

Can I cast a pointer to a DirectX interface to a lower version number?

No. DirectX interfaces are COM interfaces. This means that there is no requirement for higher numbered interfaces to be derived from corresponding lower numbered ones. Therefore, the only safe way to obtain a different interface to a DirectX object is to use the QueryInterface method of the interface. This method is part of the standard IUnknown interface, from which all COM interfaces must derive.

Can I mix the use of DirectX 9 components and DirectX 8 or earlier components within the same application?

You can freely mix different components of differing version; for example, you could use DirectInput 8 with Direct3D 9 in the same application. However, you generally cannot mix different versions of the same component within the same application; for example, you cannot mix DirectDraw 7 with Direct3D 9 (since these are effectively the same component as DirectDraw has been subsumed into Direct3D as of DirectX 8). There are exceptions, however, such as the use of Direct3D 9 and Direct3D 10 together in the same application, which is allowed.

Can I mix the use of Direct3D 9 and Direct3D 10 within the same application?

Yes, you may use these versions of Direct3D together in the same application.

What do the return values from the Release or AddRef methods mean?

The return value will be the current reference count of the object. However, the COM specification states that you should not rely on this and the value is generally only available for debugging purposes. The values you observe may be unexpected since various other system objects may be holding references to the DirectX objects you create. For this reason, you should not write code that repeatedly calls Release until the reference count is zero, as the object may then be freed even though another component may still be referencing it.

Does it matter in which order I release DirectX interfaces?

It shouldn't matter because COM interfaces are reference counted. However, there are some known bugs with the release order of interfaces in some versions of DirectX. For safety, you are advised to release interfaces in reverse creation order when possible.

What is a smart pointer and should I use it?

A smart pointer is a C++ template class designed to encapsulate pointer functionality. In particular, there are standard smart pointer classes designed to encapsulate COM interface pointers. These pointers automatically perform QueryInterface instead of a cast and they handle AddRef and Release for you. Whether you should use them is largely a matter of taste. If your code contains lots of copying of interface pointers, with multiple AddRefs and Releases, then smart pointers can probably make your code neater and less error prone. Otherwise, you can do without them. Visual C++ includes a standard Microsoft COM smart pointer, defined in the "comdef.h" header file (look up com_ptr_t in the help).

I have trouble debugging my DirectX application, any tips?

The most common problem with debugging DirectX applications is attempting to debug while a DirectDraw surface is locked. This situation can cause a "Win16 Lock" on Microsoft Windows 9x systems, which prevents the debugger window from painting. Specifying the D3DLOCK_NOSYSLOCK flag when locking the surface can usually eliminate this. Windows 2000 does not suffer from this problem. When developing an application, it is useful to be running with the debugging version of the DirectX runtime (selected when you install the SDK), which performs some parameter validation and outputs useful messages to the debugger output.

What's the correct way to check return codes?

Use the SUCCEEDED and FAILED macros. DirectX methods can return multiple success and failure codes, so a simple:

== D3D_OK

or similar test will not always suffice.

How do I disable ALT+TAB and other task switching?

You don't!

Is there a recommended book explaining COM?

Inside COM by Dale Rogerson, published by Microsoft Press, is an excellent introduction to COM. For a more detailed look at COM, the book Essential COM by Don Box, published by Longman, is also highly recommended.

What is managed code?

Managed code is code that has its execution managed by the .NET Framework Common Language Runtime (CLR). It refers to a contract of cooperation between natively executing code and the runtime. This contract specifies that at any point of execution, the runtime may stop an executing CPU and retrieve information specific to the current CPU instruction address. Information that must be query-able generally pertains to runtime state, such as register or stack memory contents.

Before the code is run, the IL is compiled into native executable code. And, since this compilation happens by the managed execution environment (or, more correctly, by a runtime-aware compiler that knows how to target the managed execution environment), the managed execution environment can make guarantees about what the code is going to do. It can insert traps and appropriate garbage collection hooks, exception handling, type safety, array bounds and index checking, and so forth. For example, such a compiler makes sure to lay out stack frames and everything just right so that the garbage collector can run in the background on a separate thread, constantly walking the active call stack, finding all the roots, chasing down all the live objects. In addition because the IL has a notion of type safety the execution engine will maintain the guarantee of type safety eliminating a whole class of programming mistakes that often lead to security holes.

In contrast this to the unmanaged world: Unmanaged executable files are basically a binary image, x86 code, loaded into memory. The program counter gets put there and that's the last the OS knows. There are protections in place around memory management and port I/O and so forth, but the system doesn't actually know what the application is doing. Therefore, it can't make any guarantees about what happens when the application runs.

What books are there about general Windows programming?

Lots. However, the two that are highly recommended are:

Programming Windows by Charles Petzold (Microsoft Press)
Programming Applications for Windows by Jeffrey Richter (Microsoft Press)

How do I debug using the Windows symbol files?

Microsoft publish stripped symbols for all system DLLs (plus a few others). To access them add the following to your symbol path in the project settings inside Visual Studio:

srv*http://msdl.microsoft.com/download/symbols

for caching symbols locally use the following syntax:

srv*c:\cache*http://msdl.microsoft.com/download/symbols

Where c:\cache is a local directory for caching symbol files.

General Direct3D Questions

Where can I find information about 3D graphics techniques?

The standard book on the subject is Computer Graphics: Principles and Practice by Foley, Van Dam et al. It is a valuable resource for anyone wanting to understand the mathematical foundations of geometry, rasterization and lighting techniques. The FAQ for the comp.graphics.algorithms Usenet group also contains useful material.

Does Direct3D emulate functionality not provided by hardware?

It depends. Direct3D has a fully featured software vertex-processing pipeline (including support for custom vertex shaders). However, no emulation is provided for pixel level operations; applications must check the appropriate caps bits and use the ValidateDevice API to determine support.

Is there a software rasterizer included with Direct3D?

Not for performance applications. A reference rasterizer is supplied for driver validation but the implementation is designed for accuracy and not performance. Direct3D does support plug-in software rasterizers.

How can I perform color keying with DirectX graphics?

Color keying is not directly supported, instead you will have to use alpha blending to emulate color keying. The D3DXCreateTextureFromFileEx() function can be used to facilitate this. This function accepts a key color parameter and will replace all pixels from the source image containing the specified color with transparent black pixels in the created texture.

Does the Direct3D geometry code utilize 3DNow! and/or Pentium III SIMD instructions?

Yes. The Direct3D geometry pipeline has several different code paths, depending on the processor type, and it will utilize the special floating-point operations provided by the 3DNow! or Pentium III SIMD instructions where these are available. This includes processing of custom vertex shaders.

How do I prevent transparent pixels being written to the z-buffer?

You can filter out pixels with an alpha value above or below a given threshold. You control this behavior by using the renderstates ALPHATESTENABLE, ALPHAREF and ALPHAFUNC.

What is a stencil buffer?

A stencil buffer is an additional buffer of per-pixel information, much like a z-buffer. In fact, it resides in some of the bits of a z-buffer. Common stencil/z-buffer formats are 15-bit z and 1-bit stencil, or 24-bit z and 8-bit stencil. It is possible to perform simple arithmetic operations on the contents of the stencil buffer on a per-pixel basis as polygons are rendered. For example, the stencil buffer can be incremented or decremented, or the pixel can be rejected if the stencil value fails a simple comparison test. This is useful for effects that involve marking out a region of the frame buffer and then performing rendering only the marked (or unmarked) region. Good examples are volumetric effects like shadow volumes.

How do I use a stencil buffer to render shadow volumes?

The key to this and other volumetric stencil buffer effects, is the interaction between the stencil buffer and the z-buffer. A scene with a shadow volume is rendered in three stages. First, the scene without the shadow is rendered as usual, using the z-buffer. Next, the shadow is marked out in the stencil buffer as follows. The front faces of the shadow volume are drawn using invisible polygons, with z-testing enabled but z-writes disabled and the stencil buffer incremented at every pixel passing the z-test. The back faces of the shadow volume are rendered similarly, but decrementing the stencil value instead.

Now, consider a single pixel. Assuming the camera is not in the shadow volume there are four possibilities for the corresponding point in the scene. If the ray from the camera to the point does not intersect the shadow volume, then no shadow polygons will have been drawn there and the stencil buffer is still zero. Otherwise, if the point lies in front of the shadow volume the shadow polygons will be z-buffered out and the stencil again remains unchanged. If the points lies behind the shadow volume then the same number of front shadow faces as back faces will have been rendered and the stencil will be zero, having been incremented as many times as decremented.

The final possibility is that the point lies inside the shadow volume. In this case the back face of the shadow volume will be z-buffered out, but not the front face, so the stencil buffer will be a non-zero value. The result is portions of the frame buffer lying in shadow have non-zero stencil value. Finally, to actually render the shadow, the whole scene is washed over with an alpha-blended polygon set to only affect pixels with non-zero stencil value. An example of this technique can been seen in the "Shadow Volume" sample that comes with the DirectX SDK.

What are the texel alignment rules? How do I get a one-to-one mapping?

This is explained fully in the Direct3D 9 documentation. However, the executive summary is that you should bias your screen coordinates by -0.5 of a pixel in order to align properly with texels. Most cards now conform properly to the texel alignment rules, however there are some older cards or drivers that do not. To handle these cases, the best advice is to contact the hardware vendor in question and request updated drivers or their suggested workaround. Note that in Direct3D 10, this rule no longer holds.

What is the purpose of the D3DCREATE_PUREDEVICE flag?

Use the D3DCREATE_PUREDEVICE flag during device creation to create a pure device. A pure device does not save the current state (during state changes), which often improves performance; this device also requires hardware vertex processing. A pure device is typically used when development and debugging are completed, and you want to achieve the best performance.

One drawback of a pure device is that it does not support all Get* API calls; this means you can not use a pure device to query the pipeline state. This makes it more difficult to debug while running an application. Below is a list of all the methods that are disabled by a pure device.

ID3D10Device9::GetClipPlane
ID3D10Device9::GetClipStatus
ID3D10Device9::GetLight
ID3D10Device9::GetLightEnable
ID3D10Device9::GetMaterial
ID3D10Device9::GetPixelShaderConstantF
ID3D10Device9::GetPixelShaderConstantI
ID3D10Device9::GetPixelShaderConstantB
ID3D10Device9::GetRenderState
ID3D10Device9::GetSamplerState
ID3D10Device9::GetTextureStageState
ID3D10Device9::GetTransform
ID3D10Device9::GetVertexShaderConstantF
ID3D10Device9::GetVertexShaderConstantI
ID3D10Device9::GetVertexShaderConstantB

A second drawback of a pure device is that it does not filter any redundant state changes. When using a pure device, your application should reduce the number of state changes in the render loop to a minimum; this may include filtering state changes to make sure that states do not get set more than once. This trade-off is application dependent; if you use more than a 1000 Set calls per frame, you should consider taking advantage of the redundancy filtering that is done automatically by a non-pure device.

As with all performance issues, the only way to know whether or not your application will perform better with a pure device is to compare your application's performance with a pure vs. non-pure device. A pure device has the potential to speed up an application by reducing the CPU overhead of the API. But be careful! For some scenarios, a pure device will slow down your application (due to the additional CPU work caused by redundant state changes). If you are not sure which type of device will work best for your application, and you do not filter redundant changes in the application, use a non-pure device.

How do I enumerate the display devices in a multi-monitor system?

Enumeration can be performed through a simple iteration by the application using methods of the IDirect3D9 interface. Call GetAdapterCount to determine the number of display adapters in the system. Call GetAdapterMonitor to determine which physical monitor an adapter is connected to (this method returns an HMONITOR, which you can then use in the Win32 API GetMonitorInfo to determine information about the physical monitor). Determining the characteristics of a particular display adapter or creating a Direct3D device on that adapter is as simple as passing the appropriate adapter number in place of D3DADAPTER_DEFAULT when calling GetDeviceCaps, CreateDevice, or other methods.

What happened to Fixed Function Bumpmapping in D3D9?

As of Direct3D 9 we tightened the validation on cards that could only support > 2 simultaneous textures. Certain older cards only have 3 texture stages available when you use a specific alpha modulate operation. The most common usage that people use the 3 stages for is emboss bumpmapping, and you can still do this with D3D9,

The height field has to be stored in the alpha channel and is used to modulate the lights contribution i.e.:

// Stage 0 is the base texture, with the height map in the alpha channel
m_pd3dDevice->SetTexture(0, m_pEmbossTexture );
m_pd3dDevice->SetTextureStageState(0, D3DTSS_TEXCOORDINDEX, 0 );
m_pd3dDevice->SetTextureStageState(0, D3DTSS_COLOROP,   D3DTOP_MODULATE );
m_pd3dDevice->SetTextureStageState(0, D3DTSS_COLORARG1, D3DTA_TEXTURE );
m_pd3dDevice->SetTextureStageState(0, D3DTSS_COLORARG2, D3DTA_DIFFUSE );
m_pd3dDevice->SetTextureStageState(0, D3DTSS_ALPHAOP,   D3DTOP_SELECTARG1 );
m_pd3dDevice->SetTextureStageState(0, D3DTSS_ALPHAARG1, D3DTA_TEXTURE );
if( m_bShowEmbossMethod )
{
 // Stage 1 passes through the RGB channels (SELECTARG2 = CURRENT), and 
 // does a signed add with the inverted alpha channel. 
 // The texture coords associated with Stage 1 are the shifted ones, so 
 // the result is:
 //    (height - shifted_height) * tex.RGB * diffuse.RGB
   m_pd3dDevice->SetTexture( 1, m_pEmbossTexture );
   m_pd3dDevice->SetTextureStageState( 1, D3DTSS_TEXCOORDINDEX, 1 );
   m_pd3dDevice->SetTextureStageState( 1, D3DTSS_COLOROP, D3DTOP_SELECTARG2 );
   m_pd3dDevice->SetTextureStageState( 1, D3DTSS_COLORARG1, D3DTA_TEXTURE );
   m_pd3dDevice->SetTextureStageState( 1, D3DTSS_COLORARG2, D3DTA_CURRENT );
   m_pd3dDevice->SetTextureStageState( 1, D3DTSS_ALPHAOP, D3DTOP_ADDSIGNED );
   m_pd3dDevice->SetTextureStageState( 1, D3DTSS_ALPHAARG1, D3DTA_TEXTURE|D3DTA_COMPLEMENT );
   m_pd3dDevice->SetTextureStageState( 1, D3DTSS_ALPHAARG2, D3DTA_CURRENT );

   // Set up the alpha blender to multiply the alpha channel 
   // (monochrome emboss) with the src color (lighted texture)
   m_pd3dDevice->SetRenderState( D3DRS_ALPHABLENDENABLE, TRUE );
   m_pd3dDevice->SetRenderState( D3DRS_SRCBLEND,  D3DBLEND_SRCALPHA );
   m_pd3dDevice->SetRenderState( D3DRS_DESTBLEND, D3DBLEND_ZERO );
}

This sample, along with other older samples, are no longer shipped in the current SDK release, and will not be shipped in future SDK releases.

Geometry (Vertex) Processing

Vertex streams confuse me how do they work?

Direct3D assembles each vertex that is fed into the processing portion of the pipeline from one or more vertex streams. Having only one vertex stream corresponds to the old pre-DirectX 8 model, in which vertices come from a single source. With DirectX 8, different vertex components can come from different sources; for example, one vertex buffer could hold positions and normals, while a second held color values and texture coordinates.

What is a vertex shader?

A vertex shader is a procedure for processing a single vertex. It is defined using a simple assembly-like language that is assembled by the D3DX utility library into a token stream that Direct3D accepts. The vertex shader takes as input a single vertex and a set of constant values; it outputs a vertex position (in clip-space) and optionally a set of colors and texture coordinates, which are used in rasterization. Notice that when you have a custom vertex shader, the vertex components no longer have any semantics applied to them by Direct3D and vertices are simply arbitrary data that is interpreted by the vertex shader you create.

Does a vertex shader perform perspective division or clipping?

No. The vertex shader outputs a homogenous coordinate in clip-space for the transformed vertex position. Perspective division and clipping is performed automatically post-shader.

Can I generate geometry with a vertex shader?

A vertex shader cannot create or destroy vertices; it operates on a single vertex at a time, taking one unprocessed vertex as input and outputting a single processed vertex. It can therefore be used to manipulate existing geometry (applying deformations, or performing skinning operations) but cannot actually generate new geometry per se.

Can I apply a custom vertex shader to the results of the fixed-function geometry pipeline (or vice-versa)?

No. You have to choose one or the other. If you are using a custom vertex shader, then you are responsible for performing the entire vertex transformation.

Can I use a custom vertex shader if my hardware does not support it?

Yes. The Direct3D software vertex-processing engine fully supports custom vertex shaders with a surprisingly high level of performance.

How do I determine if the hardware supports my custom vertex shader?

Devices capable of supporting vertex shaders in hardware are required to fill out the D3DCAPS9::VertexShaderVersion field to indicate the version level of vertex shader they support. Any device claiming to support a particular level of vertex shader must support all legal vertex shaders that meet the specification for that level or below.

How many constant registers are available for vertex shaders?

Devices supporting vs 1.0 vertex shaders are required to support a minimum of 96 constant registers. Devices may support more than this minimum number and can report this through the D3DCAPS9::MaxVertexShaderConst field.

Can I share position data between vertices with different texture coordinates?

The usual example of this situation is a cube in which you want to use a different texture for each face. Unfortunately the answer is no, it's not currently possible to index the vertex components independently. Even with multiple vertex streams, all streams are indexed together.

When I submit an indexed list of primitives, does Direct3D process all of the vertices in the buffer, or just the ones I indexed?

When using the software geometry pipeline, Direct3D first transforms all of the vertices in the range you submitted, rather than transforming them "on demand" as they are indexed. For densely packed data (that is, where most of the vertices are used) this is more efficient, particularly when SIMD instructions are available. If your data is sparsely packed (that is, many vertices are not used) then you may want to consider rearranging your data to avoid too many redundant transformations. When using the hardware geometry acceleration, vertices are typically transformed on demand as they are required.

What is an index buffer?

An index buffer is exactly analogous to a vertex buffer, but instead it contains indices for use in DrawIndexedPrimitive calls. It is highly recommended that you use index buffers rather than raw application-allocated memory when possible, for the same reasons as vertex buffers.

I notice that 32-bit indices are a supported type; can I use them on all devices?

No. You must check the D3DCAPS9::MaxVertexIndex field to determine the maximum index value that is supported by the device. This value must be greater than 2 to the 16th power -1 (0xffff) in order for index buffers of type D3DFMT_INDEX32 to be supported. In addition, note that some devices may support 32-bit indices but support a maximum index value less than 2 to the 32nd power -1 (0xffffffff); in this case the application must respect the limit reported by the device.

Does S/W vertex processing support 64 bit?

There is an optimized s/w vertex pipeline for x64, but it does not exist for IA64.

Performance Tuning

How can I improve the performance of my Direct3D application?

The following are key areas to look at when optimizing performance:

Batch size

Direct3D is optimized for large batches of primitives. The more polygons that can be sent in a single call, the better. A good rule of thumb is to aim to average 1000 vertices per primitive call. Below that level you're probably not getting optimal performance, above that and you're into diminishing returns and potential conflicts with concurrency considerations (see below).

State changes

Changing render state can be an expensive operation, particularly when changing texture. For this reason, it is important to minimize as much as possible the number of state changes made per frame. Also, try to minimize changes of vertex or index buffer.

Note As of DirectX 8, the cost of changing vertex buffer is no longer as expensive as it was with previous versions, but it is still good practice to avoid vertex buffer changes where possible.

Concurrency

If you can arrange to perform rendering concurrently with other processing, then you will be taking full advantage of system performance. This goal can conflict with the goal of reducing renderstate changes. You need to strike a balance between batching to reduce state changes and pushing data out to the driver early to help achieve concurrency. Using multiple vertex buffers in round-robin fashion can help with concurrency.

Texture uploads

Uploading textures to the device consumes bandwidth and causes a bandwidth competition with vertex data. Therefore, it is important to not to over commit texture memory, which would force your caching scheme to upload excessive quantities of textures each frame.

Vertex and index buffers

You should always use vertex and index buffers, rather than plain blocks of application allocated memory. At a minimum, the locking semantics for vertex and index buffers can avoid a redundant copy operation. With some drivers, the vertex or index buffer may be placed in more optimal memory (perhaps in video or AGP memory) for access by the hardware.

State macro blocks

These were introduced in DirectX 7.0. They provide a mechanism for recording a series of state changes (including lighting, material and matrix changes) into a macro, which can then be replayed by a single call. This has two advantages:

You reduce the call overhead by making one call instead of many.
An aware driver can pre-parse and pre-compile the state changes, making it much faster to submit to the graphics hardware.

State changes can still be expensive, but using state macros can help reduce at least some of the cost. Use only a single Direct3D device. If you need to render to multiple targets, use SetRenderTarget. If you are creating a windowed application with multiple 3D windows, use the CreateAdditionalSwapChain API. The runtime is optimized for a single device and there is a considerable speed penalty for using multiple devices.

Which primitive types (strips, fans, lists and so on) should I use?

Many meshes encountered in real data feature vertices that are shared by multiple polygons. To maximize performance it is desirable to reduce the duplication in vertices transformed and sent across the bus to the rendering device. It is clear that using simple triangle lists achieves no vertex sharing, making it the least optimal method. The choice is then between using strips and fans, which imply a specific connectivity relationship between polygons and using indexed lists. Where the data naturally falls into strips and fans, these are the most appropriate choice, since they minimize the data sent to the driver. However, decomposing meshes into strips and fans often results in a large number of separate pieces, implying a large number of DrawPrimitive calls. For this reason, the most efficient method is usually to use a single DrawIndexedPrimitive call with a triangle list. An additional advantage of using an indexed list is that a benefit can be gained even when consecutive triangles only share a single vertex. In summary, if your data naturally falls into large strips or fans, use strips or fans; otherwise use indexed lists.

How do you determine the total texture memory a card has, excluding AGP memory?

IDirect3DDevice9::GetAvailableTextureMem() returns the total available memory, including AGP. Allocating resources based on an assumption of how much video memory you have is not a great idea. For example, what if the card is running under a Unified Memory Architecture (UMA) or is able to compress the textures? There might be more space available than you might have thought. You should create resources and check for 'out of memory' errors, then scale back on the textures. For example, you could remove the top mip-levels of your textures.

What's a good usage pattern for vertex buffers if I'm generating dynamic data?

Create a vertex buffer using the D3DUSAGE_DYNAMIC and D3DUSAGE_WRITEONLY usage flags and the D3DPOOL_DEFAULT pool flag. (Also specify D3DUSAGE_SOFTWAREPROCESSING if you are using software vertex processing.)
I = 0.
Set state (textures, renderstates and so on).
Check if there is space in the buffer, that is, for example, I + M <= N? (Where M is the number of new vertices).
If yes, then Lock the VB with D3DLOCK_NOOVERWRITE. This tells Direct3D and the driver that you will be adding vertices and won't be modifying the ones that you previously batched. Therefore, if a DMA operation was in progress, it isn't interrupted. If no, goto 11.
Fill in the M vertices at I.
Unlock.
Call Draw[Indexed]Primitive. For non-indexed primitives use I as the StartVertex parameter. For indexed primitives, ensure the indices point to the correct portion of the vertex buffer (it may be easiest to use the BaseVertexIndex parameter of the SetIndices call to achieve this).
I += M.
Goto 3.
Ok, so we are out of space, so let us start with a new VB. We don't want to use the same one because there might be a DMA operation in progress. We communicate to this to Direct3D and the driver by locking the same VB with the D3DLOCK_DISCARD flag. What this means is "you can give me a new pointer because I am done with the old one and don't really care about the old contents any more."
I = 0.
Goto 4 (or 6).

Why do I have to specify more information in the D3DVERTEXELEMENT9 structure?

As of Direct3D 9, the vertex stream declaration is no longer just a DWORD array, it is now an array of D3DVERTEXELEMENT9 structures. The runtime makes use of the additional semantic and usage information to bind the contents of vertex streams to vertex shaders input registers/variables. For Direct3D 9, vertex declarations are decoupled from vertex shaders, which makes it easier to use shaders with geometries of different formats as the runtime only binds the data that the shader needs.

The new vertex declarations can be used with either the fixed function pipeline or with shaders. For the fixed function pipeline, there is no need to call SetVertexShader. If however, you want to switch to the fixed function pipeline and have previously used a vertex shader, call SetVertexShader(NULL). When this is done, you will still need to call SetFVF to declare the FVF code.

When using vertex shaders, call SetVertexShader with the vertex shader object. Additionally, call SetFVF to set up a vertex declaration. This uses the information implicit in the FVF. SetVertexDeclaration can be called in place of SetFVF because it supports vertex declarations that cannot be expressed with an FVF.

D3DX Utility Library

What file formats are supported by the D3DX image file loader functions?

The D3DX image file loader functions support BMP, TGA, JPG, DIB, PPM and DDS files.

The text rendering functions in D3DX don't seem to work, what am I doing wrong?

A common mistake when using the ID3DXFont::DrawText functions is to specify a zero alpha component for the color parameter; resulting in completely transparent (that is, invisible) text. For fully opaque text, ensure that the alpha component of the color parameter is fully saturated (255).

How can I save the contents of a surface or texture to a file?

The DirectX 8.1 SDK added two functions to the D3DX library specifically for this purpose: D3DXSaveSurfaceToFile() and D3DXSaveTextureToFile(). These functions support saving an image to file in either BMP or DDS format. In previous versions you will have to lock the surface and read the image data, then write it to a bitmap file. An article on writing a function to store bitmaps can be found at Windows GDI: Storing an Image.

Alternatively, GDI+ could be used to save the image in a wide variety of formats, though this requires additional support files to be distributed with your application.

How can I make use of the High Level Shader Language (HLSL) in my game?

There are three ways that the High Level Shading Language can be incorporated into your game engine:

Compile your shader source into vertex or pixel shading assembly (using the command line utility fxc.exe) and use D3DXAssembleShader() at run time. This way even a DirectX 8 game can even take advantage of the power of the HLSL.
Use D3DXCompileShader() to compile your shader source into token stream and constant table form. At run time load the token stream and constant table and call CreateVertexShader() or CreatePixelShader() on the device to create your shaders.
The easiest way to get up and running is to take advantage of the D3DX Effects system by calling D3DXCreateEffectFromFile() or D3DXCreateEffectFromResource() with your effect file.

What is the correct way to get shaders from an Effect?

Use D3DXCreateEffect to create an ID3DXEffect and then use GetPassDesc to retrieve a D3DXPASS_DESC. This structure contains pointers to vertex and pixel shaders.

Do not use ID3DXEffectCompiler::GetPassDesc. Vertex and pixel shader handles returned from this method are NULL.

What is the HLSL noise() intrinsic for?

The noise intrinsic function generates perlin noise as defined by Ken Perlin. The HLSL function can currently only be used to fill textures in texture shaders as current h/w does not support the method natively. Texture shaders are used in conjuction with the D3DXFill*Texture() functions which are useful helper functions to generate procedurally defined textures during load time.

How do I detect whether to use pixel shader model 2.0 or 2.a?

You can use the D3DXGetPixelShaderProfile() and D3DXGetPixelShaderProfile() functions which return a string determining what HLSL profile is best suited to the device being ran.

How do I access the Parameters in my Precompiled Effects Shaders?

Through the ID3DXConstantTable interface which is used to access the constant table. This table contains the variables that are used by high-level language shaders and effects.

Is there a way to add user data to an effect or other resource?

Yes, to set private data you call SetPrivateData (pReal is the D3D texture object, pSpoof is the wrapped texture object).

hr = pReal->SetPrivateData(IID_Spoof, &pSpoof, 
            sizeof(IDirect3DResource9*), 0)));

To look up the wrapped pointer:

    IDirect3DResource9* pSpoof;
    DWORD dwSize = sizeof(pSpoof);
    hr = pReal->GetPrivateData(IID_Spoof, (void*) &pSpoof, &dwSize);

Why does rendering of an ID3DXMesh object slow down significantly after I define subsets?

You probably have not optimized the mesh after defining the face attributes. If you specify attributes and then call ID3DXMesh::DrawSubset(), this method must perform a search of the mesh for all faces containing the requested attributes. In addition, the rendered faces are likely in a random access pattern, thus not utilizing vertex cache. After defining the face attributes for your subsets, call the ID3DXMesh::Optimize or ID3DXMesh::OptimizeInPlace methods and specifying an optimization method of D3DXMESHOPT_ATTRSORT or stronger. Note that for optimum performance you should optimize with the D3DXMESHOPT_VERTEXCACHE flag, which will also reorder vertices for optimum vertex cache utilization. The adjacency array generated for a D3DX Mesh has three entries per face, but some faces may not have adjacent faces on all three edges. How is this encoded? Entries where there are no adjacent faces are encoded as 0xffffffff.

I've heard a lot about Pre-computed Radiance Transfer (PRT), where can I learn more?

PRT is a new feature of D3DX added in the Summer 2003 SDK Update. It enables rendering of complex lighting scenarios such as global -llumination, soft shadowing and sub-surface scatter in real time. The SDK contains documentation and samples of how to integrate the technology into your game. The PRT Demo Sample and LocalDeformablePRT Sample samples demonstrate how to use the simulator for per vertex and per pixel lighting scenarios respectively. Further information about this and other topics can also be found at Peter Pike Sloan's Web page.

How can I render to a texture and make use of Anti Aliasing?

Create a multisampled render target using Direct3DDevice9::CreateRenderTarget. After rendering the scene to that render target, StretchRect from it to a render target texture. If you make any changed to the offscreen textre (such as blurring or blooming it), copy it back to the back buffer before you present().

DirectSound Questions

Why do I get a burst of static when my application starts up? I notice this problem with other applications too.

You probably installed the debug DirectX runtime. The debug version of the runtime fills buffers with static in order to help developers catch bugs with uninitialized buffers. You cannot guarantee the contents of a DirectSound buffer after creation; in particular, you cannot assume that a buffer with be zeroed out.

Why I am experiencing a delay in between changing an effects parameters and hearing the results?

Changes in effect parameters do not always take place immediately on DirectX 8. For efficiency, DirectSound processes 100 milliseconds of sound data in a buffer, starting at the play cursor, before the buffer is played. This preprocessing happens after all of the following calls:

IDirectSoundBuffer8::SetCurrentPosition
IDirectSoundBuffer8::SetFX
IDirectSoundBuffer8::Stop
IDirectSoundBuffer8::Unlock

As of DirectX 9, a new FX processing algorithm that processes effects just-in-time addresses this problem and has reduced the latency. The algorithm has been added to the IDirectSoundBuffer8::Play() call, along with an additional thread that processes effects just ahead of the write cursor. So you can set parameters at any time and they'll work as expected. However, note that on a playing buffer there'll be a small delay (usually 100ms) before you hear the parameter change, because the audio between the play and write cursors (and a bit more padding) has already been processed at that time.

Is it possible to have a hardware midi synth play back in 3D?

Unfortunately not as there are no DirectMusic hardware synths that support 3D positioning. There are also no DirectMusic hardware synths that support AudioPaths (which is how you get 3D). If you use hardware synths you are limited to DirectX 7-era DMusic functionality.

How do I detect if DSound is installed?

If you do not need to use DirectSoundEnumerate() to list the available DSound devices, don't link your application with dsound.lib and instead use it via COMs CoCreateInstance(CLSID_DirectSound...) then initialize the DSound object using Initialize(NULL). If you need to use DirectSoundEnumerate(), you can dynamically load dsound.dll using LoadLibrary("dsound.dll"); and access its methods using GetProcAddress("DirectSoundEnumerateA/W") and GetProcAddress("DirectSoundCreateA/W") and so on.

How do I create multichannel audio with WAVEFORMATEXTENSIBLE?

If you can't find an answer to your question in the DirectSound help files, there is a good article with more information available at Multiple Channel Audio Data and WAVE Files.

How can I use the DirectSound Voice Manager with property sets like EAX?

In DirectSound 9.0 when you duplicate a buffer it is now possible to get the IDirectSoundBuffer8 interface on the duplicate buffer, which will give you access to the AcquireResources method. This will allow you to associate a buffer with the DSBCAPS_LOCDEFER flag with a hardware resource. You can then set your EAX parameters on this buffer before having to call Play().

I am having problems with unreliable behavior when using cursor position notifications. How can I get more accurate information?

There are some subtle bugs in various versions of DirectSound, the core Windows audio stack, and audio drivers which make cursor positions notifications unreliable. Unless you're targeting a known HW/SW configuration on which you know that notifications are well-behaved, avoid cursor position notifications. For position tracking GetCurrentPosition() is a safer technique.

I am suffering from performance degradation when using GetCurrentPosition(). What can I do to improve performance?

Each GetCurrentPosition() call on each buffer causes a system call, and system calls should be minimized as they are a large component of DSound's CPU footprint. On NT (Win2K and XP) the cursors in SW buffers (and HW buffers on some devices) move in 10ms increments, so calling GetCurrentPosition() every 10ms is ideal. Calling it more often than every 5ms will cause some performance degradation.

My DirectSound application is taking up too much CPU time or is performing slowly. Is there anything I can do to optimize my code?

There are several things you can do to improve the performance of your audio code:

Don't call GetCurrentPosition too often. Each GetCurrentPosition() call on each buffer causes a system call, and system calls should be minimized as they are a large component of DSound's CPU footprint. On NT (Win2K and XP) the cursors in SW buffers (and HW buffers on some devices) move in 10ms increments, so calling GetCurrentPosition() every 10ms is ideal. Calling it more often than every 5ms will cause some perf degradation.
Utilize a separate, lower frame-rate for audio. Nowadays many Windows games can exceed 100 Frames per Second and it is not necessary in most cases to update your 3D audio parameters at the same frame rate. Processing your audio every second or third graphics frame, or every 30ms or so, can reduce the number of audio calls significantly throughout your application without reducing audio quality.
Use DS3D_DEFERRED for 3D objects. Most sound cards respond immediately to parameter changes and in a single frame much can change, especially if you change the position or orientation of the listener. This causes the soundcard / CPU to perform many unnecessary calculations, so another quick and universal optimization is to defer some parameter changes and commit them at the end of the frame.
or at least use SetAllParameters rather than individual Set3DParamX calls on buffers.

Similarly, you should use at least use SetAllParamenters calls on 3D buffers rather the individual Set3DParamX calls. Just try to minimize system calls whenever possible.
Don't make redundant calls; store and sort a list of play calls. Often, in one audio update frame, there are 2 requests to play new sounds. If the requests are processed as they arrive, then the first new sound could be started and then immediately replaced the second requested sound. This results in redundant calculations, an unnecessary play call, and an unnecessary stop call. It is better to store a list of requests for new sounds to be played, so that the list can be sorted, and only those voices that should start playing, are actually ever played.
Also, you should store local copies of the 3D and EAX parameters for each sound source. If a request is made to set a parameter to a particular value, you can check to see if the value is actually different from the last value set. If it isn't, the call does not need to be made.

Although the sound card driver will probably detect this scenario and not perform the (same) calculation again, the audio call will have to reach the audio driver (via a ring transition) and this is already a slow operation.

When I stream a buffer it tends to glitch and perform poorly. What's the best way to stream a buffer?

When streaming audio into a buffer there are two basic algorithms: After-Write-Cursor (AWC) and Before-Play-Cursor (BPC). AWC minimizes latency at the cost of glitching, whereas BPC is the opposite. Because there are usually no interactive changes to the streamed sound this sort of latency is rarely a problem for games and similar applications, so BPC is the more appropriate algorithm. In AWC, each time your streaming thread runs you "top up" the data in your looping buffers up to N ms beyond their write cursors (typically N=40 or so, to allow for Windows scheduling jitter). In BPC, you always write as much data to the buffers as possible, filling them right up to their play cursors (or perhaps 32 bytes before to allow for drivers that incorrectly report their play cursor progress).

Use BPC to mimimize glitching, and use buffers 100ms or larger even if your games doesn't glitch on your test hardware, it will glitch on some machine out there.

I am playing the same sounds over and over very often and very quickly and sometimes they don't play properly, or the Play() call takes a long time. What should I do?

Startup latency (which is different from streaming latency mentioned above) can be an issue in the case of some hardware (the Play() call just takes a long time sometimes on certain sound cards). If you really want to reduce this latency, for twitch sounds (gun shots, footsteps, and so on.) a handy trick is to keep some buffers always looping and playing silence. When you need to play a twitch sound, pick a free buffer, see where its write cursor is, and put the sound into the buffer just beyond the write cursor. Some soundcards fail QuerySupport for deferred properties that I know they support. Is there a workaround? You could just QuerySupport for the non-deferred versions of the properties and use deferred settings anyway. The most recent soundcard drivers may also fix this issue.

How do I encode WAV files into WMA?

Refer to the documentation on the Windows Media Encoder at: Windows Media Encoder 9 Series.

How do I decode MP3 files with DirectSound?

DirectSound does not natively support MP3 decoding. You can decode the files in advance yourself (using an ACM codec of a DirectShow filter), or else just use DirectShow itself, which can do the decode for you; you can then copy the resulting PCM audio data into your DirectSound buffers.

DirectX Extensions for Alias Maya

Why aren't my NURBS showing up?

NURBS are not supported. You can convert them to polygon meshes.

Why aren't my SUBDs showing up?

SUBDs are not supported. You can convert them to polygon meshes.

Why does my animation in the X file look different than the animation in the preview window?

The preview window is not animating in the strictest sense of the matter. It is not playing animation but instead is synchronizing to the most current state of Maya's scene. When animation is exported the matrices at each transform are decomposed into scale, rotation (quaternion), and translation components (often referred to as SRTs). SRTs are more desirable than matrices because they interpolate well, provide a more compact form of the data, and can be compressed independently. Not all matrices can break down into SRTs. If they cannot decompose, the resulting SRTs will be unknown, so small errors in animation may be detected. The two features in Maya that most often cause problems during decomposition are shears and off-center rotations or scales. If you are encountering this problem, because you are using off-center rotations or scales, consider adding additional transforms increasing your level of hierarchy.

Where D3DX animation supports SRTs, it looks like this:

[S]x[R]x[T]

Maya's matrices are much more complicated and require a significant amount of additional process, which looks like this:

[SpInv]x[S]x[Sh]x[Sp]x[St]x[RpInv]x[Ro]x[R]x[Rp]x[Rt]x[T]

I skinned my mesh with RigidSkin but the mesh (or portion) isn't moving. Why?

Maya's Rigid Skin is not supported at this time. Please use Smooth Skin.

Where has all of my IK gone in the X-file?

X-files do not support IK. Instead, the IK solutions are baked into the frames stored in the X-file.

Why do none of my materials colors show up except DirectXShaders?

The DirectX Extensions for Maya currently only support DirectXShader materials for preview and export. In a future version other materials may be supported.

XInput Questions

Can I use DirectInput to read the triggers?

Yes, but they act as the same axis. So you can not read the triggers independently with DirectInput. Using XInput, the triggers return seperate values.

For more information on why DirectInput interprets the triggers as one axis, see Using the Xbox 360 Controller with DirectInput.

How many controllers does XInput support?

XInput supports 4 controllers plugged in at a time.

Does XInput support non-common controllers?

No, it does not.

Are common controllers available through DirectInput?

Yes, you may access common controllers through DirectInput.

How do I get force feedback on the common controllers?

Use the XInputSetState function.

Why does my default audio device change?

When connecting the headset, the controller's headset acts as a standard USB audio device, so when it is connected, Windows automatically changes to use this USB audio device as the default. Since the user likely does not want all audio to go through the headset, they will need to manually adjust it back to the original setting.

How do I control the lights on the controller?

The lights on the controller are predetermined by the operating system and can't be changed.

How do I access the Xbox 360 button in my applications?

Sorry, this button is reserved for future use.

Where do I get drivers?

The drivers will be available via Windows Update, or through windowsgaming.com.

How is controller ID determined?

At XInput startup, the ID is determined non-deterministically by the XInput engine and the controllers that are plugged in. If controllers are plugged in while an XInput application is running, the system will assign the new controller the lowest available number. If a controller is disconnected, its number will be made available again.

How do I get the audio devices for the controller?

Use the XInputGetDSoundAudioDeviceGuids function. See the AudioController sample for details.

What should I do when a controller is unplugged?

If the controller was in use by a player, you should pause the game until the controller is reconnected and the player presses a button to signal that they are ready to unpause.