Mapping DirectX VA to IAMVideoAccelerator
Restricted Mode Profile and Configuration Establishment
Due to the variety of types of data that can be decoded by DirectX VA, and the multiple decoding configurations supported within DirectX VA for each of these types of data (e.g., using bitstream buffers vs. host residual difference decoding vs. accelerator-based IDCT with and without encryption of each relevant type of buffer, etc.), we believe it would be somewhat ungainly to simply specify a unique GUID for every unique data type and decoding configuration. This would create a large number of GUIDs (e.g., hypothetically if there were 16 profiles of DirectX VA and 16 configurations possible for each, there would need to be 256 defined GUIDs - requiring 4k bytes of memory just to hold them all. This issue is the most difficult part of determining how to map DirectX VA into IAMVideoAccelerator, with the remainder of the operational definition mostly being quite straightforward. As a result, we specify a unique GUID only for each type of data (i.e., for each restricted mode profile) and allow an additional GUID to be associated with each type of encryption. The decoding configuration is then established between the decoder and accelerator by a lower-level subordinate negotiation using probing and locking operations to establish configurations for each type of DirectX VA function.
DirectX VA IAMVideoAccelerator Operational Specification
The precise mechanism of operation is as follows:
- Each restricted mode profile defined herein has an associated DirectX VA GUID which can be supported by a downstream input pin's IPin::QueryAccept and IPin::ReceiveConnection and listed in IAMVideoAccelerator::GetVideoAcceleratorGUIDs.
- Similarly, each encryption protocol type for use with DirectX VA shall have an associated encryption protocol type GUID which can be supported by a downstream input pin's IPin::QueryAccept and IPin::ReceiveConnection and listed in IAMVideoAccelerator::GetVideoAcceleratorGUIDs. The "no encryption" GUID DXVA_NoEncrypt shall not be sent in this list, as support for it is required and therefore implicit.
- After calling IPin::ReceiveConnection to attempt a connection to the downstream input pin, the decoder's IAMVideoAcceleratorNotify::GetCreateVideoAcceleratorData shall return a pointer to a DXVA_ConnectMode data structure containing the connection mode information for the connection. IAMVideoAccelerator::GetCompBufferInfo shall be called with *pdwNumTypesCompBuffers = 16 and shall return compressed buffer information based on the convention that the type number of each buffer as defined in Section 3.4 can be used directly as the zero-based index into the array of AMVACompBufferInfo data structures that is returned. This requires that for any buffer types that will not be used (including buffer type 0, since there is no defined use of that buffer type), the accelerator driver will provide AMVACompBufferInfo data structures with some form of "dummy" parameter values (e.g., dwNumCompBuffers=0, dwWidthToCreate=0, dwHeightToCreate=0, and dwBytesToAllocate=0).
- DXVA function indications and associated data buffers are sent using IAMVideoAccelerator::Execute. The DXVA function is indicated in the dwFunction parameter of the call. The only DXVA functions that are relevant for initialization are DXVA_ConfigQueryOrReplyFunc and DXVA_EncryptProtocolFunc.
- IAMVideoAccelerator::BeginFrame shall be called prior to sending any bDXVA_Func with compressed buffer parameters which cause writes to an uncompressed destination surface. The purpose of IAMVideoAccelerator::BeginFrame in DirectX VA is to associate destination surfaces with index values and to notify the video accelerator driver of the intent to initiate writes a surface so that the driver can respond with an indication of whether the surface is ready to be overwritten. The AMVABeginFrameInfo structure passed in IAMVideoAccelerator::BeginFrame shall contain a pInputData pointer to single WORD wBeginPictureIndex parameter matching the frame index passed into IAMVideoAccelerator::BeginFrame (and dwSizeInputData shall be 2). This is the index to be used in a compressed buffer to command a write to the surface (i.e., to be used as wDecodedPictureIndex, wDeblockedPictureIndex, wBlendedDestinationIndex, or wPicResampleDestPicIndex). Each call to IAMVideoAccelerator::BeginFrame shall be paired with a corresponding call to IAMVideoAccelerator::EndFrame as described below. For example, if a compressed picture is to be decoded and then alpha blended using front-end buffer-to-buffer blending with a graphic image, there would be a call to IAMVideoAccelerator::BeginFrame prior to decoding the compressed picture into a surface specified in wDecodedPictureIndex, then a call to IAMVideoAccelerator::EndFrame after passing all compressed buffers used to decode the picture, then a second call to IAMVideoAccelerator::BeginFrame prior to commanding alpha blending combination of the graphic source with the decoded picture into a surface specified in wBlendedDestinationIndex, and then a second call to IAMVideoAccelerator::EndFrame after the alpha blend combination operation.The pointer pOutputData in AMVABeginFrameInfo shall be NULL (and dwSizeOutputData shall be "0"). The HRESULT that is returned by IAMVideoAccelerator::BeginFrame shall be:
- S_OK if the uncompressed surface is available and ready for use.
- E_PENDING if the uncompressed surface is not yet available for use but will become available soon (i.e., if the uncompressed surface is being read for display and the reading/display of the surface has not yet been completed).
- E_FAIL or E_INVALIDARG some other error indication only if a data format or protocol error is detected (such as an incorrect value of dwSizeInputData or a non-NULL pOutputData).
- DXVA function indications and assocated data buffers are sent using IAMVideoAccelerator::Execute. More than one bDXVA_Func value may be indicated in the same call to IAMVideoAccelerator::Execute. The bDXVA_Func values shall be packed into the dwFunction parameter of the call, with the first function command in the eight MSBs, the next command in the next eight bits, etc., and with any remaining bits padded out with zeros. The value 0xFF for bDXVA_Func indicates that the bDXVA_Func is extended to two or four bytes. If the second byte is also 0xFF, this indicates that bDXVA_Func is extended to four bytes. If the upper four bits of the third byte are 0xF or 0x0, this indicates that bDXVA_Func contains a DXVA_ConfigQueryOrReplyFunc or DXVA_EncryptProtocolFunc. Multi-byte commands shall not indicate continuation past the end of dwFunction. Care must be taken by the decoder to ensure that no sequential dependencies are present between different bDXVA_Func values specified in the same call to IAMVideoAccelerator::Execute and that all potential race conditions (such as between picture decoding and sub-picture blending, between sub-picture loading and sub-picture blending, etc.) are prevented by appropriate calls to IAMVideoAccelerator::BeginFrame and IAMVideoAccelerator::QueryRenderStatus before subsequent calls to IAMVideoAccelerator::Execute.
- If dwFunction contains a DXVA_ConfigQueryOrReplyFunc, the lpPrivateInputData pointer for passing data to the accelerator in this call shall point to a configuration data structure, the lpPrivateOutputData pointer for receiving information from the accelerator shall point to an area where an alternative or duplicate configuration data structure can be placed, the pamvaBufferInfo pointer for an array of AMVABUFFERINFO shall be NULL, and dwNumBuffers shall be zero. The returned HRESULT contains the S_OK or S_FALSE indication in response to the query, or E_FAIL or E_INVALIDARG some other error indication HRESULT in the event of a severe problem in protocol execution (such as an invalid.configuration parameter). All calls to IAMVideoAccelerator::Execute for all uses of DXVA_ConfigQueryOrReplyFunc shall precede all other calls to IAMVideoAccelerator::Execute.
- If dwFunction contains a DXVA_EncryptProtocolFunc, the lpPrivateInputData pointer for passing data to the accelerator in this call shall point to an encryption protocol data structure that begins with DXVA_EncryptProtocolHeader, the lpPrivateOutputData pointer for receiving information from the accelerator shall point to an area where the data to be returned (such as a certificate) by the encryption protocol (which will begin with DXVA_EncryptProtocolHeader) can be placed, the pamvaBufferInfo pointer for an array of AMVABUFFERINFO shall be NULL, and dwNumBuffers shall be zero. The returned HRESULT contains S_OK as long as the encryption protocol is functioning normally and contains E_FAIL or E_INVALIDARG or some other error indication HRESULT in the event of a severe problem in protocol execution.
- If dwFunction does not contain a DXVA_ConfigQueryOrReplyFunc or DXVA_EncryptProtocolFunc, the lpPrivateInputData pointer for passing data to the accelerator shall point to a buffer description list. The first four entries in the buffer description list structure for each buffer (dwTypeIndex, dwBufferIndex, dwDataOffset, and dwDataSize) shall be equal to those in the AMVABUFFERINFO data structure for the same buffer. If bDXVA_Func is equal to "1" is specified within dwFunction and bPicReadbackRequests is "1", the lpPrivateOutputData pointer for receiving information from the accelerator shall point to an area of persistent memory (e.g., heap) to be filled in with read-back macroblock data from the accelerator (such data not guaranteed to be present until IAMVideoAccelerator::QueryRenderStatus for writing to the same picture parameters buffer indicates S_OK as described in item 10 below). Otherwise, the lpPrivateOutputData pointer for receiving information from the accelerator shall point to a single DWORD to be set to one of the following indication values (particularly useful for reporting bitstream errors in off-host VLD operation).
Value |
Description |
0 |
Execution OK. |
1 |
Minor problem in data format encountered. |
2 |
Significant problem in data format encountered. |
3 |
Severe problem in data format encountered. |
4 |
Other severe problem encountered. |
If either type of "severe" problem is indicated, the software decoder should cease to operate the function(s) unless corrective action can be taken. This data returned from the accelerator shall not be read by the host until after the buffer rendering for the picture has completed, as can be tested by IAMVideoAccelerator::QueryRenderStatus. The returned HRESULT contains S_OK as long as the interface operation is functioning normally and may return E_FAIL or E_INVALIDARG or some other error indication HRESULT in the event of a severe problem.
- The picture decoding parameters buffer shall be among the first buffers sent for the decoding of each picture when using IAMVideoAccelerator::Execute with bDXVA_Func equal to "1", and all the buffers for decoding a picture in a bitstream shall be sent before any buffers for decoding subsequent pictures. If a macroblock control command buffer is sent, a corresponding residual difference data buffer shall be sent (containing data for the same macroblocks) with the same IAMVideoAccelerator::Execute call.
- IAMVideoAccelerator::EndFrame shall be called after all compressed buffers have been sent that will cause the creation of the output content in a specified uncompressed surface (i.e., a result of operations specified for wDecodedPictureIndex, wDeblockedPictureIndex, wBlendedDestinationIndex, or wPicResampleDestPicIndex). The purpose of this call to IAMVideoAccelerator::EndFrame is to notify the video accelerator hardware that all data needed for the specified operation has been sent. The pointer to data to send downstream through IAMVideoAccelerator::EndFrame shall point to a single WORD wEndPictureIndex containing the index of the frame that is ending. This parameter shall match the wBeginPictureIndex value specified in the prior call to IAMVideoAccelerator::BeginFrame before the sending of the relevant compressed buffers. Subsequent to a call to IAMVideoAccelerator::EndFrame, the uncompressed surface with index wEndPictureIndex shall not be found in any picture's wDecodedPictureIndex, wDeblockedPictureIndex, wBlendedDestinationIndex, or wPicResampleDestPicIndex until after another call to IAMVideoAccelerator::BeginFrame is issued to announce that this will occur and an S_OK has been returned as a result. However, that destination surface index may occur in subsequent read access commands such as wForwardRefPictureIndex, wBackwardRefPictureIndex, wPicResampleSourcePicIndex, or bRefPicSelect[i]. The HRESULT returned by IAMVideoAccelerator::EndFrame shall be S_OK unless there is some kind of data format or protocol error, in which case it can be E_FAIL or E_INVALIDARG or some other error indication.
- In the case of field based decoding (e.g. in MPEG-2 bitstreams) there will not be a one-to-one mapping of functional pictures in the bitstream to uncompressed surfaces in the accelerator interface. When decoding field pictures in an MPEG-2 bitstream, there will be two "pictures" decoded to produce one complete output uncompressed surface. In the DirectX VA interface definition, each frame corresponds to each use of wDecodedPictureIndex, wDeblockedPictureIndex, wBlendedDestinationIndex, or wPicResampleDestPicIndex. Thus two pairs of calls to IAMVideoAccelerator::BeginFrame and IAMVideoAccelerator::EndFrame are required for the decoding of field pictures into output uncompressed surfaces.
- A call to IAMVideoAccelerator::QueryRenderStatus with dwFlags equal to zero which occurs sometime after a call to IAMVideoAccelerator::EndFrame with a particular wEndPictureIndex and checks the status of a buffer that was sent that contained the wEndPictureIndex in wDecodedPictureIndex, wDeblockedPictureIndex, wBlendedDestinationIndex, or wPicResampleDestPicIndex will return an S_OK indication if all of the operations to write the data to the uncompressed surface have completed and will return E_PENDING if the operation has not yet completed. E_FAIL or E_INVALIDARG or some other error indication may be returned in the event of a protocol error.
Operational Correspondence with Motion Compensation Device Driver
This section contains a description of the Motion Compensation device driver side of the DirectX VA interface. (Reference:Windows 2000 DDK - Graphics Drivers - Design Guide - 3.0 DirectDraw DDI - 3.12 Motion Compensation. See the Windows DDK for documentation on the structures in boldface.)
The following items refer to entries accessed through the DD_MOTIONCOMPCALLBACKS structure:
- At the start of the relevant processing, the device driver's DdMoCompCreate is used to notify the driver that the software decoder will start using a video acceleration object.
- GUIDs received from IAMVideoAccelerator::GetVideoAcceleratorGUIDs originate from the device driver's DdMoCompGetGUIDs.
- A call to the downstream input pin's IAMVideoAccelerator::GetUncompFormatsSupported returns data from the device driver's DdMoCompGetFormats.
- The DXVA_ConnectMode data structure from the decoder's IAMVideoAcceleratorNotify::GetCreateVideoAcceleratorData is passed to the device driver's DdMoCompCreate.
- Data returned from IAMVideoAccelerator::GetCompBufferInfo originates from the device driver's DdMoCompGetBuffInfo.
- Buffers sent using IAMVideoAccelerator::Execute are received by the device driver's DdMoCompRender.
- Use of IAMVideoAccelerator::QueryRenderStatus invokes the device driver's DdMoCompQueryStatus. A return code of DDERR_WASSTILLDRAWING from DdMoCompQueryStatus will be seen by the host decoder as a return code of E_PENDING from IAMVideoAccelerator::QueryRenderStatus.
- Data sent to IAMVideoAccelerator::BeginFrame are received by the device driver's DdMoCompBeginFrame. A return code of E_PENDING is needed from DdMoCompBeginFrame in order for E_PENDING to be seen by the host decoder in response to IAMVideoAccelerator::BeginFrame.
- Data sent to IAMVideoAccelerator::EndFrame are received by the device driver's DdMoCompEndFrame.
- At the end of the relevant processing, the device driver's DdMoCompDestroy is used to notify the driver that the current video acceleration object will no longer be used, so that the driver can perform any necessary cleanup.