Microsoft DirectX 8.1 (C++) |
The Microsoft AVI file format is a RIFF file specification used with applications that capture, edit, and play back audio-video sequences. In general, AVI files contain multiple streams of different types of data. Most AVI sequences use both audio and video streams. A simple variation for an AVI sequence uses video data and does not require an audio stream.
Modifications to the original AVI file specification made in the OpenDML AVI File Format Extensions are not discussed in this section. For further information on these extensions, see version 1.02 of the OpenDML AVI File Format Extensions published by the OpenDML AVI M-JPEG File Format Subcommittee, February 28, 1996.
This section contains the following topics.
See Also
AVI files use the AVI RIFF form. The AVI RIFF form is identified by the FOURCC (four-character code) 'AVI '. All AVI files include two mandatory LIST chunks. These chunks define the format of the stream and stream data. AVI files might also include an index chunk. This optional chunk specifies the location of data chunks within the file. An AVI file with these components has the following form:
RIFF ('AVI '
LIST ('hdrl'
.
.
.
)
LIST ('movi'
.
.
.
)
['idx1'<AVI Index>]
)
The LIST chunks and the index chunk are subchunks of the RIFF 'AVI ' chunk. The 'AVI ' chunk identifies the file as an AVI RIFF file. The LIST 'hdrl' chunk defines the format of the data and is the first required LIST chunk. The LIST 'movi' chunk contains the data for the AVI sequence and is the second required LIST chunk. The 'idx1' chunk is the index chunk. AVI files must keep these three components in the proper sequence.
The LIST 'hdrl' and LIST 'movi' chunks use subchunks for their data. The following example shows the AVI RIFF form expanded with the chunks needed to complete the LIST 'hdrl' and LIST 'movi' chunks:
RIFF ('AVI '
LIST ('hdrl'
'avih'(<Main AVI Header>)
LIST ('strl'
'strh'(<Stream header>)
'strf'(<Stream format>)
'strd'(<additional header data>)
'strn'(<Stream name>)
...
)
.
.
.
)
LIST ('movi'
{SubChunk | LIST ('rec '
SubChunk1
SubChunk2
.
.
.
)
.
.
.
}
.
.
.
)
['idx1'<AVI Index>]
)
This and following sections describe the chunks contained in the LIST 'hdrl' and LIST 'movi' chunks. The 'idx1' chunk is not described in this document. For more information on the 'idx1' chunk and indexes in AVI files, see version 1.02 of the OpenDML AVI File Format Extensions published by the OpenDML AVI M-JPEG File Format Subcommittee, February 28, 1996.
The file begins with the main header. In the AVI file, this header is identified by the 'avih' FOURCC (four-character code). The header contains global information for the entire AVI file, such as the number of streams within the file and the width and height of the AVI sequence. The AVI main header structure is defined as follows:
typedef struct {
DWORD dwMicroSecPerFrame;
DWORD dwMaxBytesPerSec;
DWORD dwReserved1;
DWORD dwFlags;
DWORD dwTotalFrames;
DWORD dwInitialFrames;
DWORD dwStreams;
DWORD dwSuggestedBufferSize;
DWORD dwWidth;
DWORD dwHeight;
DWORD dwReserved[4];
} MainAVIHeader;
dwMicroSecPerFrame
Specifies the number of microseconds between frames. This value indicates the overall timing for the file.
dwMaxBytesPerSec
Specifies the approximate maximum data rate of the file. This value indicates the number of bytes per second the system must handle to present an AVI sequence as specified by the other parameters contained in the main header and stream header chunks.
dwReserved1
Reserved. Set this to zero.
dwFlags
Contains any flags for the file. The following flags are defined.
Value | Description |
AVIF_HASINDEX | Indicates the AVI file has an 'idx1' chunk containing an index at the end of the file. For good performance, all AVI files should contain an index. |
AVIF_MUSTUSEINDEX | Indicates that the index, rather than the physical ordering of the chunks in the file, should be used to determine the order of presentation of the data. For example, you could use this to create a list of frames for editing. |
AVIF_ISINTERLEAVED | Indicates the AVI file is interleaved. |
AVIF_WASCAPTUREFILE | Indicates the AVI file is a specially allocated file used for capturing real-time video. Applications should warn the user before writing over a file with this flag set because the user probably defragmented this file. |
AVIF_COPYRIGHTED | Indicates the AVI file contains copyrighted data and software. When this flag is used, software should not permit the data to be duplicated. |
dwTotalFrames
Specifies the total number of frames of data in the file.
dwInitialFrames
Specifies the initial frame for interleaved files. Noninterleaved files should specify zero. If you are creating interleaved files, specify the number of frames in the file prior to the initial frame of the AVI sequence in this member. For more information about the contents of this member, see "Special Information for Interleaved Files" in the Video for Windows Programmer's Guide.
dwStreams
Specifies the number of streams in the file. For example, a file with audio and video has two streams.
dwSuggestedBufferSize
Specifies the suggested buffer size for reading the file. Generally, this size should be large enough to contain the largest chunk in the file. If set to zero, or if it is too small, the playback software will have to reallocate memory during playback, which will reduce performance. For an interleaved file, the buffer size should be large enough to read an entire record, and not just a chunk.
dwWidth
Specifies the width of the AVI file in pixels.
dwHeight
Specifies the height of the AVI file in pixels.
dwReserved
Reserved. Set this array to zero.
One or more 'strl' chunks follow the main header. (A 'strl' chunk is required for each data stream.) These chunks contain information about the streams in the file. Each 'strl' chunk must contain a stream header and stream format chunk. Stream header chunks are identified by the FOURCC (four-character code) 'strh' and the stream format chunks are identified by the FOURCC 'strf'. In addition to the stream header and stream format chunks, the 'strl' chunk might also contain a stream-header data chunk and a stream name chunk. Stream-header data chunks are identified by the FOURCC 'strd'. Stream name chunks are identified by the FOURCC 'strn'.
The stream header structure contains header information for a single stream of a file.
typedef struct {
FOURCC fccType;
FOURCC fccHandler;
DWORD dwFlags;
DWORD dwPriority;
DWORD dwInitialFrames;
DWORD dwScale;
DWORD dwRate;
DWORD dwStart;
DWORD dwLength;
DWORD dwSuggestedBufferSize;
DWORD dwQuality;
DWORD dwSampleSize;
RECT rcFrame;
} AVIStreamHeader;
The stream header specifies the type of data the stream contains, such as audio or video, by means of a FOURCC.
fccType
Contains a FOURCC that specifies the type of the data contained in the stream. The following standard AVI values for video and audio are defined.
'vids' | Indicates the stream contains video data. The stream format chunk contains a BITMAPINFO structure that can include palette information. |
'auds' | Indicates the stream contains audio data. The stream format chunk contains a WAVEFORMATEX structure. |
'txts' | Indicates the stream contains text data. |
fccHandler
Optionally, contains a FOURCC that identifies a specific data handler. The data handler is the preferred handler for the stream. For audio and video streams, this specifies the installable compressor or decompressor.
dwFlags
Contains any flags for the data stream. The bits in the high-order word of these flags are specific to the type of data contained in the stream. The following standard flags are defined.
Value | Description |
AVISF_DISABLED | Indicates this stream should not be enabled by default. |
AVISF_VIDEO_PALCHANGES | Indicates this video stream contains palette changes. This flag warns the playback software that it will need to animate the palette. |
dwPriority
Specifies priority of a stream type. For example, in a file with multiple audio streams, the one with the highest priority might be the default stream.
dwInitialFrames
Specifies how far audio data is skewed ahead of the video frames in interleaved files. Typically, this is about 0.75 seconds. If you are creating interleaved files, specify the number of frames in the file prior to the initial frame of the AVI sequence in this member. For more information about the contents of this member, see "Special Information for Interleaved Files" in the Video for Windows Programmer's Guide.
dwScale
Used with dwRate to specify the time scale that this stream will use. Dividing dwRate by dwScale gives the number of samples per second. For video streams, this rate should be the frame rate. For audio streams, this rate should correspond to the time needed for nBlockAlign bytes of audio, which for PCM audio simply reduces to the sample rate.
dwRate
See dwScale.
dwStart
Specifies the starting time of the AVI file. The units are defined by the dwRate and dwScale members in the main file header. Usually, this is zero, but it can specify a delay time for a stream that does not start concurrently with the file.
dwLength
Specifies the length of this stream. The units are defined by the dwRate and dwScale members of the stream's header.
dwSuggestedBufferSize
Specifies how large a buffer should be used to read this stream. Typically, this contains a value corresponding to the largest chunk present in the stream. Using the correct buffer size makes playback more efficient. Use zero if you do not know the correct buffer size.
dwQuality
Specifies an indicator of the quality of the data in the stream. Quality is represented as a number between 0 and 10,000. For compressed data, this typically represents the value of the quality parameter passed to the compression software. If set to 1, drivers use the default quality value.
dwSampleSize
Specifies the size of a single sample of data. This is set to zero if the samples can vary in size. If this number is nonzero, then multiple samples of data can be grouped into a single chunk within the file. If it is zero, each sample of data (such as a video frame) must be in a separate chunk. For video streams, this number is typically zero, although it can be nonzero if all video frames are the same size. For audio streams, this number should be the same as the nBlockAlign member of the WAVEFORMATEX structure describing the audio.
rcFrame
Specifies the destination rectangle for a text or video stream within the movie rectangle specified by the dwWidth and dwHeight members of the AVI main header structure. The rcFrame member is typically used in support of multiple video streams. Set this rectangle to the coordinates corresponding to the movie rectangle to update the whole movie rectangle. Units for this member are pixels. The upper-left corner of the destination rectangle is relative to the upper-left corner of the movie rectangle.
The last eight members describe the playback characteristics of the stream. These factors include the playback rate (dwScale and dwRate), the starting time of the sequence (dwStart), the length of the sequence (dwLength), the size of the playback buffer (dwSuggestedBuffer), an indicator of the data quality (dwQuality), and the sample size (dwSampleSize).
Some of the members in the stream header structure are also present in the main header structure. The data in the main header applies to the whole file, while the data in the stream header structure applies only to a stream.
A stream format ('strf') chunk must follow a stream header ('strh') chunk. The stream format chunk describes the format of the data in the stream. For video streams, the information in this chunk is a BITMAPINFO structure (including palette information if appropriate). For audio streams, the information in this chunk is a WAVEFORMATEX structure.
The 'strl' chunk might also contain an additional stream-header data ('strd') chunk. If used, this chunk follows the stream format chunk. The format and content of this chunk is defined by installable compression or decompression drivers. Typically, drivers use this information for configuration. Applications that read and write RIFF files do not need to decode this information. They transfer this data to and from a driver as a memory block.
The optional 'strn' stream name chunk provides a zero-terminated text string describing the stream. (The AVI file functions can use this chunk to let applications identify the streams they want to access by their names.)
An AVI player associates the stream headers in the LIST 'hdrl' chunk with the stream data in the LIST 'movi' chunk by using the order of the 'strl' chunks. The first 'strl' chunk applies to stream 0, the second applies to stream 1, and so forth.
For example, if the first 'strl' chunk describes the wave audio data, the wave audio data is contained in stream 0. Similarly, if the second 'strl' chunk describes video data, then the video data is contained in stream 1.
Following the header information is a LIST 'movi' chunk that contains chunks of the actual data in the streams - that is, the pictures and sounds themselves. The data chunks can reside directly in the LIST 'movi' chunk or they might be grouped into 'rec' chunks. The 'rec' grouping implies that the grouped chunks should be read from disk all at once. This is used only for files specifically interleaved to play from CD-ROM.
Like any RIFF chunk, the data chunks contain a FOURCC (four-character code) to identify the chunk type. A FOURCC is a 32-bit quantity represented as a sequence of one to four ASCII alphanumeric characters, padded on the right with blank characters. The FOURCC that identifies each chunk consists of the stream number and a two-character code that defines the type of information encapsulated in the chunk. For example, a waveform chunk is identified by a two-character code of 'wb'. If a waveform chunk corresponded to the second LIST 'hdrl' stream description, it would have a FOURCC of '01wb'.
Note While two-character codes are a convenient way to describe a stream, do not expect them to be recognized by other applications. Use FOURCCs when creating a stream or transferring the information to other applications.
Because all the format information is in the header, the audio data contained in these data chunks does not contain any information about its format. An audio data chunk has the following format (the ## in the format represents the stream identifier):
WAVE Bytes '##wb'
BYTE abBytes[];
Video data can be compressed or uncompressed DIBs. An uncompressed DIB has BI_RGB specified for the biCompression member in its associated BITMAPINFO structure. A compressed DIB has a value other than BI_RGB specified in the biCompression member. For more information about compression formats, see the description of the BITMAPINFOHEADER data structure in the Microsoft Windows Programmer's Reference.
A data chunk for an uncompressed DIB contains RGB video data. These chunks are identified by a two-character code of 'db' (db is an abbreviation for DIB bits). Data chunks for a compressed DIB are identified by a two-character code of 'dc' (dc is an abbreviation for DIB compressed). Neither data chunk will contain any header information about the DIBs. The data chunk for an uncompressed DIB has the following form:
DIB Bits '##db'
BYTE abBits[];
The data chunk for a compressed DIB has the following form.
Compressed DIB Bits '##dc'
BYTE abBits[];
Video data chunks can also define new palette entries used to update the palette during an AVI sequence.
Text streams can use arbitrary two-character codes.