This article provides background for developers interested in adding external device control and timecode support to Microsoft® DirectShow® applications. It also discusses how timecode is used in the production environment and lists some typical applications that rely on external devices. Finally, it describes how external device control is implemented and provides links to the interfaces available to build VCR control- and timecode-enabled filters in DirectShow.
This article contains the following sections.
You can control an external device in DirectShow by implementing device control filters. These filters control devices or streams of data that are entirely external to the computer and expose interfaces such as IAMExtDevice, IAMExtTransport, IAMTimecodeGenerator, and IAMTimecodeReader. Generally, external device control filters do not need to expose pins. However, an example of a device control filter that does expose pins might be a filter representing a source of data such as a VCR. A pin-to-pin connection representing the data flowing from the VCR to the capture board enables the device control filter and the video capture filter to communicate with each other and negotiate data types, although they do not use the standard transport and no data would flow between the filters themselves, other than control information. Applications can instantiate and directly control an external device filter, but it is strongly recommended that they are always instantiated within the context of a filter graph, even if they are the only filter in the graph.
External devices can include VCRs, video editing stations, audio tape recorders (ATRs), mixers, or any other device used in the video capture and editing process. Capture and editing requires DirectShow external device control filters to provide audio and video synchronization and precise control. You can accomplish synchronization of audio and video during playback, edit, and capture with external clocks or Society of Motion Picture and Television Engineers (SMPTE) timecode. Understanding timecode is the key to understanding external device control.
SMPTE timecode is the glue that holds the post-production process together. It identifies video and audio sources, makes automatic track synchronization possible, and provides a container for ancillary data related to the production. You will need to understand this data stream and its application to media production, tool development, or system design.
SMPTE timecode, more properly known as SMPTE time and control code, is a series of digital frame address values, flags and additional data applied to a video or audio stream, and is defined in ANSI/SMPTE 12-1986. Its purpose is to provide a machine-readable address for video and audio.
The most common form of an SMPTE timecode data structure is an 80-bit frame that contains the following information.
The DirectShow TIMECODE_SAMPLE structure is an example of a timecode data structure that contains timecode information for video or audio data.
SMPTE timecode comes in one of two types. Timecode recorded on an analog audio track as a bi-phase mark encoded signal is known as LTC, or Linear TimeCode (formerly known as Longitudinal TimeCode). Each timecode frame is one video frame time in duration. The other common type of timecode is known as VITC, or Vertical TimeCode. VITC is usually stored on two lines of a video signal's vertical blanking interval, somewhere between lines 10 and 20.
LTC timecode is easy to add to a prerecorded tape, because it is encoded in a separate audio signal. However, it cannot be read when the tape is paused, moving very slowly, or moving very quickly. In addition, it consumes one audio channel on nonprofessional VCRs.
VITC timecode, on the other hand, can be read from speeds of zero to 15 times normal speed. It can contain field-dependent data and can be read from video capture cards. However, it is not easily added to a prerecorded tape and often requires expensive hardware.
SMPTE timecode also comes in one of two modes, nondrop frame and drop frame. Nondrop frame is timecode that is consistently increasing and sequential. It can act as a real-time clock and works fine for monochrome video that runs at a frame rate of exactly 30 frames per second.
However, NTSC color video actually runs at a frame rate of 29.97 Hz (frames per second) because of some compatibility issues with monochrome television. This causes a problem with nondrop frame timecode because it gets out of step with real-time at the rate of 108 frames (or 3.6 seconds) per hour. This means that after 1 hour of playback, the timecode would read 00:59:56:12, assuming a start point of 00:00:00:00. This causes problems when trying to calculate show duration or using "time-of-day" referencing.
A solution to this problem is to skip some frames in the count every so often so the error is reduced to something tolerable. This compensation method is called "drop frame" and is implemented by skipping the first two frames from the count at the start of each minute except minutes 00, 10, 20, 30, 40 and 50. The net result is an error of less than 1 frame per hour, or about 3 frames per 24 hour period.
Drop frame is used more commonly in today's productions, although any implementation should support mixing both modes.
Applications that provide video capture and editing functionality will typically require control of external devices. These applications need to identify and index video and audio frames through references to SMPTE timecode. Linear editing system computers generally control three or more tape machines, as well as a video switcher and possibly a digital disk recorder. The controlling computer must execute commands at precise times and therefore must get videotapes cued to specific places at specific points in time.
Applications typically use timecode in a number of different ways including, but not limited to the following:
It quickly becomes apparent that timecode makes many things possible when properly handled. Unfortunately, there is also a lot that can go wrong, either because of poor technique or hardware malfunctions. Some things to look out for on timecoded tapes are:
Timecode can be generated either by an external timecode generator, by a capture card capable of generating timecode, by the device control filter itself, or by an external device such as a VCR that has a built-in timecode reader. An RS-422 connection is generally necessary if an external device sends the timecode to the host.
After timecode is generated, it needs to be captured either in tabular or stream format concurrently with the video or audio, so that it can later be accessed during editing. This is handled in one of two ways:
1. Build a table that lists the timecode discontinuities indexed to frame position within the stream, and write the table to the end of the file after capture is complete. The list might be an array of structures that look like this (note that the following structure is a simplification of the DirectShow TIMECODE_SAMPLE structure and is intended as an example only):
struct { DWORD dwOffset; // offset into stream in frames char[11] szTC; // timecode value at offset in hh:mm:ss:ff // for nondrop, hh:mm:ss;ff for drop frame } TIMECODE;
For example, given a captured video stream with one timecode break in it, the list might look like this:
{0, 02:00:00:02}, {16305, 15:21:13:29} // timecode jumps at frame 16305
Using this table, any frame's timecode can be easily calculated.
2. Treat the data as a stream and write it to the file just as video and audio are written. This is useful for rapidly changing data or even non-timecode data in the vertical blanking interval (VBI) such as closed-captioning data.
After the timecode data is properly stored with its associated frame data, applications that edit, composite, synchronize, or trigger can access and use a familiar and standard indexing system.
To understand external device control, it is necessary to understand timecode. The key things to remember about timecode are:
Given this background, two fundamental problems exist with device control. First, hundreds of different communication protocols exist for all the various devices from all the various manufacturers. Although some devices are more widely used than others, such as VCRs and laser disc players, almost all have a different remote control interface. As more sophisticated professional video and audio applications continue to move to the desktop, this problem gets worse. Due to these myriad protocols, you must implement separate DirectShow filters for each external device you want to control.
Second, the fundamental problem in the design of professional video and audio systems is that events must occur at precise points in time. Taking a systems view of this issue, consider the following timing diagram:
The horizontal axis denotes time in video fields, or roughly 1/60 of a second for NTSC video. The key point here is that all signals line up in time; that is, timecode starts at the beginning of a frame (System Frame Pulse). External devices, such as tape machines, are aligned with the system reference, as well as digital video playback such as an AVI file run from an AVI-enabled application.
Conformance to this timing requirement is achieved by various means, the most common of which is a master reference signal distributed to all components in the system. This reference is known as blackburst in the video world, so named because it is a composite video signal containing no active video above black level. The "burst" portion of the name refers to the color burst portion of the video signal. Each device connected to the reference must maintain its own synchronization. This means, for example, that a digital video player must switch frames during the vertical blanking interval, a tape machine must switch into record mode during the vertical blanking interval, commands sent to external devices through a serial port must be timed to the frame pulse, and all of these and other synchronized events must occur when the SMPTE timecode hits a predetermined value. Failure to conform to these rules results in tearing of a video image or edits occurring at the wrong point in time.
Accomplishing all this in the professional video world is relatively straightforward, but in the hybrid world of desktop video, it is very difficult.
Building on the concepts presented so far, the two design examples in the following diagrams illustrate a potential configuration of external devices.
The block diagrams show that it is relatively simple to distribute the reference signal to all of the boxes. To deal with synchronization that takes place within the computer, for example, between the timecode reader and digital video player, it is recommended that either a "vertical drive" hardware interrupt, specialized operating system services, or some other custom solution be used.
Finally, if you intend to write a filter that controls an external device, you should implement the IAMExtDevice and IAMExtTransport interfaces. If your filter reads or generates timecode, you should also implement IAMTimecodeReader, IAMTimecodeGenerator, and IAMTimecodeDisplay interfaces provided by DirectShow. Additionally, if you need to move binary messages to and from an external devicefor example, to download executable code for the external device's microprocessor to executeyou should do this by implementing the COM IDataObject interface, which has a complete set of methods for handling binary data transfers. Use this interface for whatever custom data transfer purposes your filter needs.
For sample code that demonstrates how to implement an external device control filter, see the Samples\Multimedia\DShow\Src\Vcrctrl folder in the DirectX Media SDK.
For additional information on SMPTE timecode and external device control, refer to the following documentation.
SMPTE standards and reprints are available from SMPTE by calling (914) 761-1100.
Top of Page
© 2000 Microsoft and/or its suppliers. All rights reserved. Terms of Use.