2.3.4 Media Access

The Telephony SPI line and phone device classes only provide control operations for these devices. However, phone devices and calls on line devices are capable of carrying media streams, i.e., information streams (voice, data, video, etc.). Access to these media streams is not provided by the Windows Telephony SPI. Instead the TAPI DLL must use other Windows APIs to provide this access or otherwise manage these media streams (such as the waveform API or the higher-level MCI interface). The Telephony SPI is used to establish calls independent of the call's media mode (e.g., voice, data, ...).

For example, for line devices, the TAPI DLL first uses the Telephony SPI to establish a connection to another station. Once a call is established, the TAPI DLL or its client can then use the waveform API (or the MCI waveaudio API) on the associated wave device (or MCI waveaudio device) to play back (send) and record (receive) audio data over the connection. Similarly, if the connection media stream is from a fax or data modem, the TAPI DLL's client would use a fax API or data modem API (as appropriate) to deal with the media stream.

To provide both telephony API functions and media stream access to a phone or to a call on a line device, the service provider for the phone or line will have to implement both the Telephony SPI and the appropriate media stream SPI. The physical device then simultaneously supports multiple device classes, e.g., the line device class and the wave device class. Since these APIs are orthogonal, limited coupling exists at the API level between the apps' usage of these APIs. Multiple applications that share calls and media streams in non trivial ways will likely need to coordinate their activities at the application level to prevent conflicting usage of the Telephony API and the media stream API.

A typical media-oriented client thus uses Telephony API functions to control the call, followed by some media stream API functions to manipulate the media stream, followed by some more Telephony API functions to shutdown the call. A Service Provider supporting this simultaneously supports both orthogonal APIs as outlined above. Orthogonal device-related APIs under Windows have orthogonal "name spaces" to identify devices as well. The client application needs to relate the Telephony device identification to the appropriate device identification within the appropriate media device class. The TSPI includes functions to retrieve device-class specific device identification for a given line, address, call, or phone device. The client uses these functions to obtain, for example, a device identification that can be opened under a "fax" API to transfer a fax over a given call device.

The Telephony SPI reports media stream type changes to the TAPI DLL when requested. This process is sometimes referred to as call classification. The mechanism used to determine the type of a media stream (e.g., voice, fax, data modem, etc.) is service provider specific. A service provider may filter the media stream for energy or tones that characterize the media type. Alternatively, the determination may be made by information exchanged in messages over the network, by the use of distinctive ringing, or by knowledge about the caller or called ID, etc.

The Telephony SPI also provides limited support for control of the media stream that may exist on a call. This is provided to avoid latency problems that may arise in client/server configurations where the application would be forced to go through the stream's media API. The TAPI DLL can request actions on a call's media stream triggered by events normally reported via the Telephony SPI, such as the detection of a tone or DTMF digit, the transition of the call to a specified call state, etc. For example, the TAPI DLL can request that a call's media stream be suspended when a # DTMF digit is detected on the call, and that the media stream be resumed when a * DTMF digit is detected on the call. Note that some implementations or configurations will be unable to provide any media control functions or media access to the phone or line. Providing media control is optional to the service provider; it should provide the largest performance benefit for client/server implementations. Since it is optional and since only limited control is provided, its usage is generally discouraged. If at all possible, apps should use the stream's media API instead.