Windows NT 5.0 Brings You New Telephony Development Features with TAPI 3.0--MSJ, November 1998

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

November 1998

Windows NT 5.0 Brings You New Telephony Development Features with TAPI 3.0

Download Nov98Tapi3.exe (13KB)

Michelle Quinton is the development lead for TAPI 3.0. She often can be seen walking her two Dalmatians around Microsoft’s campus.

TAPI 3.0, the next version of Microsoft's telephony API, is scheduled to be released with Windows NT® 5.0. It differs from TAPI 2.1 in several ways. First, it is a set of COM interfaces, rather than a procedural C API. This allows developers to write TAPI 3.0 programs in Visual Basic® and Java, as well as C/C++. Second, it adds media control to the API, so you can handle the recording and playback of voice messages. Finally, TAPI 3.0 has added support for IP, which is becoming increasingly important in the telephony world. This article will describe these new features, and includes a short TAPI 3.0 sample program.
      To try TAPI 3.0 out, download the beta of Windows NT 5.0 from http://www.microsoft.com/NTServer/Basics/Future /WindowsNT5/default.asp!. As with all betas, TAPI 3.0 is subject to change until it ships with Windows NT 5.0.
TAPI 3.0 Architecture
       Figure 1 shows the TAPI 3.0 architecture in detail. The boxes in green—Tapi32.dll, Tapisrv.exe, and Unimodem—represent telephony components that already exist in Windows® 98 and Windows NT 4.0.
      Tapi32.dll exports all the TAPI 2.1 functions, and is loaded by any application using TAPI 2.x. Tapisrv.exe, the core of Windows telephony, is an executable on Windows 98 and a service process on Windows NT. Tapi32.dll has a private RPC interface to Tapisrv.exe for handling all telephony requests. Tapisrv.exe processes requests and forwards them to TAPI service providers (TSPs) if necessary.
      TSPs are analogous to device drivers; they handle hardware-specific communication. Unimodem is a TSP provided by Microsoft to support any type of modem. Telephony hardware vendors can write their own TSPs that plug in under Tapisrv.exe to support their hardware.
      The yellow boxes in Figure 1 represent new telephony components planned for Windows NT 5.0. These are Tapi3.dll, three sets of COM interfaces (Call Control, Media Control, and Directory Control), two new TSPs to support IP telephony (H.323 and IP Multicast), three Media Stream Providers (MSPs), the Terminal Manager, and DirectShow™.
      Tapi3.dll, the core of TAPI 3.0, communicates with Tapisrv.exe through the same private RPC interface used by Tapi32.dll. In fact, Tapisrv.exe does not know the difference between Tapi32.dll and Tapi3.dll. TAPI 3.0 ensures backward compatibility with existing TAPI 2.x service providers by using the same abstraction in the interface to Tapisrv.exe.
      MSPs provide media streaming on calls handled through TAPI. TAPI 3.0 provides a uniform API for call control and media control. It then forwards all call control requests to the TSP and all media control requests to the MSP. The MSP architecture is similar to the TAPI 2.x service provider UI architecture: it is an extension to a TSP that is loaded in the application's process. The TSP tells TAPI, through a new Telephony Service Provider Interface (TSPI) function, that it has a corresponding MSP. Tapi3.dll instantiates the specified MSP. TAPI 3.0 also provides a way for the MSP and TSP to communicate with each other throughout the telephony session.
      The Terminal Manager (Termmgr.dll), another new component planned for TAPI 3.0, is a helper component for MSPs. It uses DirectShow to find all multimedia devices present on a computer, then creates TAPI 3.0 Terminal objects that correspond to these devices.
New Features of TAPI 3.0
      TAPI traditionally had a first-party call model. This means that a call in TAPI represents the endpoints of a connection. For example, if person A called person B, this would be represented in TAPI as two calls—the endpoint of the call at person A and the endpoint of the call at person B. Taking this further, a conference with n participants would be represented by (at least) n calls. In the TAPI 3.0 object model, the Call object represents this first-party view.
      TAPI 3.0 has also incorporated a third-party call model in addition to the first-party call model. In a third-party call model, a telephony application gets an overview of all the endpoints involved in a connection. The CallHub object represents this view in the TAPI 3.0 object model. If there is a call from person A to person B, there would be one CallHub object representing that connection, as well as two Call objects, one for each endpoint. In a conference with n participants, there would be one CallHub object representing the conference and n Call objects representing the endpoints.
      Address types represent the types of dialable strings that the TSP can accept when dialing a call. In previous versions of TAPI, all dialable strings were assumed to be phone numbers. With IP telephony, however, TSPs may support other types of dialable formats such as an IP address, email name, or machine name. TAPI 3.0 defines five address types, shown in Figure 2.
      You can query a TSP for the address types that it supports. Also, since a TSP can support multiple address types, an application must specify the address type of the dialable string that it is using.
HRESULT CreateCall( BSTR pDestAddress, long lAddressType, ITBasicCallControl ** ppCreatedCall );
When the application calls this method, it supplies both the destination address to call and the address type that describes the format of the destination address. Any method in TAPI 3.0 that takes a destination address string as a parameter also takes an address type.
      Applications need to know what protocols a device supports, and the API provides a method for discovering this. Each TAPI line device supports a single protocol. Figure 2 shows the protocols that are currently defined for TAPI 3.0: PSTN (Public Switch Telephone Network), H.323, and multicast conferencing. If an application is only interested in H.323, it can easily find the device to use.
      TAPI 3.0 allows a call to support multiple media modes simultaneously. In previous versions of TAPI, a call could only have one media mode, and both TSPs and applications used the LINEMEDIAMODE_XXX constants. In TAPI 3.0, applications use the new TAPIMEDIAMODE_XXX constants, and a TSP can specify that a call has more that one media mode. For example, if a call has both audio and video on it, the TSP can report both LINEMEDIAMODE_INTERACTIVEVOICE and LINEMEDIAMODE_VIDEO in the LINECALLINFO structure.
      Call center support was introduced with TAPI 2.0, and has been greatly enhanced in version 3 with standard call center features such as agents, sessions, groups, and queues. I won't be discussing the call center features in this article. Please refer to the TAPI 3.0 documentation in MSDN™ for further information.
Media Control
      One of the major enhancements in TAPI 3.0 is the addition of media control, which provides a way for TSPs to give applications control over the media stream. In previous versions of TAPI, a TSP that supported media control would usually implement a wave device that allowed access to its media stream. TAPI provided a generic way for the application to retrieve the wave device ID, but after that it was up to the application to handle the media control using the multimedia API.
      Although this worked, the application had to know that the TSP had a wave device. And if a new media streaming model came along, TSPs couldn't use it until applications had been updated to work with the new model. Of course, applications would never get updated until some TSP supported this new model. This architecture led to applications working well with one or a very few TSPs, which defeats the purpose of having an API to abstract functionality.
      For these reasons, Microsoft decided to incorporate media streaming into TAPI 3.0 via the Terminal object. Microsoft abstracted common telephony media streaming tasks and created Terminal interfaces for these tasks. As COM interfaces, these represent a contract between the media streaming implementation and the application, so media control in applications will work across different TSPs. The abstract layer remains the same, even if the media streaming tasks are implemented differently under the surface.
      Usually TSPs and MSPs are matched pairs, but what about all those TSPs that include wave devices? Those will work with TAPI 3.0 as well. During TAPI 3.0 initialization, it queries the TSP for a wave device. If one exists, TAPI 3.0 will use a special MSP called the WaveMSP for that TSP. The WaveMSP wraps the wave device to fit into the TAPI 3.0 object model.
The COM Interfaces
      TAPI 3.0 is the first version of TAPI to use COM. Previous versions of TAPI passed information through variable-length structures, which were problematic for programmers using languages such as Visual Basic. A COM interface makes TAPI available to Visual Basic, VBScript, Java, and JScript®, as well as C and C++. As you will see, TAPI 3.0 was designed for maximum automation compatibility.
      First, TAPI 3.0 uses automation-compatible variables. You will see types like VARIANT_BOOLs and BSTRs, but no structures. The idea that TAPI no longer has structures will seem rather odd to anyone who has written a TAPI 2.x application. To see how this was achieved, take a look at the ITAddressCapabilities interface.
      Another way that TAPI 3.0 ensures automation compatibility is through parallel methods—one method for C/C++ and one method for scripting languages. This was necessary in a few places, most commonly with enumerations. Any place where something is enumerated, there is a parallel method for retrieving a Collection object, which anyone who develops in Visual Basic knows and loves.
      Finally, TAPI 3.0 ensures automation compatibility by using the COM connection point callback scheme. All TAPI events are handled through a single callback interface. Microsoft is considering an option in future versions of TAPI that will let applications wait on events rather than use callbacks.
TAPI 3.0 Object Model
       Figure 3 shows the core TAPI 3.0 object model. There are five core objects: TAPI, Address, Call, Terminal, and CallHub. There are also additional objects for the call center API. The TAPI object is the entry point into TAPI 3.0. It is created through CoCreateInstance. From the TAPI object, an application can enumerate Address objects. The Address object basically corresponds to the TAPI 2.x line device, which is what you use to make and receive calls. The application can find out the capabilities of each Address object to decide which ones it wants to use. For example, an application may only be interested in Address objects that support the H.323 protocol. In this case, the application would enumerate the Address objects from the TAPI object, then query each Address object for its protocol.

Figure 3 TAPI 3.0 Core Objects

Figure 3 TAPI 3.0 Core Objects

      Address objects own Call objects. Call objects correspond directly to call handles in previous versions of TAPI. The Call object also is the first-party view of a connection, as described earlier. Call objects are either created by the application or generated by the TSP. Generally, an application creates outgoing calls and the TSP creates incoming calls.
      Address objects also own Terminal objects. Terminals let the application select what media and media devices to use on a call. On a computer with a sound card, there would be Terminal objects that correspond to the sound card's microphone and speakers. The application can select those terminals on a call to indicate that it wants those devices to be the source and sink of media on that call.
      Note that while Terminal objects are similar in concept to terminals in previous versions of TAPI, they are not derived from those terminals. So if a TSP supports the TAPI 2.x concept of a terminal, this is not exposed in TAPI 3.0 as a Terminal object.
TAPI Object Interfaces
      The TAPI object's default interface is the ITTAPI interface. The Initialize and Shutdown methods on this interface, as the names suggest, initialize and shut down the TAPI session. Most TAPI 3.0 applications will begin by calling CoCreateInstance to create the TAPI object, then immediately call ITTAPI::Initialize to start the TAPI session.
      No other TAPI 3.0 methods can be called before Initialize, which does the equivalent of the TAPI 2.x lineInitializeEx. It also creates all the Address objects, and discovers and creates all the available Terminal objects. As the TAPI object owns Address and CallHub objects, the ITTAPI interface has methods to enumerate these objects.
      The Shutdown method tells TAPI that the application is completely finished using TAPI. Before calling Shutdown, the application must make sure it has released references to all other TAPI objects. After Shutdown, no other TAPI methods may be called and an application should release its reference to the TAPI object.
      ITTAPI also has methods to register for assisted telephony and application priority. These features are similar to those found in previous versions of TAPI, so I won't discuss them here.
      An application uses the TAPI object to register for events. Events in TAPI 3.0 are fired through the standard COM connection point mechanism. TAPI 3.0 defines a single interface, ITTAPIEventNotification, that the application must implement and register to receive TAPI events. The TAPI object supports the IConnectionPointContainer interface, and the application follows the standard connection point registration mechanism to register the ITTAPIEventNotification interface.
      Once the interface is registered, the application will receive general TAPI 3.0 events, such as addresses being created and removed and addresses going in and out of service. Additionally, the application can register with TAPI 3.0 to receive events related to calls. ITTAPI::RegisterCallNotifications is used to register for call events.
HRESULT RegisterCallNotifications( [in]ITAddress * pAddress, [in]VARIANT_BOOL fMonitor, [in]VARIANT_BOOL fOwner, [in]long lMediaTypes, [in]long lCallbackInstance, [out,retval]long * plRegister );
RegisterCallNotifications is similar to lineOpen in TAPI 2.x. In this method, pAddress specifies the Address object on which the application wants call-related events reported. Remember that Address objects own Call objects, so applications can't be informed of a call unless they are listening on an address. If the application wants to receive call events on more than one address, it must call this method for each address.
      fMonitor and fOwner specify whether the application wants to monitor or own incoming calls. Even if these are both VARIANT_FALSE, the application will receive call events about outgoing calls it makes on that address. If fMonitor is VARIANT_TRUE, the application will receive call events about all calls on that address, but will not own any of the calls. If fOwner is VARIANT_TRUE, the application will receive call events only about calls that it owns. This also indicates to TAPI that it wants to own incoming calls. Both fMonitor and fOwner may be VARIANT_TRUE at the same time. This tells TAPI that the application wants to own incoming calls, and it wants to see events about any call on that address, whether or not it owns the call.
      lMediaTypes specifies the media modes of calls in which the application is interested. This parameter is only relevant when fOwner is VARIANT_TRUE. Basically, it tells TAPI that the application is only interested in owning calls of the specified media type. The TAPI 3.0 media types are listed in Figure 2.
      lCallbackInstance is an application-defined value that is returned to the application on any event that is fired as a result of this call to RegisterCallNotifications. An application can call RegisterCallNotifications multiple times for the same Address object, and use this value to distinguish between events.
      TAPI 3.0 returns a unique registration value in plRegister. The application uses this value to stop receiving call events by passing it as the parameter to the ITTAPI::UnregisterNotification method.
Address Object Interfaces
      The default interface for the Address object is the ITAddress interface. From this interface, the application can perform the most common operations on the Address object. It can enumerate existing calls owned by the Address; retrieve the current state, name, and parent TAPI object of the address; and obtain the name of the TSP that owns this address. Most importantly, an application can create an outgoing call with the CreateCall method on this interface:
HRESULT CreateCall( [in] BSTR pDestAddress, [in] long lAddressType, [out, retval] ITBasicCallControl ** ppCall );
pDestAddress is the destination address string, such as a phone number or an email address. lAddressType is the address type of pDestAddress. So if pDestAddress is in the format of a phone number, lAddressType will be LINEADDRESSTYPE_PHONENUMBER. The address of the created call is returned in ppCall.
      The CreateCall method simply creates a Call object. After the call is created, the application still needs to set up and connect the call before a connection is actually made. I will discuss this further in the Call object overview.
      The Address object also supports the ITAddressCapabilities interface, which is used to obtain detailed information on the capabilities of the address. The two main methods on the ITAddressCapabilities interface are get_AddressCapability and get_AddressCapabilityString.
HRESULT get_AddressCapability( [in] ADDRESS_CAPABILITY AddressCap, [out, retval] long * plCapability ); HRESULT get_AddressCapabilityString( [in] ADDRESS_CAPABILITY_STRING AddressCapString, [out, retval] BSTR * ppCapabilityString );
      ADDRESS_CAPABILITY and ADDRESS_CAPABILITY_STRING are enums that specify which capabilities the application is interested in querying. An example of an ADDRESS_CAPABILITY is AC_ADDRESSTYPES, which requests the address types supported by the Address object. An example of an ADDRESS_CAPABILITY_STRING is ACS_PROTOCOL, which requests the protocol that the Address object supports. The protocol is a GUID, but it is passed in the interface in string format. Many capabilities can be queried through these two methods.
      ITMediaSupport, another Address object interface, is used to describe the media supported by the address. The application can obtain the TAPIMEDIAMODEs supported, and learn whether the Address object supports media streaming through DirectShow.
      Finally, the Address object has the ITTerminalSupport interface, which lets applications find out which terminals can be used on calls that are owned by this address.
Terminal Object Interfaces
      Terminals represent the source or sink of media at the endpoint of a call. In an interactive voice call, common terminals might be a handset or a microphone and speakers. You select terminals on calls to tell TAPI how to handle the media on the call.
      There are two types of terminals. A static terminal represents a hardware device that is present on the computer such as a microphone or speakers. A static terminal only has one instance, and normally there will be device contention if the application tries to use the same static terminal on two calls.
      The second type of terminal is a dynamic terminal. This type of terminal can be instantiated, and there can be multiple instances at one time. Examples of this are file terminals, DTMF (Dual Tone Multi-Frequency) detection and generation, and video window terminals.
      TAPI 3.0 also defines a set of terminal classes that helps the application determine what the terminal actually represents. Examples of terminal classes are a handset, headset, microphone, or video capture device. Figure 2 has the complete list of defined terminal classes.
      TAPI 3.0 creates Terminal objects in several ways. If an address has an MSP, TAPI asks the MSP for the terminals that it supports. This is where the Terminal Manager comes into play. If the MSP supports streaming through DirectShow, it can use the Terminal objects created by the Terminal Manager rather than create its own terminals. The MSP can create its own terminals if it wants, as long as they conform to the terminal interfaces defined by TAPI 3.0. If the address has a corresponding TAPI 2.x phone device, TAPI 3.0 will create Terminal objects that represent the capabilities of the phone device.
      If there is no corresponding MSP, TAPI will create a dummy terminal that represents whatever media is available on the address. An address with one of these terminals usually is simply used for call control; no media control is available programmatically. The dummy terminal is created to complete the object model and ensure that there is at least one terminal that can be selected on a call.
      TAPI 3.0 creates some terminals that represent media control available in the TSP. While TAPI 2.x is primarily call control, there is some support for media control. The primary example of this is DTMF detection and generation. If the TSP supports DTMF detection or generation, TAPI 3.0 will create terminals that expose those capabilities.
      It is important to note that some terminals are instantiated across more than one Address object. An example is a TSP with two TAPI line devices that correspond to two TAPI 3.0 Address objects. In this scenario, the TSP has a corresponding MSP that is DirectShow-based and can stream to and from the computer's sound card. The sound card corresponds to two terminal devices: a microphone and speakers. But in this case there would be an instance of the microphone terminal for each of the two addresses and an instance of the speakers terminal for each of the two addresses. The two instances of the terminals would refer to the same physical device, but they appear as separate Terminal objects to TAPI and the application.
      The default interface on the Terminal object is the ITTerminal interface. This interface is used to discover the basic properties of the terminal: the terminal type and class; the media type, direction, and current state; and the terminal's displayable name.
      The Terminal object also has the ITMediaSupport interface. The Address object also supports this interface. The application uses the ITMediaSupport interface to find out what TAPIMEDIAMODEs the terminal supports, and if the terminal uses DirectShow for its media streaming.
      There are several other interfaces currently defined for terminals. The interfaces supported depend on the class of terminal. For example, TAPI 3.0 defines the ITBasicAudioTerminal that must be implemented by any terminal that supports audio. Any terminal that supports video must implement the IBasicVideo interface, which is defined by DirectShow. These interfaces provide a standard way for applications to manipulate the media stream on a call. They abstract the actual management of the stream from the application, so an application that controls media will be much easier to write.
      There are many more terminal interfaces planned for TAPI. These include a file, a text-to-speech, and speech recognition terminals. These additional terminal interfaces may not be available when TAPI 3.0 is shipped in Windows NT 5.0, and will most likely appear in a future release.
      The Terminal Manager implements another dynamic terminal called the Media Streaming terminal. Basically, this terminal lets an application read and write directly to the media stream. The application can access the actual buffers that the MSP is using for streaming. This terminal should be used by applications with media streaming needs that are not met by the standard terminal defined by TAPI 3.0. For more information on the Media Streaming terminal, take a look at the TAPI 3.0 sample answering machine program in MSDN.
Call Object Interfaces
      The Call object represents the first-party view of a call, or the endpoint of a call or connection. The Call object corresponds directly with a call handle in previous versions of TAPI, and is used by applications for almost all control-related functions.
      The default interface for the Call object is ITBasicCallControl. As the name suggests, this interface performs most of the call control. It has methods to connect, disconnect, answer, hold, conference, transfer, and more. It is also used to select and unselect terminals on the call.
      When making or answering any call, the application must select at least one terminal on the Call object before the call can be connected. Selecting a terminal is important because it tells TAPI and the TSP how the application wants the media stream on the call to be set up. For an outgoing call, the simplest sequence of events would be:
pAddressTerminalSupport->GetDefaultTerminal( lMediaType, dir, &pTerminal );
pAddress->CreateCall( pDestAddress, lAddressType, &pCall );
pCall->SelectTerminal( pTerminal );
pCall->Connect( VARIANT_FALSE );
      The Call object also supports the ITCallInfo interface. As the Call object corresponds directly to a TAPI 2.x call handle, this interface provides methods to access fields in the related TAPI 2.x LINECALLINFO structure. It also lets the application set the fields in the related TAPI 2.x LINECALLPARAMS structure, which is used when setting up an outgoing call. For example, this interface has a method called put_BearerMode that lets the application set the desired LINEBEARERMODE_XXX before making a call. The method get_BearerMode retrieves the LINEBEARERMODE_XXX being used on the call. There are about 50 methods on ITCallInfo, so I won't cover all of them here. But if you are familiar with TAPI 2.x and are looking for something that you found previously in LINECALLINFO or LINECALLPARAMS, ITCallInfo is the place to look.
      To place an outgoing call you create a call, select terminals, then call Connect. The Connect method on ITBasicCallControl occasionally causes confusion because of the parameter it uses. The method looks like this:
HRESULT Connect( [in] VARIANT_BOOL fSync );
fSync tells TAPI 3.0 when the application wants the method to return. It either can return directly after the call request is made, or it can wait until the call is in the CS_ CONNECTED state. If fSync is VARIANT_TRUE, Connect will return when the call is connected, disconnected, or times out. Setting fSync to VARIANT_TRUE should only be done in very simple applications that don't want to register a callback to wait for a state change. Almost all applications should set fSync to VARIANT_FALSE and monitor the call state of the call through the event mechanism.
CallHub Object Interfaces
      The CallHub object represents the third-party view of a call—an overview of all the connections. In TAPI 3.0, it is viewed as a collection of Call objects, or the "hub" of a bunch of calls. While TAPI's native view of calls is still the Call object, which is first-party-based, Microsoft is working on adding more third-party call support in TAPI; CallHub is the first step.
      The only interface on the CallHub object is ITCallHub. This interface can be used to enumerate the Call objects in the CallHub, to get the current state of the CallHub, or to clear the CallHub. Clearing the CallHub corresponds to disconnecting all the calls in the CallHub.
      As you can see, an application can't do a lot with the CallHub object. It's mainly useful for tracking. An application that wants to know all the parties in a conference or wants to follow a connection around through various conferences and transfers would use the CallHub object.
      In the CallHub implementation, the TSP has a handle (a first-party view TAPI 2.x call handle) for every call on a line device that it owns. It is the TSP's responsibility to keep track of calls and information about them. It does this by means of a LINECALLINFO structure that is associated with each call. When an application requests a LINECALLINFO structure, TAPI passes that request on to the TSP, and the TSP fills in most of the fields of that structure.
      One of the fields in LINECALLINFO is dwCallID. TAPI uses the dwCallID field to create CallHubs. If the dwCallID field is not zero, TAPI will use that ID as a CallHub identifier. Any calls from the same TSP with the same dwCallID field will be put into the same CallHub. So there is little work that the TSP has to do to support this, and no handle swapping between TAPI and the TSP is necessary.
A Sample TAPI 3.0 Program
      INCOMING is a sample TAPI 3.0 application that demonstrates how to wait for and answer incoming calls (see Figure 4). As a sample, it makes some assumptions so the code is easier to understand. For example, it assumes that there only will be one call at a time. More than one call will confuse the application, though it would be fairly simple to make it more robust. INCOMING does not let the user pick which terminals to use on a call; it uses default terminals. But it does illustrate most of the major features in TAPI 3.0.
      The InitializeTapi function, as you can guess, initializes TAPI. It calls CoCreateInstance to create a TAPI object, and then it calls Initialize on the returned object. It registers the event callback in the RegisterTapiEventInterface function and starts listening for calls in the ListenOnAddresses function.
      Let's take a closer look at RegisterTapiEventInterface. First it creates a CTAPIEventNotification object. The CTAPIEventNotification object is defined in callnot.h and implemented in callnot.cpp. CTAPIEventNotification implements the ITTAPIEventNotification interface, which is defined as follows:
interface ITTAPIEventNotification : IUnknown { [id(1), helpstring("method Event")] HRESULT Event( [in] TAPI_EVENT TapiEvent, [in] IDispatch * pEvent ); }
This single Event method is used to fire all TAPI 3.0 events to the application. callnot.cpp implements Event very simply: it posts a message to the application's UI thread to handle the event. A multithreaded apartment model application should do as little as possible on the thread in which Event is called and should not call back into TAPI 3.0, since this can cause a deadlock situation. Also note that Event calls AddRef so that the Event object is not deleted when Event returns.
      OnTapiEvent is the function that eventually gets called to handle TAPI events. The TAPI_EVENT enum defines the events that can be fired. For each TAPI_EVENT, an event interface is defined. The Event object, pEvent, which is passed in Event, supports this corresponding interface. For example, the event TE_CALLNOTIFICATION supports the ITCallNotificationEvent interface.
      In OnTapiEvent, the application only handles the TE_ CALLNOTIFICATION and TE_CALLSTATE events. All other events are ignored. Also, notice the call to Release at the end of the function. This corresponds to the call to AddRef made when posting the event to the UI thread.
      Going back to RegisterTapiEventInterface, you can see that after the CTAPIEventNotification object is created, the application finds the ITTAPIEventNotification connection point and registers the callback object. After this registration, the application calls ListenOnAddresses.
      The ListenOnAddresses function starts by calling gpTapi->EnumerateAddresses. TAPI returns an enumerator of all Address objects present on the system. The application then loops through all the addresses by calling pEnumAddress->Next, and checks to see if the address supports TAPIMEDIAMODE_AUDIO. If it does, the application calls another function, ListenOnThisAddress, to start listening for calls on that Address object.
      Of course, there are many other capabilities that an application may want to query before using the Address object. As discussed previously, the ITAddressCapabilities interface provides lots of information about the address.
      The function ListenOnThisAddress first queries for the ITMediaSupport interface, then obtains all the TAPIMEDIAMODEs supported by the address. ITMediaSupport:: get_MediaTypes returns a long, which is actually a bit field of the supported TAPIMEDIAMODEs. That long is then used in RegisterCallNotifications to tell TAPI which media the application would like to listen for.
      I already know that the address supports audio because I checked this in ListenOnAddresses. For this application, I also want to listen for video, if available. I could have specifically checked for video, then called RegisterCallNotifications with TAPIMEDIAMODE_AUDIO|TAPIMEDIAMODE_VIDEO, if it was supported. The application does it this way to demonstrate both ways of determining how to find the supported media modes.
      Also, notice that the application keeps a global array of registration instances from RegisterCallNotification in gplRegistrationInstances. As discussed previously, this value is used to stop listening for calls. The implementation in this application is slightly awkward. It keeps an array with no way to map the registration instance back to an address, so there is no way for the application to selectively unregister for notifications. Usually an application would keep this value associated with an address in case it decided to stop listening on that address.
      After it starts to listen for calls, the application is finished with its initialization and simply waits for something to happen. The user interface (see Figure 5) is very simple. It lets the user answer, disconnect calls, and exit. Let's see what happens when a call comes in.

Figure 5 INCOMING UI

Figure 5 INCOMING UI

      First, TAPI will fire a TE_CALLNOTIFICATION event. As described previously, this event is given to the application in the ITTAPIEventNotification::Event method, and the application immediately posts this event to its UI thread. The OnTapiEvent functions eventually get called, and the application then dispatches this event to the HandleCallNotificationEvent function.
      HandleCallNotificationEvent first queries the Event object, pEvent, for the ITCallNotificationEvent interface. The ITCallNotificationEvent interface is defined as:
interface ITCallNotificationEvent : IDispatch { [propget, id(1), helpstring("property Call")] HRESULT Call( [out,retval] ITCallInfo ** ppCall ); [propget, id(2), helpstring("property Event")] HRESULT Event( [out,retval] CALL_NOTIFICATION_EVENT * pCallNotificationEvent ); [propget, id(3), helpstring("property CallbackInstance")] HRESULT CallbackInstance( [out, retval] long * plCallbackInstance ); }
From this interface, the application can obtain the call about which it's being notified, the CALL_NOTIFICATION_ EVENT (which tells the application if it's the owner or just a monitor of this call), and the callback instance that was given to TAPI in the call to RegisterCallNotifications.
      The INCOMING application checks the CALL_NOTIFICATION_EVENT to make sure that it is the owner of the call. If it isn't the owner, it ignores the call and returns. It is important to note that at this point, there is no reference to this call. The application has nothing else to clean up related to the call it's ignoring. If the application is the owner of the call, it retrieves the call, saves it in its global pointer, and returns. Actually, using a global variable in this way is bad; if there were already a call there, it would be overwritten. To simplify the demonstration, this application assumes one call at a time.
      At this point, the Answer button in the application has not been enabled because the application hasn't yet received an event that indicates an offering state. This is similar to the LINE_APPNEWCALL message in TAPI 2.x. The notification event lets the application know about the existence of a call, but the application shouldn't do anything with it until it gets a call state message. A notification is always followed immediately by a call state message.
      Next, the application will receive a TE_CALLSTATE event. This is eventually handled in HandleCallStateEvent, which queries the event for the ITCallStateEvent interface and looks like this:
interface ITCallStateEvent : IDispatch { [propget, id(1), helpstring("property Call")] HRESULT Call( [out, retval] ITCallInfo ** ppCallInfo ); [propget, id(2), helpstring("property State")] HRESULT State( [out, retval] CALL_STATE * pCallState );
[propget, id(3), helpstring("property Cause")] HRESULT Cause( [out, retval] CALL_STATE_EVENT_CAUSE * pCEC ); [propget, id(4), helpstring("property CallbackInstance")] HRESULT CallbackInstance( [out, retval] long * plCallbackInstance ); }
The ITCallStateEvent interface gives the application the call, the new call state, the cause for the call state change, and the callback instance. HandleCallStateEvent first checks the new call state, and if it's not CS_OFFERING, CS_DISCONNECTED, or CS_CONNECTED, it ignores it.
      The Answer button finally gets enabled when CS_OFFERING is handled, although the application waits for the user to press the button before actually answering the call. When CS_CONNECTED calls are handled, the MakeWindowsVisible function is called. By default, video windows are hidden in TAPI 3.0. This gives the application time to place the windows and set their properties before showing them. The best time to set these properties is when the call is connected.
      The user can now press the Answer button. When the button is pressed, the dialog procedure will call AnswerTheCall. This function finds and selects terminals on the call, and then calls the Answer method. As I mentioned while discussing the Call object, terminals must be selected on the call before the call can be answered or connected. So let's look at how the program discovers the terminals to be used.
      The CreateTerminals function takes an Address object and returns an array of terminals to use on the call. First, it tries to find a Terminal object that supports audio rendering by calling GetDefaultTerminal to obtain the default audio render terminal. It then checks the actual direction of the terminal. Some terminals can both render and capture the stream. When asked for a specific direction, TAPI can return a terminal that supports both directions. If the terminal does support both, then there is no need to get the capture terminal as well.
      Next, the CreateTerminals function determines if the address also supports video. If so, it obtains a video render terminal. This is always a video window, which is a dynamic terminal. The function GetDefaultTerminal only returns static terminals, so it will fail for a video capture terminal. Instead, CreateTerminals calls GetVideoRenderTerminal, which is a wrapper around the ITTerminalSupport::CreateTerminal method. The only trick with CreateTerminals is that the terminal class being requested, in this case CLSID_VideoWindowTerm, must be passed in as a BSTR, not a GUID. The function converts the CLSID to a BSTR, CreateTerminals is called, then any allocated memory is freed.
      Finally, CreateTerminals uses GetDefaultTerminal to obtain the video capture terminal. If the video capture terminal exists, it also enables the preview window on this terminal, so the application will display a preview of what it is sending. Note that it's possible for an address to support video even though no video capture terminal exists. The capability of the address is a separate issue from whether a video capture device is present on the computer. In contrast, video rendering is always available if the address supports video because this involves simply creating a window.
      The CreateTerminals function uses a simple method to find terminals for a call, relying solely on GetDefaultTerminal to retrieve the Terminal object. Typically, an application will default to GetDefaultTerminal, but gives the user the option of choosing which terminals to use on a call.
      When execution returns to AnswerTheCall, the application has the terminals to select on the call. It loops through the array of returned terminals and selects each terminal on the call. Finally, the application calls ITBasicCallControl::Answer on the Call object.
      That about covers the application. Disconnecting the call is very simple—just call ITBasicCallControl::Disconnect. There is also some cleanup after a call is disconnected, and when the application shuts down.
Conclusion
      You've taken a first look at the new features in TAPI 3.0, and how to write TAPI 3.0 applications. You should now have a good idea of how the TAPI 3.0 objects work together, and how to make and receive calls. You can obtain additional information about TAPI 3.0 from the Microsoft Web site at http://www.microsoft.com/ communications/tapilearn30.htm.
From the November 1998 issue of Microsoft Systems Journal.