TAPI 3.0 Connection and Media Services

TAPI 3.0 is an evolutionary API providing convergence of both traditional PSTN telephony and IP telephony. IP telephony is an emerging set of technologies that enables voice, data, and video collaboration over existing LANs, WANs, and the Internet. TAPI 3.0 enables IP telephony on the Microsoft Windows operating system by providing simple and generic methods for making connections between two or more computers and accessing any media streams involved in the connection.

To provide connection and media services, TAPI uses Telephony Service Providers (TSPs) and Media Stream Providers (MSPs), collectively called Service Providers (SPs). This paper, intended for developers, reviews SPs as they exist in TAPI 2.x, and discusses the enhancements provided in TAPI 3.0.

On This Page

The TAPI 3.0 Architecture
A Review of TAPI 2.x
TAPI 3.0 Overview
For More Information


IP telephony is an emerging set of technologies that enables voice, data, and video collaboration over existing IP-based LANs, WANs, and the Internet.

Specifically, IP telephony uses open IETF and ITU standards to move multimedia traffic over any network that uses IP, offering users both flexibility in physical media (for example, POTS lines, ADSL, ISDN, leased lines, coaxial cable, satellite, and twisted pair) and flexibility of physical location. As a result, the same ubiquitous networks that carry Web, e-mail, and data traffic can be used to connect to individuals, businesses, schools, and governments worldwide.

The Microsoft Windows Telephony Application Programming Interface (TAPI 3.0) is an evolutionary API that supports convergence of both traditional PSTN telephony and telephony over IP networks. Intended for developers, this paper describes service providers (SPs) as they exist in TAPI 2.x and the enhancements that are in TAPI 3.0. A service provider can either be a Telephony Service Provider (TSP) or a Media Service Provider (MSP). TSPs are responsible for resolving the protocol-independent call model of TAPI into protocol-specific call-control mechanisms. MSPs implement Microsoft DirectShow interfaces for a particular TSP and are required for any telephony service that makes use of DirectShow streaming.

This paper assumes a familiarity with TAPI 2.x and the Component Object Model (COM).

The TAPI 3.0 Architecture

The Microsoft Windows 2000 TAPI 3.0 architecture is shown in Figure 1.

Figure 1: Windows 2000 TAPI 3.0 architecture

The left side of the drawing shows the components shipping today in TAPI 2.x. The right side shows the new components that are part of TAPI 3.0. The TAPI 3.0 COM API allows object-oriented, language-neutral software development. It is built on top of the TAPI server. The C API (to the left) provides call functionality. The new TAPI 3.0 COM API provides not only call functionality but also media access, directory access, and terminal access as well. (Terminals are the devices on a computer that capture and render audio and video. A sound card and a video camera are examples of devices that can be terminals.)

At the application level, there are COM interfaces for call control, media control, and directory control. There are also COM interfaces at the SP level to support services such as IP telephony and media control (the latter is exposed as part of the MSP). Therefore, in TAPI 3.0, COM objects are a part of the SP. Some of these service provider objects and interfaces will be aggregated through TAPI objects while the MSPs will directly provide others to the applications.

The architecture diagram demonstrates that TAPI 3.0 incorporates the features of TAPI 2.x, and enhances them. This means that existing TSPs will work with TAPI 3.0 and that developers can build on knowledge they already have to either expand existing SPs or build new ones.

A Review of TAPI 2.x

As a prelude to discussing TAPI 3.0 SPs, this paper first reviews the basics of TAPI 2.x SPs, pointing out differences and parallels with TAPI 3.0 as they occur, beginning with a review of the TAPI 2.x distributed architecture.

TAPI 2.x Remote Service Provision

Figure 2 illustrates the distributed architecture that was first introduced in TAPI 2.1.

Figure 2

Figure 2: Distributed architecture introduced in TAPI 2.1

Service providers operate in a distributed environment. This distribution is given transparently by remotesp. Transparent distribution means the SP neither knows nor cares if it's operating on the same computer as the client application. Just as this had significance in TAPI 2.x, for the user interface (UI) component, so does it still have significance in TAPI 3.0, when providing media services.

The UI DLL in TAPI 2.x uses exactly the same model as the MSP in TAPI 3.0. It can run remotely and it runs in the client application context rather than in the server context. For setting up media calls, it has a pipe that runs from it, through TAPI, to the TSP, to exchange context about the connections and the application.

Notice that the UI DLL is residing on the client side while the TSP is running in the TAPI server context that is on the server side. This means that any registry writing done by the UI DLL is sent back to the TSP so that the TSP writes to the server, which is the logical place, rather than on some random client that may not be available the next time the application runs.

One issue that must still be considered in TAPI 3.0 is how security should be applied in a distributed environment. The TAPI 3.0 security scheme is line-based rather than address-based as in TAPI 2.x. This should be taken into account when modeling both hardware and services, and when deciding how to split them up among addresses, lines, and phone devices. For example, if you decide to have one line with many addresses, security can be applied only to the line and there is no way to distinguish among the addresses. On the other hand, if you decide to have a separate line for each address, then security can be applied in a much more granular fashion.

Another point of similarity between TAPI 3.0 and TAPI 2.x is that, on Windows 2000 servers, TAPI3.DLL in TAPI 3.0 is a peer to the TAPI 2.x TAPI32.DLL. A TAPI application can load either TAPI32.DLL or TAPI3.DLL. Finally, TAPI 2.x TSPs will work with TAPI 3.0. TAPI 3.0 enables a TAPI 3.0 application that is connection-only to transparently pick up the TAPI 2.x TSP.

A Review of TAPI 2.x SP Objects

The TAPI 2.x objects that are of interest to an SP are, hierarchically:

  • Line Devices

  • Addresses

  • Calls

  • Phone Devices

Line devices own addresses and addresses support calls. On the same level as line devices are phone devices. Several new objects have been introduced in TAPI 3.0. One of these is streams. Streams are operated on when setting up sources and sinks for media. Calls support streams and streams provide access to terminals. Terminals act as the endpoints for calls. Streams and terminals are discussed later in the paper.

As we said earlier, one difference between TAPI 2.x and TAPI 3.0 is that TAPI 2.x is line-based while TAPI 3.0 is address-based. (In TAPI 3.0, addresses rather than lines are constantly being acquired and queried for their capabilities.) This doesn't affect TSPs because TAPI 3.0 deals with the differences and line devices can still be accessed from address objects.

Modeling TAPI Lines

When providing connection and media service, the first question is how to model devices. TAPI doesn't mandate any particular scheme. It is possible to have a single address per line or multiple addresses per line. Similarly, it is possible to have a single address type per line or many address types.

For example, a telephone with both an ISDN (Integrated Services Digital Network) line and a POTS (Plain Old Telephone Service) line could be modeled as a single line device with a POTS-type address and an ISDN-type address. It would be up to the application to query those addresses to find out their capabilities. On the other hand, the device could be modeled as having two lines. This might make more sense because applications typically pick up a line and assume that every address on that line is of the same type.

These decisions also have repercussions for security. For example, if a PBX (Public Branch Exchange) is modeled as a single line, where every station on the line has a separate address, then you can't offer any security. On the other hand, if you use an address-based scheme, with a line for every station, you can grant users access to their individual lines.

Also, TAPI 3.0 has new protocol and address types beyond the implicit PSTN-types available in TAPI 2.x, discussed later in the paper. These new types are all per-line, which means that all addresses on a line are assumed to be of the same protocol type.

Finding a Suitable Line

In TAPI 2.x, an application picks a SP by doing the following:

  1. The application calls lineInitialize/Ex and iterates through the line devices.

  2. It negotiates the version numbers for all of the lines in the application with lineNegotiateAPIVersion.

  3. It determines the line capabilities with lineGetDevCaps.

  4. It determines the address capabilities with lineGetAddressCaps.

  5. It looks for appropriate media and bearer modes with lineGetCallInfo.

  6. It may also look for LINEDEVCAPS.dwPermanentLineID.

TAPI 3.0 objects, which also support streaming, have similar mechanisms for initialization and for discovering capabilities. One enhancement in TAPI 3.0 concerns the LINEDEVCAPS.dwPermanentID attribute. Because this attribute is not necessarily permanent or unique in TAPI 2.x, and because in a distributed environment, uniqueness is necessary in both space and time, TAPI 3.0 uses a globally unique identifier (GUID), which is guaranteed to be unique across all services.

Opening a Line

At the SPI (Service Provider Interface) level, a line in TAPI 2.x should only be opened oncewhen an application first shows an interest in it. The TSPI_lineOpen method should return reasonably quickly. If the open is time-consuming, consider using a thread. Also, if there is an error, first return, and then send a LINE_CLOSE.

Once the line is opened, TSPI_lineSetDefaultMediaDetection indicates what types of calls the application is interested in. An application can receive calls of the correct type(s) once, many times, or never. It's important that, if the correct call is never received, new calls (of the incorrect type) are never indicated. This is because TAPI will immediately close the call and invalidate the context. The upshot is that, when users pick up the phone to make a call, the SP disconnects them.

A Review of TAPI Handles

There are three types of TAPI handles:

  • The hXxx handle. This is the application's handle and is generated by TAPI.

  • The htXxx handle. This is generated by TAPI and given to the TSP. It describes the TAPI context.

  • The hdXxx handle. This is generated by the TSP and given to TAPI. It describes the TSP objects and TAPI uses the handle to refer to those objects when talking to you.

The second and third types of handles are of the greatest interest. A series of diagrams, shown below, illustrate how an SP uses the handles while processing a call.

Figure 3: Service Provider processing a call

Figure 3 shows three applications. The first application opens the device as an owner while the other two are only monitoring and won't be expected to pick up any calls and own them.

The LINE_NEWCALL method indicates that a call has come in on the device. The method gives TAPI the SP-generated driver handle (the hdCall handle) and also has a place for TAPI to write its own handle to the call (the htCall handle). TAPI then generates call handles hCall1, hCall2, and hCall3 to each of the applications.

The state transition to offering is shown in Figure 4.

Figure 4: State transition to offering

Notice that the LINECALLSTATE_OFFERING uses the TAPI context that it filled in during the previous step. The next diagram, Figure 5, shows what happens when the call is answered.

Figure 5: Call answered

The owner application performs a lineAnswer and TAPI responds with a TSPI_lineAnswer, using the driver call handle to indicate the context. The next diagram, Figure 6, shows what happens when the call state is active.

Figure 6: An active call state

Here, the SP responds and changes the state to active. The LINECALLSTATE_ACTIVE method uses the TAPI handle (htCall) and all the applications are notified of the state change.

Figure 7 illustrates what happens when a call is dropped.

Figure 7: . A call is dropped

When the owner application is finished with the call, it calls the lineDrop function and the SP receives a TSPI_lineDrop, with its own context. The next diagram, Figure 8, shows what happens when the call state is idle.

Figure 8: Call state is idle

Once a call is idle, it can't transition to any other state. It is still possible to get ITCallInfo information about the call to find out such things as the duration of the call and the parties involved in that call. The call is alive as long as an application has a handle open to it. This means that TAPI isn't going to deallocate the call, the SP call handle, or its own context until the last application has deallocated its application call handle. This is shown in the Figure 9.

When the first application deallocates the call nothing happens, and this is also true when the second application does the same. Only when the third application deallocates the call does the SP receive the TSPI_lineCloseCall, again with its own driver handle. Once this has happened, you can no longer rely on any of the contexts surviving. The htCall handle may be reused immediately, which means it is important that such things as tables and pointers no longer reference it.

Figure 9: Deallocating the call

TAPI 3.0 Overview

Now that you've reviewed the basics of earlier versions of TAPI, you can examine the changes that have occurred in TAPI Service Provider Interface (TSPI) 3.0. It's important to remember that the changes at the TSPI level are incremental. Existing, connection-oriented SPs will work and you can take advantage of some of the new features in TAPI 3.0 without necessarily having to use COM or write an MSP. Examples of this are adding CallHub call control and reporting IP telephony capabilities. If you have a wave device, you can take advantage of the DirectShow integration that is part of TAPI 3.0 by using the MSP that is already included.

The following sections discuss the TSPI enhancements that are in TAPI 3.0. These enhancements include:

  • Address types.

  • Protocol types.

  • A permanent and unique line and phone GUID.

  • CallHub support.

  • New capabilities reporting.

  • MSP support.

Address Types

Address types tell the application what sort of dialable strings the SP supports and also tells the SP the format of those strings. Earlier versions of TAPI always assumed that dialable strings were phone numbers. This is no longer true. There is a session descriptor type for IP telephony and a variety of ways for addressing users, such as by name, machine name, and IP address. Here is the complete list of address types:

  • LINEADDRESSTYPE_PHONENUMBER, which means the address is a phone number.

  • LINEADDRESSTYPE_SDP, which is a Session Description Protocol (SDP) address. This protocol is an IETF standard for announcing multicast conferences.

  • LINEADDRESSTYPE_EMAILNAME, which means the address is an email name.

  • LINEADDRESSTYPE_DOMAINNAME, which means the address is a domain name.

  • LINEADDRESSTYPE_IPADDRESS, which means the address is an IP address.

These types are provided to applications in the following structures:




The LINECALLINFO structure indicates the call's address type. The LINECALLPARAMS structure tells the SP the format of the destination source strings and the LINEDEVCAPS structure holds a variety of information about the line's capabilities such as the different kinds of tones that can be generated or the digit modes.

TAPI 3.0 Protocol Types

Just as there are new address types in TAPI 3.0, there are also new protocol types. The protocol types supported in TAPI 3.0 are:

  • TAPIPROTOCOL_PSTN, which supports voice.

  • TAPIPROTOCOL_H323, which supports the ITU standard, H.323, for videoconferencing over packet-switched networks such as LANs and the Internet.

  • TAPIPROTOCOL_Multicast, which supports calls made using the Multicast backBone (MBONE).

Earlier versions of TAPI assumed that call control was always across PSTN but in TAPI 3.0 there is support for IP telephony with the H.323 and multicast protocols. These protocols are GUIDs with one GUID per line. This means that there must be a separate line for each protocol that the SP supports. The permanent line GUID is stored in the LINEDEVCAPS structure. There is also a permanent phone GUID stored in the PHONECAPS structure.

TAPI 3.0 SPI CallHub Support

Earlier versions of TAPI presented a first-party view of a call, which means the call handle only represented one endpoint of the connection. The notion of a half-call view (in essence, a switch) didn't really exist. In TAPI 3.0, the CallHub object provides the switch's view of a call. This is also called a third-party view. There is one CallHub per conference, with some number of half-calls or call handles connected to itone for each user. Each of these handles can be manipulated separately.

For current TSPs that use the dwCallID field in LINECALLINFO, TAPI automatically creates a CallHub for a call. TAPI assumes that every call that references the same ID is part of the same CallHub.

There is also new information available for tracking. The CALLHUBTRACKING constants are:




These constants let you monitor CallHubs across all devices on a TSP, across only those devices that have SetDefaultMediaDetection selected, or lastly, you can indicate that you don't want TAPI to create CallHubs. One reason to choose this option is because of concerns about security. To retrieve and set the type of CallHub tracking, use the TSPI_lineGetCallHubTracking and the TSPI_lineSetCallHubTracking functions.


The following constants have been added to the LINEDEVCAPFLAGS structure:

  • LINEDEVCAPFLAGS_MSP, which reports if the line has MSP capabilities.

  • LINEDEVCAPFLAGS_CALLHUB, which reports if the line has CallHub capabilities.

  • LINEDEVCAPFLAGS_CALLHUBTRACKING, which reports if the line has CallHub tracking capabilities.

  • LINEDEVCAPFLAGS_PRIVATEOBJECTS, which reports if the line has any private objects.

Private objects allow you to perform tasks that are specific to your SP. Some device-specific operations are a matter of implementing additional private interfaces on the SP. (You're free to expose whatever interfaces you like.) These are cases where the interfaces are exposed directly to the application and are not aggregated. An example is making a terminal object behave any way you like.

In other cases, private objects are actually aggregated by TAPI into TAPI 3.0 standard objects, again allowing you to do things that are specific to your SP. For instance, a Call object could contain your object's private data, allowing you to treat a call in some non-standard way.

TAPI 3.0 SPI MSP Support

Enhancements in TAPI 3.0 for supporting MSPs include:

  • A function to find out the MSP CLSID, which is used when creating an MSP.

  • A function to open a communication channel between the MSP and TSP.

  • A channel function message and response system. This allows an opaque structure to be passed back and forth between the TSP and the MSP.

The communication channel allows an MSP to set all the characteristics of a call before a connection is made. Typically, the application first sets up the DirectShow filters, the terminals, and any other conditions that it requires. Once this is done, TAPI asks the TSP to actually establish the connection. Initially, then, it is usually the TSP that uses the communication channel to query the MSP to find out the characteristics of the call. The MSP uses the channel to indicate to the TSP when those characteristics have changed.

An Overview of MSPs

An MSP is a COM-based object used to construct application-specific streams that are exposed through terminal objects. An MSP is similar to the UI DLL in TAPI 2.x. It is loaded into the application process because it constructs the streams and terminals, which means it must be available to that instance of the application. One way to think of an MSP is as a part of the TSP that operates in the application process. MSPs and TSPs are tied together 1-to-1. For example, there is an H.323 TSP and MSP. These can't be mixed and matched because the communication between the two is opaque. The following diagram, Figure 10, shows the MSP object model.

Figure 10: MSP object model

TAPI first creates an address object for each TSP address. This means that for each line device that exposes an address, TAPI creates an object. This object provides an application's access to the various services.

If the TSP has an MSP associated with it, then for each address object, TAPI asks the MSP to create an MSP address object. This is aggregated through the TAPI address object so that its interfaces become available to the application. The MSP address object is used to enumerate the terminals and these are exposed directly to the application by TAPI-defined interfaces.

When a call is created, TAPI first creates its own call object but it also requests the MSP to create an MSP call object, which TAPI aggregates. Finally, the stream objects are created from the MSP call object. Once again, the application can access these interfaces directly.

TAPI 3.0 Interfaces

All the TAPI 3.0 interfaces are illustrated in Figure 11.

Figure 11: TAPI 3.0 interfaces

Writers of SPs are particularly interested in the TSPI interface to tapisrv.exe, the MSPI interface to TAPI 3.0 and the application process context, and the terminal manager interface by which existing terminals can be discovered, built, and connected to the DirectShow filtergraph.

Note that, in the diagram, the TAPI 3.0 interface extends over the MSP. This is to indicate that the MSP exposes objects that implement TAPI interfaces. It doesn't mean that TAPI is actually aggregating those objects. Because the MSP does implement TAPI interfaces, it's important that those interfaces be implemented in a way that's compatible with HTML-based scripting languages such as Visual Basic Scripting Language (VBScript).

An Overview of Terminals

Terminals are devices that are the final source or sink of media on a call. They are usually implemented by an MSP and can be divided into two typesstatic and dynamic. Static terminals are typically resource-intensive and nonshareable. Examples of static terminals include sound cards, microphones and speakers. Dynamic terminals are usually only limited by hardware resources such as memory. Examples of dynamic terminals include video windows and DTMF (Dual Tone Multi-Frequency) detection and generation. The TAPI3.DLL implements terminals corresponding to TAPI phone devices.

An application can select a terminal on a stream at any point in the lifetime of a call. (This means before the SP receives a TSPI_lineCloseCall.) Although it's possible for the MSP to fail a request for a terminal, the preferred method is that the MSP be able to select and deselect terminals on the fly.

In TAPI 2.x, most SPs had a wave device for accessing the media streams. TAPI provided a generic way for the application to retrieve the device (the lineGetID method) but after that it was up to the application to handle media control. However, many applications would prefer to simply direct the bitstream rather than operate on it. To relieve applications from actually having to deal with the bits, TAPI 3.0 has transferred these responsibilities onto the MSP, which, in turn, uses the capabilities of DirectShow for handling the raw bitstream.

Terminal interfaces are designed to define specific tasks that an application may want to perform. They provide a standard model for media control and simplify most streaming tasks such as the following:

  • DTMF.

  • Speech recognition and text to speech.

  • Recording files and playing them back.

  • Interactive audio and video.

Acting as COM interfaces, terminals represent a contract between the media streaming implementation and the application. They guarantee that media control in applications will work across various TSPs. The abstract layer remains the same, even if the media streaming tasks are implemented differently under the surface.

An Overview of Streams

A stream represents a single media type and a single direction on a call. (Note that it is not at all uncommon for a single call to have multiple streams.) An application selects terminals on streams. Doing this tells the MSP how to set up its media streaming. The following drawing, Figure 12, illustrates streams on a full-duplex audio/video call.

Figure 12: Stream on a full-duplex audio/video call

This call has four streams:

  • An audio render stream.

  • An audio capture stream.

  • A video render stream.

  • A video capture stream.

Through these streams, you can plumb in a speaker terminal to render the audio, a microphone terminal as a source for the audio, a video window terminal, a video camera terminal and finally, another video window terminal on the video capture stream. This last window lets you see what is going out over the network and it means you need a capture stream for the render video stream.

Cases where directions are contradictory are not uncommon and the MSP should be able to take a stream that is going in one direction, demultiplex it, and push it in the other direction. Of course, it is possible to simply return an error, but the preferred method is that the MSP be able to handle the situation. Another example of contradictory streams could be rendering audio to multiple locations, such as to a speaker and to a file while recording a conference.

Streams provide applications with a standard model for dealing with media streaming on a call. They allow applications to tell the MSP how to set up media streaming unambiguously. Streams separate the media from the call and divide it into four parts:

  • The media type.

  • The media direction.

  • The media source.

  • The media sink.

This means there is a much finer level of manipulation through streams than was possible through the call object. Distinguishable sources and destinations can always benefit by being exposed through separate streams.

Streams support substreams, which are exactly like streams except that they are one level lower. An example of substreams, used by the IP multicast capabilities supported in TAPI 3.0, is dividing an incoming video stream into substreams, where each substream represents one of the participants in the conference. You can then apply a separate video terminal on each substream to display each of them in different windows.

The substream is part of the stream object's capabilities. It is implemented by the MSP and is directly exposed to the application, not aggregated through TAPI. An MSP can implement private interfaces to gain even more control over a substream. For example, in the multicast example, an MSP could implement a participant interface to further identify conference members.

An Overview of the Wave MSP

The WaveMSP is a generic MSP provided by Microsoft that can be used with any TSP that has a wave device. In earlier versions of TAPI, applications needed to use getLineID to retrieve the wave device ID. They then needed to open the wave device, which would be locked up for the duration of the session, and they had to handle all the media control themselves. The WaveMSP is an out of the box solution. During initialization, TAPI queries the TSP for a wave device. If one exists, then the WaveMSP is used. The WaveMSP wraps the wave device to fit into the TAPI 3.0 object model. The WaveMSP exposes the DirectShow interfaces directly above the wave device. It uses the Terminal Manager to discover, create, and use terminals.

Determining the Need for Writing MSPs

To summarize, the functions of an MSP are the following:

  • It implements terminals.

  • It communicates terminal selection with the TSP.

  • It performs media control (the TSP handles call control).

  • It can set up streaming for client application in the application context

There is no reason to write an MSP if only call control is needed and not media control. There is also no reason to write one if the TSP already has a wave device. In this case, use the WaveMSP. The only reason not to use the WaveMSP would be to take advantage of some specific feature by writing new interfaces that the WaveMSP doesn't provide. For example, if there is very fine granularity for measuring progress detection on a device, you may want to implement this as a feature on the stream, using your own MSP. Of course, the most general reason to write an MSP is to provide TAPI 3.0 applications with a finer degree of control of media on the telephony device.

The TAPI 3.0 Algorithm for Terminal Creation and Discovery

The way TAPI 3.0 discovers and creates terminals is summarized in the following piece of pseudocode:

if the TSP has an MSP 
enumerate terminals from MSP 
if the TSP has a wave device 
enumerate terminals from WaveMSP 
create terminals based on phone devices 

TAPI first checks to see if the TSP has an MSP. If it does, then TAPI asks the MSP to enumerate the terminals. If there is no MSP, TAPI then checks to see if the TSP has a wave device. If it does, TAPI uses the WaveMSP that enumerates the terminals, discovering what hardware is available for supporting audio. Lastly, if there is no MSP or wave device, TAPI creates terminals based on the phone devices that are available. TAPI creates terminal and stream objects for phone devices, so that you can direct the media stream to your phone device on your call.

An Overview of MSPI

The Media Stream Provider Interface (MSPI) allows the MSP to enumerate and create terminals for the application. The MSPI is used to create the MSP Call object, which is then aggregated. Once this happens, the application talks directly to the MSP, bypassing TAPI. It selects terminals the MSP has created, on streams that the MSP has implemented and created. Because the application talks directly to the MSP, it's important that the MSP conform to the defined TAPI interfaces and be scripting compatible.

An Overview of the Terminal Manager

The terminal manager, like the WaveMSP, is intended to simplify MSP development. The terminal manager makes it easier to write MSPs, to interface with DirectShow, and to locate hardware resources on the computer.

The terminal manager is a small helper DLL that enumerates the static terminals and also has base classes that you can derive from when writing MSPs. These classes are designed to make it easy to write MSPs based on DirectShow. They do the following:

  • Enumerate and create DirectShow terminals.

  • Insert filters into a DirectShow filtergraph.

The base classes create terminal objects that correspond to found devices that are of concern to TAPI. The helper functions insert filters, based on these terminals, into a DirectShow filtergraph.

Although the terminal manager is based on standard DirectShow devices, you can use whatever you like, creating your own terminal objects and implementing your own interfaces. You can also define additional interfaces on standard DirectShow terminals that an application can query for and use.

Making a TAPI 3.0 Call

This section shows how to establish a TAPI call from the SP point of view. Remember that a call can be voice along with other media streams. The steps are as follows:

  1. The application selects an address offered by the SP.

    The application asks that the SP enumerate the terminals on that address.

    1. TAPI passes this request on to the MSP.

    The application creates a call on the address: Creating a call means that it is creating a Call object, not actually establishing a connection. To create a call object:

    1. TAPI informs the MSP of the call.

    2. The MSP creates the MSP Call object.

    3. TAPI aggregates the MSP Call object into the TAPI Call object.

    The application performs a QueryInterface to acquire the stream control interface.

    1. Through aggregation, this interface is passed on to the MSP.
  2. The streams available on that MSP call are enumerated. This involves enumerating objects that are created by the MSP using interfaces implemented by the MSP.

  3. A terminal is selected on the stream.

    Finally, the call is connected. To accomplish this:

    1. TAPI calls TSPI_lineMakeCall

    2. The TSP and the MSP communicate about the call. The TSP uses the pipe through TAPI to track connection states that will affect characteristics of the stream.

An Overview of the ACD Proxy Functions

This section discusses the support at the service level for the TAPI 3.0 Automatic Call Distribution (ACD) proxy functions. (For a more complete description of ACD, refer to the TAPI 3.0 documentation.) These functions offer an interface for call center functions provided by proxy applications running on the TAPI server and provide flexible options for modeling ACD components.

An ACD component can be modeled in these ways:

  • Modeled 1-for-1.

  • Modeled entirely in software and built against a known PBX or switch. In this case, all the queuing and vectoring is done in the software.

  • Modeled as a hybrid. In this case, you can add specific software functionality to an existing ACD component.

There is transparent support for existing TSPs. For example, it is possible to build a proxy application that runs against an existing PBX TSP.

ACD Proxy Components

The following diagram, Figure 13, illustrates the ACD proxy components.

Figure 13: ADC proxy components

The ACD proxy application sits on the server and runs against a regular TSP. The TSP has no knowledge about the ACD system. It only understands lines and addresses.

The application sits on the client. It expects to query for an ITTAPICallCenter interface, pick up an agent handler application, register an agent, and finally, to end up in a session, with an address, ready to receive calls.

First the ACD proxy selects lines from the TSP that will be used for incoming calls. These lines aren't seen by the application. They are used only by the ACD proxy for accepting calls. The proxy selects other lines from the TSP that will be used as agent lines. These are addresses that will underlie the agent session. Every session is dynamically assigned a session containing a queue and a group, and an address within that group to which calls will come.

The application also constructs queues. Agents will handle some of these and others will be vectors that are transient and used for such things as announcements. Finally, the session is established and the agent is set up and associated with a particular group and a particular address within that group.

The application communicates with TAPI by function calls that are part of the COM interface in TAPI 3.0. The ACD proxy communicates with TAPI by registering with TAPI through ACD functions. When an application makes a function call, TAPI sends proxy request messages to the ACD proxy. These contain all the arguments that were passed in the application's function call. The proxy responds with response messages that are taken back to the client as function completions. The proxy can also send unsolicited messages for state changes and events.

An Example of Inbound Call Routing

This section gives an example of inbound call routing, using an ACD proxy. Initially, the ACD proxy opens lines for incoming calls, opens lines for agents, and sets LINEPROXYREQUEST constants on those addresses. The agent lines are associated with particular agent instances. The ACD proxy receives proxy requests on these lines and responds to those requests on these lines.

To query for an ITCallCenter interface, the ACD client queries with:

pTapi->QueryInterface (IID_ITTAPICallCenter, &pCallCenter); 

The ACD client then enumerates agent handlers with:

pCallCenter->EnumerateAgentHandlers (&pEnum); 

It creates an agent with:

pAgentHandler->CreateAgent (&pAgent); 

It puts the agent in an agent session with a specified group and address, using this call:

pAgent->CreateSession (pACDGroup, pAddress, &pAgentSession); 

Finally, the ACD proxy receives an incoming call on one of its incoming call lines. It selects a queue for the call, giving it the call treatment that the call requires. The proxy next selects a free agent from the appropriate group, probably basing its selection on ITCallInfo requests. It either transfers or redirects the call to an agent, depending on whether or not the call was picked up. The client then receives the call notification on an associated address.


TAPI 3.0 supports existing TSPs while making it easy to extend their capabilities to support IP telephony. Developers can take advantage of the SPI extensions for adding media control. The WaveMSP provides an out of the box solution to media control for TSPs that already have a wave device. The terminal manager simplifies writing MSPs, interfacing with DirectShow, and locating hardware resources on the computer. By taking advantage of the new ACD functions, developers can provide greatly enhanced call center services.

For More Information

For more information about TAPI 3.0 and Microsoft Windows 2000 servers, check out Microsoft TechNet or consult the following Web sites:

For information about TAPI 3.0, go to the Communication Services page of the Windows NT Server Web site.

For information about the family of Windows 2000 servers, go to the Windows 2000 Server Web sites.

You can download the Microsoft Windows Platform SDK.

For sample code, articles, and information about the platform SDK, go to the Microsoft MSDN Web site.