Cutting Edge: Pluggable Protocols; Microsoft Internet Developer January 1999

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

Download the code (54KB)

Dino Esposito

Pluggable Protocols

ost of us are so accustomed to strings like "http:" that we use them without thinking. We use them with the same nonchalance with which we turn a TV on and off with a remote control. In many cases, we even omit the http prefix when we enter the URL for a Web site we're going to surf. It's so common that you don't even have to bother to type it in.
      Together with http:, there are a couple of other magic strings that we use frequently—not as often as with http:, but often enough. The mailto: URL prefix helps us send email. News: lets us browse newsgroups. And let's not forget the ftp: and gopher: protocols.
      But how many of us actually know what exactly http: or mailto: are? First of all, they represent protocols—systems of rules that define the proper behavior in various situations. On the Internet, a protocol is a set of rules that determines how a certain resource is accessed. When you type http://www.microsoft.com in the address bar of your browser, you're actually telling it to use the rules of the HTTP protocol to get in touch with the server at www.microsoft.com.
      Each protocol has its own set of rules and provides a certain behavior to the user. Depending on what the protocol is designed to do, it might also need a transport mechanism. In the case of HTTP this mechanism is TCP/IP, another group of protocols that is designed to control data transmission across a network.
      If you're thinking that http: is just a magic word like abracadabra, you're only partially wrong. There's nothing magic behind HTTP or any other protocol. But you could name your own custom protocol abracadabra, and specify how it should access remote and hidden resources.
      Let's examine exactly what an Internet protocol is. I'll review the fundamentals of HTTP and the minimal set of functionality that characterizes an Internet protocol. With this in mind, I'll show how to code a custom URL protocol. Custom protocols provide highly specialized behavior, particularly within an intranet or in a local application based on the WebBrowser engine. If the idea of custom URL protocols sounds a bit odd to you, consider that you're probably already using them. The MSDN™ InfoViewer module uses the ivt: protocol Microsoft® Internet Explorer accesses resources in executable modules with res:, and displays help pages or raw text with about:.

A Quick Tour of the HTTP Protocol
      As you probably know, HTTP stands for Hypertext Transfer Protocol. You can transmit other types of data with this protocol, but it's particularly efficient when using hypertext. Moving from page to page by following a hyperlink with HTTP requires very little overhead.
      HTTP is a stateless, transaction-oriented protocol that treats each data exchange between a client and a server as if it were a single and independent communication and not part of a longer conversation. Each HTML page received, and each request sent, has no knowledge of previous transactions at the protocol level. At the application level, there are a few techniques you can use to work around this. A communication performed through HTTP is generally composed of a couple of messages going from the browser to the server and vice versa. The browser sends in a request message and, if all goes well, receives a response message. A complete description of the various fields of the request and response messages are beyond the scope of this article. For more information about the HTTP protocol, take a look at http://www.w3.org/pub/WWW/Protocols.
      The protocol also defines how messages should be formatted. If it's using HTTP, a message should consist of a header and a body. The header contains information that qualifies the content of the request and the returned data. This is useful when a server returns a varied set of data types; if a browser can get information from an HTTP header, it doesn't have to auto-detect the difference between GIF images, JPEG images, HTML pages, plain text, and so on.

A Definition for a URL
      Another central concept that relates to the Web is the URL, which stands for Uniform Resource Locator. A URL is a way to fully describe a resource accessible through the Internet. You can think of it as a generalization of the file system's path name. In fact, file: is a protocol that browsers often use to access local files. A URL includes the protocol to be used, the computer on which the resource is stored, and the internal path that usually comprises one or more directories and the resource name. For example, consider the following URL:

http://msdn.microsoft.com/scripting/default.asp

The protocol is http:. The host computer is msdn.microsoft.com. The directory is scripting Default.asp is the file. This URL can be reconstructed to indicate one specific resource on the Internet, which can be accessed through HTTP.
      In general, a URL has two parts that must be specified: the protocol and the resource. Here's another example of a valid URL:

res://E:\Programs\myfile.exe/HTML_RESOURCE

The protocol is res:, and there's no explicit mention of the host computer. More importantly, the resource's specification isn't limited to a fully qualified path name. In fact, this line contains a file name plus additional information that refers to a specific item in the file's resources by name. In this case, "resource" is admittedly a bit misleading. The resource addressed by the URL evaluates to a given item in the file's user interface resources (bitmaps, icons, dialogs, and the like). The res: protocol was introduced to allow access to embedded HTML pages as special custom resources. For more information about the standard definition of URLs, check out RFC 1738 at http://www.isi.edu/in-notes/rfc1738.txt.

Browsers and Protocols
      Although HTTP is the most well-known and widely used protocol on the Internet, browsers generally support a variety of different protocols. Figure 1 just scratches the surface. (It also contains references to sources where you can find the most up-to-date information about the standards.) HTTP, FTP, and Gopher are probably the three most common protocols implemented by Web servers. In addition to Internet Explorer, all the protocols listed in Figure 1 are also supported by Netscape Communicator and most other vendors' browsers.
      It's important to always remember that a browser is in no way tied only to HTTP. A browser is simply a piece of software that performs some actions by following a given protocol. Ultimately, a protocol is implemented by a piece of software, resident on the client machine, that is invoked when a browser encounters the prefix used as the protocol identifier. In this way, when the browser finds an address that begins with http:, it relies on the functions exposed by the module that handles HTTP. When the browser encounters an ftp: link, it calls the module that handles FTP protocol conversations.
      Once the interface of such a module is formalized, you have a generic layer of code that acts as a conduit to transfer data between the browser and the server. In this case, the server can be anything that can provide the requested information. It could be a Web server for http:, an email program for mailto:, or the local file system if the protocol is file:. Interestingly, the server will be a file if the protocol is res:, as I hinted above.
      By generalizing the structure of this protocol-handling layer and implementing it via a component object model like COM, the browser now has a far more modular architecture. At the same time, it is more extensible, since it's not dependent upon a fixed number of protocols.

Pluggable Protocols
      In the case of Internet Explorer 4.0 and higher, the natural evolution of this approach is geared toward pluggable protocols—that is, protocols whose support can be added after the browser's installation. By writing and registering protocol modules, you can extend and enhance the behavior of Internet Explorer.
      This is especially interesting when you consider the impact it can have on desktop applications based on the WebBrowser control. In fact, each new registered protocol allows you to define an easy and immediate way to access local or network resources in a custom, URL-based manner. By defining the system of rules that form a protocol, you define the steps that extract a given resource from its original context. In other words, a protocol is a way to encapsulate a behavior and complex logic in a short string like mailto: or http:. This has several advantages:

You can reduce a complex series of operations on a file to a short name with a few parameters.

You can access a specific piece of information within a file (like an HTML page in an executable module or a record in a database).

You can filter and check access to resources by sending them all through a single handler.
      There are many possible practical applications of custom URL protocols. You can extract a recordset from an OLE DB provider and format it as an HTML table. You can implement a mechanism that provides list of files with wildcards like the MS-DOS® dir command. I'll discuss this in a moment, as well as investigate the relationship between pluggable protocols and the Windows® shell. But first, let's take a quick look at some of the additional protocols that are supported by Internet Explorer 4.0.

The res: Protocol
      The res: protocol lets you extract a resource from a compiled module like an EXE or DLL. While this protocol has been introduced to work with HTML pages, you can use it to work with other type of resources as well, including custom resources. A URL based on this protocol looks like this:

res://resource file[/resource type]/resource id

where resource file is the name of the executable module. If the file is in the search path (for instance, in the Windows directory), it may be specified by file name alone.
      The second chunk of information, resource type, is optional. The res: protocol supports numbers for each of its predefined resource types, and allows you to use literal strings to identify custom resources. The complete list of the resource types is declared in winuser.h. Figure 2 lists the entries most commonly used via the res: protocol. Those are the same IDs required by some API functions like FindResource. If you don't provide a resource type, it defaults to HTML (type 23). This means that

res://ie4tour.dll/23/welcome.htm

and

res://ie4tour.dll/welcome.htm

are equivalent and will access the same page in the specified executable. If you want to refer a custom resource, namely one whose type is not defined in winuser.h, then you have to use the name of the type. For example, if you have a file called resProt.dll with this line in its .rc file

MindLogo GIF mind.gif

then

res://resProt.dll/gif/MindLogo

is the correct way to reference the image.
      The final piece of information in the URL is the resource name. A resource name can be a number or a string. You can use any string to identify a resource, but if it evaluates to an external file name that's been embedded in the executable file, using the file name as an identifier makes sense. For example, if you have the following line in your .rc file

mind.gif GIF mind.gif

you can invoke it within an HTML page like this:

<img src="res://resProt.dll/gif/mind.gif"></img>

      The res: protocol allows you to compile HTML pages within your application so that there's just one file to distribute: the EXE. Figures 3 and 4 show the contents of an .rc file that includes very simple HTML pages. Notice that all the internal references are based on the res: protocol. This lets you embed an entire HTML-based application within a compiled module.
       Figure 5 illustrates a sample that was written with Visual Basic®. The key is adding a resource script file (.res) to the project. The one I used was generated with Visual C++®. Though the sample is written with Visual Basic 6.0, it works with any 32-bit version of Visual Basic provided that you have the Internet Explorer 4.0 WebBrowser control!
      The version of MFC that ships with Microsoft Visual Studio® 6.0 contains a cool feature to let you obtain the same result. As you may know, MFC includes a new class, CHtmlView, that is just a wrapper around the WebBrowser component. This class exposes a method called Navigate that lets you open the specified URL. Interestingly, CHtmlView also defines a method, LoadFromResource, that navigates to a page that's embedded in the application's resources. Here's how the method is declared:

BOOL LoadFromResource( LPCTSTR lpszResource ); BOOL LoadFromResource( UINT nRes );

As you can guess, LoadFromResource makes internal use of the res: protocol. The function's pseudocode looks like this:

CString strURL; LPSTR szMod = new CHAR[_MAX_PATH]; GetModuleFileName(hInstance, szMod, _MAX_PATH)) strURL.Format("res://%s/%s", szMod, szResrc); Navigate(strURL, 0, 0, 0);

      The res: protocol addresses an issue to which many developers are sensitive. It's not the only possible solution for embedding HTML resources into an application. A lower-level approach might be to embed your resources as shown above, then extract and recreate them as separate files at startup using FindResource and other related APIs. This solution might be worth consideration if your target browser isn't Internet Explorer. The res: protocol is the most elegant solution, but browsers other than Internet Explorer 4.0 don't support it.

The about: Protocol
      Have you ever wondered about the about:NavigationCanceled URL that appears when you try to access unavailable resources with Internet Explorer? Well, about: is another pluggable protocol. Its role is to display either raw text or predefined pages using a short moniker. In one sense, about: is the Web equivalent of MessageBox. It is meant to help you display messages in an HTML page.
      The syntax for this protocol is:

about:text

The text portion can be raw HTML text or a kind of pointer to an HTML page. The browser first tries to find a matching page for the specified text. If it fails, it next considers the text portion to be plain text to display. The about: protocol is implemented in shdocvw.dll. Under the hood, the protocol's implementation ends up writing text in the document body. If you type the following in the Internet Explorer 4.0 address bar

about:Hello, MIND

the string "Hello, MIND" will appear on a blank page as if you'd loaded a page with this source code:

<HTML>Hello, MIND</HTML>

You can also enter more complex text such as:

about:Hello, <a href=www.microsoft.com/mind> MIND</a>

The result is shown in Figure 6.

Figure 6: Using about:

      Figure 6: Using about:

What's cool with about: is that you can define monikers to address specific HTML pages instead of plain text. For example, the content displayed by about:NavigationCanceled actually comes from an HTML resource, res://shdocvw.dll/navcancl.htm, that's stored in shdocvw.dll, as shown in Figure 7. But how does the browser know how to associate the NavigationCanceled moniker with the resource navcancl.htm? It's all stored in a table within the system registry, under this easy-to-remember key:

HKEY_LOCAL_MACHINE \Software \Microsoft \Internet Explorer \AboutURLs

Adding a new moniker is as easy as writing a new entry in the registry. If you have resprot.dll installed in your system directory and add this association to the table, then about:mind will be a command that'll be recognized by Internet Explorer 4.0.

Figure 7: about: Navigation Canceled

Figure 7: about: Navigation Canceled

      The about: protocol is also supported by Netscape Communicator 4.05, but it doesn't support any conversion tables in its implementation, and it's limited to outputting text in the document's body.

Invoking Protocols from the Shell
      The Windows shell API boasts a couple of functions, ShellExecute and ShellExecuteEx, that turn out to be really useful when you need to invoke resources through Internet protocols. Both let you execute actions (called verbs) upon executable programs and documents (including URLs).
      This means that you can connect to a remote URL simply by executing a command like:

ShellExecute( NULL, NULL, mailto:despos@infomedia.it, NULL, NULL, SW_SHOW );

The third parameter can be any string that the browser can parse as a complete URL. This particular call uses the interface of your standard mail program to drop me a line. Likewise,

ShellExecute( NULL, NULL, res://resprot.dll/mind.htm, NULL, NULL, SW_SHOW );

displays the specified page from the DLL, just as the URL about:mind would.

The IUrlSearchHook Interface
      Internet Explorer 4.0 and above can manage URLs very flexibly. It not only lets you define custom URL protocols, but it also gives you a chance to modify and translate an unrecognized URL—that is, a URL without a specified protocol.
      Internet Explorer 4.0 first attempts to resolve the protocol itself by applying default protocols such as http: and file:. If those fail, it then tries enumerating all the in-process COM objects whose CLSIDs can be found under the following registry path:

HKEY_CURRENT_USER \Software \Microsoft \Internet Explorer \UrlSearchHooks

Each CLSID listed under this key has an empty string as its value. By default, the only entry under this key points to the Microsoft URL Search Hook, which is implemented in shdocvw.dll.
      A search hook is an in-process COM object that exposes the IUrlSearchHook interface. This interface has a very simple structure with a single method, Translate, with the following prototype:

HRESULT Translate( LPWSTR lpwszSearchURL, DWORD cchBufferSize );

The lpwszSearchURL argument contains the URL the browser is unable to resolve. If the function is successful, it'll be replaced with the translated URL. The string is Unicode; cchBufferSize denotes its length.
       Figure 8 contains a simple ATL-based COM object that implements the IURLSearchHook interface. When it's loaded by Internet Explorer 4.0, it tries to verify whether the URL is an email address. If so, it prefixes the URL with "mailto:". The algorithm to determine whether the URL evaluates to an email address is the world's simplest: it checks for an @ character. If the hook receives

despos@infomedia.it

it returns

mailto:despos@infomedia.it

      Once you've installed this extension, you'll be able to type an email address in the Windows Explorer or Internet Explorer address bar and have a new message generated automatically by your default mail program. The ATL project, which was generated with Visual C++ 6.0 and ATL 3.0, is included in the source code for this article. The object makes all the necessary changes to your system registry during the registration process, which you can start manually with this call:

regsvr32 <path>/mailhook.dll

How Pluggable Protocols Work
      Pluggable protocols are built on top of asynchronous URL monikers. A moniker is a system object that encapsulates a particular instance of a COM object, along with all its persistent data. It has the ability to locate and load the object into memory, starting from a file name. Monikers were introduced with the OLE 2 specification, and they were originally synchronous objects. The increasing prominence of the Internet and the need to access objects further away than a local-area network brought about a double evolution: asynchronous components with the ability to locate objects via a URL. URL monikers are a concrete implementation of the asynchronous moniker specification.

Figure 9: URL Monikers and Pluggable Protocols

      Figure 9: URL Monikers and Pluggable Protocols

      URL monikers are a COM-based technology that extends the original concept of the OLE moniker to provide a uniform interface to the download process. URL monikers are coded into the urlmon.dll module. Figure 9 illustrates how URL monikers and pluggable protocols fit in with the global picture.
      The power of urlmon.dll has been enhanced in Internet Explorer 4.0. It now can use external modules to implement and support new asynchronous protocols. In practice, urlmon.dll knows how to handle URLs that use the HTTP, FTP, and Gopher protocols. It delegates this to an internal COM module that utilizes WinInet calls to do the job. For any other required protocol, urlmon.dll passes the control to the specified pluggable object. Internet Explorer 4.0 itself comes with about a dozen asynchronous pluggable protocols. There are mailto:, res:, and about:, as well as lesser-known ones like javascript: and vbscript:. The javascript: protocol lets you execute JavaScript code directly from the address bar. The URL

javascript:d=new Date();alert(d.getYear());

displays a message box with the current year.
      An asynchronous pluggable protocol (APP) is required to implement a few COM interfaces to offer the basic functionality required by urlmon.dll. In general, an APP is expected to receive a string with an unknown URL protocol, parse it, analyze the various elements, and produce a stream of bytes as the output.

Figure 10: App Interfaces

Figure 10: App Interfaces

      APPs implement two interfaces: IInternetProtocol and IInternetProtocolRoot. You need to provide about 10 functions to make an APP work. Two are particularly noteworthy: Start and Read. Both functions are called by urlmon.dll. The first one begins the protocol process, while the second actually obtains the data to display. The Start function receives a pointer to the IInternetProtocolSink interface implemented by urlmon.dll. This interface constitutes the means of communication between the APP and its caller (see Figure 10).
      An interesting example of an asynchronous pluggable protocol is available for download from http://support.microsoft.com. It's a Visual C++ 5.0 project that uses ATL to create the COM object. When I tried to compile this project, I found that it had some missing files. But creating a new ATL project and adding the downloaded files worked fine. Modifying this code to implement any other kind of custom protocol is a matter of replacing the behavior of a couple of functions. In particular, you should look at DoParse, which analyzes the URL composition, and Read, which returns the stream of bytes to urlmon.dll. Figures 11 and 12 list the functions to take into account when coding a pluggable protocol.
      An APP must be registered to the system registry under

HKEY_CLASSES_ROOT \PROTOCOLS \Handler \<protocol>

where <protocol> is the name before the colon. Each such key has a CLSID value that represents the module that implements IInternetProtocol.

Application-based Protocols
      So far, I've examined the most powerful way to add custom URL protocols to Internet Explorer 4.0. There is another equally useful technique. Basically, you can associate a protocol with an application. This is what happens with protocols like mailto: or outlook:. In practice, each time you call an URL with that protocol, you end up invoking the registered application with the specified command line. For example, on my PC the following command lines execute after an URL with the mailto: or outlook: protocol is specified:

"C:\PROGRAM FILES\OUTLOOK EXPRESS\MSIMN.EXE" /mailurl:%1 D:\msoffice\Office\outlook.exe "%1"

This information is stored in the registry under

HKEY_CLASSES_ROOT \<protocol> \shell \open \command

where protocol is the name of the protocol in question (like mailto: or outlook:).

Further Reading
      This article doesn't exhaust the theme of Internet Explorer 4.0-based pluggable protocols. For example, security zones, namespace handlers, and MIME-type filters are all topics for further research. If you're interested, take a look at the documentation available on MSDN (http://msdn.microsoft.com). In addition, the Internet Client SDK area has an entire chapter dedicated to this topic.

From the January 1999 issue of Microsoft Internet Developer.