Incorporating the WebBrowser Control into Your Program, MIND July 1998

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

This article assumes you're familiar with Internet Explorer 4.0, C++, COM.
Download the code (154KB)

Incorporating the WebBrowser Control into Your Program
Michael Heydt

Microsoft Internet Explorer 4.0 is really a shell program that hosts the WebBrowser control. Your application can do the same with just a bit of COM.

Now and then I've wondered how I could use HTML within applications that I develop. It sure would be neat if I had an HTML parser and display engine that I could use to generate rich content displays in my C++ applications. Unfortunately, I've been too busy to take the time to write what would, in most respects, amount to a component-based Web browser.
      Well, with Microsoft® Internet Explorer 4.0, I got just that. Version 4.0 adds extensions that allow applications hosting an Internet Explorer WebBrowser control to interact with the underlying DHTML, and even to extend the DHTML object model. Whoo-hoo!
      Over the last year or so I have used the WebBrowser control to display HTML within various applications that I have written. The control was straightforward to add and provided quite a nice feature for those applications. But I couldn't design my own forms to use for data entry and also receive user-entered data from my application. The WebBrowser control was, for all intents and purposes, display-only.
      In my home office, I like to use Outlook™ Express as my Internet mail client. It's lightweight and has a very friendly user interface. Interestingly, this app's interface looks like it was implemented as HTML displayed in a WebBrowser control (see Figure 1). I found that Outlook Express was useful for interpreting and displaying HTML as part of email messages and newsgroup postings, a feature not available in other products at the time.
      Recently, Microsoft Outlook 98 was made available for download. I had heard that Outlook 98 was, like Outlook Express, able to compose and interpret HTML-based mail. So I spent what seemed like days downloading the beta on my 28.8 Kbps modem connection, and I was not disappointed. The Outlook 98 client does support HTML in mail messages as well as the stationery concept introduced with Outlook Express. It looks to me like Outlook 98 uses HTML to present the input panes for searching and displaying messages. Figure 2 shows the release version of Microsoft Outlook 98, displaying several forms that look suspiciously like HTML. If HTML is indeed being used here, the program can obviously access user-entered data.
Dr. Watson, Please!
      Fortunately, there's a relatively simple way to determine what's going on with a program: spyxx.exe (also known as Spy++). In my opinion, the Spy++ tool has been one of the most useful programs ever created for the Windows® platform. The information it gives you is invaluable for peeking into how an application is implemented. I fired up Spy++ and started to take a look at several applications.
      The first application I looked at was Outlook Express. Figure 3 shows Outlook Express with the Spy++ Find Window tool highlighting the Outlook Express window that appears to be hosting the HTML. The result of this operation is shown in Figure 4.
      The form in Figure 4 exposes the registered class name of the window: Internet Explorer_Server. A-ha! It sure looks like Internet Explorer is being reused by Outlook Express. Due diligence then led me to investigate Outlook 98, the Windows Explorer (with Internet Explorer 4.0 shell integration installed), and the Internet Explorer 4.0 application itself. Figures 5, 6, and 7 all show the programs being interrogated by Spy++, and each of these tests produced the same results. In each, Spy++ detected a rectangular window whose class name just happened to be Internet Explorer_Server.
      These results were actually just what I had expected. The Internet Explorer WebBrowser control is being used by all of these applications to provide rich content displays. But I was still not sure if it was actually the WebBrowser control, so I put together a quick and dirty application that displayed a Microsoft® WebBrowser control (CLSID_WebBrowser), and then used Spy++ on that application. Lo and behold, the class of the window was indeed Internet Explorer_Server! I had the confirmation I needed.
      Being able to reuse the WebBrowser control to display HTML is just fine and dandy, but it still doesn't tell me how these applications are interacting with the HTML and with any data that may have been input by the user. Since Internet Explorer 3.0, you've been able to connect an event sink object to the WebBrowser control to intercept certain events that it generates (BeforeNavigate, DownloadComplete, and StatusTextChange, for example). But version 3.0 of the control has no methods for accessing the HTML itself.
      After much research, I have found this to be exactly the area where Internet Explorer 4.0 adds functionality, through a new COM interface, IWebBrowser2, supported by the WebBrowser control. This new interface allows the application that is hosting the WebBrowser control to access the underlying DHTML document object model and extend it.
Internet Explorer and DHTML Object Models
      To effectively understand the features provided to your application by the IWebBrowser2 interface, you will first need a basic understanding of the object models of both Internet Explorer and DHTML. Of course, completely explaining these models would literally fill volumes, because each individual type of element (<B>, <HTML>, <BODY>, and so on) in a DHTML file has its own COM object class or interface to represent it.
      The best way to get a thorough understanding of these object models is to pick up a few books and development tools. Inside Dynamic HTML (Microsoft Press, 1997) is one of the best references available for the DHTML object model. Further technical information on the object model can be found by studying the header files from the Internet Client SDK that are associated with DHTML (in particular, mshtml.h, expdispid.h, mshtmhst.h, and mshtmdid.h). These files specify the interfaces supported by the various classes of objects. Unfortunately, the help files available on the Internet Client SDK are only of moderate assistance. They explain how to use the IWebBrowser2 interface in detail, but don't provide much information on other related COM interfaces like ICustomDoc and IDocHostUIHandler.
Essential Interface Usage
      Before I discuss how to use the Internet Explorer 4.0 objects within your application, let's go over a few interfaces.
      CLSID_WebBrowser is where everything must start—it's the CLSID of the WebBrowser ActiveX® control. Internet Explorer 4.0 is really just this control being hosted by an application called Internet Explorer.
      The IWebBrowser2 interface is implemented by the WebBrowser control and represents the primary means of interacting with the control. As an extension to the legacy interface IWebBrowser, this new interface provides a means of accessing the underlying DHMTL document via the new get_Document method.
      You access and manipulate the DHTML document currently being displayed in the WebBrowser control via IHTMLDocument2. It lets you set and retrieve HTML elements of the document, set scripts to be handled by various events, and retrieve the interface of the window and frames displaying the document.
      IHTMLWindow2 is the interface to the window displaying an HTML document. This interface actually represents a collection of frames, of which the current window may be a member. It's similar to the HTML document object in that it allows you to manipulate the window, set the script code for various events, and obtain an event object representing the specifics of the user interaction with the window or document.
      IHTMLEventObj is an interface on an event object that contains information on the event such as the location of the cursor, keys pressed, and any HTML elements selected by the operation.
      IHTMLElement is the interface onto an object that represents an element of an HTML document. HTML elements roughly correspond to individual tags in the document. Its methods allow you to do things like set the script to be executed on various events and manipulate the innerHTML and outerHTML attributes of the element to change the actual displayed HTML.
      All of the interfaces described up to this point exist on various objects in the Internet Explorer DHTML object model. But in order for your application to handle events generated by these objects, you will need to understand several dispatch interfaces. The interfaces I'll go over next are all derived from IDispatch and are event sinks that an application must implement to receive events from an object. To use these sink objects, you would request the IConnectionPointContainer interface on the object for which you want to register one of these sites. Once you have the connection point container, you can make the connection between the object and your event sink. When events occur, the object will send events to the provided sink. At a minimum, you should understand the following three dispatch interfaces.
      The DIID_DWebBrowserEvents2 interface allows you to monitor events generated by the WebBrowser control. One event of particular importance is DISPID_NAVIGATECOMPLETE, which informs its host that the document has been completely constructed by the WebBrowser control. Once this event is fired, the IWebBrowser2::get_Document method will return an interface pointer to a valid DHTML document.
      With the DIID_HTMLDocumentEvents interface, a program can receive events raised by the DHTML document object, such as onmouseover, onmousemove, or onclick.
      The third interface is DIID_HTMLWindowEvents. The events provided by DIID_HTMLWindowEvents are raised by windows in the DHTML object model. Examples of these events include OnLoad, OnUnload, OnFocus, and OnBlur.
      Two other important interfaces, ICustomDoc and IDocHostUIHandler, allow you to customize the display of UI adornments for Internet Explorer and for extending the DHTML object model.
      The ICustomDoc interface is implemented by HTML document objects to allow a WebBrowser control host to set its IDocHostUIHandler interface. Normally Internet Explorer will call the host client site's QueryInterface method to obtain the IDocHostUIHandler interface. However, a hosting application can explicitly set the IDocHostUIHandler interface on an HTML document via ICustomDoc if the application does not support an IOleClientSite interface or if the application wants to save the document the trouble of repeated calls to QueryInterface.
      The IDocHostUIHandler interface is optionally implemented by the application hosting the WebBrowser control. By implementing this interface the host can replace the menus, toolbars, and context menus used by Internet Explorer 4.0. When WebBrowser needs to negotiate for user interface considerations (like window size), or when it needs to resolve the External property of the DHTML Window object, the WebBrowser control will QueryInterface the containing application's IOleClientSite interface for this interface (unless it is explicitly set).
      The control host can also extend the DHTML object model by providing an external automation object to the scripting engine that represents the host application. References to this external object in a script are resolved by the scripting engine, which issues a query to the application hosting WebBrowser for its IDocHostUIHandler interface. Then the scripting engine calls the get_ External method on that interface. The container application then uses this call to return a dispatch interface to an automation object. Next, the scripting engine will invoke methods on this object via normal COM automation dispatch methods.

Reusing the WebBrowser Control to Host DHTML
      The first step along the road to fully implementing Internet Explorer hosting in your application is to include the WebBrowser control. The WebBrowser control is an ActiveX control, so you need only enable your application to host ActiveX components. A full discussion of providing ActiveX control support in a container is beyond the scope of this article, but you shouldn't have much trouble finding plenty of suitable documentation. In the sample application, the implementations of the ActiveX control container interfaces are handled by MFC.
      The sample provided with this article implements a form-based application that presents the user with two screens. The first form allows the user to enter properties and their values, and gives the user the option of pressing a button to navigate to a form that displays all of the previously entered properties. This report form has the ability to navigate back to the data entry form at a click of a button.
      The basic functionality of the input form is implemented with DHTML and some embedded VBScript. The application itself provides an extension to the DHTML object model that's presented to both of the DHTML forms; the extension allows the script in these forms to store and retrieve property name/value pairs from within the application. This is an interesting part of the application because the scripting extension lets a Web-based application use this client-side environment to maintain state between Web pages stored on the server.
      The sample application shows how to intercept events from the underlying DHTML document object to track the user's interaction with the elements on the DHTML forms. As the user moves the mouse over the document, the application intercepts onmouseover and onmouseout events to display the tag name and value of the HTML element that the mouse is currently moving over.
      The sample app also shows how to intercept the default functionality of Internet Explorer to display a context menu when the user right-clicks on the form. The application intercepts this event and short circuits the display of the menu, effectively hiding from the user the fact that Internet Explorer is being used to provide the display for the application.
      One last architectural issue that is embodied by the sample warrants explanation. Like Microsoft Outlook 98, this sample application displays HTML but does not actually retrieve this content from a Web server. In fact, the sample application (and, I assume, Outlook 98) retrieves its forms from within its resources. The WebBrowser control is capable of extracting resources from Win32® image files by a simple change in the format of the target URL. Instead of beginning the URL with http://, a URL requested by the sample application begins with res://. This informs the WebBrowser control to look for the data as a resource in an executable file. The format of the URLs used by the sample application is http://<exepath>/<ResourceName>, where <exepath> is obtained when the sample is initialized, via GetModuleFileName. The HTML forms are simply ASCII files that are included in the resource as custom resources. The resources can be given any name within the resource; this name does not need to match the actual file name.

Adding and Retrieving Properties
       Figure 8 shows the main screen of the sample app in action. Notice the elegant graphical DHTML display provided by the system (I borrowed the notebook motif from an Outlook 98 template). The form allows users to enter property name/value pairs by entering data into the edit fields and pressing the Add/Modify button. This button executes the script shown in Figure 9 from within the form. In the given example, the user has successfully entered a property name/value pair of "a,1" and a confirmation message box has been displayed.
       Figure 10 shows the results of pressing the List Properties button after the user has entered a few properties and their values. When the button is pressed, the script on the main form traps the onclick event for the button and calls the window.navigate method to move the WebBrowser control to the Property Report form.
      The Property Report page is generated by the HTML shown in Figure 11. This form contains an embedded script that executes when the form is loaded. The script calls into the sample and dynamically generates a table based upon the data the user has entered.

Intercepting Events in the DHTML Object Model
      Once you have the code to get the control up and displaying HTML, you can start to monitor events on various objects. The sample app accomplishes this by establishing event sinks on three objects in the model: WebBrowser, IHTMLDocument2, and IHTMLWindow2.
      Establishing an event sink on each of these interfaces follows the standard COM procedures for establishing a connection point. Take, for example, the sample program establishing a connection with the HTML document object via the IHTMLDocument2 interface. Figure 12 shows a code snippet from this sample that connects the HTML document object (IHTMLDocument2) to an event sink that can handle events from the HTML document object. This code is executed when the WebBrowser component fires the NavigateComplete event, signifying that a new HTML document object is available.
      The call to com_util::establish_connection_point does the work of establishing the connection point. This function will attempt to make the connection given four criteria: a pointer to an interface of an object to which you want to connect, the type of the sink to be established, a pointer to the actual sink, and a place to put the cookie representing the connection.
      Once the connection point is established, events will begin to flow from the document object to the specified sink. Figure 13 shows the Invoke method of the html_document object's sink and how it maps DISPIDs to method calls on the C++ html_document object. All the event-handling methods in the html_document base class initially return E_NOTIMPL. You must override members in this class to provide the required functionality.
      This is exactly what the my_document class does. my_ document overrides the onmouseover and onmouseout events from the underlying DHTML document object. The my_document::onmouseover method (see Figure 14) is most important because it shows how to obtain the window that is displaying the document. From that window, it also shows how to obtain the event object that describes the onmouseover event. From the event object it is possible to obtain the html_element object that represents the actual element in the document that the mouse has moved over. Then, from this html_element object, the sample application obtains that tag name and ID from the element and displays it in the status bar.

Extending the DHTML Object Model
      The ability to extend the DHTML object model is arguably the most important feature that has been added to the functionality of the WebBrowser control. The DHTML object model that's accessible from a script exposes various objects to the code, regardless of the scripting language. One of these objects, named window, represents the window that is displaying the current document.
      The window object has a property named external. In VBScript, the external property represents an application hosting the HTML document. When a script accesses this object, the scripting engine issues a QueryInterface call to the IOleClientSite interface of the WebBrowser component's host application, asking it to return the IDocHostUIHandler interface (in the sample code here, it has been explicitly set). If the component's host supports the resolution of this external object, it will return a pointer to this interface. The scripting engine will then call that interface's get_External method. Through get_External, the host application can return an IDispatch interface pointer to an automation object that will represent the external object to the scripting engine.
      If the host returns an interface to an automation object from the get_External method, the scripting engine will use this information to resolve method calls that a script makes to the object. The engine will call GetIDsOfNames, and then make the appropriate Invoke calls on the automation object (see Figure 15).
      You can use this feature for almost any purpose. The sample application uses it to provide client-side data storage to maintain state within a Web-based application. You could also use it to store objects for later retrieval, or let the framework download objects ahead of time. By preloading objects like this, you could have them ready when a page needs them, making the system a lot more responsive.

User Interface Issues
      Earlier I mentioned that if you provide an implementation of IDocHostUIHandler in your application, Internet Explorer will call your container to tell it how to handle user interface actions that it is about to take. An excellent example of this is when the user right-clicks on a Web page. Typically, Internet Explorer displays a popup menu that has a View Source choice. If the user selects this menu item, Internet Explorer starts a program like Notepad to display the HTML that comprises the page.
      This feature is great for learning about how Web pages are constructed. However, this feature also poses a significant problem. Suppose you have developed a Web-based application and have client-side application code implemented amongst the DHTML that makes up the Web pages. A user navigates to a page in the application and selects View Source. All of a sudden, your application code is exposed to the user and the rest of the world. Not too good for keeping your code proprietary!
      The sample shows how to prevent this scenario from happening. The solution is actually fairly simple. The implementation of the IDocHostUIHandler::ShowContextMenu method is handled within the C++ browser_control class by a contained class. The method implementation simply forwards the call to a virtual function, showcontextmenu, in browser_control. This method returns E_NOTIMPL, which causes Internet Explorer to go about its usual business. But the browser_control class in the sample is actually subclassed by my_browser_control, which overrides the showcontextmenu method. This implementation of the method returns S_OK, which tells Internet Explorer not to display its own context menu. Voilà! Your source code is no longer accessible.

Redistributing Internet Explorer 4.0
      You may have been wondering what happens if the user does not have Internet Explorer 4.0 installed on their computer. Can your application still work? The short answer to this question is no. The long answer is that if Internet Explorer is not installed on the system, all hope is not lost. By obtaining a license from Microsoft, you can redistribute Internet Explorer with your application.
      Redistribution of Internet Explorer requires that your application setup program determine whether the proper version of Internet Explorer is currently installed on the target system. If version 4.0 is determined to not be present, your setup program can take advantage of the Internet Explorer 4.0 minimal installation package. This package will install the Internet Explorer 4.0 base browser, DirectShow™, and the Microsoft Win32 VM for Java. For more information on redistributing Internet Explorer 4.0 with your application see the Internet Client SDK, Internet Tools and Technologies, Licensing and Distribution.

Conclusion
      I like the model for developing applications provided by the Web: a browser on the client and the server anywhere you like. But with all of the hoopla, I think that some people have lost sight of the fact that this model is not appropriate for all applications. There are problems with window management and state. Nothing prevents a user from just going off and surfing away in the middle of their work, leaving the server in a bind. And don't get me started on the View Source issue again.
      I see the combination of reusing the WebBrowser control on the client side—along with HTML forms that are either delivered from a Web server or embedded within the application on the client—as providing solutions to a number of difficult application development questions. I leave you with the following thoughts on how this architecture can help solve some problems.

Controlling User Navigation in Web-based Applications
      Controlling user navigation in Web-based applications has two ramifications. First, by deploying a customized browser using the WebBrowser control instead of distributing Internet Explorer, the app can track and verify user navigation.
      Second, the custom browser may connect to a particular URL by default, and may not allow users to explicitly enter other URLs to wander off to. This will provide for a browser that is rooted (as in namespaces in the Windows Explorer) to a particular URL. This would also allow the server-side code to be more effectively designed since the server will not have to manage scenarios where the user has inadvertently exited the site.
      One of the major problems with developing a Web-based application is that data cannot be saved on the client between Web pages. Sure there are workarounds, but they all have their drawbacks. As elegant as some are (like ASP), they are still hacks to get around the problem. By using an application architecture that encompasses the techniques in this article to store data on the client, you will be able to more easily develop and deliver Web-enabled applications. You won't have to worry about passing data back and forth between client and server to maintain state.

Hiding HTML Source from Users
      The problem of hiding HTML source code from users has been dealt with directly within this article. But it is worth noting that hiding the code has several effects. First, your users can't see how your pages are structured and how the scripting works. Besides the fact that it keeps your application code and logic private, hiding code also prevents users from having valuable information that might help them in reverse engineering the operation and layout of your Web site in an effort to hack it.
      And hiding the HTML source will also hide the fact that the browser is being reused. This helps with the appearance of totally seamless integration of the browser into your application. Remember, one of the ideas of component software is that the user does not need to know what components are being used, or even that components are being used.
      Another major problem with developing Web-based applications is that you are basically limited to a forms-based model. How many times have you just wanted to pop up a dialog box on top of your HTML form? If you use the model described in this article, you can get one up there by providing an extension to the DHTML and have the hosting application do this work for you.
      Let's flip the coin and view this from a different perspective. Up to now I've mostly talked about how this framework could help develop Web-based applications. You can also look at another angle: adding Web application functionality to Win32-based applications. The use of the browser control in your application can allow you to build content-rich user interfaces without binding them to your executable. Your app could load its user interface dynamically from a Web server.
      Finally, the Web gives you easy implementation of a richly graphical user interface. One of the biggest challenges to developing Win32-based applications is programming a really nice graphical user interface for your application. To provide a great interface you need to deal with device contexts, palettes, graphics libraries, and a myriad of other issues. If you reuse the WebBrowser control, Internet Explorer takes care of this for you.

From the July 1998 issue of Microsoft Interactive Developer.