This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.


MIND


This article assumes you're familiar with Dynamic HTML, and Visual Basic.
Download the code (3KB)

Accessing the Internet Explorer Document Object Model from Visual Basic 5.0
Yasser Asmi

Want more control over the Web? The Internet Explorer Document Object Model brings the benefits of Dynamic HTML to apps built with Visual Basic 5.0.
You create rich user interfaces for your Web apps using Microsoft® Internet Explorer 4.0 and Dynamic HTML (DHTML). Why not use DHTML to do the same for your Visual Basic®-based applications? Blending technologies from Visual Basic and Internet Explorer gives you a powerful platform for building applications.
      In this article, I will take a look at the basics of working with Internet Explorer Document Object Model and how to access it from Visual Basic. I will also show you how to build two sample apps. The first sample uses the WebBrowser control to analyze the elements in HTML documents. The second sample is an ActiveX® component that dynamically adds new HTML elements to the page and sinks events with those elements.

The Document Object Model
      The Internet Explorer Document Object Model is a set of COM objects that correspond to the elements you see on a Web page. Within Internet Explorer, every HTML element (<IMG>, <A>, <TABLE>, and so on) is programmable via an object that is part of the overall object model. You can modify the appearance and behavior of an HTML element by altering an object's properties and calling its methods. In addition to methods and properties, the objects also fire events to signal user interaction or changes in the corresponding HTML element. Basically, the Document Object Model lets you access the engine behind DHTML.
      To allow access to these objects, Internet Explorer creates a top-level window object for each HTML document it displays. From this window object you can access the rest of the object hierarchy by using properties and collections. For example, you will use the document property of the window object to retrieve the actual document object that represents the entire page. If your Web page has <FRAME> tags, then you can use the frames collection to access each frame object. I will cover the details of enumerating the frames collection later.
      When you write script code within your HTML document, you have access to the object model. Your script runs in the context of the window object that represents the HTML document. The window object is implied in script code, so use of the window keyword is optional. The following VBScript snippet shows how to use the window object to get to the location object and enumerate the frames collection:


 window.location.href = "www.microsoft.com"
 
 For i = 0 To window.frames.length - 1                 
     MsgBox window.frames(i).title   
 Next
The Document Object Model is fairly complex but is well documented in MSDN Workshop, available on the Microsoft Web site (http://msdn.microsoft.com/workshop). To give you a general idea of what's available, Figure 1 lists a few important objects and their commonly used features. As mentioned earlier, there is a corresponding object for every HTML element. Some elements have more than one object, such as the <INPUT> tag, which can have various TYPE= attributes.

Accessing the Document Object Model
      So far I have described what the object model gives you and how you can access it from a script. But how can you take advantage of this functionality from a Visual Basic-based app? Your application either needs to host Internet Explorer or Internet Explorer needs to host your application. From a high-level perspective, there are three scenarios that make this possible:


      The method you select depends on the needs of your application. The first two methods require the user to start your Visual Basic-based application separately. The ActiveX control method allows a user to simply navigate to your Web page in an Internet Explorer window. Internet Explorer has extensive code download facilities for ActiveX controls, but keep in mind that the ActiveX control may have to be signed depending on the user's security settings. Now let's look briefly at the implementation details of these scenarios.

The WebBrowser Control
      Using the WebBrowser control from Visual Basic is pretty simple. The Visual Basic Application Wizard has an option to create a WebBrowser form. The template used by the Application Wizard was designed to work with Internet Explorer 3.x, so I will go over the steps needed to create a simple WebBrowser application in version 4.0.

  1. Create a standalone Visual Basic project.
  2. From the Components menu, add Microsoft Internet Controls (shdocvw.dll).
  3. From the References menu, add a reference to Microsoft HTML Object Library (mshtml.dll). If this is not listed, click Browse and select mshtml.dll from the Windows System folder.
  4. Drag the WebBrowser control onto your form and size it.
  5. Issue the initial navigate command from your form_load:
 WebBrowser1.Navigate "www.microsoft.com"
  1. Handle at least one WebBrowser event called DocumentComplete. This is important because it indicates that your Web page is finished loading. However, this event is fired for each frame in your page. The following code demonstrates how to determine when the entire page has completed:
 Private Sub WebBrowser1_DocumentComplete(ByVal pDisp As Object, _
                                          URL As Variant)
     If (pDisp Is WebBrowser1.Object) Then
         MsgBox "your document is ready"
     End If
 End Sub
  1. Press F5 to run your project.

      To complete this simple WebBrowser application, you can add a text box where the user can type in a new URL and use the Navigate method again. You can add Back and Forward buttons and use the GoBack and GoForward methods. You can also make the WebBrowser control size properly when your form is resized.
      Remember that the DocumentComplete event is fired when the page is fully loaded. At this point you can use the Document property of the WebBrowser control to access all the objects of the currently loaded Web page. This is not the top-level window object, but it provides a way to get there. To access the window object, you can use the parentWindow property of the document object. The following statement demonstrates this:

WebBrowser1.Document.parentWindow
Remember that the Document property is valid only after the DocumentComplete event has fired.

Internet Explorer Automation
      Although automating Internet Explorer is a little different from using the WebBrowser control, you have similar capabilities since it is based on the same IWebBrowser2 programming interface. The main difference is that automation opens up in a separate window. Here are the steps to do this:

  1. Create a standalone Visual Basic project.
  2. From the References menu choice, add a reference to Microsoft Internet Controls (shdocvw.dll).
  3. From References, add a reference to the Microsoft HTML Object Library (mshtml.dll).
  4. Declare a private variable in your form's declarations section:
Dim WithEvents mIE As InternetExplorer
  1. Add the following code to your form_load:
 Set mIE = New InternetExplorer
 mIE.Visible = True
 mIE.Navigate www.microsoft.com
  1. Add the following code to the mIE_DocumentComplete event:
 If pDisp Is mIE Then
     MsgBox "Document is ready"
 End If
  1. Press F5 to run your project.
      In this case, the mIE variable holds a reference to the Internet Explorer automation object. Since I declared it as WithEvents in step 4, my Visual Basic project will receive event notification from the referenced object. You can use this reference just as you would the WebBrowser control in the previous example. The events you receive are identical to the events for the WebBrowser control. Similarly, you can use the following statement to refer to the top-level window object of the Web page being displayed:

 mIE.Document.ParentWindow
Again, the Document property is only valid after the DocumentComplete event has fired.

ActiveX Control
      Creating an ActiveX control is quite simple in Visual Basic.

  1. Create an ActiveX Control project in Visual Basic.
  2. From References, add a reference to Microsoft HTML Object Library (mshtml.dll).
  3. Save the project and compile it into an OCX.
  4. From the Project Properties Component tab, set Versioning to Binary compatibility and specify the OCX file you just created. This keeps the same CLSID for your control every time you compile.
  5. Using an <OBJECT> tag, insert your control into an HTML file. You can use Visual InterDev™ to do this, or you can create an Internet Download Setup for your project in Visual Basic and use the default HTML file generated for your control.
      Now that you have an ActiveX control embedded in an HTML file, you can access the HTML document containing your control by adding the following expression in your UserControl code:

 UserControl.Parent.script.document
To access the top-level window, you can use this code:

 UserControl.Parent.script.document.parentWindow
Since this expression is somewhat tedious to type, you can store a reference to the document in a variable. That is why I added MSHTML to the project references.

Using mshtml.dll in References
      You may be wondering why I have been adding references to mshtml.dll in the projects. MSHTML provides type information for all the objects in the Internet Explorer Document Object Model. This DLL is responsible for rendering HTML, hosting ActiveX controls, and providing the object model. By adding a reference to it in the Visual Basic project, I'm saying something like this:


 Private mDoc As HTMLDocument
 ' in  DocumentComplete
 Set mDoc = WebBrowser1.Document     
More importantly, I can declare the mDoc variable using the With-Events keyword and start to receive events for it as soon as I put in the reference.
Figure 2: Visual Basic Object Browser
Figure 2: Visual Basic Object Browser

      To see the rich object model in Internet Explorer, you can use the Visual Basic Object Browser. After you have added a reference to MSHTML to your project, press F2 to display the Object Browser, then select the MSHTML library from the combobox. You will see numerous classes—with names like HTMLWindow2—in the left-hand pane. Once you click on a class, the Object Browser will display all the properties, methods, and events for that object (see Figure 2).
      These classes in MSHTML are equivalent to the objects I have been discussing. For example, HTMLDocument is the class that corresponds to a document object; HTMLWindow2 corresponds to the window object. In fact, if you use the TypeName function on various objects, the class names that you will get are shown in Figure 3.

Inspecting Elements on a Web Page
      My first sample app (see Figure 4) will list all the elements in a Web page by browsing through the all collection of each frame and displaying each item in a TreeView control. The sample also creates a TextRange object for each body element to display just the text information out of the Web page.
      You can download the sample from the MIND Web site, but it is fairly simple to create yourself. Simply start from the WebBrowser control project I created earlier. On the same form as WebBrowser1, add a command button (cmdForward), a textbox (txtAddress), a multiline textbox (txtText), and a TreeView control (tvTreeView). Now you only need to add the code.
      To follow the code, start in WebBrowser1_DocumentComplete. In this event handler, I clear the tree view and call RecurseFrames with the WebBrowser1.Document property. This gives me the document object. RecurseFrames fills the tree view by calling FillTree and displaying the text information. It will also call itself for each frame contained within this document. A document may have frames that are represented with window objects and have a document object themselves. This means you can walk through all the frames recursively, passing a reference to the document object. A reference to the tree node is also passed as a parameter so the function knows where to add new items. FillTree is used to display the all collection. It can filter and display only the elements whose tag name is passed as a parameter.
      From a document object, you can access each HTML element on the page using the all collection. You can use a for each loop, a regular for loop, or an ID to search for an element. Here's how to do each of these:


 For Each Obj in mDoc.All  'or For I = 0 To mDoc.All.Length - 1
    'do something with Obj or mDoc.All.Item(I)
 Next
 mDoc.All("MyFieldID1").Style.Visibility = "hidden"
Note that when searching using an ID, this statement will not work if there is more than one element with the same ID—a common technique in DHTML. In that case, a collection object is returned.

Accessing Properties and Methods
      Each object returned from the all collection represents an HTML element. Check the tagName property to determine which element you are looking at. For example, an anchor element will have a tag name of A. All the attributes that can be specified for the tag are exposed as properties. Based on the tagName, you can access the relevant properties. For example, the most important attribute of an anchor element is HREF. Therefore, HREF is exposed as the href property of the anchor object. For an <INPUT> object you may want to look at the type property to see if it is checkbox, input, and so on.
      In addition to the properties that correspond to the attributes, there are other properties to facilitate DHTML's dynamic nature, such as the innerText and the innerHTML of an HTML element. The innerText property gives you just the unformatted text, while innerHTML gives you the HTML contained within the tags. CSS Style is exposed as a style property that refers to an HTMLStyle object. The HTMLStyle object lets you examine or change all the CSS properties such as positioning and background color.
      You can also call methods on the object, including focus, blur, and scrollIntoView. Some elements also have the click method. The <SELECT> element, for example, has an add method that allows you to add new items to the listbox it represents. Additionally, the insertAdjacentText and insertAdjacentHTML methods allow you to dynamically add new HTML elements to the page before or after the element you call the method on. My second sample app will demonstrate this technique.

Dynamic Content from an ActiveX Control
      My second sample app is an ActiveX control that creates the rest of the page dynamically from within the UserControl code. The only element you place in your HTML is the <OBJECT> tag for this control. When the page loads, the control adds a few links and fields on the page. It sinks events with these newly created elements and processes the events in the Visual Basic code.
      All the code required is shown in Figure 5. Simply start from the ActiveX control project I created earlier in this article, then add the code you see in the UserControl module and compile the OCX. Set Binary Compatibility so the component's ID remains constant, and reference the newly created OCX. This is a good idea if you are not going to be changing the public interface of the control. Then, insert the control in HTML as follows:


 <HTML>
 <HEAD>
 </HEAD>
 <BODY>

 <OBJECT ID="TEST" WIDTH=0 HEIGHT=0
 CLASSID="CLSID:F9F570DC-D47C-11D1-B676-00C04FC203B4"
 </OBJECT>

 </BODY>
 </HTML>
Use your own CLSID instead of the one shown here. You can find your CLSID by creating an Internet Component Download for your control using SetupWizard.
      You can also use Visual InterDev to insert this <OBJECT> tag. Make sure that WIDTH=0 and HEIGHT=0 so the ActiveX control itself is not displayed. As you can see, the HTML page does not include any elements other than a hidden ActiveX control—this object does all the work on the page.
      To follow the code, look at the declarations section. I declared two object references using the WithEvents keyword: HTMLDocument and HTMLInputButtonElement. I set values for these references from my UserControl_Show event, which gets called when the page is fully loaded and my ActiveX control is becoming visible for the first time. The first thing UserControl_Show does is get a reference to the document it is in. This reference triggers the flow of events for the document. The ActiveX control also inserts new content into the document.

Inserting Elements and Receiving Events
      You can insert new HTML elements into your page by using a method called insertAdjacentHTML. To call this method you need to have a reference to an HTML element; you insert new elements either before or after an existing element. The first parameter to this method indicates where you want to insert the elements. Commonly used values for this are AfterEnd or BeforeBegin. The second parameter is a string value that contains the actual HTML code you want to insert.
      You have already seen how to call methods and properties of the DHTML objects from your Visual Basic code. But how do you receive events from these objects? First you need to declare a reference variable using the WithEvents keyword. You must specify a type since you cannot use WithEvents when declaring something as a generic Object. Visual Basic needs to know what events to let you write code for. You can get this type information by using the Object Browser and finding the element for which you are interested in trapping events. Make sure that the element has events listed in the Object Browser. Once you declare something as WithEvents, the Procedures/Events Box of your code window will list all the object's available, scriptable events. Simply select an event and write code to respond to it.
      While writing code for an event handler for an HTML element, you need to look at the DHTML event object. This is exposed as a property of the window object. You can access this using mDoc.parentWindow.event, where mDoc is a document object. The event object gives you information like mouse position for click events and a reference to the source element (srcElement) where the event originated.
      The event object has another important property: cancelBubble. All DHTML events are bubbled up. For example, if you click on a field in an HTML form, you can capture this event at the field level, at the form level, and at the document level. This is why it is important to look at the srcElement property to find out where the event took place. At the same time, it is important to set the cancelBubble property to False if you want the bubbling to continue. Event bubbling is a very powerful concept and makes handling events really flexible.
      Now, you don't actually start receiving any events until you set the WithEvents-declared reference variable to some value. You have to point the WithEvents variable to a real object to sink events with that object. This can be done using the all collection:


 Private WithEvents mMyDivTag as HTMLDivElement
 Set mMyDivTag = mDoc.all("MyDivTag")
Once you set the reference to an object, your code will be called every time an event is raised. This will continue to happen until you set your reference equal to nothing.
      If you are sinking events with an ActiveX control embedded in the Web page, you have to add a reference to the typelib for the ActiveX control in your project References. For your own Visual Basic-based ActiveX control, you can create a typelib by checking the Remote Server Files option under project properties.

Conclusion
      In this article I showed you three ways to hook into Internet Explorer from Visual Basic: using the WebBrowser control, Internet Explorer automation, and an ActiveX control. I looked at accessing the Internet Explorer Document Object Model, as well as calling methods and properties on HTML elements and receiving events from these elements.
      Using both the rich object model of Internet Explorer and the ease of Visual Basic provides a powerful programming model. With the information in this article, you should be able to start thinking of problems you can solve using this blend of technologies.
      To find more detailed information, please visit the Microsoft MSDN Workshop home page. This site has detailed online help and samples. Additionally, please visit the Internet Client SDK support site, which contains FAQs and links to related Knowledge Base articles.

From the August 1998 issue of Microsoft Interactive Developer.