MSHTML is a rendering engine and parser for HTML. Introduced in Microsoft® Internet Explorer 4.0, it is the main HTML component of the Internet Explorer Web browser and can be reused in other applications. It hosts ActiveX Controls and supports the OLE Control'96 (OC96) specification for windowless controls. Controls that use the OC96 interfaces can gain in performance, and they can be transparent and have an irregular shape.
MSHTML itself is an Active Document, so it can be hosted by implementing the Active Document interfaces in an application. MSHTML can also be customized by aggregation to create a special-purpose HTML Active Document. Applications that contain MSHTML supply their own toolbars and menu UI. The container can also override the default context menus.
It is also possible to use MSHTML without UI activation to make use of the MSHTML ability to parse HTML. By loading HTML, you can use the object model to access the underlying HTML and modify any elements. COM objects hosted by MSHTML (such as ActiveX Controls) also have the ability to access the DHTML Object Model and, as a result, closely integrate with the MSHTML container to affect the host.
For many purposes, it is more appropriate to host the WebBrowser control to integrate browsing into an application. The WebBrowser control implements support for in-place linking and navigation in addition to MSHTML. For additional information on the WebBrowser control, see Reusing the WebBrowser Control.
MSHTML can be hosted to form an integral part of an application. This can be done in two ways:
As an Active Document, MSHTML follows the standard OLE mechanisms for negotiating with the container for menu merging and displaying its toolbars. MSHTML supports only the single-view implementation of Active Documents. Commands can be sent and received using the IOleCommandTarget interface that is also part of the Active Document specification. The standard NULL command set is supported as well as a set of extended commands specific to MSHTML.
A host of MSHTML can turn off the MSHTML menus and toolbars and supply its own UI. The host can also intercept any of the MSHTML UI, such as context menus and message boxes. For more information about intercepting UI, see Replacing the MSHTML User Interface. For more information about implementing an Active Document host, see Programming an Active Document Container.
If your application is using MSHTML to implement an Active Document that customizes MSHTML behavior, a host can aggregate with MSHTML to create a new class. MSHTML is designed to support aggregation. Overriding a small number of interfaces can customize much of the MSHTML behavior.
The minimum interfaces that need to be supplied by an aggregator are IOleObject and the IPersist* interfaces.
These interfaces supply the object's ClassID and UserType properties (through IOleObject::GetUserClassID, IOleObject::GetUserType, and IPersist::GetClassID methods). The aggregator should supply its own implementations to differentiate the new object from MSHTML.
Some common functionality that can be customized by aggregation includes:
Other customization can be done by supplying services to MSHTML.
MSHTML provides or uses interfaces that allow an aggregator to supply ambient properties to it without having to implement a full client site for hosting MSHTML. The site object normally supplies ambient properties, but a site object has a number of other interfaces that are not trivial to implement. Using this technique, the aggregator can let the container's client site pass through to MSHTML while the aggregator still supplies ambient properties.
If MSHTML is tightly integrated into an application, it might be desirable to have the application supply its own menus, toolbars, and other UI. To do this, the host can supply optional services to MSHTML that allow the host to hook into and replace the MSHTML UI. In implementing these interfaces, a host knows that MSHTML is the component being hosted and has knowledge of the MSHTML command set.
When replacing the MSHTML UI, the host turns off the MSHTML menus and toolbars and can replace them. This does not involve any negotiation. A host of MSHTML can display message boxes, Help UI, and all in-place active UI on behalf of MSHTML by implementing two interfaces, IDocHostUIHandler and IDocHostShowUI.
To replace the menus, toolbars, and other UI for MSHTML, the host should implement the IDocHostUIHandler interface. MSHTML obtains this interface by passing IID_IDocHostUIHandler to the QueryInterface method of the host's client site object. If the host does not implement a client site, the host can query the MSHTML document object for the ICustomDoc interface and call the ICustomDoc::SetUIHandler method to set the MSHTML UI handler.
When the IDocHostUIHandler interface is present, MSHTML delegates a number of IOleInPlaceObject and IOleInPlaceActiveObject methods directly to it, allowing a host to hook into methods such as IOleInPlaceActiveObject::TranslateAccelerator and IOleInPlaceActiveObject::ResizeBorder. Returning S_OK on the IDocHostUIHandler::ShowUI method called by MSHTML will cause MSHTML to hide its menus and toolbars and not call the IOleInPlaceFrame methods for menu merging and border space negotiation.
To supply message boxes and Help UI, the host should implement the IDocHostShowUI interface. MSHTML obtains this interface by calling the host's client site QueryInterface, requesting IID_IDocHostShowUI. The site object implements both IOleDocumentSite and IOleClientSite.
Because MSHTML can be used in a number of different host environments, there is a requirement that each host environment be able to set different default user preferences in the registry. MSHTML allows its host to specify a special registry key under which to store the default preferences.
MSHTML calls IDocHostUIHandler::GetOptionKeyPath on the host at initialization to allow the host to specify the registry location for the preference settings. MSHTML then uses this registry key to store and get the settings for its preference property pages. If the host returns S_FALSE for this method, or the returned registry key path is NULL or empty, MSHTML reverts to its own default set of options.
There are three situations where MSHTML needs to resolve URLs or load data:
For a host to get MSHTML to load and display the page at a specific URL, the host constructs a moniker using the CreateURLMoniker API call. Then it calls IPersistMoniker::Load on MSHTML. The MSHTML implementation of IPersistMoniker is specifically designed to support the loading of data asynchronously over slow links.
For more information about these interfaces, see URL Monikers Overview.
MSHTML is also capable of loading and saving HTML through its implementation of the IPersistStreamInit and IPersistFile interfaces. Both implementations operate asynchronously.
Because MSHTML loads documents asynchronously, it might not be possible to gain immediate access to the object model of the requested document. To determine when the requested document has completely loaded, a hosting application should implement the OnChanged method of the IPropertyNotifySink interface. Use the standard connection point protocol to advise MSHTML of the availability of this outgoing interface.
When the ready state of the document changes, MSHTML calls the host's implementation of IPropertyNotifySink::OnChanged. The following code example shows how to get the ready state from the document as the ready state changes.
STDMETHODIMP CApp::OnChanged(DISPID dispID) { if (DISPID_READYSTATE == dispID) { READYSTATE rs = READYSTATE_UNINITIALIZED; GetReadyState(m_pDoc, &rs); if (READYSTATE_COMPLETE == rs) { BOOL fRet = PostThreadMessage(GetCurrentThreadId(), WM_USER_STARTWALKING, (WPARAM)0, (LPARAM)0); } } } HRESULT GetReadyState(IHTMLDocument* pDoc, READYSTATE* pReadyState) { VARIANT vResult = {0}; EXCEPINFO excepInfo; UINT uArgErr; DISPPARAMS dp = {NULL, NULL, 0, 0}; HRESULT hr = pDoc->Invoke(DISPID_READYSTATE, IID_NULL, LOCALE_SYSTEM_DEFAULT, DISPATCH_PROPERTYGET, &dp, &vResult, &excepInfo, &uArgErr); if (FAILED(hr)) return hr; *pReadyState = READYSTATE( V_I4( & vResult ) ); VariantClear(&vResult); return S_OK; }
For details on how to load an HTML document asynchronously, see the WalkAll sample.
If a user clicks a link within an HTML page viewed in MSHTML, MSHTML calls the HlinkNavigate API (after having set up an IHlink object). If the host does not implement IHlinkFrame, this API launches a separate application to follow the hyperlink.
Hosts that want to act as browsers (and navigate within the same frame) can implement IHlinkFrame on the FRAME object. The HlinkNavigate API calls IHlinkFrame::Navigate, allowing a browser application to hide the previous instance of MSHTML and create, load, and show a new instance of MSHTML (or other Active Document) to display the new page.
For more information on hyperlinking, see Hyperlinks.
Because the WebBrowser control has built-in support for hyperlinking, hosting the WebBrowser control is the preferred solution here.
MSHTML can be automated using IDispatch and IConnectionPointContainer-style automation interfaces. These interfaces allow a host to automate MSHTML through the object model.
MSHTML is responsible for loading and running scripts that appear within HTML. Because scripting is done by using the ActiveX Scripting interfaces, any ActiveX script engine can be hosted by MSHTML.
For more information about ActiveX Scripting engines and hosts, see ActiveX Scripting.
Because MSHTML is an Active Document, it communicates with its host using the IOleCommandTarget interface. With this interface, the following details are communicated to the hosting frame:
In addition to the standard command group, MSHTML supports a group of MSHTML-specific commands that provide simple access to a number of MSHTML-specific features. For more information, see the OLECMDID enumeration.
Active DocumentAlso known as an OLE Document Object or Doc Object. An Active Document is a contained object; the container provides the frame and some basic UI. In such containers, the Active Document can be interchanged with other Active Documents while the containing frame and its UI remain constant. Examples of Active Document containers include the Office Binder and Internet Explorer 3.0x, where the Active Document could change to Microsoft® Word or Microsoft® Excel while maintaining the same outer frame application. An Active Document is similar to an OLE Embedding scenario or an ActiveX control, but its interfaces are designed to support an object that is at the top level and takes up the entire content area of the frame. There are specific interfaces required to support Active Document functionality. For more information, see Active Documents.
ActiveX ControlAlso known as an OLE Control. An ActiveX Control is a contained mini-application. It can (optionally) maintain state, draw itself, persist itself, have its own window, respond to automation methods, throw events, take keyboard focus, respond to mouse and keyboard input, and show merged menu and toolbar UI. Most of the control interfaces are documented in the MSDN Online site . Support was added to Microsoft Internet Explorer 4.0 to take advantage of new interfaces that improve the performance of ActiveX Controls and make them suitable for the Internet. For more details, see Building ActiveX Controls for Internet Explorer.
ActiveX ScriptingA standard set of interfaces that allows for language-independent script integration to applications. Any scripting engine (Visual Basic® Scripting Edition [VBScript], JScript® [compatible with ECMA 262 language specification], or a third-party scripting language, for example) that supports the standard interfaces can be integrated with an ActiveX scripting host such as MSHTML.
AggregationA kind of run-time inheritance. By aggregating, an object can extend and enhance the functionality of another object but still take advantage of the functionality and interfaces of the aggregated object. Objects can be designed to be aggregated or not. MSHTML is designed to be aggregated.
AmbientA property owned by the container and supplied to an ActiveX object through the IDispatch interface on its hosting site.
AutomationA set of standards to allow an object to be programmed by scripts. Every object that can be automated can have methods and properties that can be used by a script, as well as events that can trigger scripts to be run.
CommandA simple action sent to an ActiveX object through the IOleCommandTarget interface. Commands usually correspond to user-level commands, such as the commands on menus, and can be enabled or disabled by the command target. A command can be sent to the frame, the container site, MSHTML, or a control.
ContainerThe ActiveX object that owns the site obtained through IOleClientSite::GetContainer (can be null). From the container, the contained objects can be enumerated through IOleContainer::EnumObjects. This concept of containment should not be confused with the concept of containment used for scripting and supplied by automation interfaces. Some contained automation objects are not contained ActiveX objects, and some contained ActiveX objects cannot be automated.
Dispatch InterfaceAn interface inheriting from IDispatch that is used to access named automation properties and methods of an object from a script.
Document WindowThe document window, with toolbar space, supplied to an ActiveX object through IOleInPlaceSite::GetWindowContext. (In the multiple document interface [MDI] it is the document window, and in the single document interface [SDI] it is NULL.) MSHTML currently ignores the document window.
Event InterfaceA callback interface attached to an object using IConnectionPointContainer. This is used by script engines to get notification of events thrown by objects.
FrameThe outer application frame, with menu and toolbar space, supplied to an ActiveX object through IOleInPlaceSite::GetWindowContext. This is the object with which MSHTML negotiates for menu and toolbar space and also for links.
MSHTMLAn Active Document that is capable of rendering HTML and laying out contained ActiveX Controls. In this document, it refers to an instance of the MSHTML COM object that implements IOleObject, IOleDocument, IOleDocumentView, and many other interfaces.
ServiceFunctionality supplied by the host to an ActiveX object through the IServiceProvider interface on the container site. Each service is identified by a service identifier (SID), allowing access to interfaces and methods.
SiteThe object supplied by a container to a contained object through IOleObject::SetClientSite. Containers of an ActiveX object must supply a site before doing anything else. MSHTML gets much of its information about its geometry, activation, ambient properties, and so on, from its container site. MSHTML supplies a site for each ActiveX control it hosts.
Type InfoA structure used to specify properties, methods, and events. Can be obtained from a dispatch interface through IDispatch or an object that supplies IProvideClassInfo*.
X ObjectAn object that MSHTML wraps around each hosted control to supply common, per-control, container-owned properties and events. MSHTML aggregates the X object to the control, if possible, and merges types with the control.