Discardable Properties for Your Web Pages in Internet Explorer 4.0

Matt Oshry, Developer
Michael Edwards, Developer Technology Engineer
Microsoft Corporation

November 1997
Updated December 30, 1997
(including a new discussion of security issues and code signing)

Click to open or copy the sample files associated with this technical article.

Contents

Introduction
How Does It Work?
How Do I Use It?
How Do I Make My COM Object Cacheable?
How Does the PropertyCache Sample Work?
Hey, I'm No C++ Programmer, Just Show Me How to Script It
Just Give Me the Samples Already!
Summary

Introduction

Microsoft® Internet Explorer version 4.0 includes a new feature to save temporary data from one Web page, and access it from any other page, within the same browser window. Multipage processes (such as registration or online shopping) often require information that was entered by the user on previous pages, but doesn't need to be available longer than one browser session. The new IDiscardableBrowserProperty interface allows data to be contained in a very lightweight object that is automatically discarded when the browser window is closed, or when the object has not been accessed for over ten minutes.

This article begins by describing how the IDiscardableBrowserProperty interface works, then goes on to explain how you can enhance an existing COM object to support it. After that, for C++ developers we describe some important implementation details for the PropertyCache control we've provided. If you aren't the COM type, and just want to use this new feature from script, don't despair. After reading the next section on how this feature works, skip that scary COM stuff and learn how to cache data in Hypertext Markup Language (HTML) scripts.

How Does It Work?

If you have read the Internet Client Software Development Kit (SDK) documentation (see the MSDN™ Library, SDK Documentation bin for the Internet Client documentation) on the WebBrowser Control, you know that Internet Explorer 4.0 includes the new IWebBrowser2 interface for adding Internet browsing functionality to your application. And, if you've been around the block a couple of times, you might remember the old Internet Explorer 3.x IWebBrowserApp interface was replaced by IWebBrowser2. IWebBrowserApp included the PutProperty method to cache a property that can be retrieved later with GetProperty. You can read more about these methods in the Internet Client SDK, but let's look at their function prototypes right here:

HRESULT PutProperty(BSTR szProperty, VARIANT vtValue);
HRESULT GetProperty(BSTR szProperty, VARIANT FAR* pvtValue);
// szProperty is a caller-allocated buffer with a unique property name.
// vtValue is a VARIANT value associated with the given property.

As you can see, since Internet Explorer 3.0 you've had the ability to store arbitrary data (VARIANT objects) as properties of a browser window. But you may not know (unless you have a pretty good testing staff) that this mechanism is not very useful for Microsoft ActiveX® controls or documents, because the Internet Explorer 3.0 browser does not have the capability to void these properties after a period of time. Therefore, stored properties hang around until the browser window is closed, which might be a long time (while the client's available memory slowly dissipates).

The IWebBrowser2 implementation for GetProperty and PutProperty fixes this problem by making the stored property discardable. All properties that are associated with a browser window are regularly scanned for discarding. A property is considered discardable if the page that stored it has been unloaded, and more than 10 minutes have elapsed since the property was last accessed. Properties associated with a given browser window are not applied to other windows. Of course, all the properties associated with a given browser window are discarded when the window is closed.

To take advantage of this enhanced property cache feature, you need to package the data into a COM object that supports the IDiscardableBrowserProperty interface. That's because IWebBrowser2::PutProperty calls QueryInterface on the passed VARIANT object for an IID_IDiscardableBrowserProperty interface. If the call is successful (the interface is supported), the browser will consider the object to be discardable.

How Do I Use It?

If you want to do something as simple as remembering whether the user has logged on, or whether they clicked something on a previous page, then some scripting examples are probably what you're most interested in seeing. However, if you have something more complicated in mind, then you may need to incorporate discardable functionality directly into objects you want to cache for other pages. So, let's consider both perspectives, starting with the person who is just aching to store data for later retrieval.

How Do I Make My COM Object Cacheable?

Storing any COM object as a discardable property of the browser window is easy. The object needs only to update its implementation of IUnknown to respond to a QueryInterface for IID_IDiscardableBrowserProperty by returning its IUnknown pointer.

We're going to explain how to do this by using Active Template Library (ATL) code examples. If you are not familiar with ATL, that's OK, the explanation will still make sense to you as long as you have some familiarity with COM. However, if you get confused (or you are just the anal type), there's an MSDN Library article called "The Active Template Library Makes Building COM Objects a Joy" (MSDN Library, Periodicals) by Don Box, that will introduce you to ATL (the "Implementing IUnknown" section is most relevant to the following discussion).

If you create a simple COM object using ATL, the interface map in your class declaration (which specifies how ATL should implement your QueryInterface method) looks like this:

BEGIN_COM_MAP(CYourData)
COM_INTERFACE_ENTRY(IYourData)
END_COM_MAP( )

If you want to be able to store an instance of CYourData as a discardable property of the browser window, all you have to do is modify the interface map to handle queries for the IID_IDiscardableBrowserProperty interface:

// shlguid.h defines the IID_IDiscardableBrowserProperty GUID.
#include <shlguid.h>

BEGIN_COM_MAP(CYourData)
COM_INTERFACE_ENTRY(IYourData)
// tell ATL to return our IUnknown when asked for the discardable interface
COM_INTERFACE_ENTRY(IID_IDiscardableBrowserProperty, CYourData)
END_COM_MAP( )

Then, to cache an instance of CYourData, you just need to create a VARIANT object to point to your object, give it a unique property name, and store it:

#include "stdafx.h"   // VARIANT, V_* and other "standard" OLE goodies.
#include <exdisp.h>   // IWebBrowserXXX

static BSTR szName = "some GUID generated exclusively for this property";

HRESULT SaveIt(IYourData * pData, IWebBrowser2 * pWebBrowser) {

   VARIANT vCache;
   LPUNKNOWN pUnknown;

// Set the VARIANT's type to an IUnknown pointer.
   V_VT(&vCache) = VT_UNKNOWN;
   // Make the VARIANT point at our object.
   pData->QueryInterface(IID_IUnknown, (LPVOID*)&pUnknown);
   V_UNKNOWN(&vCache) = pUnknown;

   // IWebBrowser2 will QI vCache for the discardable interface,
   // and if successful, automatically retire it if it's not accessed
   // for a while.
   return pWebBrowser->PutProperty(szName, vCache);
}

To get the object back later:

HRESULT GetItBack(IYourData * pData, IWebBrowser2 * pWebBrowser) {

   VARIANT vCache;
   HRESULT hr;

   // Retrieve the VARIANT we stashed.
   hr = pWebBrowser->GetProperty(szName, &vCache);

   // Get our object out of it.
   if (SUCCEEDED(hr) && V_VT(&vCache) == VT_UNKNOWN)
   hr = V_UNKNOWN(&vCache)->QueryInterface(IID_IYourData, (LPVOID*)&pData);

VariantClear(&vCache);

   return hr;
}   

OK, we're cheating just a bit here. You cynical types are wondering how the heck you're supposed to come up with a pointer to the IWebBrowser2 object for your browser window. That's a good question. There's an easy answer, too, and we'll use ATL to demonstrate.

In addition to providing objects with their implementation for IUnknown, ATL provides a default implementation of the IObjectWithSite interface, IObjectWithSiteImpl, which provides an object with a pointer to its activation site (in this case, the browser window). To use ATL's built-in implementation of IObjectWithSiteImpl, you simply add the implementation to the base class list for your object, and add an entry to the interface map to expose the implementation's interface via QueryInterface.

For example, if CYourObject is an object that you are loading on your Web page via an <OBJECT> tag, and you want to be able to get a pointer to the IWebBrowser2 object, do the following in your class declaration:

class CYourObject :
   // Other base classes for CYourObject go here.
   public IObjectWithSiteImpl<CYourObject>
{
...
BEGIN_COM_MAP(CYourObject)
COM_INTERFACE_ENTRY(IYourObject)
// Interfaces for other base classes in CYourObject go here.
COM_INTERFACE_ENTRY_IMPL(IObjectWithSite)
 END_COM_MAP( )
...
};

After the browser loads your object, IObjectWithSiteImpl provides the m_spUnkSite member variable, which can be used to obtain a pointer to the IWebBrowser2 interface for your browser window:

#include <servprov.h> // IServiceProvider

HRESULT GetWebBrowser(CWebBrowser2 * pWebBrowser2)
{
   if (!m_spUnkSite)
      return S_FALSE;

   // CComQIPtr is an ATL helper macro that declares spSP as a pointer to
   // IServiceProvider. (And takes care of reference counting for us too!)
   // http://www.microsoft.com/msdn/sdk/inetsdk/help/compdev/comobj/comobj.htm
   // for IServiceProvider documentation.
   CComQIPtr<IServiceProvider, &IID_IServiceProvider> spSP(m_spUnkSite);

   return spSP->QueryService(IID_IWebBrowserApp, IID_IWebBrowser2, (LPVOID*)& pWebBrowser2);
 }

Typically, SaveIt, GetItBack, and GetWebBrowser would all be methods in the CYourObject class, and CYourObject would create the instance of CYourData that encapsulated the data you wanted to cache. You could then use scripting events for page exit (onbeforeunload) to save your object, and page enter (onload) to reload the object on a subsequent page that needed the saved information.

For information on how to do the scripting, and to learn about the simple PropertyCache ATL sample that Matt put together, read on.

How Does the PropertyCache Sample Work?

There's nothing like decent sample code to help you get something working, so we included the source code for the PropertyCache control with this article. But before you go off and start changing the code, let's go over some important caveats.

First, to build this project you need the Internet Client SDK installed on your computer. Also, make sure the libs and includes from the Internet Client SDK are searched in front of those supplied elsewhere (in Microsoft Developer Studio® you can do this from the Tools menu by choosing Options, and Directories).

If you read the above section on how to modify an ATL object to masquerade as a discardable browser object, then the source code in the PropertyCache project will look very familiar. There are two classes in PropertyCache. CDiscardable is a bare-bones COM object, an implementation of IUnknown that supports the IID_IDiscardableBrowserProperty interface and a single VARIANT property (similar to the CYourData class from the above example code). CPropertyCache is also a bare-bones COM object with methods that accept and return a VARIANT. Since CPropertyCache inherits from IObjectWithSiteImpl, it goes on your Web page (and thus is similar to the CYourObject class from the above example code).

The PropertyCache project builds a COM server named PCACHE.DLL. When PropertyCache is loaded on your Web page (via an <OBJECT> tag), the browser creates a CPropertyCache object. You can invoke its CacheData method to save script data, and RetrieveCachedData to get the data back later. When your script calls CacheData, the control dynamically creates a CDiscardable object, sticks the passed VARIANT item into it, and hands it off to the browser window (via PutProperty on the IWebBrowser2 object). The browser window increments the reference count for the discardable object, so when the PropertyCache control releases its own reference, the browser window is left with the only remaining reference. The browser window keeps track of when the object is accessed (via GetProperty on the IWebBrowser2 object called by RetrieveCachedData), and every time a page completes loading it checks whether any objects can be discarded. If a discardable object has not been accessed for 10 minutes it is discarded. This means that if your script caches some data from one page when it unloads, and accesses it from another page, you'll just need to finish loading the second page within 10 minutes of leaving the first page. You can't modify the 10-minute parameter.

What if I need to modify the PropertyCache sample?

If you plan to modify the PropertyCache source code in order to use a variation of it on your own pages, you will need to make three easy but important changes to make sure your control is unique and secure.

First, you'll need to create your own Globally Unique Identifier (GUID) for the property name. Why? Security. If you don't change the GUID, anyone who reads this article and wants to hack your data will be able to by stripping off my GUID identifier. The GUID for the property name is defined in the file PageAcc.CPP:

// The gc_szKey string provides a unique property name. This GUID is
// concatenated to the property name passed to the CacheData method
// to ensure that property names are secure.
// Naturally, anybody reusing this sample code will want to change
// this GUID by generating one of their own(use the GUIDGEN.exe
// utility that ships with Developer Studio).
const OLECHAR gc_szKey[] = OLESTR("{your_GUID}");

Note this isn't the actual GUID for gc_szKey used in the downloadable, code-signed PropertyCache control included with this article. Using a secret value for that GUID ensures that the property names used in your script can't be viewed by a hacker intent on stealing cached data.

Second, you need to change the GUIDs used to identify the classes, interfaces, and type library for the PropertyCache control. Why? The browser distinguishes between these control elements by their IDs. If your control uses the same IDs as some other control that is already on the local computer, the browser will get confused. These five GUIDs are identified by the uiid() macros in the pcache.IDL file, such as this one for the IPropertyCache interface ID:

[
 object,
 uuid(4F157AE1-3F9A-11D1-9E78-00AA00BBF119),
 dual,
 helpstring("IPropertyCache Interface"),
 pointer_default(unique)
]
interface IPropertyCache : IDispatch
{
 [id(1), helpstring("Cache some arbitrary data")]
 HRESULT CacheData([in] BSTR bstrPropName, [in] VARIANT vData, \
 [in, optional] long lSecurityLevel);
 [id(2), helpstring("Retrieve some arbitrary data")]
 HRESULT RetrieveCachedData([in] BSTR bstrPropName, \
 [out, retval] VARIANT* pvData);
};

What if I want to use the PropertyCache control from C++?

If you want to write C++ code to use the PropertyCache object as is, then you'll need to include the PCACHE.H header file and link with the pcache.LIB import library included with the sample files for this article. The pcache.DLL file will be a dependent DLL to your code, so you will have to make sure the library is part of your download package.

The following ATL code will create a PropertyCache object and store a property with the default security setting:

#include "stdafx.h"
#include "pcache.h"
...
VARIANT vCache;
// Fill vCache with something.
CComObject* pCache = new CComObject;
if (pCache)
{
   pCache->CacheData(L"aUniquePropertyName", vCache);
   pCache->Release();
}
// CacheData() makes a copy of vCache, so you need to free it.
...

Then, to retrieve the property later, do this:

#include "stdafx.h"
#include "pcache.h"
...
CComObject* pCache = new CComObject;
if (pCache)
{
   VARIANT vCache;
   pCache->RetrieveCachedData(L"aUniquePropertyName", &vCache);
   pCache->Release();
}
// RetrieveCachedData() copies the data into vCache
// (so free it when you're done).
...

Hey, I'm no C++ Programmer, Just Show Me How to Script It

If you're like many Web authors, you like to do as much as you can with HTML and script (especially when somebody is offering free sample code). In that case, you'll really appreciate the PropertyCache sample we wrote to give away with this article. With this sample, you can save data stored in script variables on your Web page, and access the data from another page in that browser window. We included the source code for the PropertyCache sample for you C++ programmers, but the rest of you (who just want to script it), can use the version we already built and code-signed. To use the PropertyCache control insert this HTML in the body of your HTML document:

<!-- The data caching and retrieving control --> <OBJECT CLASSID="clsid:68A12882-7584-11d1-A259-00C04FD97350" CODEBASE=pcache.cab#Version=1,0,0,1 HEIGHT=0 WIDTH=0 ID=oCacher>

Note the CODEBASE parameter refers to a CABinet (.CAB) file. We code-signed the PropertyCache control with a Microsoft digital signature, so the control is actually inside pcache.CAB. (If you want to know what a CAB file is, go to the Microsoft Site Builder Network Web site at http://www.microsoft.com/workshop/prog/cab/.) Since Microsoft is code-signing the PropertyCache control, when your viewers are asked whether they should download it, they'll know the control was developed by Microsoft. Hopefully, knowing this will give your customers a profound sense of reassurance! The pcache.CAB file is included in the HTML sample downloads below.

While we are on the subject of security, let's talk about why this control is very secure. First, the PropertyCache control doesn't access any resources on the customer's local computer. Thus, there is no way that a malicious Web author can misuse this control to illegally access your local resources. But in addition to protecting your customer's local resources from hackers, you also need to protect the data that you cache using the PropertyCache control. For example, if you are going to use this control to implement a shopping cart on a commerce page, you probably don't want some other page to be able to peek into that shopping cart and see what items somebody is purchasing. For this reason, you must include a security parameter when you cache data using the PropertyCache control. This parameter indicates how strictly you want to limit access to the data, including whether you will allow the data to be accessed by any page in your domain, by any page in your virtual root, or only by the page that cached the data in the first place (this last, most restrictive setting, is the default). Under no circumstances will a page that is not on your domain be able to access the data.

From a performance standpoint, it is more efficient to add a single property to the browser than it is to write many properties. So, if you have a lot of stuff to save, you should put it all in an array variable and cache the array (although you can also cache any JavaScript variable or object). Typically, your script will accumulate data from user interaction with the page, and then cache it on page exit:

<SCRIPT for=window event=onbeforeunload language=javascript
 // Security == 1 // Allow any page on your domain to read the data.
 // Security == 2 // Allow any page in your vroot to read the data.
 // Security == 3 // Allow only this page to read the data.
 var Security = 2;
 aList = new Array();
 populate(aList);
 oCacher.CacheData("aUniquePropertyName", aList, Security);
</SCRIPT>

A subsequent page that needs previously cached data would then access it while the page is loading:

<SCRIPT for=window event=onload language=javascript
 var aList = oCacher.RetrieveCachedData("aUniquePropertyName");
 if (aList == null) {
 // Must've been greater than 10 minutes since the data was stored
 // or last accessed.
 }
</SCRIPT>

CacheData() and RetrieveCachedData() are the only methods for this control, and there are no events or properties. Pretty simple, huh?

Just Give Me the Samples Already!

If you copy the samples associated with this article, you can play with a sample that keeps track of whether a user has already logged in to your Web site, or you can play with a sample that uses an array variable to keep track of products a user wants to buy from your store. This sample also uses the Tabular Data Control to manipulate and present shopping items from a comma-delimited list. Remember that you need to be using Internet Explorer 4.0 for the samples to work!

Summary

Even though today's Web pages are often offered in the context of a larger site, they tend to be pretty stand-alone views of information. That is, what a user does on one page in a given site doesn't impact what they see or do in other pages on that site. However, in cutting-edge sites like Microsoft Expedia (http://expedia.com/daily/home/default.hts?), you are beginning to see multiple Web pages offered as a single unit that share important information and events. In this context, what a user does on one page has very much to do with what they see and do on other pages in that site. Thus, with the discardable browser property feature, Internet Explorer 4.0 introduces the next generation in data-sharing functionality that is needed to turn an unconnected set of Web pages into something more akin to an integrated application.

So get busy!