Jeff Brown
Rafael M. Muñoz
Microsoft Corporation
June 1, 1998
The following article was originally published in the Site Builder Magazine (now known as MSDN Online Voices) "Web Men Talking" column.
Contents
Where have I been? - How to set up logging with CDF
Ugliness - Active Desktop items displaying error pages
I'm late, I'm late ... - Creating an update schedule with CDF
Call me - Autodial problem with subscriptions
Everyone is asking about channels. The Web Men have no way to contact ancient princes or past lives -- but they have loads of information on channels of the Internet Explorer kind. They will help you with logging user activity, caching on the client machine, and scheduling. We want to thank Alan McBee, of Microsoft Support, for his contributions to this column.
Dear Web Men:
I have been keeping a daily journal of my life since grade school, and I recently started producing my journal as a channel. I thought that I could use CDF logging to help add value to my journal by tell me how people are using my channel, and what they are doing online and off. Could you shed some light on the topic of logging using CDF files, so that I can track my readers better?
Ben Johnson
The Web Men reply:
Lonely guy, aren't you, Ben? Logging the activity of a Channel Definition Format (CDF) file probably isn't going to complete your life's journal, but let's see what we can do to help.
We recommend starting with a couple of great references: the Page-Hit Logging topic here in the Web Workshop, and the World Wide Web Consortium (W3C) site, where you'll find information on the Extended Log File Format that is being used as the standard for the log file.
The two CDF elements necessary when creating a log file for a channel are LOGTARGET and LOG. The LOGTARGET element, which must occur before any ITEM elements, describes the location to which the log file will be sent. The LOG element is defined for each ITEM, and specifies which ITEM you want to have recorded in your log.
A sample CDF file would look like this (note that this is just an example):
<?XML VERSION="1.0" ENCODING="UTF-8"?> <CHANNEL HREF="ChannelPage.htm" BASE="http://example.microsoft.com/yourchannel/"> <LOGTARGET HREF="http://example.microsoft.com/ scripts/posting.dll" METHOD="POST" SCOPE="ALL"> </LOGTARGET> <TITLE>Your Channel</TITLE> <ABSTRACT>This is your sample channel</ABSTRACT> <LOGO HREF="your.ico" STYLE="ICON"/> <LOGO HREF="your_med.gif" STYLE="IMAGE"/> <LOGO HREF="your_big.gif" STYLE="IMAGE-WIDE"/> <ITEM HREF="page1.asp " PRECACHE="YES"> <TITLE>Page 1</TITLE> <ABSTRACT>Page 1 ASP</ABSTRACT> <LOGO HREF="your.ico" STYLE="ICON"/> <LOG VALUE="document:view"/> </ITEM> </CHANNEL>
The simple CHANNEL defined above has just one ITEM. The LOGTARGET defines a directory, HREF=, where we have created an ISAPI DLL to receive and parse the incoming log file. The SCOPE attribute defines what will be logged OFFLINE, ONLINE, or ALL. Since we are using an ISAPI DLL, we need to use Microsoft Internet Information Server (IIS) on our server. We could just have easily have used a Perl script, such as the one described in Page-Hit Logging, which wouldn't necessarily require IIS.
For a great example of how to create an ISAPI DLL for parsing purposes, download the Internet Client SDK . Browse to the c:\inetsdk\samples\iislog\ directory (default installation location), and refer to the Readme file describing the IIS Log sample.
Now that we have set up a way for the server to receive a log file, when does the log file get sent back to the server? The log is sent to the server every time the CDF is updated. This can be done either manually or according to the schedule set up by the CDF author. See below, "I'm late, I'm late ...," for more information on scheduling.
Before you dive into creating a log file, you should be aware of a few details. The most important issue is that logging is still in its infancy, and new and improved methods will be forthcoming with new releases of Internet Explorer . Some other current issues either concern or confuse people, and we'll address a few of them here:
Dear Web Men:
I developed an Active Desktop item, and I've been told by some of my users that when they are disconnected from the Internet -- for example, when Windows starts up -- the item displays an ugly "Navigation Cancelled" page on their desktops. They don't see the Web page I refer to in my CDF. Do you know what's causing this?
M. Holmes
The Web Men reply:
Yup, we have some ideas about this. First, the users should only see this error page if the Web page for your Active Desktop item is not cached on their computer. If the page is not available locally, and the computer is not connected to the Web, Internet Explorer will not be able to retrieve and display the page.
The top-level page of an Active Desktop item will be cached by default, just because of the way you must define its CDF file. So it's likely the users have done something on their end. The problem could happen because a user has gone into the Internet Options dialog box of Internet Explorer, and deleted temporary Internet files and subscription content (i.e., your Active Desktop item content).
Another possibility is that users have changed the default subscription properties for the item. They may have changed the subscription type so that they are notified only when updates occur, and do not have the content downloaded for offline viewing.
In either case, there is not much you can do to prevent it (don't you hate when that happens?). If the top-level page of the Active Desktop item, and any content it includes, is not cached on the computer, it will not be available for offline viewing. Put another way, your Active Desktop item is toast.
You can double-check that you are doing all the right things in your CDF file to cache content for your Active Desktop item. Use the ITEM element for all content that must be available offline. For the complete story on this, see Active Desktop Item Development Guidelines in the Internet Client SDK documentation.
Dear Web Men:
I am a very organized type of person. I live by my day planner and without it I would be lost. I am running into some problems though trying to understand the SCHEDULE tag used in CDF. Exactly how does everything work? I need to get my life back on schedule.
Thanks,
Ian Latham
The Web Men reply:
Ian, we wouldn't want you to commit any schedule faux pas, so we have combed through all the documentation on scheduling and cashed in on some favors from our buddies in Developer Support to come up with a sure-fire checklist of rules for you to live by. (Add them to that day planner!)
Let us first point out a few tidbits that might be known by the CDF gurus, but not by us little folk. Using the SCHEDULE element is actually very simple, despite all its attributes (STARTDATE, STOPDATE, and TIMEZONE) and child elements (INTERVALTIME, EARLIESTTIME, and LATESTTIME). Only three elements are required to set up an entire schedule: SCHEDULE, INTERVALTIME, and LATESTTIME. To create a schedule that will update a CDF file once every day, you could use the following, which would update the CDF sometime between 12:00 A.M. and 9:00 A.M.:
<SCHEDULE> <INTERVALTIME DAY="1"/> <LATESTTIME HOUR="9"/> </SCHEDULE>
We highly recommended that you position the SCHEDULE element at the top of the CDF file. Sure, the documentation says that it "should appear" at the top, but that's why you came to us -- for the "User's Necessary Interpretation" of the documentation. When the SCHEDULE tag is placed at the top, it will be easier to find and maintain (especially in larger CDF files).
A final point of interest: The ENDDATE attribute -- forget it, delete it, replace it. This attribute has become obsolete, and should no longer be used within your CDF file; instead, use the STOPDATE attribute.
Here is a handy summary.
Key:
INTERVALTIME | I |
EARLIESTTIME | E |
LATESTTIME | L |
# updates per day/TD> | 1 | 2 | 3 | 4 | 6 | 8 | 12 | 24 |
INTERVALTIME values (hours) | 24 | 12 | 8 | 6 | 4 | 3 | 2 | 1 |
Rules:
Let's apply these rules to a couple of sample schedules.
Good Schedule Scenario #1:
Requirements: Update once a day
Schedule:
<SCHEDULE> <INTERVALTIME DAY="1"/> <EARLIESTTIME HOUR="7"> <LATESTTIME HOUR="9"/> </SCHEDULE>
Let's apply our rules: INTERVALTIME is equal to one day, or a 24-hour period. EARLIESTTIME this can start is seven hours from midnight, or 7:00 A.M.; LATESTTIME this will end is nine hours from midnight, or 9:00 A.M. Now, you're thinking, "These match exactly with the time of day," and that's why we used them in this example. The fact that they match is purely coincidental -- and as we point out in our rules, they shouldn't have anything to do with a specific time of the day. If you get in the habit of thinking this way scheduling will be easier.
Figuring in our last rules:
E (7) < I (24) | TRUE |
E (7) < L (9) | TRUE |
L (9) - E (7) = 2 < I (24) | TRUE |
Good Schedule Scenario #2:
Requirements: Update three times a day, within a four-hour period
Schedule:
<SCHEDULE> <INTERVALTIME HOUR="8"/> <EARLIESTTIME HOUR="2"> <LATESTTIME HOUR="6"/> </SCHEDULE>
Let's apply our rules: INTERVALTIME is equal to eight hours. Checking our chart above, we see that this corresponds with three times a day, starting at midnight. EARLIESTTIME this can start is two hours into the interval, and LATESTTIME this will end is six hours into the interval. Our schedule looks like this:
Figuring in our last rules:
E (2) < I (8) | TRUE |
E (2) < L (6) | TRUE |
L (6) - E (2) = 4 < I (8) | TRUE |
Bad Schedule Scenario:
Requirements: Update twice a day, once at 7:00 A.M. and again at 3:00 P.M.
Schedule:
<SCHEDULE> <INTERVALTIME HOUR="8"/> <EARLIESTTIME HOUR="7"> <LATESTTIME HOUR="3"/> </SCHEDULE>
We should have hesitated right from the start -- when we saw specific times in the day. Let's apply our rules: INTERVALTIME is equal to eight hours; therefore, we are updating three times -- and not twice -- a day. EARLIESTTIME this can start is seven hours from midnight, or 7 A.M. (lucky match), and LATESTTIME this will end is three hours from midnight, or 3:00 A.M. (really?). So this schedule is trying to say that we want to end even before we start! Notice how thinking about a specific hour in the day causes problems?
Figuring in our last rules:
E (7) < I (8) | TRUE |
E (7) < L (3) | FALSE |
L (3) - E (7) = -4 < I (8) | FALSE ( -4 is less than 8, but who ever heard of a negative time frame?) |
Our rules show us that this schedule is inaccurate. In fact, we challenge you to come up with a schedule that would work for this scenario.
Dear Web Men:
I'm running into problems with an Active Desktop item whose subscription is set up to update frequently. I'm using a modem connected to my computer to access the Internet, and have set the subscription properties to autodial the modem whenever the content is scheduled to be updated. But after Internet Explorer does the auto-dial and downloads the content the first time, it doesn't dial and update the content again unless I restart Windows. What's going on?
Deborah H.
The Web Men reply:
Deborah, you are running into a problem with the subscription manager component of Internet Explorer. It's not correctly checking for connections to the Internet in this situation. After updating once, it will not cause the modem to dial again for subsequent updates, and the Active Desktop item will not be updated until you restart Windows. Not good. But there's hope for the future! The problem is understood, and is being looked at by the smart folks who are working hard on upcoming versions of Internet Explorer. (In fact, they recently barricaded themselves in their offices, and are now subsisting on bread and water. They have proclaimed they will not leave until they have constructed the ultimate Web browser. Now that's commitment.)
Jeff Brown
when not forcing family and friends to listen to Zydeco and country blues music, helps develop training courses for the Microsoft Mastering Series -- with a smile.
Rafael M. Munoz
is a part-time Adonis, and full-time support engineer for Microsoft Technical Support. He takes it very, very personally every time you flame Microsoft.