Robert Carter
MSDN Technical Writer
March 9, 1999
Note This is the second article in a series documenting how we built the new MSDN Online. The first discussed the architecture of the site and gave an outline of the production process and tools we use to deliver our stories to you:
"How We're Building the New MSDN Online: Functional Overview".
Since this went live, we've published an additional article in this series:
"Personalizing the Developer Start Page"
Go to the page that hosts the LookupTable Object and installation instructions.
Contents
Introduction
What's a Dictionary Object?
Some Background on ASP, Threading, and the Dictionary Object
Home is Where the LookupTable Object Is
We think the brand-spanking new MSDN Online home page is pretty cool. Based on the feedback you've been giving us, you do, too. One of the key reasons we were able to release the new home page was because we convinced the Active Server Pages (ASP) team to build a new object, based on the Visual Basic® Scripting Edition (VBScript) Dictionary object, which enabled us to meet our performance objectives.
This article is all about the Dictionary object and its streamlined cousin, the LookupTable object: what they are, how they work, what they're used for, and so on. Be forewarned that we'll be throwing around terms such as apartment threading -- and although we'll try to explain things clearly, with links to other articles when it's hard to distill some concepts into simple summaries, don't be afraid to skip ahead to other sections. You don't have to understand all the intricacies of our LookupTable object to put it to work on your site.
A Dictionary object is the Visual Basic implementation of the hallowed Perl hash, or associative array. It's a very simple construct, a one-to-one key-value pair: bananas are a yellow food; my daughter's name is Hedy. In Perl, a hash is its own data type, just like an array, a scalar, or a string. It's also a fundamental structure of JScript®; everything is stored as a hash.
Seasoned (some would say, "hardened") developers look at you askance when you then ask, "So what's it good for?" The hash is such a fundamental construct that surely you're a doofus for even asking. Accustomed to being a doofus, I asked anyway. Here's my take: A hash is useful any time you have a thing, or a set of things, that has a unique characteristic associated with it that you'd like to list. A banana is yellow (I realize that if it's a young banana, it's green -- but if we're talking about a hash, you can't have your bananas both ways). If I had a whole bunch of foods, I could characterize them all by their color. If I have another daughter, I will probably not name her Hedy, and I could set up a hash that associates each daughter with her hair color, her favorite food, whatever. For example:
Dim faveFood Set faveFood = CreateObject("Scripting.Dictionary") faveFood.Add "Hedy", "Cheerios" faveFood.Add "Madeleine", "Pizza"
(Keep in mind that the importance of the subject being discussed is inversely proportional to the triviality of the examples used to explain it; given that we're talking about favorite toys and food colors, you can assume that hashes, and Dictionary objects, are pretty important.)
Remember that with the Dictionary object you can associate only one value with a key (hence the term associative). If you want more than one value, you have to step up to a full-fledged array. But you don't have to adhere to any particular data structure or construct within the object; for your key, you can associate an integer or a string with an object, a string, another integer, whatever your little heart desires (you can't use a reference to an array as a key, but you can as use it as a value).
The VBScript Dictionary object was released with VBScript 2.0 in November of 1996. The Microsoft Scripting team added it to the VBScript run-time library (scrrun.dll) to enable VBScript programmers to use associative arrays, as did their JScript brethren. Unfortunately, due to a combination of things, they became a victim of their own implementation.
Here's how. When the Scripting team released the Dictionary object, they only really intended it to be used on the client side. Some of you are probably thinking, as I did, "Big deal." Well, apparently it is. Without going into too much detail, performance can go you-know-where in a handbasket if you don't watch your threads, the term for all the execution processes your computer manages in the course of handling your requests. We'll provide a brief overview of threading here; for more introduction, see Geek Speak Decoded #7: Tasks, Processes, and Threads by Nancy Cluts, or Agility in Server Components by Neil Allain.
On a Web server, the number of requests that need to be handled simultaneously is much higher than the box from which you accessed this page. A Web server running ASP technology, for example, assigns every page request a thread from a pool established on start-up; if all the threads are used, subsequent page requests are placed in a queue. For more information on setting pool size and so on, check out Tuning Internet Information Server Performance and IIS 4.0 Tuning Parameters for High-Volume Sites . The ASP server then applies a processing algorithm to figure out which threads to process when. One process resembles the techniques we're advised to follow when taking standardized tests: Do the easy ones first, save the hard ones for later. While there is some processing overhead associated with keeping track of where the server is in answering different requests on different threads (called context switching) -- and while really long requests often take longer to fulfill, because they keep getting stopped and re-started -- overall, the average response time is quicker, because the easy requests don't have to wait.
One way the ASP server figures out whether to apply its context-switching skills is based on what kind of threading model an object claims it needs. If the ASP server is told (by a flag setting in the Registry) that an object is apartment-threaded, the ASP server's context-switching skills aren't used (although the processing overhead for context switching continues to run in the background). All server requests for that object are "locked down" to the thread on which it was first called. If a hard question comes along and clogs up that thread, every other request has to wait in line until the server can puzzle it out. Things can get pretty slow that way.
If the ASP server is told that an object is both-threaded (able to work on either a single or multiple apartment threads), any request to that object is answered really fast, because the ASP server assumes that the object is doing all the thread-managing itself, so the ASP server doesn't need to handle any processing overhead.
Designing an object to be both-threaded is a wee bit harder than designing an object to be apartment-threaded. You have to synchronize all the requests you get (so you can return the proper information to each thread). You have to ensure that two or more threads don't try to access the same piece of data at the same time (especially if they can change the value of the data they're accessing). And so on. In short, a both-threaded object is more robust, but much harder to design.
The original Dictionary object developed by the scripting team was apartment-threaded, because the team assumed that everybody would use the object on the client side -- and the extra effort to make it both-threaded wouldn't be justified. As it turns out, everybody wanted to use it on the server.
Even that wouldn't have been a problem if the Dictionary object call was limited to page scope on the server (it's unlikely that the object would be called more than once or twice from a single page). But if the Dictionary object call were set to session or, heaven forbid, application scope, the number of requests it could be forced to process would mushroom -- and with it, the probability that server speed would bog down.
To make matters worse, there was a hiccup in the installation process. Even though the Dictionary object was apartment-threaded, the installation process set the host computer's Registry key as if the object were both-threaded. The ASP team was excited to see a both-threaded Dictionary object appear, and encouraged folks to use it on the server in application scope to reap all the benefits of hashes. Whoops.
So what happens when an apartment-threaded object is treated by ASP as if it were both-threaded? At some point -- usually when two threads modify the same piece of data without being aware of each other's actions -- a piece of data on the server is corrupted or set to a value the Web application isn't designed to handle. Some time later, the server tries to access the data -- and boom, server crash. The likelihood of this happening increases with the traffic the server faces, so the server tends to crash at the times when you will aggravate the most people possible.
Hardly the ideal choice for Web developers: a slow object or a crash-prone object.
Which is where home.microsoft.com (HMC) entered the picture. Our team was assigned to build a new, customizable home page for the Home.Microsoft.com site (now home.msn.com site ). We wanted to use the Dictionary object for two main reasons:
But for some reason the server kept crashing. Hmmm... By this time, the ASP team was already aware of the boo-boo, and realized it was time to build a new object -- one that was fast, and one that could be used in application scope, as the HMC team needed.
So the ASP team built the new object. But to meet the performance and stability requirements of the HMC team, they limited the object to a subset of traditional hash functionality. In essence, the object they developed is a whizzy-fast read-only lookup table. In an inspired moment, the ASP team named it the LookupTable object.
We create references to the LookupTable object and set them to application scope by calling them in the ASP file global.asa:
<OBJECT ID=categories PROGID="IISSample.LookupTable" SCOPE=Application RUNAT=Server></OBJECT>
(We load the object more than twenty-five times, each time using a different ID, to store lots of different text file lookup tables, but that's another story.)
We populate all the object references (save the lookup tables that hold all our "stories") when the application is started using a server-side include. Why an include? In the event that we want to update the objects after the server starts, we can use an ASP file referencing the same code to load newer versions of the lookup tables.
<!--#include file="msdn-online/dictionaryloadfile.inc" -->
Within the include file, we declare our variables and use the LoadValues method of the LookupTable object to load our lookup table key-value pairs. The method takes two parameters -- one for the URL of the text file, the other to specify the data types of the key-value pairs (strings or integers) and whether to use error-checking. Below is an abridged snippet from our include file showing how we handle the lookup table that links the category codes to the descriptions of them you see on the home page.
<SCRIPT LANGUAGE=VBScript RUNAT=Server> Function InitDictionaries() Dim UNCRoot Dim CategoriesResult UNCRoot = Server.MapPath("/") & _ "\msdn-online\lookupfiledirectory\" CategoriesResult = Categories.LoadValues _ (UNCRoot + "Categorieslookup.txt", 10) If CategoriesResult=0 Then InitDictionaries = True Application("LoadDictionariesUpdateResult") = 0 Else InitDictionaries = False Application("LoadDictionariesUpdateResult") = 4 End If End Function </SCRIPT>
The only requirement for the text files that contain the key-value pairs is that they be separated by a comma with one pair per line (pairs that are longer than one line use "\" as a continuation character).
Other methods available for the new LookupTable object (athough we don't use them for the home page) are:
If you want even more information on how the Dictionary object works, more complete descriptions of its methods and use, or simply want to add it to your own collection, travel over to the LookupTable object download page.
As useful as the LookupTable and Dictionary objects may be, there are times when you may need to store more than a key-value pair in memory. In those cases, you might consider creating an Application-state Recordset -- essentially storing a table, or the results of a query, directly in server memory. See the excellent Nancy Cluts article on this subject: "Got Any Cache?"
In our next installment of the MSDN Chronicles, we'll explore in detail the cookie-based personalization scheme we use. Stay tuned.
Robert Carter is an MSDN technical writer.