Michael Wallent
Lead Program Manager
Microsoft Corporation
January 4, 1999
The following article was originally published in the Site Builder Magazine (now known as MSDN Online Voices)"DHTML Dude" column. To fully understand Michael Wallent's Dynamic HTML column, you need Internet Explorer 4.0 or higher.
One of the most fun things that I do in my job at Microsoft is meet with customers who have, or are in the process of building, systems using Internet Explorer and Dynamic HTML (DHTML). Last month, in a meeting with a large company, an interesting problem arose. A customer had a list of 2,000 or so products from which users could choose, and our customer needed to present data on those products in an easy to use fashion.
No problem, I thought; use data binding. The company representatives told me that they had already thought of that -- but in their experience, it was too slow. After some discussion, it turned out that they were loading all 2,000 elements into a table, and that was slow. One alternative we discussed was using the datapagesize property on the data-bound table to limit the amount of data initially loaded, and then use buttons to page back and forth through the data. They felt that this, too, wasn't a good solution, mostly because the users of the system would have to learn a new user-interface paradigm. Point well taken. Then, they said something that really surprised me. "We can do this in Visual Basic, and it's really fast." Hmmm. Was Visual Basic loading all 2,000 rows in a second? Nope. They were using a virtual list. The HTML solution they were comparing it to was being asked to load all 2,000 rows just as fast as this virtual list was loading about 20. Two orders of magnitude: not a very clean comparison. So, the gauntlet was thrown. Could we build a virtual list using DHTML?
Oh, yeah.
Note that this month's sample is designed to run with Internet Explorer 5 Beta or later.
The basic requirement of a virtual list is give the user the impression that all of the possible data is in the list, while behind the scenes, the elements are retrieved only when the user scrolls to see them. In this way, the user experience is a seamless list, scrolling as expected. If the entire data set were loaded initially, the performance of the page load would be intolerable.
Even though data binding alone doesn't have all the functionality to solve this problem, it is the critical component. The basic method here is to have two tables. The first table is our virtual list table (vtable). The vtable is actually a <DIV> of a defined width and height, with the overflow Cascading Style Sheets (CSS) property set to "scroll." The second, hidden, table (we'll call that the data table) actually does the data binding. The data table is set up to use the datapagesize property, so the amount of data downloaded can be easily controlled. As data becomes available in the data table, the rows are copied into the vtable. As the vtable is scrolled, it signals the data table that it needs more data, and then as soon as the data is downloaded, it's copied to the vtable. All the user sees is a seamless, scrolling list.
I'm first going to review the source that actually gets the data into the data table.
<xml id=LargeData src=dom2.xml> </xml> <table id=SourceTable datasrc=#LargeData datapagesize=20 width=100% style="table-layout: fixed; display: none"> <tr> <td> <div datafld=data></div> </td> </tr> </table>
I'm using data binding with an XML Data Source Object (DSO); however, any DSO will do (Tabular Data Control, Remote Data Service, or one of your own invention). The bound table is similar to any other simple bound table, with two exceptions. First, the datapagesize property is set to 20, indicating that data will be shown 20 rows at a time. (For fun, you might want to try out modifying this value; it has some pretty dramatic changes in the performance of this code.) Second, the table is hidden using the CSS display: none property. The data table won't be used to display content; rather, it will be used as a data pump for the vtable.
Data binding, like many other HTML features is asynchronous. This means that the timing of the data availability isn't sequential, and doesn't behave the same on a fast line as it would on a 28.8 modem. However, we can monitor a series of events that can tell us when the data is available.
// when the data bound table is readyState == complete // this alerts that there is new data to transfer. // this method then calls the copyRows() method, and calls // testForScroll() to handle the case where the // scroll thumb has been moved by a large amount function dataAvailable() { if (SourceTable.readyState == "complete") { copyRows(); } testForScroll(); } // the readyState of the data table will change as // data fills in, this can tell us when its ok // to copy data SourceTable.onreadystatechange = dataAvailable;
When the page initially loads, the first 20 rows of the data set will automatically load. Those first 20 rows need to be copied into the vtable as soon as they are available. As any data bound table is loaded with data, its readyState property will sequence through the "loading," "interactive," and finally to the "complete" state. Every time the readyState property of the data table changes, the onreadystatechange event will fire. Note that with the code above, I handle this event for the data table (the data table has the ID of SourceTable in this code). When the readyState of the table is "complete," the data is in the data table. At this point, the next chunk of rows in the data table can be copied to the vtable.
I'll cover exactly what the testForScroll() method does in the scrolling section.
// this method actually moves the data from the // bound table to the virtual table on demand // it detects when all the data has been copied, // and stops the transfer function copyRows() { var i, copyRow, row; for (i=0; i<SourceTable.rows.length; i++) { row = SourceTable.rows[i]; // test if all data copied if (row.recordNumber <= LastRecordNumber) { // if so, no more data to copy, so unhook ScrollContainer.onscroll = null; SourceTable.onreadystatechange = null; return; } // deep copy the row from the table copyRow = row.cloneNode(true); TargetBody.insertBefore(copyRow, null); // used to keep track of how many rows are copied // to insure no duplicates LastRecordNumber = row.recordNumber; // removes a placeholder for scrolling for each row // inserted in the vtable deletePlaceholder(); } }
The method above does the dirty work of moving the rows from the data table to the vtable. This task is made significantly easier with Internet Explorer 5 and the addition of the cloneNode() method. This allows a row to be copied in one line of code (this was previously pretty tricky to get right).
This code loops through all the rows in the data table, and for each row, does a deep copy, and inserts the copy into the vtable.
There are two interesting bits here. First, there is a global variable LastRecordNumber, which keeps track of the last row that was inserted. Using this variable, the code checks to make sure that the same records aren't inserted multiple times into the vtable.
The second interesting bit is the deletePlaceholder() method, which has to do with properly maintaining the scroll bar, which we'll cover next.
Getting the scroll bar to behave properly was the trickiest part of this program. The scroll bar needs to feel consistent to create the proper "virtual" illusion for the end user. It would be unacceptable for the scroll bar to grow, or change position as the user was scrolling down in the list. However, this would be exactly the behavior if it weren't for some of the code below.
Before I cover how to keep the scrollbar the right size, I'll explain how to figure out when exactly to load more data.
// event handler for onscroll for the ScrollContainer // each time the table is scrolled, we check to see // if the last placeholder is close to in view. // if it is, we need to get more data. function testForScroll() { if (ScrollContainer.scrollTop + ScrollContainer.offsetHeight + ScrollBuffer >= firstspacer.offsetTop) { getMoreData(); } } // this function advances the page in the databound table // note that the nextPage() method is asynch, and will // return before the new data is in. // to resolve that problem, the onreadystatechange event // is hooked in the databound table. // this event is hooked by the dataAvailable() method function getMoreData() { if (SourceTable.readyState != "complete") return; SourceTable.nextPage(); }
Earlier, in the definition of the dataAvailable() method, the testForScroll() function got called. The testForScroll() function is also the event handler for the onscroll event of the vtable. Whenever the vtable scrolls, if the current end of the list will be visually exposed, or is already exposed, then more data needs to be retrieved from the data table. The firstspacer object sits immediately after the current last object in the vtable. By calculating its position, with respect to how much the container is scrolled, it's possible to figure out whether more data is required. The reason that testForScroll() is called in the dataAvailable() method is to catch the condition where more than one set of data needs to be transferred to fill the scrolled region of the vtable list. This will happen when the user drags the thumb faster than the data can load.
The getMoreData() method actually does the work of advancing the cursor on the table to get the next chunk of data. However, as discussed above, the actual copying occurs only after the data is loaded.
// to make sure that the scroll bars are the right size, // one spacer is created for eachrow that can be // inserted. These are removed as real rows are created. function insertScrollBlocks() { var i, br; for (i=0; i<TotalRecordCount; i++) { br = document.createElement("SPAN"); if (i == 0) { br.id = "firstspacer"; } else { br.id = "spacer"; } br.style.display = "block"; ScrollContainer.insertBefore(br, null); } } // method to remove a placeholder when a real row is inserted function deletePlaceholder() { p = document.all.spacer; if (p == null) return; if (p.length != null) { brToRemove = p[p.length - 1]; if (brToRemove == null) return; brToRemove.removeNode(); } }
The real trick of this sample was to get the scrollbar to work properly. The solution was to insert "spacer" objects in the vtable. One spacer was inserted for each record that could come in. As each real row came in, a corresponding spacer was deleted.
This virtual list implementation was optimized for fast first load of long lists of data. However, there are some use cases that would need to be optimized make this truly general purpose. If you had a million rows of data, then you would probably need to delete rows as they got far away from the current view port, or your page would soon grind to a halt. However, this implementation is reasonable for the 2,000-5,000 range of rows, depending on complexity.
Until next time.
Michael Wallent is Microsoft's group program manager for DHTML.