Fitch & Mather Sample: Using XML to Display Data

Scott Stanfield
Vertigo Software, Inc.

September 1998

Summary: Discusses the decision to use XML to encode FMCorp expense report data. (8 printed pages) Covers:

Introduction
XML Overview
Original Architecture
Creating the XML Document
XML Parsing Technology
Tools
For More Information

Introduction

The World Wide Web Consortium (W3C) recently defined a new standard for data interchange called the Extensible Markup Language (XML). FMCorp uses XML to encode the expense report data for transmission over the Internet. This document describes why the decision to use XML was made, how the .xml files are created, and how they are parsed on the server.

XML Overview

Hypertext Markup Langauge (HTML) is a great way to display data, but it's pretty lousy when it comes to describing data. The W3C recognized the need for a new language to address this problem. The solution, XML, provides a standard for defining your own markup tags and data structure so that data can easily be exchanged online.

XML tags can be used directly within an HTML page to make the page "smarter." In fact, most of the information on the Internet now describes how XML will revolutionize Web pages. Understanding the design goals will make it easier to see why it was adapted for our project.

One of these goals is to define an exact grammar for XML to make it easy to parse and understand. The rules for start/end tags, name/value pairs, and layout look similar to HTML. But the rules only determine if the document is well formed, not if it makes sense. The semantics of any XML data stream are dependent on the implementation.

What does an XML document look like? The .xml file that describes a single-entry expense report is shown here:

<?XML VERSION="1.0"?>
<EXPENSEREQUEST>
   <VERSION>Expense Report 1.0</VERSION>
   <USEREMAIL>alice@fmcorp.com</USEREMAIL>
   <DESCRIPTION>Enter Description here</DESCRIPTION>
   <ITEMS>
      <ITEM>
         <TYPE>Travel</TYPE>
         <DESCRIPTION>rental car</DESCRIPTION>
         <COST>44</COST>
         <DATE>6/1/98</DATE>
      </ITEM>
   </ITEMS>
</ExpenseRequest>

In order to understand how XML is used in FMCorp, it's important to understand the site's original architecture flaws.

Original Architecture

FMCorp's original architecture didn't use XML. A user's expense report was transferred to the Web server as a native Microsoft® Excel document (an *.xls file). When the Web server received the .xls file, a Microsoft Visual Basic® Component Object Model (COM) component used Excel's Automation interface to parse the expense report data. The data was then entered into the database and the file was discarded.

This technique should be raising some eyebrows. It put an incredible burden on the Web server. Not only did it require an installation of Excel on the Web server, but it also took approximately 20 seconds to complete the entire process. Most of this time was spent starting the Excel engine.

No Webmaster would ever allow this extra processing burden on the server. Instead, we rethought the whole problem and reasoned that the processing should be done on the client. After all, the employee already had the data in a parsable format: Excel.

Because Excel supported macros, we discovered that we could add a macro (see the discussion to follow) that was executed when the file was saved. This macro could use the Excel object model to walk through the data and create an ASCII file representing the contents on the local hard drive. Later, the contents of the file could be uploaded via the Web site.

The question remained as to what format the data should take as it is transferred across to the server. We considered creating our own proprietary ASCII format, but that would require two pieces of code: one to build the file, the other to parse the file.

Because Microsoft Internet Explorer 4.0 comes with an XML Parser object, we reasoned that we could use it to decompose the .xml file created by the Excel macro. Because our Web site required Microsoft Internet Information Server (IIS) 4.0, which installs Internet Explorer 4.0, we knew the parser would be available.

XML Parser support is provided by a COM object whose implementation lives in a file called msxml.dll. The ProgID for this component is "MSXML."

Creating the XML Document

Although you can use the XML Parser to programmatically create an XML document, we couldn't ensure the user had Internet Explorer 4.0 on the machine that was running Excel. We opted to create the file "by hand," using a long macro in Excel written in Visual Basic for Applications.

The Excel macro is embedded in the downloadable template. The code looks like this:

Private Sub Workbook_BeforeSave(ByVal SaveAsUI As Boolean, Cancel As Boolean)
    createXMLfile (Left(ActiveWorkbook.FullName, _
      Len(ActiveWorkbook.FullName) - 4) & ".xml")
End Sub

Sub createXMLfile(filepath As String)
    Dim vbNewLine As String
    vbNewLine = Chr$(13) + Chr$(10)
    
    Dim XMLFileText As String 'working XML conversion of this speadsheet
    Dim ERsheet As Worksheet
    Set ERsheet = ActiveWorkbook.Worksheets("Expense Report")
    
    Const tagItemDescription = "ItemDescription"
    Const tagItemDate = "ItemDate"
    
    'list of Expense types
    Dim eTypes(6) As String
    eTypes(1) = "ItemRoom"
    eTypes(2) = "ItemTransport"
    eTypes(3) = "ItemFuel"
    eTypes(4) = "ItemPhone"
    eTypes(5) = "ItemEntertain"
    eTypes(6) = "ItemOther"
    eTypes(0) = "ItemMeals"
    
    XMLFileText = ""
    XMLFileText = XMLFileText & "<?XML VERSION=" & Chr(34) & "1.0" & Chr(34) & "?>" & vbNewLine
    XMLFileText = XMLFileText & tabs(0) & UCase("<ExpenseRequest>") & vbNewLine
    XMLFileText = XMLFileText & tabs(1) & "<VERSION>" & ERsheet.Range("ExpenseVersion") & "</VERSION>" & vbNewLine
    XMLFileText = XMLFileText & tabs(1) & "<USEREMAIL>" & ERsheet.Range("Email") & "</USEREMAIL>" & vbNewLine
    XMLFileText = XMLFileText & tabs(1) & "<DESCRIPTION>" & ERsheet.Range("description") & "</DESCRIPTION>" & vbNewLine
    
    XMLFileText = XMLFileText & tabs(1) & "<ITEMS>" & vbNewLine
    
    Dim nMaxItems As Integer
    nMaxItems = ERsheet.Range("ItemDescription").Count
    
    With ERsheet
        ' note we start with an index of 2 instead of 1
        ' the excel column ranges first index is the title
        ' of the column, the actual items start at index 2
        For col = 0 To UBound(eTypes)
            'for each column of data
            'set the column we are working on
            sColumn = eTypes(col)
            
            For nIndex = 2 To nMaxItems
                If .Range(sColumn)(nIndex) <> Empty And .Range(tagItemDescription)(nIndex) <> Empty Then
                    ' get the item information
                    XMLFileText = XMLFileText & tabs(2) & "<ITEM>" & vbNewLine
                    XMLFileText = XMLFileText & tabs(3) & "<TYPE>" & .Range(sColumn)(1) & "</TYPE>" & vbNewLine
                    XMLFileText = XMLFileText & tabs(3) & UCase("<Description>") & .Range(tagItemDescription)(nIndex) & UCase("</Description>") & vbNewLine
                    XMLFileText = XMLFileText & tabs(3) & UCase("<Cost>") & CCur(.Range(sColumn)(nIndex)) & UCase("</Cost>") & vbNewLine
                    XMLFileText = XMLFileText & tabs(3) & UCase("<Date>") & .Range(tagItemDate)(nIndex) & UCase("</Date>") & vbNewLine
                    XMLFileText = XMLFileText & tabs(2) & "</ITEM>" & vbNewLine
                End If
            Next nIndex
        Next col
    End With
    
    XMLFileText = XMLFileText & tabs(1) & "</ITEMS>" & vbNewLine
    
    XMLFileText = XMLFileText & tabs(0) & "</ExpenseRequest>" & vbNewLine
    
    'MsgBox XMLFileText
    Dim FSO As Object
    Dim newFile As Object
    
    Set FSO = CreateObject("Scripting.FileSystemObject")
    Set newFile = FSO.CreateTextFile(filepath)
    
    newFile.Write (XMLFileText)
    newFile.Close
    
    Set newFile = Nothing
    Set FSO = Nothing
End Sub

Function tabs(n As Integer) As String
    tabs = ""
    For x = 0 To n - 1
        tabs = tabs & vbTab
    Next
End Function

Workbook_BeforeSave is called right before the spreadsheet is saved. Because it is called before the employee has a chance to name the file, the default file name will be ExpenseReportTemplate.xml. Subsequent saves will pick up the saved file name, like Jack1.xml.

XML Parsing Technology

Once the .xml file has been created on the client, it's the user's responsibility to upload the file to the server. The file upload technique is discussed in "The Fitch & Mather Corporation Web Site."

Once the file has been received on the server, Active Server Pages (ASP) code uses the FMCorp ExpenseReports component SubmitXMLFile method to take apart the file. This method uses the IXMLDocument interface that is exposed by the XML Parser. The full details of the interfaces and methods exposed by MSXML can be found in the links referenced at the end of this document.

Let's examine the InitialParse method, because it shows the minimal amount of code necessary to take apart an .xml file:

Private Function InitialParse(ByVal XMLFilePath As String, ByRef xml As IXMLDocument, _
        ByRef Email As String, ByRef Description As String) As Boolean
    
    On Error GoTo InitialParse_Error
    
    Dim xd As New XMLDocument
    Dim root As IXMLElement
    
    InitialParse = False    ' assume the worst
    
    ' Check version
    xd.URL = XMLFilePath
    Set root = xd.root
    If root.children.item("VERSION").Text <> "Expense Report 1.0" Then
        Set xd = Nothing
        Exit Function
    End If
    
    Email = root.children.item("USEREMAIL").Text
    Description = root.children.item("DESCRIPTION").Text
    
    Dim item As Object
    
    For Each item In root.children.item("ITEMS").children
        Debug.Print item.children.item("TYPE").Text
        Debug.Print item.children.item("DESCRIPTION").Text
        Debug.Print item.children.item("DATE").Text
        Debug.Print item.children.item("COST").Text
    Next
    
    InitialParse = True
    
InitialParse_Exit:
    If Not xd Is Nothing Then
        Set xml = xd        ' possible memory leak here--call set xd = nothing next
    End If
    Exit Function
    
InitialParse_Error:
    InitialParse = False
    Resume InitialParse_Exit

End Function

An instance of the IXMLDocument interface is instantiated as the variable "xd." Its URL property points to the full path representing the .xml file on the server. The next block of code checks the version string. If this attribute isn't found, the code returns.

Next, the USEREMAIL tag and DESCRIPTION tag are extracted into two output parameters. The For Each syntax provides a convenient way to enumerate each item in the ITEMS collection. Each item should contain the TYPE, DESCRIPTION, DATE, and COST attributes. For debugging purposes, these items are sent to the Debug window.

The return value for the method is set to true and the XML Parser interface is returned. The code that uses InitialParse later enumerates the items in the collection and writes them to the database.

Tools

For general editing of any XML-based data, you can try Microsoft's XML Notepad tool, which is available free of charge at http://msdn.microsoft.com/xml/notepad/intro.asp. XML Notepad offers a simple user interface that shows the tree structure of XML data. As of this writing, the tool is in beta, but it was stable enough for our purposes.

Figure 1 shows an expense report .xml file that was generated by the Excel template.

Figure 1. XML Notepad showing a simple expense report

For More Information