XML: One Hot Abbreviation, but What Does It Mean?

IMHO
December 12, 1997

John Swenson
MSDN Online

Even if you're an experienced software developer, important new technologies always come along that you know nothing about. Such seems to be the case with XML, which garnered a lot of attention at this fall's Internet World 97 because of its newness. Based on many of the questions XML vendors were asked at their Internet World booths, few of the 50,000-plus people attending Internet World had a clue about what XML (eXtensible Markup Language) actually does.

"They don't understand it," said Denise Graves, an account manager staffing the Internet World booth for ArborText, a Michigan software company that is one of the first vendors to offer an XML tool. "I've had lots of people come up to me and say 'I know I need to learn XML, I know it's next thing for the Web, can you tell me about it?'" Many people incorrectly assume XML is a replacement for HTML, and don't understand that XML actually complements HTML, she says.

It's still early

If you're among those who don't know XML from HTML, don't worry. The W3C is still working on the first XML specification, so there's still time to get up to speed on this new technology before it arrives in full force. Microsoft is touting the fact that Internet Explorer 4.0 is the first browser to support XML, but even Microsoft product managers admit there really isn't any XML on the Web yet. Just wait, they promise, and you'll see how important XML will become.

Unless you already know what a parser does and understand acronyms such as DTD (Document Type Definition), XSL (Extensible Style Language), and RDF (Resource Description Format), you have some learning to do. The best place to start is probably the W3C Web Site's XML page (http://www.w3.org/XML/). Two days before the start of Internet World, the W3C kick-started XML by releasing the initial proposal for the XML 1.0 specification. Developers who are serious about learning XML should read through the proposed specification (http://www.w3.org/TR/PR-xml-971208/), one Microsoft XML expert advised.

In search of XML

I spent most of my second day at Internet World talking to people about XML. Everyone seemed to explain it in a different way, and most recommended seeing a demo to help understand what XML can do. You may have trouble catching a demo, but you can point your browser to the new Extensible Markup Language (XML) pages of the Microsoft Site Builder Network Web site (http://www.microsoft.com/xml/). There's an XML demo (http://www.microsoft.com/workshop/author/xml/parser/) on the site you can download, and lots of information about XML. Be sure to also take a look at "Frequently Asked Questions About Extensible Markup Language (XML)" document, available on the MSDN Library.

Here's how Microsoft defines XML:

A simplified subset of the Standard Generalized Markup Language (SGML) specifically designed for Web applications. XML provides a standard format to describe different types of data—for example, an appointment record, a purchase order, a database record—so that the information can be decoded, manipulated, and displayed consistently and correctly. XML provides a file format for representing data, a schema for describing data structure, and a mechanism for extending and annotating HTML with semantic information.

Don't be confused

Okay, so what does that really mean? It's really not so complicated, said Adam Bosworth, Microsoft's chief XML expert. "It's simple—XML is just an unlimited set of tags. It's like HTML, but XML lets you build your own tags," he says.

Another good explanation came from Rick Shaw, a senior systems engineer at Poet Software, one of several companies shipping the first XML tools. "XML allows self-describing documents," he told me during a stop at Poet's booth in the Microsoft Partners Pavilion. "If you look the tags in HTML, they're really just formatting tags. They don't really say anything about what's in the text. An HTML tag would just say 'this is bold' or 'this is blue.'"

XML tags go much further than HTML by letting you define the actual content within a set of tags. For example, an XML tag might define a word or series of words as the name of a person (<person>Rick Shaw</person>) or technology (<technology>XML</technology>). If you had a long document or series of documents you wanted to search for names, you could easily pull out all of them. With straight HTML you couldn't do that, since the search engine would have no way to know which words represented names. If you searched for "name" in HTML, you'd just find every instance of the word "name."

What XML can do

When a document is fully tagged with XML, you can do amazing things with it, said Kari Johnson of Chrystal Software, a Xerox spin-off and one of the other XML tools vendors in the Microsoft Partners Pavilion. XML lets you view a document in whatever form you want to see it, rearranging its structure on the fly, or just pulling out sections that contain the information you're looking for. You can break a document into as many fragments of text as you want and tag them all, she said. In an extreme case, you could apply an XML tag to every word, although that probably wouldn't make sense. You can even define your own new XML tags, allowing for infinite tagging possibilities.

XML is not just another nifty new technology in search of a market, Johnson said. Although most Internet World attendees who stopped by the Chrystal Software booth didn't understand XML, they did understand the problems it can solve, she said. "That's a good sign, because it means not only is this a good technology, but there's also a market behind XML, a need for this," she said.

Large companies will be among the first to use XML, Johnson predicts, since it can help them catalog, search, and retrieve information from their vast stores of documents. Because XML is extensible, it can describe data contained in a wide variety of applications, from entire collections of Web pages to individual database records.

Microsoft is betting most people will want to display XML information in a Web browser. That's why the company is building support for XML into Internet Explorer, and in the future plans to add it to other Microsoft products as well. To push XML along, Microsoft is working closely with the W3C to develop the XML 1.0 specification.

A lot more to come

Since XML is just getting started, we'll be sure to give you a lot more information on MSDN Online as this new technology develops. Be sure to check the XML pages of the Microsoft Site Builder Network Web site (http://www.microsoft.com/xml/) too. In the meantime, think how much better off your users would be if you could give them a better way to organize, search, and view information in your applications.

Comments? Send us e-mail.