It's Not Just a .doc and .xls World Anymore

IMHO
December 18, 1997

John Swenson
MSDN Online

With all the announcements Microsoft makes these days, it's easy for important news to slip by without notice. It would have been easy, for example, to miss the relatively quiet December 15 announcement that the next version of Office will feature HTML as a companion file format to Microsoft's proprietary Office file formats.

When I read the December 15, 1997 Microsoft press release, "Microsoft Office Breaks Ground by Adopting HTML Standard as File Format," (http://www.microsoft.com/corpinfo/press/1997/Dec97/htmlpr.htm) on this, I did a double take. You mean after all these years of working with Word documents, Excel spreadsheets, and other Office file formats, Microsoft is suddenly going to let millions of Office users start working with a single, standard .htm Web file format? Wow.

Reading further, I saw that the next version of Office will continue to let users open, save, and create .doc files, .xls files, .ppt files, and other native Office file formats. But Office users who want to switch to a native Web format will be able to work with all their Office documents as .htm files, or convert any native Office file to HTML.

But can't you already make this HTML conversion in Office 97, I wondered, simply by choosing Save as HTML . . . in the File menu of each Office application? The press release was short on details, so I called Andrew Dixon, a product manager on the Office team.

Separate but equal

"The best way to describe this is that we're elevating HTML to the same level as our own proprietary Office file formats," Dixon explained.

This improved HTML support in Office will enable seamless "round-tripping" between HTML file formats and native Office file formats. In other words, users will be able to switch their Office documents back and forth between HTML and native Office file formats at any time, without losing any formatting.

Office users will be able to save a Word document in HTML, for example, and open it back up in Word (or a browser) while preserving all important data such as PivotTable dynamic views and complex charts and tables. Even long documents filled with editing marks and Word Art will look exactly the same whether they're saved as Word documents or HTML files.

The same goes for documents created in any Office application, including Excel, PowerPoint, Access, and Outlook. Excel users will be able to switch their Excel spreadsheets to HTML, for example, and still preserve their Excel pivot tables.

Today when users save Office documents in HTML, they look very similar to the way they appear in the Office file formats, but not identical. In the future, such documents will look identical. This means anyone using a Web browser on any platform will be able to open Office documents and see them exactly as they should look, even if they don't have Office installed on their PC.

How'd they do that?

This seamless back-and-forth switching between file formats is possible because some clever developers on the Office team figured out how to save Office documents in HTML without losing any of the rich document formatting possible with Excel, Word, or the other Office applications.

Microsoft couldn't accomplish this file-format conversion trick by using straight HTML though. The next version of Office will also rely on XML (Extensible Markup Language) to preserve richly formatted Office documents in the .htm format.

In case you're unfamiliar with XML, this is the new Web technology that made a big splash at the December Internet World 1997 trade show in New York City. See the "XML: One Hot Abbreviation, but What Does It Mean?" article I wrote on the topic for more information.

XML complements, not replaces, HTML. It provides a standard format to describe different types of data, so that the information can be decoded, manipulated, and displayed consistently and correctly. Like HTML, XML is an industry standard, or at least in the process of becoming one. (Microsoft is working closely with the W3C to develop the XML 1.0 specification, which is now in the "proposed recommendation" stage.)

Rather than get bogged down in a technical explanation of how Office will use XML, I'll stick to the topic of the file-format change.

The decision to make HTML a companion file format to the native Office file formats was "an incredibly important design decision for the next version of Office," Dixon says.

The chief reason for making this big file-format switch is—you guessed it—the rising importance of the Web. Letting Office users save their documents as HTML will make it a snap for companies and other organizations to post documents on their intranet and Internet Web sites. If Office users save their original documents as HTML files, there won't even be any conversion process. Documents will be able to go straight onto the Web.

Making HTML a standard Office file format also promises to eliminate the file-exchange headache for organizations trying to share their documents with the outside world. Users will be able to send Office documents via e-mail and know the person at the other end of the line can open the documents with all their formatting intact.

The developer opportunity

So what is this file-format change likely to mean for developers? A lot. Since HTML and XML are industry-standard file formats, the next version of Office should open the door to all sorts of new third-party and custom applications, Dixon says. Any application that supports HTML will be able to open Office documents and edit them, allowing developers to create new applications linked to Office. "That opens all kinds of doors," Dixon says.

It's still too early to discuss what the new opportunities might be for third-party developers, he says. (Microsoft isn't talking release dates yet for the next version of Office, in case you were wondering.) But forward-thinking developers can use their imaginations and start thinking now about how their applications might be able to take advantage of an Office that uses HTML as a standard file format.

With the rapid rise of the Web, it's inevitable the Office team would tie the suite even more tightly to HTML. But until now, who would have predicted they'd find a way to create Word, Excel, PowerPoint, and Access documents in HTML—without sacrificing any formatting?

Comments? Send us e-mail.