This article is adapted from Lessons from the Component Wars: An XML Manifesto, which can be found on MSDN Online. |
Over the past two years, XML seems to have overtaken the Java language and Design Patterns as the solution to whatever ills plague the software industry. In this month's column I will discuss the emerging use of XML as a component technology after first taking a look at other more established component technologies such as COM and CORBA. As someone who has spent the last six years working with COM, I believe the primary goal of component software is to enable collaboration and cooperation among software development organizations, and the primary function of a component technology is to act as glue between multiple pieces of software. This is true of COM, the Java language, and CORBA. These three technologies provide infrastructure for integrating software components written by independent organizations. From the 10,000-foot view, these technologies look the same. If you look up close, however, each technology uses radically different techniques and programming styles to achieve its goals.
In-memory Interoperation
|
Figure 1 Degrees of Interoperation |
Mixing multiple components in memory is easily the most intimate interoperation possible. By standardizing an in-memory representation that all components must adhere to, a component technology can offer extremely high performance. Additionally, having a standardized in-memory representation allows the supporting runtime to offer a wider array of component management services at a substantially lower performance cost. COM standardizes the in-memory representation of object references based on simple C++-style virtual function tables. This makes in-process COM easy to support on just about any platform. The Java language standardizes the representation of component code, and each Java virtual machine defines its own in-memory representation for objects. The advantage of this approach is that each virtual machine implementor is theoretically free to innovate while still building on a common component format. The disadvantage is that components must run in the same virtual machine to interoperate, which, in the presence of versioning, is not always possible. The CORBA specification punts on in-memory representation, as the original goal of CORBA was to provide an object-based Remote Procedure Call (RPC) system.
Source Code Interoperation
Type Information Interoperation
Wire Interoperation
Components and Culture Clash
Component Technology for the Web
XML as a Component Technology
|
|
Here, customer number 4462 is ordering items 3352 and 1829. Assuming that both the sender and receiver understand what this means, everything is great. But what if the sender wanted to annotate this message with additional informationfor example, by adding an identifier to the order that associates it with a larger financial transaction? You could imagine the sender simply adding the attribute as follows: |
|
However, because the application receiving the message may have been developed independently from the sending application, there are several potential problems. For one, the receiver may not allow additional attributes or elements to be added to a message. If the receiver interprets the presence of this attribute as a parsing error, the request will fail. To deal with this problem, newer XML description technologies (such as Microsoft XML Data) allow XML vocabularies to be defined as either open or closed. A closed vocabulary cannot be extended beyond what is described in the base vocabulary schema. An open vocabulary can be extended, with the receiving application deciding how to interpret extended elements and attributes. Depending on the application, unrecognized extensions to a vocabulary can often be ignored. Assuming that the order message shown earlier is part of an open XML vocabulary, it should be safe to add the transid attribute. However, what if the receiver of the request also wanted to extend the vocabulary? What if the receiver had defined a new attribute for associating orders with low-level database transactions? If the receiver had the misfortune of choosing transid as the attribute name, then the sender's financial transaction ID would be misinterpreted as a low-level database transaction ID. To solve this problem, the W3C added namespaces to XML. XML namespaces allow attributes and elements to be scoped by a URI. The following XML fragment illustrates how XML namespaces can be used to unambiguously add the transid attribute to the order request: |
|
When a receiver parses this XML fragment, it can detect that the transid attribute is scoped by the namespace http://money.org/FinancialXML/ns and is not the same as the transid attribute used to represent database transactions (which would have a different namespace URI). In fact, XML namespaces allow both transid attributes to appear in the same request unambiguously: |
|
Despite the current energy being dedicated to XML-based type description, to date namespaces are still the most enabling enhancement to XML that has come out of the W3C. XML can be loosely typed Due to the use of open vocabularies and namespaces, XML can support loosely typed communications. While strong typing has many benefits (and is supported by XML using DTDs or their equivalents), it is extremely easy to build loosely typed systems using XML. This makes XML adaptable to generic application frameworks, data-driven applications, and rapid development scenarios such as disposable or transient Web-based applications. Many ADO Recordset fans are using the Microsoft XML parser (MSXML) to replace the Recordset as a data transfer mechanism for both its cross-platform benefits and its superior support for nontabular data. XML translates XML Simply adopting XML as a component integration technology does not completely solve the interoperability problem. Though much of the industry is embracing XML as an interoperability technology, this only pushes the interoperability problem up one level of abstraction. Even if the entire industry were to shift to XML overnight, this alone would not help, as different organizations are likely to use different XML vocabularies to represent the exact same information. Granted, there are currently industry-wide initiatives to standardize domain-specific XML vocabularies (such as BizTalk, FinXML, and OASIS). However, it is not known whether any of these efforts will achieve 100 percent penetration in a particular application domain. Fortunately, the lack of standardized vocabularies can be addressed using XML technology. In particular, in the presence of two competing vocabularies, it is likely that application-level gateways will transform requests from vocabulary A into requests in vocabulary B. An even more promising solution lies in XML transforms, also called XSL transformations (XSLT). XSLT allows one XML vocabulary to be transformed into another by specifying the transformation rules (in XML, of course). XSLT was originally devised to map XML to HTML, but is currently being applied in a variety of more interesting scenarios. To understand the power and elegance of XSLT, take a look at the following XSLT document, which converts all of the attributes of an XML document into elements: |
|
If you were to run this transform over the <order> fragment shown earlier, you would wind up with the following as output: |
|
Each XML parser can have its own proprietary way to apply an XSLT. Using the MSXML parser, the following JScript program will do the job: |
|
Note that this generic five-line script will work on arbitrary XSLT transformations. Also note that the XSLT does not have to generate valid XML. For example, consider the following XSLT: |
|
When run over the same input <order> element, it generates the following: |
|
As these two examples illustrate, XSLT makes it easy to automate translation from an XML vocabulary to other formats, including other XML vocabularies.
Summary
|
Have a question about programming with COM? Send your questions via email to Don Box: dbox@develop.com or http://www.develop.com/dbox. |
From the January 2000 issue of Microsoft Systems Journal.