Chapter 17 OLE Documents and Embedding Containers

There are things, and there are places to put things.

Tony Williams, Microsoft OLE Architect

In our kitchen, my wife and I have about 150 resealable plastic storage units in all shapes and sizes, from tiny ones that hold barely a quarter cup to ones that should hold enough salad to feed a cast of thousands. (OK, so I'm exaggerating slightly.) Some are square, some are round, some have such unusual shapes that they can't be stacked in our freezer. Some are clear, some are bold seventies colors (such as avocado), and some are recycled yogurt containers that say, "Sell by June 1, 1962." (OK, so I'm exaggerating again.) Some we obtained as gifts from Mom, some we bought ourselves, and at least one I found at a campground in the Cascade Mountains. (No, I'm not exaggerating.)

Besides the ubiquitous plastic, there are boxes, bins, pots, jugs, jars, cans, bags, baskets, bottles, and small paper packets containing powders with ingredient lists long enough to choke a toastmaster. What lives inside these various storage units is just as diverse. Some contain dry goods such as assortments of beans, lentils, split peas, stone-ground whole wheat flour, rye flour, brown rice, couscous, millet, Wheat Chex, and spinach grown according to the California Organic Foods Act of 1990. Others, in the refrigerator, hold at least five different kinds of soups, last week's radishes, tomorrow's lunch, and usually some sort of edible yet unidentifiable leftover.

The problem with kitchen storage is that the stuff you want to put in a given container might not necessarily fit the container. Carrots, for example, do not lend themselves to storage in an egg carton. What we would really like is that any container have at least a basic set of attributes that allow it to contain any kind of stuff, regardless of how otherwise bizarre that container might be. In addition, we would really like to have foods that all share a basic set of attributes that would allow them to fit into any of these standardized containers, regardless of how otherwise fantastic the food might be.

I doubt this will ever happen with food storage, but the same problem exists in computing when we are trying to integrate arbitrary or unstructured data from different sources into one centralized place, which we call a compound document. By unstructured data I mean information whose internal format is not known to the application that manages the compound document—the data is simply seen as a blob of bytes. Now, it has always been relatively easy to have two specific applications exchange specific structured data when both applications understand the exact data formats in question. Such intimate knowledge allows the applications to fit together as well as eggs fit in an egg carton. But just as an egg carton really doesn't work well to store anything but eggs, such a specialized interface between applications is not all that useful to other applications.

How, then, can we create applications that can deal in a generic way with unstructured data from any other application? How can we create a container application—the one that manages a compound document—that uses information from any source without intimate knowledge of the source or the information itself? And how can we create a source of such data that needs no intimate knowledge about potential containers? The obvious solution is some sort of central standard that both sides recognize. In other words, a container application needs to view all sources as conforming to some generic prototype so that the container can treat all sources polymorphically. In the same manner, all sources need to see all containers as conforming to a prototype of their own. In this way, any container can use any source; any source can work with any container. Information is then freely shareable among them all.

OLE Documents is the specification that defines these two prototypes: one for compound document containers, or simply containers, and the other for compound document servers, also called sources or servers. OLE Documents is the means of integrating unstructured data from any arbitrary source in any arbitrary compound document (the persistent file) being managed by the container. The unit of exchange is called the compound document content object, or simply content object. (In the context of this and most of the chapters that follow, object is used to mean the same thing.) Each content object has its own identity—a CLSID—to uniquely mark its type as well as to identify the server code that knows how to manipulate that data at the container's request. Content objects encapsulate their internal data formats and manipulation code behind a set of interfaces that define the prototype. These interfaces provide for persistence, structured data exchange, viewing, caching, and what we call activation of the user interface in which the user can manipulate that data.

OLE Documents is the last of the various means that OLE provides to share and integrate information. In Chapter 10, we saw how to exchange structured data through IDataObject. Chapter 11 explored how to view and cache graphical data, and Chapters 12 and 13 examined the exchange of structured data through the OLE Clipboard and OLE Drag and Drop. With OLE Automation, discussed in Chapters 14 and 15, we saw how data is shared through individual properties, and in Chapter 16, we saw how to share properties through persistent property sets as well as through the user interface of property pages. OLE Documents completes the picture, exchanging information through unstructured blobs. In fact, OLE Documents uses many of these other technologies to fulfill the necessary parts of its own protocol.

Nevertheless, OLE Documents is a rich technology, and we'll take the next seven chapters to explore it all. Overall, the number of new interfaces is relatively small. We'll see, for example, IOleObject, IOleClientSite, and IRunnableObject. Most of what this and the following chapters discuss are the protocols for how applications interact through these and other interfaces we've seen to make OLE Documents work. In addition, the user interface involved in object activation will be a significant topic in these chapters.

In this chapter, we'll look specifically at the architecture for OLE Documents as a whole, including additional object states we have not yet encountered. We'll then examine embedded content objects, a mechanism in which an object's unstructured data is stored inside the compound file directly, using storage-based persistence. Embedding is the most basic form of OLE Documents, and it forms the basis for everything else in this technology. Once we look at what embedded objects are and how they behave, we'll see the details of container-side implementation as we enhance the Patron sample to work with OLE Documents.

In Chapter 18, we'll look at the implementation details of a local server as we enhance the Cosmo sample to serve up Polyline Figures as embeddable objects. In Chapter 19, we'll complete our discussion of the basic embedded object by looking at object handlers and in-process servers for embedded objects, creating a rendering handler for Cosmo and also enhancing the Polyline sample to serve embedded objects as well. We'll see that both in-process handlers and servers have some special issues when dealing with the data cache and other container-side considerations.

Chapters 20 and 21 will build on what we know about embedded objects and explore linked content objects, in which the object's unstructured data is not stored directly in the compound document itself. Rather, that data is stored somewhere else, and the compound document includes a moniker that names that other place. Servers that support linking must supply these monikers and must also support the necessary mechanisms to bind those monikers, just as we saw in Chapter 9. So while we understand how monikers themselves work, Chapters 20 and 21 will show us how we move them from source to container to set up a link relationship.

Chapters 22 and 23 wrap up OLE Documents through a detailed discussion of in-place activation, which is a more document-centric user interface model than the one used for basic activation of an embedded object that we'll see in this chapter. In-place activation actually forms the basis for OLE Controls, so these chapters will lead naturally into Chapter 24, which covers the remaining details of OLE Controls.

Through these chapters, you'll see that OLE Documents truly enables any container to work with the data from any source—the server that provides that data—and that any server can provide data to any container, regardless of the nature of the compound document in that container. This means that we can, by analogy, fit lasagna noodles in a vinegar bottle, carrots in an egg carton, and ancho chile peppers in an ice-cube tray, without any trouble whatsoever. We can't do that in the kitchen, but, hey, this is just software...anything is possible.


Why MFC Is So Popular for OLE Documents

I imagine that you've already concluded that OLE Documents, let alone much of the other material we've seen and you have yet to see in this book, is complex. A set of protocols as powerful as OLE Documents, to be flexible enough to handle all the demands that are made of it, will be complex. The protocols themselves involve only a handful of functional interfaces, many of which we've already seen. If implementing an object with a few interfaces were all there was to it, everything in OLE would be easy. To make OLE Documents work, however, containers and servers need to not only implement and use various interfaces but also perform specific actions in a number of places around the rest of their code. For example, containers have to do certain things when they create, open, close, save, or rename a file. Servers have to do specific things when showing or hiding their window, working with files, and so on. What makes OLE Documents complex are all these little requirements strewn around an application, and that's mostly what the implementation sections in this and following chapters deal with. The Microsoft Foundation Classes (MFC) makes OLE Documents much easier by controlling the application framework itself, so it already has these pieces of code built in. Then you need only to implement the necessary customizations through virtual function overrides of the various C++ classes involved, thereby reducing the complexity tremendously. MFC is fabulously fit for working with OLE Documents and is well worth your time and investigation. This book will help you understand what MFC is doing by exploring the complete OLE Documents protocol in the raw.