Motivation I: Sharing Files Between Components

The first computers, whether mechanical like an abacus or electronic like ENIAC, really didn't have the idea of "storage" of any kind: they were nothing more than mechanical or electronic calculators. The idea of a computer brought with it the idea of some sort of physical storage device—punch cards, paper tape, and magnetic media such as disks, drums, and reel tapes—on which the computer could write information and recall it at a later time. The idea of an application at this time was the code that ran on the computer: one computer, one application. The singular application was the heart of the computer, and it controlled all aspects of reading and writing information to the storage device, as illustrated in Figure 7-1.

Figure 7-1.

When computers ran only one application, the application had complete control over the storage device. On a hard disk, the application controlled the absolute sectors in which it stored information.

In that era, computer programming gurus were skilled at optimizing throughput by taking into account the rotation speed of a disk or a drum. When the program needed the next set of data from the device, those sectors would be directly under the read head. Such were the days of real programming. None of this wimpy user interface glitz. (Just kidding.)

But these skills became obsolete with the advent of the operating system, which took control of system resources in order to allow multiple applications to run together on the same computer. Those applications now had to share system resources—such as the storage device—to ensure that they didn't overwrite one another's data. For these reasons, applications had to ask the operating system—or more accurately the file system—for a file handle. The file handle represented space on the storage device set aside for the exclusive use of the application that opened the file and owned the file handle. When the application wrote information to that file, the file system found free sectors on the disk in which to store the information and kept a table describing which sectors contained the contents of the file and in what order. The idea of a file is unknown to the storage device itself, which understands only sectors. The file system is a piece of code that manages the allocation of those sectors, requiring applications to work through a conceptual file that maps information to certain sectors on the storage device, as shown in Figure 7-2. In this way, the file system prevented conflicts between applications.

Figure 7-2.

The file system introduced the concept of files to prevent conflicts between applications that shared a common storage device.

The idea of a file system was a boon to application developers. They no longer needed to understand the intricate details of disk controllers and sectors. Instead, they could ask for a file—which appeared as a flat, contiguous array of bytes—in which they could create any structures they wanted. Applications relinquished control of the device in order to gain this convenient way to share it.

For a long time, operating system APIs and language-based run-time libraries have provided applications with many satisfactory ways of working with singular file entities. Using these technologies, applications have made some incredible innovations in the ways they deal with a single stream of information, providing features such as incremental fast saves and garbage collection within a file.

But OLE as a technology changes the scene drastically. In a component software environment, an application is no longer a monolith that controls every aspect of its storage. Instead, an application might be built from many different components, written by different developers at different points in time. But those components still require a way to store their own persistent information. At the same time, all that information has to end up in a single file, as the end user understands it, because users perceive applications as unified entities rather than as an aggregation of disparate components.

Thus, component integration requires the ability for multiple components to share storage contained within a single file on the underlying file system. This is exactly the same problem that operating systems had to solve when they enabled multiple applications to share system resources. The operating system solution was to create a file system that provided a level of indirection between an application and the underlying device. That abstraction was the file. The solution for component integration is another level of indirection: a file system within a file, in which components can deal with entities called storages and streams that each correspond to specific areas within the file, as shown in Figure 7-3.

Figure 7-3.

A file system within a file enables multiple components to share the resources of a single file.

OLE's Structured Storage is the model that defines this second layer of abstraction, which involves not only streams that act like files but also storages that act like directories on a file system. Just as a file system removes from applications the burden of managing disk sectors, Structured Storage removes from components the burden of sharing a file and is a very powerful way to manage files even for an application that is not built of components. Where components are involved, Structured Storage is a necessity; where they are not, this technology is a gift. In either case, OLE has provided the next step in the evolution of storage.