Limitations and Features of Compound Files
As mentioned earlier, OLE's Compound Files is an implementation that doesn't support absolutely every part of the Structured Storage model—it's one of those things you call engineering—certain features were left out because of time constraints and the fact that few developers would be interested in those features anyway. Here then are the aspects of Compound Files that differ from the ideal storage model as well as a few notes regarding the implementation of this technology:
- Storage objects completely implement all the functions in IStorage except SetStateBits, which doesn't do anything because no legal state bits are defined at this time. Do not be tempted to use this function with custom flags.
- Infrequently used operations such as IStorage::EnumElements (as well as MoveElementTo, RenameElement, and DestroyElement) are not optimized for performance—they can be very slow. Microsoft recommends that you not use EnumElements to manage a list of substorages and streams but rather that you store an extra stream that contains a cache of that list. You'll realize much faster performance in that way, with only a little extra coding.
- Stream objects in compound files do not support region locking, nor do they support being opened in transacted mode themselves. Thus, the IStream members LockRegion, UnlockRegion, and Revert are no-ops, while Commit does nothing more than flush internal buffers. When you make a change to a stream, you will not be able to revert to the previous contents unless you've made a separate copy.
- The Structured Storage specifications allow streams to contain up to 264 bytes—that is, the seek offset is a 64-bit value. OLE's implementation is limited to 232 bytes, using a 32-bit seek offset instead. Microsoft didn't see a 4-gigabyte limit as a problem.
- Stream allocation happens on a granularity of 512 bytes, so a stream with 10 bytes of data will occupy 512 bytes in the file, and a stream of 513 bytes will occupy 1024 bytes in the file.
- Seeking backward in a stream is somewhat slower than a forward seek because OLE's implementation uses a singly linked list to manage noncontiguous blocks of space in the file that makes up the stream.
- All element names are stored as Unicode characters regardless of platform.
Besides performance issues, the only real limitations in compound files are the absence of region locking and transactioning for streams. Keep these in mind when you design an application that uses this technology. The 512-byte granularity for streams is also an important design consideration: it becomes very inefficient to store many small data structures in individual streams because you'll end up with a lot of unused space in a file. If at all possible, design your use of compound files so that you use as much space in each 512-byte block of a stream as you can, which you can do simply by combining a few structures in the same stream. You can then use IStream::Clone to keep IStream pointers positioned at the beginning of each structure within the same stream. This way you don't have to make a large number of Seek calls to go from one structure to another.