Berthold von Freyberg
Microsoft Office Program Management
June, 1996
In Microsoft® Office 97, we want to store hyperlinks in the compound documents created by Microsoft Word, Microsoft Excel or Microsoft PowerPoint® in such a way that external tools can read, modify, and delete them.
This article provides a detailed description of the format in which the applications write the hyperlinks to a compound file stream and how external tools should access, modify and delete hyperlinks in compound documents.
In the following, "hyperlink" refers to the set of TargetAddress (URL/UNC) and SubAddress (such as a cell range in Microsoft Excel, a slide name in PowerPoint, or a bookmark in Word).
This section summarizes the implementation and storage of hyperlinks in the various applications:
This section describes the OLE properties stream to which Office 97 applications write hyperlink information . The format in which the hyperlink information is written is also described. Office 97 stores the Standard Summary Information property set in an IStream off the root IStorage, named "\005SummaryInformation". In addition, Office 97 stores two sections for the Office Summary Information (FormatID_DocumentSummaryInformation) and for the user-defined properties (FormatID_UserDefinedProperties), in another IStream named "\005DocumentSummaryInformation". Basically, we add one property, PID_HYPERLINKSCHANGED, to the existing twenty properties of the Office DocumentSummaryInformation section and one property, PID_HYPERLINKS, to the UserDefinedProperties section. Microsoft Excel, Word and PowerPoint, respectively, write this array at Save and, when opening it later, read the array and reconcile the hyperlinks in the document with any changes to the array.
Property Name | Property ID | Property ID code | Type | What stored in |
Hyperlinks | PID_HYPERLINKS | _PID_HLINKS | VT_BLOB | one hyperlink per six array elements. Format see below. |
HyperlinksChanged | PID_HYPERLINKSCHANGED | 0x00000016 | VT_BOOL | The "dirty" bit: 0 = false = no links changed 1 = true = links changed |
When saving a document, the application enumerates all hyperlinks (both its own and Office Art’s) and Office writes an array (with several array elements for each hyperlink, see below), in the same order in which it will later load and reconcile them. The application writes PID_HYPERLINKS as one VT_BLOB, but the internal structure which is how the application will later read the array is VT_VARIANT | VT_VECTOR. - Note that, for a given picture, Office Art might write up to three Hlinks in Office 97 to the OLE properties stream:
In addition, in a future version, Office Art might also write a fourth Hlink to the OLE properties stream for a linked line fill file. Internally, Office Art already supports this, but the Office applications themselves do not expose this functionality yet. When saving a document, the application also sets PID_HYPERLINKSCHANGED to False.
A related property that should be used in conjunction with PID_HYPERLINKS is the new custom property PID_LINKBASE that stores the base URL/UNC of a document. This is important in instances where PID_HYPERLINKS contains relative links.
Property Name | Property ID | Property ID code | Type | What stored in |
Hyperlink Base | PID_ LINKBASE | _PID_LINKBASE | VT_BLOB | Base address to be prepended to all relative hyperlinks, internally stored as VT_LPWSTR |
The PID_HYPERLINKS array property has the following format: the DWORD CElements indicates the number of array elements. This is equal to six times the number of hyperlinks because for every hyperlink that follows in PID_HYPERLINKS, there are six array elements:
Note TargetAddress and SubAddress are padded so that they are DWORD-aligned.
The first three DWORDs are private to the application that writes the file and should not be modified by an external tool.
The DWORD Info holds 2 pieces of information:
0 - do not change anything
1 - replace the hyperlink with the TargetAddress and SubAddress in the following two DWORDs (VT_LPWSTR)
2 - delete the hyperlink
0 - graphic shown as background of doc (link to a picture file)
1 - graphic shown as shape in doc (link to a picture file)
2 - graphic used to fill a shape (link to a fill file: picture fill, texture fill, or pattern fill)
3 - graphic used for shape outline (link to a line fill file: for future use only)
4 - hyperlink attached to a shape
5 - hyperlink attached to a (Word) field
6 - hyperlink attached to an (Excel) range
7 - hyperlink attached to a (PPT) text range
8 - hyperlink attached to a (Project) task
Note While currently not yet used, negative values of HIWORD and LOWORD are reserved for Microsoft applications.
Note The property array only comprises hyperlinks and links to pictures, textured fills and textured line files, not shortcuts, cross-references in Word, or cell references to other workbooks. Hyperlinks that appear in Word’s Undo document or in the AutoText table will also not be exposed. Finally, Office 97 applications do not expose hyperlinks from data path properties of ActiveX Controls in the OLE properties stream.
Note In contrast to other custom OLE properties, the File::Properties UI does not expose PID_HYPERLINKS.