INFO: BIFF8 BOUNDSHEET Record Data for Uncompressed Unicode

ID: Q187919

The information in this article applies to:

  • Microsoft Excel 97 Developer's Kit
  • Microsoft Visual C++, 32-bit Editions, version 5.0

SUMMARY

The Binary Interchange File Format version 8.0 (BIFF8) record data information in the Microsoft Developer Network (MSDN) and in the "Microsoft Excel 97 Developer's Kit" book does not mention a new flag that specifies whether the name of the worksheet is represented in uncompressed Unicode. Without this information, a developer might interpret the name field of the BOUNDSHEET record incorrectly if the name is stored in uncompressed Unicode.

The "BIFF8 Record Data" table at the top of page 291 of the "Microsoft Excel 97 Developer's Kit" book states that the cch (count of characters) field beginning at offset 10 is two bytes in size. This is incorrect and should state that the cch field is one byte, and that there is a one-byte flag field that reflects whether the name field is stored in compressed Unicode (one byte per character) or uncompressed Unicode (two bytes per character) at offset 11.

NOTE: The BOUNDSHEET record is entitled "BUNDLESHEET" by the Microsoft Biffview utility program.

MORE INFORMATION

The default representation for sheet name is compressed Unicode. Compressed Unicode uses one byte to represent the two-byte Unicode value of a character. It correctly assumes the high-order byte is zero, and stores only the low-order code for the letter or number at that character location.

If the sheet name is truly double-byte code, it is stored as uncompressed Unicode. Each character requires two bytes. Consequently, the name requires more space than that required for compressed Unicode.

The BIFF8 record uses the single byte at offset 11 to hold a flag indicating uncompressed Unicode. If that flag is binary one, the cch value at offset 10 is the count of double-byte characters beginning at offset 12.

The BIFF8 Record Data table at the top of Page 291 should read as follows:

   OFFSET  NAME      SIZE  CONTENTS
   ----------------------------------------------------------------------
      4    lbPlyPos    4   Stream position of the start of the BOF
                           record for the sheet.
      8    grbit       2   Option flags.
      10   cch         1   Length of sheet name in characters, not bytes.
      11   grbitChr    1   Compressed/uncompressed Unicode.
      12   rgch       var  Sheet name.

The following examples compare values with and without Unicode compression:

Uncompressed: Beginning at Offset 4 (16 bytes)

   20 0b 00 00   00 00   04   01   e5 5d  5c 4f  68 88  31 00

The BOF for this sheet starts at 00 00 0b 20. Note the byte-swapping that is explained on page 268 of the printed edition of the Excel SDK.

The option flags 00 00 tell you that this BOUNDSHEET record applies to a visible worksheet.

The cch value of 04 says the sheet name is 4 characters long.

The grbitChr value 01 means the sheet name is uncompressed Unicode, and each character is stored in 2 bytes - seen in the rgch field.

In the next 8 bytes the rgch field stores 5d-e5 4f-5c 88-68 00-31

Compressed: Beginning at Offset 4 (14 bytes)

   17 0d 00 00   00 00   06   00   53 68 65 65 74 32

The BOF for this sheet starts at 00 00 0d 17.

The option flags, 00 00, tell you that the BOUNDSHEET record applies to a worksheet that is visible.

The cch value of 06 says the sheet name is 6 characters long.

The grbitChr value of 00 means the sheet name is compressed Unicode, each character in the name is stored in one byte, with an assumed value of 00hex in the missing high-order byte of the character.

Hence, the sheet name is in 6 characters stored in 6 bytes as 53 68 65 65 74 32.

REFERENCES

"Microsoft Excel 97 Developer's Kit", Microsoft Press, ISBN 1-57231-498-2

Additional query words: BUNDLESHEET sdk xlsdk

Keywords          : kbVC500 kbGrpDSO kbOffice2000 
Issue type        : kbinfo


Last Reviewed: June 1, 1999
© 2000 Microsoft Corporation. All rights reserved. Terms of Use.