Interpreting the File Allocation Table

Now that we understand how the disk is structured, let's see how we can use this knowledge to find a FAT position from a cluster number.

If the FAT has 12-bit entries, use the following procedure:

1.Use the directory entry to find the starting cluster of the file in question.

2.Multiply the cluster number by 1.5.

3.Use the integral part of the product as the offset into the FAT and move the word at that offset into a register. Remember that a FAT position can span a physical disk-sector boundary.

4.If the product is a whole number, AND the register with 0FFFH.

5.Otherwise, "logical shift" the register right 4 bits.

6.If the result is a value from 0FF8H through 0FFFH, the file has no more clusters. Otherwise, the result is the number of the next cluster in the file.

On disks with at least 4087 clusters formatted under MS-DOS version 3.0 or later, the FAT entries use 16 bits, and the extraction of a cluster number from the table is much simpler:

1.Use the directory entry to find the starting cluster of the file in question.

2.Multiply the cluster number by 2.

3.Use the product as the offset into the FAT and move the word at that offset into a register.

4.If the result is a value from 0FFF8H through 0FFFFH, the file has no more clusters. Otherwise, the result is the number of the next cluster in the file.

To convert cluster numbers to logical sectors, subtract 2, multiply the result by the number of sectors per cluster, and add the logical-sector number of the beginning of the data area (this can be calculated from the information in the BPB).

As an example, let's work out the disk location of the file IBMBIO.COM, which is the first entry in the directory shown in Figure 10-6. First, we need some information from the BPB, which is in the boot sector of the medium. (See Figures 10-3 and 10-4.) The BPB tells us that there are

512 bytes per sector

2 sectors per cluster

2 sectors per FAT

2 FATs

112 entries in the root directory

From the BPB information, we can calculate the starting logical-sector number of each of the disk's control areas and the files area by constructing a table, as follows:

Length Sector

Area (sectors) numbers

Boot sector 1 00H

2 FATs * 2 sectors/FAT 4 01H—04H

112 directory entries 7 05H—0BH

*32 bytes/entry

/512 bytes/sector

Total sectors occupied by bootstrap, FATs, and 12

root directory

Therefore, the first sector of the files area is 12 (0CH).

The word at offset 01AH in the directory entry for IBMBIO.COM gives us the starting cluster number for that file: cluster 2. To find the logical-sector number of the first block in the file, we can follow the procedure given earlier:

1.Cluster number - 2 = 2 - 2 = 0.

2.Multiply by sectors per cluster = 0 * 2 = 0.

3.Add logical-sector number of start of the files area = 0 + 0CH = 0CH.

So the calculated sector number of the beginning of the file IBMBIO.COM is 0CH, which is exactly what we expect knowing that the FORMAT program always places the system files in contiguous sectors at the beginning of the data area.

Now let's trace IBMBIO.COM's chain through the file allocation table (Figures 10-9 and 10-10). This will be a little tedious, but a detailed understanding of the process is crucial. In an actual program, we would first read the boot sector using Int 25H, then calculate the address of the FAT from the contents of the BPB, and finally read the FAT into memory, again using Int 25H.

From IBMBIO.COM's directory entry, we already know that the first cluster in the file is cluster 2. To examine that cluster's entry in the FAT, we multiply the cluster number by 1.5, which gives 0003H as the FAT offset, and fetch the word at that offset (which contains 4003H). Because the product of the cluster and 1.5 is a whole number, we AND the word from the FAT with 0FFFH, yielding the number 3, which is the number of the second cluster assigned to the file.

0 1 2 3 4 5 6 7 8 9 A B C D E F

0000 FD FF FF 03 40 00 05 60 00 07 80 00 09 A0 00 0B ....@..'........

0010 C0 00 0D E0 00 0F 00 01 11 20 01 13 40 01 15 60 ......... ..@..'

0020 01 17 F0 FF 19 A0 01 1B C0 01 1D E0 01 1F 00 02 ................

0030 21 20 02 23 40 02 25 60 02 27 80 02 29 A0 02 2B ! .#@.%'.'..)..+

.

.

.

Figure 10-9. Hex dump of the first block of the file allocation table (track 0, head 0, sector 2) for the PC-DOS 3.3 disk whose root directory is shown in Figure 10-6. Notice that the first byte of the FAT contains the media descriptor byte for a 5.25-inch, 2-sided, 9-sector floppy disk.

getfat proc near ; extracts the FAT field

; for a given cluster

; call AX = cluster #

; DS:BX = addr of FAT

; returns AX = FAT field

; other registers unchanged

push bx ; save affected registers

push cx

mov cx,ax

shl ax,1 ; cluster * 2

add ax,cx ; cluster * 3

test ax,1

pushf ; save remainder in Z flag

shr ax,1 ; cluster * 1.5

add bx,ax

mov ax,[bx]

popf ; was cluster * 1.5 whole number?

jnz getfat1 ; no, jump

and ax,0fffh ; yes, isolate bottom 12 bits

jmp getfat2

getfat1: mov cx,4 ; shift word right 4 bits

shr ax,cx

getfat2: pop cx ; restore registers and exit

pop bx

ret

getfat endp

Figure 10-10. Assembly-language procedure to access the file allocation table (assumes 12-bit FAT fields). Given a cluster number, the procedure returns the contents of that cluster's FAT entry in the AX register. This simple example ignores the fact that FAT entries can span sector boundaries.

To examine cluster 3's entry in the FAT, we multiply 3 by 1.5, which gives 4.5, and fetch the word at offset 0004H (which contains 0040H). Because the product of 3 and 1.5 is not a whole number, we shift the word right 4 bits, yielding the number 4, which is the number of the third cluster assigned to IBMBIO.COM.

In this manner, we can follow the chain through the FAT until we come to a cluster (number 23, in this case) whose FAT entry contains the value 0FFFH, which is an end-of-file marker in FATs with 12-bit entries.

We have now established that the file IBMBIO.COM contains clusters 2 through 23 (02H—17H), from which we can calculate that logical sectors 0CH through 38H are assigned to the file. Of course, the last cluster may be only partially filled with actual data; the portion of the last cluster used is the remainder of the file's size in bytes (found in the directory entry) divided by the bytes per cluster.