Platform SDK: Files and I/O |
A file is stored on a disk drive (and other media) in one or more clusters. Clusters are the atomic unit of data allocation, made up of one or more sectors. Sectors, in turn, are physical storage units.
As a file is written to the disk, the file may not be written in contiguous clusters. Noncontiguous clusters slow down the process of reading and writing the file. The farther apart on the disk the noncontiguous clusters are, the worse the problem because of the time it takes to move the hard drive's read/write head. A file with noncontiguous clusters is said to be fragmented. To optimize files for fast access, a volume may be defragmented.
Defragmentation is the process of moving portions of files around on the disk in order to defragment files; that is, the process of moving a file's clusters on the disk to make them contiguous.
In a simple single-tasking operating system, defragmentation is straightforward: the defragmentation software is the sole task, and there are no other processes to read from or write to the disk. However, in a multitasking operating system, some processes may be reading from and writing to the hard drive while another process is trying to defragment that hard drive. The trick is to avoid writes to the file being defragmented without stopping the writing process for very long. Solving this problem is not trivial, but it is possible.
Some file systems are publicly documented, such as the FAT16 and FAT32 file systems used in the Microsoft® MS-DOS® and Windows® 98 operating systems. This allows programmers to manipulate on-disk data structures (such as file allocation tables, or FATs) directly. However, NTFS, the file system that the Windows NT® operating system uses, is deliberately opaque. To allow defragmentation of NTFS without requiring detailed knowledge of the disk structure of NTFS, a set of three DeviceIoControl operations is provided. The three operations allow applications to locate empty clusters, determine the disk location of file clusters, and move clusters on the disk. The DeviceIoControl operations transparently handle the problem of inhibiting and allowing other processes to read from and write to files during moves.
These same DeviceIoControl operations also work with FAT volumes.
These operations can be performed without inhibiting other processes from running. However, the other processes will have slower response times while a disk drive is being defragmented.
Clusters may be referred to from two different perspectives: within the file and on the volume. Any cluster in a file has a virtual cluster number (VCN), which is its relative offset from the beginning of the file. For example, a seek to twice the size of a cluster, followed by a read, will return data beginning at the third VCN. A logical cluster number (LCN) describes the offset of a cluster from some arbitrary point within the volume. LCNs should be treated only as ordinal, or relative, numbers. There is no guaranteed mapping of logical clusters to physical hard drive sectors.
An extent is a run of contiguous clusters. For example, suppose a file consisting of thirty clusters is recorded in two extents. The first extent might consist of five contiguous clusters, the other of the remaining 25 clusters.
There is no guarantee of any relationship on the disk of any extent to any other extent. For example, the first extent may be at a higher LCN than a subsequent extent.
To defragment a file:
Two of the operations used for defragmentation require handles to volumes. Only administrators can open volumes to handles, so only administrators can run defragmentation software. Your program should check the privileges of the user executing it, and gracefully refuse to run for nonadministrators.
The DeviceIoControl FSCTL_MOVE_FILE operation only operates on NTFS volumes with a cluster size less than 4K. NTFS format defaults to cluster sizes of less than 4K, so volumes with cluster sizes larger than 4K are rare.
Defragmentation DeviceIoControl Operations
The following is a table of defragmentation structures and the Associated DeviceIoControl operation.