September 1999
The Windows 2000 Change Journal is a database that contains a list of every change made to the files or directories on an NTFS 5.0 volume. Each volume has its own Change Journal database that contains records reflecting the changes occurring to that volume's files and directories. |
This article assumes you're familiar with Windows NT, Platform SDK |
Code for this article: ChangeJournal.exe (4KB)
Jeffrey Cooperstein is an author, trainer, and Windows programming consultant. He can be reached at http://www.cooperstein.com. Jeffrey Richter wrote Advanced Windows, Third Edition (Microsoft Press, 1997) and Windows 95: A Developer's Guide (M&T Books, 1995). Jeff is a consultant and teaches Win32 programming courses (http://www.solsem.com). He can be reached at http://www.JeffreyRichter.com. |
Implementation Details
The Change Journal is actually a special file on an NTFS volume. The system hides this file so that you cannot view it using familiar tools like the Explorer and the CMD shell. Whenever the file system makes a change to a file or directory, it appends a record to the journal. The record identifies the file name, the time of the change, and what type of change occurred. The actual data that changed is not kept in the journal, so don't get your hopes up about being able to roll back changesthis keeps the size of the journal as small as possible.
The Change Journal is initially an empty file on the disk volume. As changes occur to the volume, records are appended to the end of this
file. Each record is assigned a 64-bit identifier called an
Update Sequence Number (USN). When Microsoft was first developing the Change Journal, it was internally called the USN Journal. That's why the structures and defines in the winioctl.h header file refer to the Change Journal as the USN Journal. When a record is added to the journal, it is assigned a USN. USNs are generated in increasing order, so that you can compare USNs to find out the order of events (lower USNs are older events). USNs are not contiguous, so it's possible that the first USN record might be 0 and the second USN record might be 128.
The Change Journal always writes new records to the end of the file, so the implementors chose to use the file offset of a record as its USN. This makes querying the journal fast since the system can simply seek the desired record using the USN. Since records include a file name they vary in length, so you'll notice varying distances between USNs of adjacent records. A typical record might be 100 bytes long. For performance, the system writes to the journal in 4KB blocks that contain groups of 30 or 40 records (as defined by USN_PAGE_SIZE in winioctl.h). The system will not allow a single record to span the boundaries of a page, so you'll sometimes see a gap in USNs where empty space was used to pad the end of a block.
On an NTFS volume, file and directory information is stored in the Master File Table (MFT). Each record in the MFT describes a file or directory's name, location, size, attributes, and more. With NTFS 5.0, each file's MFT entry records the Last USN generated for that file. This is also true for directories. As records are appended to the Change Journal, the file system updates the MFT's Last USN value for the changed file or directory. In our next article, we'll show how this information is useful with a technique that can quickly scan the MFT for all files that changed over a range of time.
If the journal file gets too big (as defined by the MaximumSize parameter), the system will purge the oldest records at the start of the file. Traditionally, truncating data at the start of a file requires lots of file I/O. The end of the file must be copied to a new location, which is a time-consuming task. Fortunately, NTFS 5.0 supports sparse files, a mechanism that allows unneeded portions of a file to be deleted while retaining the logical offsets of the remaining data. The Change Journal is a sparse file, allowing the purging of records without any performance penalty. In addition, remaining records can still be quickly located using the USN since they remain at the same logical offset. For more information on sparse files see the article "A File System for the 21st Century: Previewing the Windows NT 5.0 File System" by Jeffrey Richter and Luis Felipe Cabrera in the November 1998 issue of MSJ.
A Change Journal can be disabled on a given volume, preventing the system from logging file and directory changes. By default, an NTFS volume will have its Change Journal disabled. Some application must explicitly activate the journal. Also note that any application can activate or disable the volume's journal at any time. An application must be able to gracefully handle the situation when a journal is disabled while the first application is still using the journal. We'll describe how applications can handle this in a future article. When an application disables the Change Journal for a volume, the system will also purge any existing records to prevent recovery of the information. This prevents applications from inadvertently reading unreliable records. The journal will only contain records as long as the journal is continuously active.
In the current implementation of the Change Journal, the journal file on disk is actually deleted when the Change Journal is disabled. A new journal file is created the next time an application activates the Change Journal. Although applications should not care about this implementation detail, it is why the terms "creating" and "deleting" the journal are used in the Platform SDK. We prefer to think of a Change Journal as being active or disabled since it describes the Change Journal as a service provided by the system. Terms such as "create" and "delete" are useful when trying to understand the implementation of the Change Journal as a file on disk. We've found that thinking about the Change Journal as active or disabled helps in understanding how it is used by applications.
Change Journals are assigned a unique 64-bit Journal ID (not to be confused with a USN number). The system will change a journal's ID when there is a chance that file or directory changes were not logged in the journal. For example, if a volume's journal is disabled, then activated, the Journal ID will be changed. As long as the Journal ID does not change, applications can be assured that the Change Journal has recorded every file and directory change. Even if the system is rebooted, the Journal ID will typically not need to change. In other words, if the Journal ID does not change after a reboot, the system has recorded all file and directory changes throughout the shutdown and boot sequence. Observant developers may discover that Journal IDs are actually standard 64-bit UTC time stamps generated from the system time. Applications should not derive any meaning from this (and remember, Microsoft may change how Journal IDs are generated before Windows 2000 ships).
Windows NT 4.0 Service Pack 4 provides limited access to NTFS 5.0 volumes. Unfortunately, the Change Journal cannot be accessed and changes to the volume will not be recorded. On dual boot systems, a boot to Windows NT 4.0 will cause all Journal IDs to be changed when Windows 2000 is restarted. Again, this allows applications running on Windows 2000 to know that they may have missed some file or directory changes.
Usage
All features of the Change Journal are accessed via the DeviceIoControl function.
|
The first parameter is a handle to a file, directory, or device obtained by calling CreateFile. DeviceIoControl is a common method used to pass device-specific requests to the driver managing hDevice. The parameter dwIoControlCode specifies what operation to perform and defines the structure of input/output buffers. If CreateFile is called with FILE_FLAG_OVERLAPPED, DeviceIoControl will operate asynchronously in the same way as ReadFile/WriteFile.
The NTFS driver manages the Change Journal. To communicate with a volume about its Change Journal, call DeviceIoControl with a handle to the volume. Call CreateFile as shown to get a volume's handle: |
|
Access to this volume handle is restricted to the system and members of the Administrators group, so typical users will not be able to run Change Journal applications. This means that these applications will most likely be services or utilities run by administrators.
The control codes that are supported for Change Journals are documented in the Platform SDK. They can be located in the index, but are not listed directly in the documentation for DeviceIoControl. The best way to locate the documentation is to search for "Change Journal." Change Journal Statistics
Change Journal Records
|
|
Applications will never have to fill in this structure. Instead, the system populates an output buffer with USN_
RECORDs when an application reads from the journal.
RecordLength is the total length of the record, in bytes, including the file name. Multiple records will be provided in an output buffer, so RecordLength should be used to calculate the location of the next record. |
|
Major Version and MinorVersion
It is easy to ignore the importance of version checking, but even easier to make a careless error that will infuriate users. Anyone who installed software on Windows NT 4.0 and received the message "Requires Windows NT 3.5 or greater" will testify to the disasters caused by the misuse of the GetVersion function! GetVersionEx was added to help clarify the versioning mess for developers, but even that was not enough. Windows 2000 has added VerifyVersionInfo to provide an even safer method for what should be a simple procedure. For the sake of this article, we don't really care about what version of Windows is running, but the Change Journal has its own version control. There aren't any fancy functions to help you out, so it's all the more important that you take the time to understand this information. (We only mention VerifyVersionInfo as a public service announcement. If you want more information, see the current Platform SDK documentation.) The initial release of Windows 2000 is expected to use version 2.0 Change Journal records (MajorVersion is 2, MinorVersion is 0). As we are writing this article, the Platform SDK contains only the version 2.0 definition of the USN_RECORD structure (defined in winioctl.h). Your application is responsible for knowing the version of the structure that was used at compile time. Winioctl.h does not currently provide any defined constants that have this information, so the best bet is to look in this header file for comments. For safety, it is a good idea to create your own compile-time constants and perform a runtime check to verify that newer structure definitions were not inadvertently included. |
|
At runtime, applications examine the MajorVersion and MinorVersion of journal records to determine compatibility with the information. If a change in MajorVersion is detected, the USN_RECORD structure has changed dramatically and the only members you can still use are Record-
Length, MajorVersion, and MinorVersion. Unfortunately, the system does not provide any ability to negotiate a compatible version at runtime. In other words, if the system fills an output buffer with records using a different MajorVersion than expected, the information cannot be used at all! Change Journal records with a MajorVersion of 1 existed on earlier betas of Windows 2000, but they are no longer supported.
If a change in the MinorVersion is detected, new members have been added after the penultimate member of the older structure. Applications can assume that USN_ RECORD structure members are valid up to the penultimate member of the older version. For example, consider the hypothetical version 2.3 USN_RECORD structure shown in Figure 3. If an application is compiled with today's version 2.0 USN_RECORD, it can still examine a memory buffer filled with the hypothetical version 2.3 USN_RECORD. It can reference all the members up to and including the FileNameOffset member. (We'll discuss the proper way to access FileName later.) On the other hand, imagine an application is compiled using version 2.3 USN_RECORD. If an output buffer has version 2.1 records, the version 2.3 USN_RECORD structure can still be used for members up to and including the ExtraInfo1 member (the penultimate member of version 2.1). Even though the record version information is provided in every record, an application only has to check it once each time it is started. The version number will not vary between volumes on the same physical machine, and will only change during a system reboot after a service pack is installed with new Change Journal software. Does this sound like a lot of work? Maybe, but consider the consequences if you incorrectly read a buffer provided by the system. Since most likely your software will be running as a service, an access violation will bring down the service! Fortunately, only version 2.0 structures currently exist. FileNameLength, FileNameOffset, and FileName
FileReferenceNumber and ParentFileReferenceNumber
|
|
Unfortunately, the function PathFromParentFRN does not exist. In fact, there is no currently exposed API that directly converts a FRN to a full path. A large portion of our next article will be devoted to doing just this.
You may now be wondering about the FileReferenceNumber member. If we could convert this FRN to a full path, it would be the full path of the record we are trying to find (and we would never need to discuss FileNameOffset, FileNameLength, or ParentFileReferenceNumber). It turns out that finding the full path from a directory FRN is much easier than finding the full path from a file FRN. The FileReferenceNumber may be either a file or directory FRN (depending on whether the record describes a change to a file or directory), but the ParentFileReferenceNumber will always be a directory FRN. Because of this, the easiest way to find the full path of a record is to examine the ParentFileReferenceNumber and append the name using the FileNameOffset and FileNameLength members. Usn, TimeStamp, and Reason
|
|
There is no input buffer, and the output buffer will be filled with sizeof(USN) bytes of data representing the USN of the generated close record. When this is done, the system immediately writes a record to the journal with the accumulated reason codes and the USN_REASON_CLOSE code, but it does not actually close the file. The Reason variable is reset to zero, and it will start accumulating changes all over again. If the Reason variable is zero when FSCTL_WRITE_USN_CLOSE_RECORD is used, it will still generate a journal record; this means you will see a record with only the USN_REASON_CLOSE code.
The only reason code that does not follow the previous rules is the USN_REASON_RENAME_OLD_NAME code. When a file is renamed, two records are added to the journal. First, the USN_REASON_RENAME_OLD_NAME code is added to the Reason variable, and a record is created. The members FileNameOffset and FileNameLength will specify the original name, and ParentFileReferenceNumber will specify the original directory. (Moving a file or directory to another location on the same volume is considered a rename.) Next, the USN_REASON_RENAME_OLD_NAME flag is removed from the Reason variable and replaced with USN_REASON_RENAME_NEW_NAME. A second record is generated with the new file name and new ParentFileReferenceNumber. Up through the next close record for the file or directory, the Reason member will continue to have the USN_REASON_RENAME_NEW_NAME code, but not the USN_REASON_RENAME_OLD_NAME code. The FileReferenceNumber of a file or directory will not change if it is renamed or moved to another location on the same volume. Suppose you rename and move the file D:\dir1\before.txt to D:\dir2\after.txt with the command: |
|
You'll see the following three records in the journal: |
|
What happens if you rename a directory that has hundreds of files and subdirectories? Say you rename D:\Program Files to D:\Pfiles. The system will only generate the following three records: |
|
There is no need to create records for all the child files or directories since this information can be inferred by following the ParentFileReferenceNumber. For just this reason, you'll find that maintaining a database of files and directories is easier if entries are stored as a name and parent ID. The main drawback occurs when you try to monitor a specific file; you need to monitor all of its parent directories up to the root directory on the volume or you might miss a move or rename.
When a directory is deleted, you do not have to worry about inferring what child files or directories are affected. The system will not allow a directory to be deleted if it has any children. If you delete a whole tree in Explorer, you'll see delete records for all the children before the delete record for any directory. It is important to understand that the Change Journal does not provide a superset of the Change Notification functionality provided through functions like FindFirstChangeNotification or ReadDirectoryChangesW. The Change Journal is designed to report all explicit actions on files or directories. Not all side effects are reported in the journal. For example, if an application calls the SetFileTime function, the Change Journal will report a Basic Information Change. However, if an application writes to a file, the Change Journal reports only the Data Overwrite (the explicit action), but not the time stamp change (the side effect). In a similar scenario, when a file or directory is created, the change to the parent directory's time stamp is not reported in the Change Journal. The Change Notification APIs, on the other hand, are designed to report all changes that they are aware of, even if it is the side effect of some other action. SourceInfo, Securityld, and FileAttributes
Reading from the Change Journal
|
|
To read some records, we call DeviceIoControl with the FSCTL_READ_USN_JOURNAL code. The input buffer must point to the following structure: |
|
Set StartUsn to the USN of the first record you want to read. It must be either zero, the USN of an existing record in the journal, or the USN of the next record that will be written to the journal. If StartUsn is zero, the system will start reading from the first record available. If StartUsn is the USN of an existing record, the system will start reading at that location. If it's the USN of the next record that will be written (such as ujd.NextUsn), the system waits for more data to appear in the journal, as
specified by the Timeout/BytesToWaitFor members we'll describe later.
Since there is no way to know if the record identified by StartUsn will match the filter criteria (see our discussion of ReasonMask/ReturnOnlyOnClose), the output buffer may not contain that specific record. Applications must examine the Usn member of returned USN_ RECORD structures to find out the USNs of the records actually returned. Since the system writes to the journal in 4KB blocks (USN_PAGE_SIZE), all 4KB aligned values from ujd.FirstUsn to ujd.NextUsn are guaranteed to be the USN of a record in the journal. Therefore, these are valid values for StartUsn. Other than that, the only way to get a valid value for StartUsn is through Change Journal APIs that return USNs. ReasonMask and ReturnOnlyOnClose
|
|
ReturnOnlyOnClose is another member that allows you to filter which records will be put in the output buffer. If this value is nonzero, only records with the USN_REASON_
CLOSE code will be returned. This works in tandem with the ReasonMask member (both conditions must be satisfied). To retrieve just the close records, set ReasonMask to reason codes of interest, and ReturnOnlyOnClose to 1. The system will return just close records, and only close records with one or more of the reason codes specified by ReasonMask. The ReturnRecord function really looks like Figure 8. Timeout and BytesToWaitFor
UsnJournalID
|
|
The USN returned at the start of the buffer is the USN of the next record following the last record returned. This is used to walk journal records without knowing exactly how much space is required. Use this USN as StartUsn on the next call to DeviceIoControl with the FSCTL_READ_USN_ JOURNAL code. Figure 10 shows how to get all the data between two USNs, as well as how to walk the records in the output buffer. |
Figure 9 Output Buffer Data |
The code in Figure 10 should be used to read records that are known to exist in the journal. The usnStart and usnEnd parameters should be between or equal to the values StartUsn and NextUsn determined by FSCTL_QUERY_
USN_JOURNAL. The Sample Application
What's Coming Up
|
For related information see: For related information see the NTFS File System page at http://msdn.microsoft.com/library/psdk/cossdk/pgservices_events_2y9f.htm. Also check http://msdn.microsoft.com for daily updates on developer programs, resources and events. |
From the September 1999 issue of Microsoft Systems Journal.
|