Keeping an Eye on Your NTFS Drives, Part II: Building a Change Journal Application -- MSJ, October 1999

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

October 1999

Keeping an Eye on Your NTFS Drives, Part II: Building a Change Journal Application
Jeffrey Cooperstein and Jeffrey Richter

The Change Journal can be either active or disabled on an NTFS volume. If the Change Journal is active, the DeviceIoControl function will return TRUE. Any app that wants to use the Change Journal can activate it if it is disabled. The system can perform this operation very quickly.

This article assumes you're familiar with Windows NT, Platform SDK

Code for this article: ChangeJournal2.exe (22KB)

Jeffrey Cooperstein is an author, trainer, and Windows programming consultant. He can be reached at www.cooperstein.com. Jeffrey Richter wrote Advanced Windows, Third Edition (Microsoft Press, 1997) and Windows 95: A Developer's Guide (M&T Books, 1995). Jeff is a consultant and teaches Win32 programming courses (www.solsem.com). He can be reached at www.JeffreyRichter.com.

Last month (September 1999) we introduced you to the new Change Journal functionality of Windows® 2000. We explained its implementation, and provided enough information to dump the contents of the Change Journal on an NTFS 5.0 volume. This month we'll cover the rest of what you need to know to write a fully functional Change Journal application. The sample application, CJTest, will demonstrate all the techniques described in this article, and can be used as a template for your own application.

Change Journal States
      Last month we explained that the Change Journal could be either active or disabled on each NTFS volume. The FSCTL_QUERY_USN_JOURNAL code was used with the DeviceIoControl function to determine the state of the journal. If the Change Journal is active, the DeviceIoControl function will return TRUE. If the DeviceIoControl function returns FALSE, the GetLastError function will provide additional information about the state of the journal. If the GetLastError function returns ERROR_JOURNAL_NOT_ ACTIVE, an application can activate the Change Journal using DeviceIoControl, passing the FSCTL_CREATE_USN_ JOURNAL code. The input buffer must point to the following structure:
typedef struct { DWORDLONG MaximumSize; DWORDLONG AllocationDelta; } CREATE_USN_JOURNAL_DATA, *PCREATE_USN_JOURNAL_DATA;
      The MaximumSize member is the maximum number of bytes the journal should use on the volume. It should be some small percentage of the size of the volume, and is limited to 4GB. For example, a reasonable size for a 9GB drive is an 8MB Change Journal. The AllocationDelta member specifies the number of bytes the journal file should expand when needed. This should be an even multiple of the volume's cluster size, and approximately one-eighth to one-quarter the value of MaximumSize. The AllocationDelta is also the number of bytes that will be purged from the start of the file if the file grows past MaximumSize. The system is lazy about enforcing the MaximumSize of a journal. It is possible that the journal can temporarily grow past this limit, but eventually the system will purge the old records.
      If a Change Journal already exists on a volume, calls to FSCTL_CREATE_USN_JOURNAL will still succeed, and it will update the journal's MaximumSize and AllocationDelta parameters. This allows you to expand the number of records that an active journal maintains without having to disable it.
      Any application that wants to use the Change Journal can activate it if it is disabled. Fortunately, the system can perform this operation very quickly. On the other hand, disabling (or deleting) an active journal can be quite time-consuming because the system must walk all the records in the master file table (MFT) and set the Last USN attribute to zero. This process can take several minutes, and will continue across reboots of the system if necessary. During this process, the journal is not considered active, but it is also not disabled (see Figure 1).

Figure 1 Possible Change Journal States

      Figure 1: Possible Change Journal States

      While the system is disabling the journal, it cannot be accessed and all journal operations return ERROR_ JOURNAL_DELETE_IN_PROGRESS. Applications should never disable an active journal because it will adversely affect other applications using the journal. However, if you really want to disable a journal you can do it by calling DeviceIoControl, passing the FSCTL_ DELETE_USN_JOURNAL code to it. The following structure must also be supplied:
typedef struct { DWORDLONG UsnJournalID; DWORD DeleteFlags; } DELETE_USN_JOURNAL_DATA, *PDELETE_USN_JOURNAL_DATA;
UsnJournalID should be zero unless the USN_DELETE_FLAG_DELETE flag is specified in the DeleteFlags member—in this particular case, it must be set to the ID of the active journal.
      The DeleteFlags member can be either USN_DELETE_ FLAG_DELETE or USN_DELETE_FLAG_NOTIFY. If the USN_DELETE_FLAG_DELETE flag is specified alone, the system will disable the active journal. DeviceIoControl will return immediately—it does not wait for the journal to be disabled. The journal must be active and UsnJournalID must be set to its ID or the call will fail. If the USN_ DELETE_FLAG_NOTIFY flag is specified in addition to USN_DELETE_FLAG_DELETE, DeviceIoControl returns only after the journal is fully disabled.
      The USN_DELETE_FLAG_NOTIFY flag can be specified alone to wait for the journal to become available after receiving ERROR_JOURNAL_DELETE_IN_PROGRESS from any of the Change Journal functions. The following code fragment illustrates how you can wait for a journal to be disabled:
// Call this if you get // ERROR_JOURNAL_DELETE_IN_PROGRESS void WaitForJournalAvailablity(HANDLE hcj, LPOVERLAPPED po) { DWORD cb; DELETE_USN_JOURNAL_DATA dujd = { 0, USN_DELETE_FLAG_NOTIFY }; // If the journal is active, this call will still succeed // and return immediately. It will not disable the journal. BOOL fOk = DeviceIoControl(hcj, FSCTL_DELETE_USN_JOURNAL, &dujd, sizeof(dujd), NULL, 0, &cb, po); // Wait for asynchronous I/O to complete if needed if (!fOk && (ERROR_IO_PENDING == GetLastError()) && (po != NULL)) GetOverlappedResult( hcj, po, &cb, TRUE) }
      Be aware that if the USN_ DELETE_FLAG_NOTIFY flag is used with asynchronous I/O, DeviceIoControl might return while the journal is still disabling. In this case, DeviceIoControl returns FALSE, GetLastError returns ERROR_IO_PENDING, and any journal operations will return ERROR_JOURNAL_DELETE_IN_PROGRESS. You can wait on the device handle or the OVERLAPPED structure's event handle to become signaled, indicating that the journal is now available.
      The fact that applications cannot gain exclusive access to the Change Journal has some interesting repercussions. Consider the following pseudocode, which disables the journal and waits for it to become available:
// Pseudocode to guarantee the journal is disabled void DisableAndWait() { // See if the journal is active if (QueryUsnJournal() returns some error) return; // if Query fails, the journal is // disabled        // The query succeeded, it must have // given us an ID. Disable the journal // using the ID, and wait for it // to finish DisableUsnJournal(id, USN_DELETE_FLAG_DELETE | USN_DELETE_FALG_NOTIFY); }
       There are two mistakes in this code. First, QueryUsnJournal could return ERROR_JOURNAL_DELETE_ IN_PROGRESS. In this case, DisableUsnJournal must be called using only the USN_DELETE_FLAG_NOTIFY flag in order to wait for the journal to be available. The second mistake is that the DisableUsnJournal call itself may return ERROR_JOURNAL_DELETE_IN_PROGRESS. Imagine the case where QueryUsnJournal successfully returns the active journal's ID, but another application immediately starts to disable it. The call to DisableUsnJournal will return ERROR_ JOURNAL_DELETE_IN_ PROGRESS, and the journal will not be available. The following pseudocode will be more successful:
void DisableAndWait() { DWORD dwErr = QueryUsnJournal(); if (dwErr is success) { // A journal exists and we now have its ID. // Tell the system to disable it DisableUsnJournal(id, USN_DELETE_FLAG_DELETE); } else if (dwErr is not ERROR_JOURNAL_DELETE_IN_PROGRESS) return; // the journal is disabled with no delete // pending // Either we tried to disable the journal or it is // in the disabling state (delete-in-progress) // In either case, wait for it to finish. Use // journal ID=0 DisableUsnJournal(0, USN_DELETE_FALG_NOTIFY); }
      As you can see, a simple operation like disabling the journal is still not straightforward. Even the pseudocode is difficult to read. Consider everything an application has to do when it starts—open the volume handle, wait if the journal is currently being deleted, activate the journal if it is disabled, and, finally, query for the journal's information. Even worse, at any step of the way, another application can be trying to activate or disable the journal. To simplify things, it helps to create a single function that doesn't return until the journal is active.
      The flow chart in Figure 2 illustrates a robust process that will fill in a USN_JOURNAL_DATA structure. It is robust since it will do everything possible to ensure that the journal is active.

Figure 2 Filling a USN_JOURNAL_DATA Structure

      Figure 2: Filling a USN_JOURNAL_DATA Structure

New Record Notification
      Last month we showed how to dump existing records in the Change Journal. After processing all available records, an application will want to wait for new records to be available. A simple solution would be to poll the journal's NextUsn parameter using the FSCTL_QUERY_USN_ JOURNAL code. Fortunately, there are more elegant solutions that don't require the polling technique.
      The following code shows how to wait for a specified USN to exist in the journal. It does not return the record. It is assumed that the specified USN does not currently exist in the journal, but the function will just return immediately if it does exist.
BOOL WaitForNextUsn(HANDLE hcj, DWORDLONG journalId, USN usn) { READ_USN_JOURNAL_DATA rujd; rujd.StartUsn = usn; rujd.ReasonMask = 0xFFFFFFFF; // All bits rujd.ReturnOnlyOnClose = FALSE; // All entries rujd.Timeout = 0; // No timeout rujd.BytesToWaitFor = 1; // Wait for this // USN to exist rujd.UsnJournalID = journalId; // The journal // we expect to read from DWORD cbRead; USN usn; // This function does not return until the USN // record exits BOOL fOk = DeviceIoControl(hcj, FSCTL_READ_USN_JOURNAL, &rujd, sizeof(rujd), &usn, sizeof(usn), &cbRead, NULL); return fOk; }
      An application can obtain the USN of the next record that will be written to the journal by looking at the NextUsn value returned with the FSCTL_QUERY_USN_JOURNAL code. If this is used with the previous code, the function will return as soon as a new record is written to the journal.
      Another way an application can use the WaitForNextUsn function is after it has processed all existing records in the journal. In last month's code, we walked all available records using repetitive calls to DeviceIoControl with the FSCTL_ READ_USN_JOURNAL code. We assumed that all records were processed when DeviceIoControl returned only sizeof(USN) bytes in the output buffer. When this happens, the first sizeof(USN) bytes of the output buffer will actually contain the same value as the StartUsn member specified in the input buffer. This means that you've read all available records and the USN returned will be the USN of the next record that will be created. The application can wait for more data to become available by using this USN in a call to the WaitForNextUsn function defined earlier.
      If an application uses the WaitForNextUsn function, it should probably sleep for a short period of time before processing the new records. If another application is performing multiple disk operations, a short delay will allow several journal records to be created before they are processed. Otherwise, the Change Journal application will be competing for system resources with the application making the changes. In addition, the Change Journal application may find that it is processing many small chunks of records as another application is doing its work.
      Another way to avoid this problem is to specify BytesToWaitFor or Timeout values. This allows an application to process journal entries only after a large number of new records are generated (or a specified time has passed). If you used the following values in the function defined earlier,
rujd.Timeout = (DWORDLONG) (-2500000000); // 25 seconds rujd.BytesToWaitFor = 16384; // 16 KB
the function would not return immediately when the specified USN is created. Instead, as soon as the USN is created, it will wait until an additional 16KB of raw journal data is available—or every 25 seconds it checks to see if any records exist past the specified USN, then will return. The BytesToWaitFor and Timeout members are particularly useful if you specify filter conditions with the ReasonMask and ReturnOnlyOnClose members. Instead of processing all new records as they appear in the journal, the system will wait until it has at least one record that passes the filter conditions before returning.

Examining the MFT for Last USN
      It is possible to enumerate the MFT entries on a volume to examine the Last USN attribute of each file or directory. This tells you when the last time a file or directory changed, even if the USN record was purged from the journal. The FSCTL_ ENUM_USN_DATA control code will let you find all files or directories with a Last USN in a specified range. The input buffer points to the MFT_ENUM_DATA structure:
typedef struct { DWORDLONG StartFileReferenceNumber; USN LowUsn; USN HighUsn; } MFT_ENUM_DATA, *PMFT_ENUM_DATA;
      The LowUsn and HighUsn members are the USNs used to limit the search. The first time this function is called, the StartFileReferenceNumber member should be set to zero. The output buffer will be filled with as much information as it can hold. The MFT is enumerated from lowest FRN to highest FRN. If more data is available, the first eight bytes of the output buffer will be the FRN to use in the next call to DeviceIoControl with the FSCTL_ENUM_USN_DATA code. If you've enumerated all the data, this FRN will be higher than the highest FRN in use by the MFT; the next call using the FSCTL_ENUM_USN_DATA code will then return FALSE, and GetLastError will return ERROR_ HANDLE_EOF.
      Following the first eight bytes of the output buffer is an array of USN_RECORD structures. The FileReferenceNumber and Usn members specify the Last USN of the file. Although the output buffer is very similar to FSCTL_READ_ USN_JOURNAL, it is not returning records from the journal. Instead, it is just using the USN_RECORD structure as a convenient way to return the Last USN data.
       Figure 3 illustrates how you can list all files that do not have valid journal data. In this case, it will be listing all files with a Last USN below the FirstUsn that is available in the journal.
      The FSCTL_ENUM_USN_DATA code is intended to give applications an easy way to recover from situations where journal data is purged before it can be processed. Imagine the series of events shown in Figure 4. Your application shuts down after it has processed USN 512. A call using the FSCTL_ QUERY_USN_JOURNAL code will show that the NextUsn member contains 640. When your application is restarted, it queries the journal again, but finds out that the FirstUsn in the journal is 1152. Since records from 640 up to, but not including, 1152 were lost, your application might have to throw out all cached information. For some applications, the information you get from the FSCTL_ENUM_USN_ DATA code is enough to reconstruct missing events.
      Specifying a LowUsn of 640 and a HighUsn of 1152 will return the following information.

USN
File
Last USN

128
File2
768

256
File5
1024

This will let the application know that something happened to File2, and you'll discover the newly created File5. Unfortunately, you won't know that File1 was deleted (it's not in the MFT any more, so there's no Last USN), or that File3 changed (you'll see that File3 changed at USN 1152, but you won't know that it changed twice while you were shut down). Depending on the type of application, this may be enough information, or it may help in deciding which files or directories need to be reexamined.
      The FSCTL_READ_FILE_USN_DATA code can be used to get Change Journal-related information about a specific file or directory. The DeviceIoControl function is called with the handle to an open file or directory (not a volume handle as is the case with other Change Journal functions), and the output buffer will be filled with a single USN_ RECORD structure:
BYTE buffer[4096]; // buffer to hold single record USN_RECORD *pRecord = (USN_RECORD *)buffer; DWORD cb; // Get journal related information regarding the open // file or directory specified by 'hFile' DeviceIoControl(hFile, FSCTL_READ_FILE_USN_DATA, NULL, 0, buffer, sizeof(buffer), &cb, NULL);
// Examine pRecord for journal information about the // file
      If the call succeeds, the following members of the returned USN_RECORD structure will be valid: RecordLength, MajorVersion, MinorVersion, FileReferenceNumber, ParentFileReferenceNumber, Usn, SecurityId, FileNameOffset, and FileNameLength. The TimeStamp, Reason, and SourceInfo members will not contain valid information. The Usn member represents the Last USN written to the journal for this file or directory. The actual last record can be read (unless it has already been purged) by specifying the Usn member as the StartUsn in a subsequent call using the FSCTL_READ_ USN_JOURNAL code.

File Name from File Reference Number
      As we mentioned last month, there is no easy way to take an FRN and convert it to a full path. If you want to do this, your applications must keep an internal database of all directories (not all files) on a volume and their FRNs. (Yes, this is a lot of extra code you'll have to write, but we'll explain the benefit of doing this at the end of this section.) It is up to you to implement the database functionality. We'll show you how to populate the database, use it to get the full path from a journal record, and how to keep it up to date while your application is running.
      Developers familiar with the Windows 2000 DDK may be aware of kernel mode APIs or undocumented NTDLL APIs that will convert FRNs to path names directly, but we don't recommend this since Microsoft may change the format of these functions in future releases.
      Let's go over two methods to initially gather all the directory names and their FRNs. For both methods, the directory database will represent every directory on a volume. Each record in the directory database will have the FRN of the directory, the FRN of the parent directory, and the name of the directory (stored as a short name, like system32—not as the full path).
      Here's the first method. The function GetFileInformationByHandle will let you find the FRN as long as you have a handle to a file or directory. The nFileIndexHigh and nFileIndexLow members of BY_HANDLE_FILE_INFORMATION provide the high 32 bits and low 32 bits of the FRN. To convert a directory name to an FRN, you need to open a handle to it. Figure 5 shows how to do this.
      Using the standard FindFirstFile/FindNextFile functions, all the directories on the hard drive can be scanned to build the directory database. Code to do this is provided in the sample application. You don't need to store the FRN of every file on the disk, since journal records contain the file or directory name and parent FRN (which is always a directory). This is why we said last month that the USN_RECORD's members FileNameOffset, FileNameLength, and ParentFileReferenceNumber are more useful than the FileReferenceNumber member alone.
      There is an alternative way to build the directory database using the FSCTL_ENUM_ USN_DATA code. When the journal is active, enumerating the range 0 to NextUsn (as returned using the FSCTL_ QUERY_USN_JOURNAL code) will enumerate every file and directory on the volume. All files on a volume must have a Last USN below the journal's NextUsn value—this is why the system must reset all file and directory Last USN values to zero when the Change Journal is disabled. You can find all directories on the volume by examining the FileAttributes member of the USN_RECORD structure. The USN_RECORD structure will also tell you the name of the directory and its parent's FRN.
      We already showed how to use the FSCTL_ENUM_USN_ DATA code, and the sample code demonstrates how to store the information in the directory database. (The database stores the object's name, FRN, and parent's FRN for every record returned that has the FILE_ATTRIBUTE_ DIRECTORY flag set in the FileAttributes member.) Once the directory database has been constructed, you can convert the FRN of any directory to a full path by walking the chain of parent FRNs up to the FRN of the root directory. Figure 6 shows how to accomplish this. (The function requires the FRN of the root directory, but this can be found by calling the FRNFromPath function described earlier with "D:\".)

Maintaining the Directory Database
      After an application populates the directory database, it will be necessary to keep it up to date as changes occur. Fortunately, the Change Journal will tell you exactly what you need to know to keep the directory database accurate. Again, the sample code shows one implementation of the directory database, but you can come up with your own way of managing it.
      Assuming you've chosen to store just the FRN, parent FRN, and name in your directory database, you can use the logic shown in Figure 7 to maintain it.
      It is very important to process create, rename/new name, and then delete in that order, since more than one can show up in a record. In other words, you might find a record that has rename/new name and delete both set. You must process both of these to maintain your database, and if you do them out of order you'll corrupt your data. Also, note that the rename/new name reason can give you a new parent FRN (the file or directory was moved to another directory), so your ChangeRecord function must take both the new name and new parent FRN as parameters.
      Be careful about storing the full path in your directory database (just store the short name, like system32). If a parent is renamed, you will only get one journal record. By storing just the name, you can reconstruct the current path on demand. To understand this, consider what would happen when you rename Program Files to Pfiles. You'll receive one journal record and modify the appropriate record in your database. If you were storing full paths, you would need to look at every other record in the directory database and adjust Program Files to Pfiles.
      As long as you process all the Change Journal records, the directory database will remain accurate. Applications will only have to repopulate the database if the journal's ID changes, or if you miss some records before they are purged. When your application exits, it can store the database on disk. The next time your application is run, it can reload the data and bring it up to date by walking the journal. This may seem like a lot of overhead, but it actually provides one benefit you'd never get from the operating system. When a directory is destroyed, the MFT no longer keeps a record of its FRN. When querying the journal, the FRNs may belong to directories that no longer exist. If the operating system provided a simple FRN-to-path name function, it would fail for records when their directory is removed, and it would be inaccurate after a directory is renamed.
      If you maintain your directory database in tandem with processing records, you can accurately recover the full path from records. The best approach when populating or maintaining the directory database is to take advantage of all the information in the journal. Here's the logic:
Query the journal for the available FirstUsn.
Use the second method above to populate the database, but limit the enumeration with a LowUsn of zero and a HighUsn of the journal's FirstUsn. This returns just the directories that changed before the first available journal entry. (You don't have records older than this, so you don't have to worry about directories that were deleted or renamed before this point.)
Process the available journal records—maintain the directory database as you read each record.
      By following these steps, you can calculate the correct full path for each record, even after directories are moved or deleted. Again, the sample source code demonstrates how to do all of this correctly.

Supplying Source Information
      Some apps modify files without intending to disturb the contents. For example, a service might want to use a private named stream on a file to store information, but users or other applications should not care that the file has changed. The SourceInfo member described last month for the USN_RECORD structure is how Change Journal applications find out the reason for a change. The FSCTL_MARK_ HANDLE code is used by applications to provide this information. The DeviceIoControl function is called with the handle to an open file or directory—not a volume handle, as with other Change Journal functions. The input buffer points to the following structure:
typedef struct { DWORD UsnSourceInfo; HANDLE VolumeHandle; DWORD HandleInfo; } MARK_HANDLE_INFO, *PMARK_HANDLE_INFO;
      The UsnSourceInfo member uses the same constants defined for the SourceInfo member of USN_RECORD. The VolumeHandle member is created the same way as for Change Journal operations. By requiring a volume handle, only applications with administrative privileges can add source information to Change Journal records. The HandleInfo member is currently reserved, and must be zero. Figure 8 illustrates the way a service should use this functionality to add private information in a named stream.
      It is perfectly OK to use the FSCTL_MARK_HANDLE code even if the journal is disabled. The source information is actually associated internally with the file, not the Change Journal, so the information persists when the journal is disabled.
      If you want to store data in a private stream, it is a good idea to generate your private stream name from a GUID. This prevents accidentally using a name that someone else is using. A good example of a private stream is the new thumbnail view in Explorer (see Figure 9). Instead or calculating thumbnails every time you open a folder, it stores them in a private stream on each file. Microsoft generated a GUID that Explorer uses as the stream name, so it will never conflict with other applications that access the file.

Figure 9 Windows Explorer Thumbnail View

      Figure 9: Windows Explorer Thumbnail View

Wrap-up
      The sample application, CJTest (see Figure 10), monitors the Change Journal and dumps information about records to the screen as they are created. It also lets you dump the current Change Journal statistics or delete the Change Journal on the current volume. Figure 11 shows CJTest as it would display a dump of the current journal information, a file being moved from C:\Directory1 to C:\Directory2, and the directory C:\Directory1 being deleted.

Figure 11 CJTest in Action

      Figure 11: CJTest in Action

      CJTest uses just about every technique described in our two articles. First, it ensures that the Change Journal is always active. It populates a directory database, maintains it with the Change Journal, and persists it to disk between sessions. CJTest monitors the journal for changes, and displays new records as they arrive. CJTest also gracefully recovers from situations where the journal ID changes or the journal is disabled by another application. To test this, a Delete Journal button is provided. CJTest will automatically activate the journal as soon as possible. (It uses FSCTL_DELETE_USN_JOURNAL to be notified when the journal is available if it ever receives ERROR_JOURNAL_DELETE_IN_PROGRESS.) When CJTest starts, it uses the Change Journal, if possible, to bring its cached directory database up to date. Since each volume may have its own Change Journal, CJTest just uses the current drive letter when picking a volume to examine.
      Before you begin writing your own code, play around with CJTest and see how the system behaves. You should now have the tools to create a useful Change Journal app.

For related information see: NTFS Change Journal at: http://msdn.microsoft.com/library/psdk/winbase/fsys_3xrg.htm.
Also check http://msdn.microsoft.com for daily updates on developer programs, resources and events.

From the October 1999 issue of Microsoft Systems Journal.