Making Room for Long Filenames

Nancy Winnick Cluts
Microsoft Developer Network Technology Team

January 1995

Revised: August 1996 (new information regarding Windows NT 3.51 and Windows 95)

Abstract

Until recently, if you wanted to save information to a file in a Microsoft® Windows®-based application, you were limited to eight characters for the filename plus a three-character extension. As a result, your directories were filled with cryptically named files such as StkRept.xls or Stat794.doc. This limitation was removed when Microsoft Windows NT® version 3.1 was released. The current version of Windows, Windows 95, also brings the power of long filenames to the masses. This article covers the different file systems supported by Microsoft Win32® and what you, as a developer, must do in your applications to support long filenames.

Stk4Q94.xls?

The limitation on the length and content of filenames in previous versions of the Microsoft® Windows® operating system has been a source of much (deserved) grumbling among application developers and end users alike. How many times have you saved a file with a name such as Stk4Q94.xls or StJul94.doc when it would have been much nicer if you had been able to name these files more descriptively: "Stock report analysis 4Q 1994" and "Monthly status report for July 1994"? The good news is that the new operating systems developed by Microsoft (Windows NT 3.1 and Windows 95) support long filenames. If you are using the Win32® application programming interface (API) to develop your application for Windows NT or for Windows 95, you will have the ability to support long filenames for your end user.

This article details some of the things that a developer must do to support long filenames within an application. I have attempted to make it as interesting as possible, and you will hopefully only fall asleep once or twice during your reading. I will cover the following topics (so if you aren't interested in one, go ahead and skip it):

Is Your System FAT?

Win32-based applications rely on file systems to store and retrieve information on mass storage devices such as hard drives and floppy drives. File systems organize data on fixed disks and, sometimes, on floppy disks. They provide the ability to create and access files and directories on the individual volumes associated with the devices in an application. Depending on the configuration of a computer, a Win32-based application may have access to volumes managed by any of the following file systems:

FAT File System

The FAT file system organizes data on fixed disks and floppy disks. If you have a FAT file system running on your computer, and if you are running a version of Windows previous to Windows 95 or Windows NT 3.51, you have no doubt slammed up against its original filename convention, lovingly referred to as 8.3 (pronounced eight-point-three by most propeller heads). This is because a FAT filename consists of a filename (up to eight characters), a separating period (.), and a filename extension (up to three characters). Windows NT 3.51 supports what is known as FastFat, and Windows 95 supports VFAT; both of these versions of the FAT file system support long filenames.

The main advantage of FAT volumes is that they are accessible by MS-DOS®, Microsoft Windows, and Microsoft OS/2® systems. FAT is currently supported on floppy disks and other removable media. The major disadvantages of a FAT volume (previous to Windows 95 and Windows NT 3.51) are the limitations on the length and content of filenames, and the lack of Unicode support.

Valid FAT filenames have the following form:

[[drive:]][[directory\]]filename[[.extension]]

The drive parameter specifies the name of an existing drive; it can be any letter from A through Z. This parameter must be followed by a colon (:). A filename can include embedded spaces, with the restriction that the spaces be preceded and followed by one or more letters, digits, or special characters. For example, the string "disk 1" is a legal value for a filename. FAT volumes do not distinguish between uppercase and lowercase letters; therefore, the filenames "ALPHABET.DOC" and "alphabet.doc" would be the same file if accessed under the FAT file system.

The directory parameter specifies the directory that contains the file. It is separated from the filename by a backslash (\). This parameter must be fully qualified (that is, it must include the names of all directories in the file's path) if the desired directory is not the current directory. A directory name can consist of any combination (up to eight characters) of letters, digits, or the following special characters:

$ % ‘ – _ @ { } ~ ` ! # ( )

A directory name can also have an extension (up to three characters) of any combination of letters, digits, or the previously listed special characters. The extension is preceded by a period.

Protected-Mode FAT File System (VFAT)

The VFAT file system organizes data on fixed and floppy disks just like the FAT file system. It is also compatible with the FAT file system, using file allocation tables and directory entries to store information about the contents of a disk. VFAT supports long filenames by storing these names and other information, such as the date and time the file was last accessed, in the FAT structures. It allows filenames of up to 255 characters, including the terminating null character. This is similar to NTFS, which allows filenames of up to 256 characters. VFAT allows paths of up to 260 characters, including the terminating null character. Do remember, though, that the path includes the full filename, so if you happen to have a filename that is 255 characters long, you will only have 4 characters left for the path (the last character will be the NULL terminator).

New Technology File System (NTFS)

NTFS organizes data on fixed disks but not on floppy disks (that is, you cannot format a floppy disk to be NTFS). NTFS supports object-oriented applications by treating all files as objects with user- and system-defined attributes. It provides all the capabilities of the FAT and HPFS file systems without many of their limitations.

NTFS is also a fully recoverable file system. It is designed to restore consistency to a disk after a CPU failure, system crash, or I/O error. So, if you crash your NTFS volume, chances are that you will be able to recover your data. NTFS allows the operating system to recover without your having to use the autochk or chkdsk command. This saves the user a lot of time when he or she reboots after a system failure. How many times have you crashed your HPFS volume and then had to sit and wait while chkdsk ran on your 1-gigabyte drive? Did you remember to bring a book to pass the time during the wait?

NTFS does provide chkdsk and autochk in case the recovery fails or corruption occurs outside the control of the file system. NTFS also includes features not present in HPFS or FAT, such as security, Unicode filenames, automatic creation of MS-DOS aliases, multiple data streams, and unique functionality specific to the POSIX subsystem. NTFS follows the filename conventions described in the HPFS section, but it also supports Unicode filenames, which is implemented internally in Windows NT. NTFS cannot manipulate a file's extended attributes if the file was created on HPFS.

Note   When saving a file from an MS-DOS or Windows 3.x application on an NTFS volume, if that application saves to a temporary file, deletes the original file, and renames the temporary file to the original filename, the long filename is lost. Any unique permissions set on that file are also lost. Permissions are propagated again from the parent directory.

High-Performance File System

HPFS organizes data on fixed disks but not on floppy disks (like NTFS). Filenames under HPFS can be up to 254 characters in length and can contain characters that are not valid for the FAT file system, such as spaces and periods. In many cases, accessing files under HPFS is faster than accessing similar files under the FAT file system. Blank spaces can be used anywhere in an HPFS filename or directory name, but blank spaces and periods at the end of a filename are ignored. So, the files "Test 1 " and "Test 1" would be treated as if they were the same filename. There is no requirement that HPFS filenames have extensions; however, many applications still create and use them. The following special characters can also be used in HPFS filenames:

, + = [ ] ; _

An HPFS filename can be all uppercase, all lowercase, or mixed case. The case is preserved for directory listings but is ignored in file searches and all other system operations. Therefore, in a given directory, there cannot be more than one file with the same name when the only difference is case.

Determining Which File System Is in Use

To determine which file system your application is currently running under, an application makes a call to the GetVolumeInformation function. This function returns information about the current volume, such as the maximum length of filenames. Once you have called this function, you can use the value returned as the maximum file length within your application and dynamically allocate a buffer for your filenames and for paths. This is preferable to using static buffers for filenames and paths. If you absolutely must use static buffers, reserve 256 characters for filenames and 260 characters for paths. The GetVolumeInformation function is as follows:

BOOL GetVolumeInformation(lpRootPathName, lpVolumeNameBuffer, nVolumeNameSize, 
   lpVolumeSerialNumber, lpMaximumComponentLength, lpFileSystemFlags, 
   lpFileSystemNameBuffer, nFileSystemNameSize)

where

General Guidelines for Supporting Long Filenames

Listed below are general guidelines that apply to all file systems supported within Windows. An application that follows these guidelines can create valid names for files and directories regardless of the file system in use.

Bad Assumptions

Since Windows has resided on FAT-only volumes for so long, it is natural that some developers cut corners and made some assumptions based on the 8.3 convention. This section lists several of the most common assumptions made by developers based upon this convention. If you are planning to support long filenames in your application, look for these things in your code.

Assumption: The file extension is three characters at maximum.

This is true if you are running under the FAT file system, but if you are running under NTFS, HPFS, or VFAT, the maximum length is best determined by a call to GetVolumeInformation. I make this assumption most often when trying to filter files based upon the file extension (the file type) or when I am stripping off the file extension.

Assumption: The path is twelve characters at maximum.

As mentioned previously, if you are running under an operating system that supports long filenames, twelve-character buffers may not have enough space for all of the characters in the filename. Consider the following code snippet, where it is assumed that the buffer was twelve characters long.

TCHAR szFile[12]= "\0";
char *lpBufPtr;

strcpy( szFile, "");

// Fill in the OPENFILENAME structure to support a template and hook.
OpenFileName.lStructSize       = sizeof(OPENFILENAME);
OpenFileName.hwndOwner         = hWnd;
OpenFileName.hInstance         = g_hInst;
OpenFileName.lpstrFilter       = NULL;
OpenFileName.lpstrCustomFilter = NULL;
OpenFileName.nMaxCustFilter    = 0;
OpenFileName.nFilterIndex      = 0;
OpenFileName.lpstrFile         = szFile;
OpenFileName.nMaxFile          = sizeof(szFile);
OpenFileName.lpstrFileTitle    = NULL;
OpenFileName.nMaxFileTitle     = 0;
OpenFileName.lpstrInitialDir   = NULL;
OpenFileName.lpstrTitle        = "Open a File";
OpenFileName.nFileOffset       = 0;
OpenFileName.nFileExtension    = 0;
OpenFileName.lpstrDefExt       = NULL;
OpenFileName.lCustData         = NULL;
OpenFileName.lpfnHook          = ComDlg32DlgProc;
OpenFileName.lpTemplateName    = MAKEINTRESOURCE(IDD_COMDLG32);
OpenFileName.Flags             = OFN_SHOWHELP | OFN_EXPLORER | OFN_ENABLEHOOK | 
                                    OFN_ENABLETEMPLATE;

// Call the common dialog function.
if (GetOpenFileName(&OpenFileName))
{
.
.
.

}
else
{
   ProcessCDError(CommDlgExtendedError(), hWnd );
   return FALSE;
}

If the user entered a filename that was more than twelve characters maximum, the File Open common dialog box will return with an error of FNERR_BUFFERTOOSMALL. The common dialog box saved us from a nasty trap. But the buffer being too small is not the only problem. If you are doing your own file parsing, and if you accept only the first twelve characters in a filename, you can end up opening the wrong file. Consider the case where the user entered the filename "Marketing report" and the current directory contained the files "Marketing report" and "Marketing salaries." If your application only accepted the first eight characters for the filename and assumed an extension, which file would the application open?

Assumption: There is only one period in a filename.

The FAT file system allows only the period delimiter in the filename. But what happens if you have an application that scans through a filename looking for a period in order to find the file extension? Under FAT, you know that the three characters that come after the period are the file extension; however, under file systems that support long filenames, this is not true. Here's a bit of code from one of my samples that relies upon this (shame on me!):

  // Strip off the extension, if any
  if (pDot = strstr(szLink, "."))
    *pDot = (char)NULL;

  // Add in the .LNK extension
  lstrcat (szLink, ".LNK");

Had there been more than one period in the filename, my code would have failed to create a file of the correct type. A better way to get the name of the file sans file extension is to use a string function that returns the pointer to the extension by checking the string from the reverse:

  if (pDot = strrchr(szLink, '.'))
    *pDot = (char)NULL;

  lstrcat(szLink, ".LNK");

Assumption: There are no spaces in filenames.

Under the FAT file system, you were allowed to have a filename that included spaces as long as each space was preceded and followed by a non-space character. This is no longer the case under VFAT. You can now have a filename such as "This has lots of spaces ", which includes many spaces and a trailing space character.

Assumption: The '+' character is an invalid character within a filename.

The '+' character is now a valid character within a filename under the operating systems that support long filenames. In other words, if I wanted to be incredibly romantic and name a file containing information about me and my husband, I could name it "Nancy + Jonathan". Of course, I would really never do that!

User Interface Considerations

I have given you the basics that you need to consider in your application when supporting long filenames internally. But what about the user interface of your application? Are there some considerations to bear in mind there, too?

In general, you need to make sure that your edit fields, list boxes, and static text strings have enough space allocated for long filenames. Consider the old-style File Open dialog box. It contains a field for entering the filename, but the field isn't really all that large. It is easy to see that, for a fairly long filename, the user would have to scroll horizontally to see the whole filename. This can lead to confusion among files for the user.

Figure 1. The old-style File Open common dialog box

The design of the new common dialog box for opening files has taken this problem into consideration. Notice how the box used for entering or displaying current filename and path information has been expanded to allow more of the filename to be displayed without scrolling.

Figure 2. The new File Open common dialog box

In short, take a look at any dialog boxes that you may be using and be sure to update the width of the box to accommodate a longer filename. If you are sticking with the common dialogs, of course, you don't even have to worry about this because the common dialog box library takes care of it for you.

Summary

Now that you've read this article, you should have a better understanding of the different file systems available for use under the different Windows operating systems. You should also be able to take a look at your current applications and alter them as needed in order to support long filenames. It is important to remember that if you decide not to support long filenames within your application, you may find that your application will not interoperate with other applications well. Or, if you programmed with some bad assumptions, you may even find your buffers overflowing. And we all know that overflowing buffers are a very bad thing in the computer world.

Suggested Reading

The following references contain more detailed information about long filenames and file systems.

Oney, Walter. "Unconstrained Filenames on the PC! Introducing Chicago's Protected Mode FAT File System." Microsoft Systems Journal 9 (August 1994). (MSDN Library Archive, Books and Periodicals)

Richter, Jeffrey M. Advanced Windows NT.

Windows NT Resource Kit, Resource Guide, Chapter 5: "Windows NT File Systems and Advanced Disk Management." (MSDN Library, Platform, SDK, and DDK Documentation, Windows Resource Kits)