Safer Functions for Working with MS-DOS(R) Files

David Thielen

{ewc navigate.dll, ewbutton, /Bcodeview /T"Click to open or copy the code samples from this article." /C"samples_4}

Opening a file under the MS-DOSÒ operating system seems like a simple operation. In reality, the manner in which a program opens a file under a given environment can adversely affect not only that environment (whether it’s the MicrosoftÒ WindowsÔ operating system, DESQviewÒ, or MS-DOS1), but the program itself.

To complicate life further, there are several MS-DOS file open functions. The more valuable ones work only under certain newer versions of MS-DOS, leading to version-dependent code. I have attempted to create a set of functions that will handle any combination of file open requirements.

Before discussing the file open process, the limit on the number of available file handles must be examined. This is a subject about which there is mass, total, and absolute confusion.

When you set files equal to, say, 70 in CONFIG.SYS, you are actually specifying the number of System File Table (SFT) entries. This is the maximum number of different files that can be open at any time. If an application duplicates the file handle of a file it has already opened, it uses the same SFT entry; the limitation is on the number of different files. The way MS-DOS is designed, there can never be more than 255 SFT entries.

The limit set by the user in CONFIG.SYS can be extended by linking in additional SFT entries (since it is implemented as a linked list). Some applications do in fact add more SFT entries if they run out of file handles, but this is a very bad idea. When your application exits, what happens if another application has an open file in one of the SFT entries in your data space? (Windows2 increases the number of SFT entries, but it’s considered an operating system, so it’s safer for Windows.) The simpler and cleaner answer is, if there are not enough SFT entries, your application should tell the user to increase the files= value in CONFIG.SYS up to 255. When your application starts up, it is reasonable to read the SFTs to determine if there are enough available handles.

The second limit on the number of open files is the Job File Name (JFN) table . Every application has its own JFN table, which is a table of bytes that maps application file handles to the correct SFT entries. Prior to MS-DOS 3.3, each JFN table was in the Program Segment Prefix (PSP) and consisted of 20 entries. Beginning with MS-DOS 3.3, the PSP contains a pointer to the JFN table, allowing each JFN table to have more than 20 entries (see Figure 1). However, when a program starts, the JFN table pointer points to the same 20 bytes in the PSP.

Figure 1 Relationship of JFN Tables to the System File Table

Regardless of the number of available SFT entries, if an application’s JFN table is filled, the application cannot open any more files. Likewise, regardless of the number of available entries in the JFN table, if the SFT is all filled, an application cannot open any more files. To open a file, there must be an empty slot in the JFN table for the requesting application as well as an empty SFT entry.

The number of entries in a JFN table can be increased in MS-DOS 3.3 and later with Function AH=67H. This function will increase the JFN table to the requested size up to 64KB. Although you are limited to 255 SFT entries, duplicate handles map to the same SFT entry, so it is possible to have 64K file handles. This also means that you can increase the size of the JFN table to more than the files= value. Usually, increasing the JFN table size beyond the number of free SFT entries is not terribly useful.

Also, some run-time libraries (like the Microsoft C version 6.0 and C/C++ version 7.0 run-time libraries) have an internal structure for each file handle. The run-time library generates an array of 20 of these structures per application to handle file opens. If you increase the size of the JFN table, you can call MS-DOS directly to open more than 20 files but you cannot use the open call. To use the open call, you need to increase the value of the _NFILE_ constant in the startup code and rebuild the startup code, increasing the size of this array. In Borland C++ 3.0, you need to define _NFILE_ in _NFILE.H and then recompile FILES.C and FILES2.C. (Don't blame me, I didn't design it.)

To complicate things further, if you run out of SFT entries under Windows, Windows attempts to create more of them. However, Windows adds the extra SFT entries to the Windows virtual machine, so the extra entries are available to Windows-based applications only. (Windows works this way because Windows itself, not WIN386, is the one adding SFT entries.) Unfortunately, Windows does this only after all existing SFT entries are used up. The result is that Windows-based applications are still able to open files when MS-DOS-based applications are out of file handles. The answer to this is not to run the MS-DOS-based application first, but (again) to increase the files= value in CONFIG.SYS.

If you run out of SFT entries, first ask the user to increase the files= value to one that gives you enough free SFT entries and have them reboot after changing it. Call Function 67H, Set Maximum Handle Count, to increase your JFN table size (it’s cheap—1 byte per file). And finally, rebuild the startup code so you can open as many files as you need. With these actions, you can actually open as many files as you wish.

FCBs

FCBs were the method of file I/O in DOS version 1.0. Handle I/O came with MS-DOS 2.0 and has been the recommended method since then. Although FCBs are still in use and new programs that use FCBs still occasionally appear, you do not want to use them. They are a much slower method of file I/O and they offer much less functionality. For example, FCBs are not aware of directories. They can only open files in the current directory. Furthermore, they cause some major problems for current versions of MS-DOS. FCBs keep their data in the calling application. They were designed before the advent of large disk drives or sharing; there is no space in the FCB structures for the extra data needed to track this information. MS-DOS has to work like crazy to handle FCB calls on large drives or on systems connected to networks. They do work but are much slower.

One advantage of FCBs is that there is no limit to the number of files you can have open. I still don’t recommend using them. If you really do need over 20 files open at once, simply require MS-DOS 3.3 or greater. After all, if you have that many files open, your users most likely have large hard disks.

Also, ask yourself if you really need to keep over 20 files open at one time. I’ve used db_Vista, an excellent database, for an application in which 19 database files needed to be accessed. I also needed up to 6 other files open at the same time. (I closed the default handles 4 and 5.) On versions of MS-DOS prior to 3.3, my application told db_Vista to limit itself to 11 open files. (db_Vista has a call to tell it how many file handles it can have.) This forced db_Vista to invoke its mechanism for reusing file handles. Under version 3.3 and later, I incremented the handle count and gave db_Vista its full 19. The application is a little slower running on versions of MS-DOS prior to 3.3 but it is more solid and a lot faster than if I had used FCBs. It is doubtful that any application would be faster using FCBs than it would be opening and closing files to share handles.

Sharing

Sharing first became a concern with the introduction of networks. At the time, the vast majority of software was not network-aware. This was generally not a problem because few attempted to use network-ignorant programs over networks.

Sharing was implemented in the Share program that came with DOS 3.0. If you don’t run Share (it’s a TSR), you have no sharing: two applications that open a file asking for exclusive access will both succeed. Prior to Windows, users without networks didn’t load Share as it just wasn’t a concern. However, the popularity of Windows means that it is now very common for programs that are network-ignorant to be running in multiple MS-DOS boxes on the same system. A network-ignorant program can crash a system that doesn’t even have a network loaded.

One place this occurs in Windows is when an MS-DOS-based program opens a file and assumes that no one else will touch it. It will then start reading from that file. Meanwhile, another program running in another MS-DOS box deletes the file. The first program is now lost; it is reading from a file that it assumes is still there. Things get even more interesting if two copies of a database program add records to the same file at the same time. To a user it is perfectly logical that if the program will allow them to access a database from two MS-DOS boxes, then it is OK to add from both at the same time.

Sharing was designed to resolve this problem. First, sharing lets the application define the sharing mode in which a file is opened. The file remains in this mode until closed. Sharing also allows regions of a file to be locked temporarily while records in the file are being updated.

Sharing modes (see Figure 2) are used to stop other applications from affecting a file in a way that will damage your application. There are several types of modes. The first is called compatibility mode. When MS-DOS 3.0 was designed, it had to handle all of the applications written for MS-DOS 2.x. Fortunately, these applications all opened files with the sharing bits set to 0. (Sharing bits existed even though sharing was not implemented and these bits were unused.) If the sharing bits are set to 0, the default is to open the file in compatibility mode. Sharing bits are set when you define sharing modes.

Figure 2 Sharing Modes

Value Meaning

OPEN_ACCESS_READONLY (00H) Open the file for read-only access.
OPEN_ACCESS_WRITEONLY (01H) Open the file for write-only access.
OPEN_ACCESS_READWRITE (02H) Open the file for read-and-write access.
OPEN_SHARE_COMPATIBILITY (00H) Used to make MS-DOS 3.0 or later programs compatible with MS-DOS 2.0. This is the default sharing value. Do not use.
OPEN_SHARE_DENYALL (10H) Do not permit any other program to open the file.
OPEN_SHARE_DENYWRITE (20H) Do not permit any other program to open the file for write access.
OPEN_SHARE_DENYREAD (30H) Do not permit any other program to open the file for read access.
OPEN_SHARE_DENYNONE (40H) Permit other programs read or write access, but no program may open the file for compatibility access.
OPEN_FLAGS_NOINHERIT (80H) A child program created with Load and Execute Program (Function 4B00H) does not inherit the file handle. If this mode is not set, child programs inherit the file handle.

Once an application has opened a file in compatibility mode no others may open that same file, with two exceptions. If a file is read only, other applications can open the file in read-only mode. Also, the application that opened the file can open it as many additional times as it wishes. In fact, many users find that they cannot load Share if they are going to run certain applications. Yet this is all avoidable if applications do not open files in compatibility mode.

Compatibility mode was not designed for use by new applications but rather to handle existing applications. Unfortunately, because of its name, many developers assumed that to keep their applications “compatible,” files should be opened in compatibility mode. You should not use compatibility mode when opening a file—it is designed for applications written before MS-DOS 3.0 was released.

There are two things to consider when opening a file with the other sharing modes. First you need to determine what you want to do to a file, and second what you are willing to allow other applications to do to that same file.

When you open a file you may want the user to read and/or write to it. Many developers assume that if users will be writing to a file they might as well get read access also. Don’t; it can make a difference to another program. You tell open what you wish to do to the file (read, write, or read and write). Based on other file opens already made on the file, Share determines if the requested open can succeed. If the file open succeeds, MS-DOS will be set to allow only the requested access to the file.

When setting your sharing mode for the file, you have four choices: you can set an exclusive lock, disallow any other writes, disallow any other reads, or allow other opens of any type. You want to deny as little as possible. Not only do you want to allow other applications to access the file, but if another application already has the file open, setting your denys too strict may cause Share to fail your open. If you set deny_all and another application already has a file open in read_only, deny_none mode, your open will fail where an open with deny_write would succeed.

Also, in some cases, you may want to try a second method if the first fails. If you are going to print a file, read_only, deny_write is the obvious mode to open in. If that fails, you may want to open in read_only, deny_none. (Then again, you may not, the risk being that the file may be only half-written.)

File Errors

When an open fails, most programs assume that it is because the file doesn’t exist. Don’t make this assumption. If a program can’t open a file, don’t print out the message File Not Found. It can be very painful for the user to try to determine why a program doesn’t find a file that obviously exists.

There are a small number of common errors. For each of these errors, you want to return a specific error message, the error number, and the complete filename. Otherwise, a user who has two copies of a file may assume it is one file while the program is opening the other.

Since each version of MS-DOS adds additional error numbers, you can never assume you know all possible error numbers. Figure 3 lists a sample File Open error message routine.

Temporary Files

Temporary files add extra concerns. First, if your application needs a temporary file, keep in mind that a lot of your customers will have systems with EMS or XMS. If you will not be filling up this extra memory, you should use EMS and XMS and only go to disk if you run out of extra memory. Memory is a lot faster than a disk drive, and it is noticeably faster than a RAM disk so this is an easy way to speed up your program.

A CreateTemporary File call (Function 5AH) was added in MS-DOS 3.0. The name returned to you is guaranteed to be unique. For versions of MS-DOS prior to 3.0, use the system time to create a unique filename. But in any version of MS-DOS (including 5.0), you can only create a temporary file in compatibility mode. Therefore, after creating a temporary file, you should close it and reopen it with read_write, deny_all.

The location of a temporary file is critical. Many systems have RAM disks primarily for temporary files. However, RAM disks generally have limited free space, which may not be enough for some temporary files. The convention is to set the environment variable TEMP or TMP to a specific directory that is used for temporary files. Unfortunately, most applications assume that TEMP/TMP is set to a single path. A better setup would be to have an environment variable like path that points to both a RAM disk (fast) and a fixed disk (large).

The source code in OPENTEMP.C (see Figure 4) includes an environment variable, TempPath=, that is allowed to have multiple directories separated by semicolons, the same as path. Furthermore, the directories on TempPath progress from fast to slow and from small to large. If you need a large temporary file, start at the end of TempPath and work to the front until you find a disk with enough space. If you want speed, start with the first and work through until you find one with enough space. If TempPath doesn’t exist or has no acceptable directories, try TEMP and then TMP. If all of those fail, try the default directory.

When you call the close function on a temporary file, the file is not deleted. The file must be deleted after the close, or you will not only litter the temporary directories with unused files, you will be taking up needed disk space.

Committing a File

Of interest in some cases, such as transaction processing, is the ability to commit a file. Committing a file is ensuring that all data, directory entry information, FAT chains, and so on related to a file are written to the disk. The notion is that if you lose power after a file is committed, the file is completely written to the disk and you will not have lost it.

Starting with the DOS 4.0 extended open, you can set a bit on the open telling it to commit all writes. On each write, MS-DOS will write the data, any FAT chain changes, and an updated directory entry. However, if you don’t need a commit on each write, this creates a large amount of overhead, even under MS-DOS 5. Starting with MS-DOS 3.3, Function 68H lets you commit a file. This call is separate from a write and will cause MS-DOS to write all dirty buffers for the file, including FAT tables, and MS-DOS will update the directory entry for the file. If you do not need a commit on every write, this call will give it to you only when needed, so it’s a lot more efficient.

For versions of MS-DOS prior to 3.3, there is a solution that usually works. If there is a free file handle, you can duplicate an existing handle and then close the duplicated handle. The close will cause the file to be updated to the disk but still keep the original file handle valid. If there are no free handles, you can flush the disk using Function 0DH, which will cause all dirty buffers to be written, but this does not update the directory entry. Your only other alternative at this point is to store the filename and just close/open it if there are no free handles. This is slower than the dup/close method so use it only if you can’t duplicate a handle.

You also need to keep in mind that if there is a cache under MS-DOS (and most systems today do have a software cache), MS-DOS might believe that the data is on disk while it is actually in the cache’s memory. I know of no way to avoid this at present. (You definitely don’t want to eliminate caching because it makes MS-DOS so much faster.) If possible, use a cache that flushes its buffers before returning from calls that could cause a commit. Most caches will flush their buffers when they see a commit call.

TrueName

You also need to know the true, fully qualified filename. The use of append, join, subst, and networks means that drive C: could actually be anything. If you close a file and then need to open it later, Z:\DAVE could now be something totally different. If you need to save a filename for future use, whether that use is one clock cycle or one year away, use TrueName to get the name of the file you want to save.

TrueName uses an MS-DOS 3 function that returns the full canonical name of any file (it is covered in the book Undocumented DOS by Andrew Schulman et al, published by Addison Wesley, 1990). This call does not verify the existence of a file, just that you could create a file with the specification you passed in, and it gives you what that name truly is. Unfortunately, TrueName cannot be called from Windows.

I have provided a set of file management functions (see “The Functions” sidebar and Figure 4) that let you avoid the problems outlined here. You may use these in programs for MS-DOS, Windows, and assembly language. A program that demonstrates these functions, TEST.C, is available on any MSJ bulletin board.

The Functions

unsigned CloseTemp (BYTE *pFile, int hFil);

Used to close a temporary file (opened with OpenTemp). This call will truncate the file, close it, and delete it.

pFile Pointer to the file name; used for delete.
hFil Handle to the file.
Return Error on delete. If no error on delete, error on close. 0 if no errors.

void CommitFile (int hFil, int fOpenCommit);

Used to commit a file. Always successful for MS-DOS 3.3 and up. Below MS-DOS 3.3, always successful if there is a free file handle or the file has not changed in size. Generally has no effect on disk caches flushing to disk.

If an app is running under MS-DOS 4.0 or greater and set commit is on in the extended open, there is no need to flush the file. Since an application knows how it opened a file, fOpenCommit is used to denote that a file was opened with commit on. Therefore, if fOpenCommit is true and the version of MS-DOS is greater than 4.0, CommitFile doesn’t do anything.

hFil Handle to the file.
fOpenCommit True if the file was opened with O_COMMIT.

int OpenFile (BYTE *pFile, unsigned uMode, unsigned short uAtr, unsigned *puErr);

Opens a file using the best create/open calls available under the version of MS-DOS running. Forces all opens and creates to use a sharing mode (cannot open in compatibility mode).

pFile For O_OPEN or O_CREATE, is the name of the file to be opened or created. Treated as a constant. For O_TEMP, holds the directory to open a temporary file in. Returns the full name of the file opened.
uMode Used to set the following:

  O_READ Can read from the file.
  O_WRITE Can write to the file.
    You can set either one or both of these. You must set at least one.
  O_OPEN Open a file (must exist).
  O_CREATE Create a file (can’t exist).
  O_TEMP Create a temporary file.
    You must set one of these. Set O_OPEN | O_CREATE | O_TRUNC if you want to force a create whether a file exists or not.
  S_DENY_READ Lock out other reads.
  S_DENY_WRITE Lock out other writes.
  S_DENY_CHILD Don’t pass handle to children.
    Set S_DENY_READ | S_DENY_WRITE for DENY_ALL. If neither is set,it is opened with DENY_NONE. You cannot get compatibility mode.
  O_COMMIT All writes will go to disk.
  O_APPEND Sets the pointer to the end of the file.
  O_TRUNC Sets the file to 0 length.

uAtr Sets the file attributes on a create. Uses the A_ defines in FILE.H.
puErr Returns the MS-DOS error number if the open fails. Returns 0 if the open is successful.
Return: Returns –1 on an error; otherwise returns the file handle.

int OpenTemp (BYTE *pFile, long lNum);

Opens a temporary file and returns the full true name of the temporary file in pFile (pFile is for return only). Attempts to first open the file in the directories pointed to by TempPath, then TEMP, then TMP, then the default directory. The file will be set to the length in lNum, guaranteeing at least that much space available in the file.

Temporary files are opened with O_READ | O_WRITE | S_DENY_READ | S_DENY_WRITE. This means your application can do anything you want to them and no other application can touch them.

pFile The true name of the temporary file created.
lNum The file will be set to this length.
Return: –1 on an error. The file handle if successful.

unsigned TrueName (BYTE *pFile, BYTE *pTrue);

Returns the true, full canonical file name for a filename passed in. The name passed in doesn’t have to exist but it must be creatable. The name can also be just a directory, instead of both a directory and a filename (that is, you can pass in C:\).

pFile The name to convert.
pTrue The true filename (should point to at least 128 bytes).
Return: 0 if OK. MS-DOS error number if an error (usually an illegal name in pFile).

1For ease of reading, “MS-DOS” refers to the Microsoft MS-DOS operating system. “MS-DOS” refers only to this Microsoft product.

2For ease of reading, “Windows” refers to the Microsoft Windows operating system. “Windows” refers only to this Microsoft product.