Rebasing Win32 DLLs: The Whole Story

Ruediger R. Asche
Microsoft Developer Network Technology Group

September 18, 1995

Click to open or copy the files in the PAGETEST sample application for this technical article.

Abstract

This article discusses the ramifications of dynamic-link library (DLL) rebasing under both Microsoft® Windows NT™ and Windows® 95. ("Rebasing" in this context refers to the process of changing the base address of a DLL in memory space.) A sample application accompanies this article, as well as a suite of DLLs to provide comparison figures.

Introduction

One of the questions I have heard a lot recently from developers at Microsoft is, "Gee, what happens if the operating system has to rebase my DLLs? What is the penalty for that, and is there any way that I can prevent the penalty? Is there any way I can change the code to generate fewer fixups?"

I thought that was a really good question, so I decided to temporarily relocate to Empiric-land, investigate the costs of DLL loading, and pour a bucket of numbers at your feet so that you can decide for yourself what to do about the DLLs.

The results presented in this paper are probably not revolutionary, nor are they surprising: Prefer one large DLL over several small ones; make sure that the operating system does not need to search for the DLLs very long; and avoid many fixups if there is a chance that the DLL may be rebased by the operating system (or, alternatively, try to select your base addresses such that rebasing is unlikely). However, as the old saying goes, "The journey is the goal." In other words, on the way to writing this paper, I found a number of little things about DLLs and memory management that I think are worth sharing. A more appropriate title for the paper might actually have been "Bits and Pieces about DLLs."

In this paper I describe a sample test application that I wrote to measure DLL loading times, as well as providing a set of DLLs to measure.

The Application

The architecture of the test set to measure dynamic-link library (DLL) load times is very simple: The PAGETEST application, written using the Microsoft® Foundation Class libraries (MFC), consists of two threads. The first (main application) thread creates and owns a mutex object. This first thread samples the current time and then calls LoadLibrary to explicitly load one of a number of libraries I provide (I discuss the libraries in the next section). Meanwhile, the second thread waits for the mutex object to become signaled.

All of the libraries consist of the DLL entry procedure only. In the PROCESS_ATTACH dispatch point of the DLL entry procedure, the mutex object is signaled. At that point, the secondary application thread wakes up and computes the difference between the current time and the time sampled before LoadLibrary was called. This difference is roughly the elapsed time that was used to load the DLL into memory. The MFC application has an option to load and unload the DLL repeatedly (50 times) so that a meaningful average loading time can be computed.

I will not discuss the specifics of the application here—it's a fairly straightforward MFC application, with all the relevant code located in the view class. The view is derived from CEasyOutputView to provide for easy display of results. (Please see "Windows NT Security in Theory and Practice" for details.)

Note that this empirical test has a number of drawbacks that may distort the actual results:

We are assuming that the time-sampling mechanism is efficient, reliable, and has a granularity that is fine enough. (I use the system performance counters.)
We are assuming that the thread-switching mechanism is consistently efficient and does not have too bad an effect on the time it takes to wake up the secondary thread.
The results from the tests naturally depend to a high degree on the underlying hardware (that is, the processor speed of the machine on which the test is run, the number of processors, the speed of the hard disk controller, and so forth).
The results were sampled using specific versions of the software involved (operating system versions, C run-time library versions, and so forth).
Normally most DLLs are loaded implicitly, not explicitly, and it is assumed that the implicit and explicit loading of DLLs takes the same amount of time, all other things being equal.

To make things worse, the numbers I did obtain vary widely at times.

Thus, you should take the results of the tests with a grain of salt. The most important deduction to make from the results is not the absolute load times, but the relative times—in other words, how changing one property changes the loading behavior, and how different strategies compare to each other.

If you wish to recreate the test results on your machine, follow the DLL positioning instructions in the next section, run PTAPP.EXE, and choose Run All Tests from the Multiple Test menu.

The DLLs

I considered a number of properties of DLLs relevant to their load time:

The size of the library
The number of items to be relocated
Whether the DLL initializes C startup code or not
Whether the DLL exports symbols or not
Whether the DLL implicitly links to other libraries or not
How long it takes the operating system to locate the DLL executable

Aside from these issues, there are also a few factors independent of the DLL that determine how slowly or quickly a DLL can be loaded—for example, the underlying operating system, the overall current work load on the machine, the application's working set, whether the DLL needs to be rebased, and so forth.

To make a long story short, I wrote 18 little (or not-so-little) DLLs that represent almost all permutations of the following properties:

DLL size (can be small or large)
- If it's a large DLL, has load-time fixups or not
C run-time support (none, implicitly linked, or explicitly linked)
Symbols exported from the DLL (yes or no)

I loaded each of the 18 DLLs, under both Windows NT™ version 3.51 and Windows® 95 on the same machine, at its preferred base address and with the preferred virtual memory range taken. Each test was also run first with the DLL located in the current directory and then deep down in the path to measure how long it takes the operating system to locate the DLL in the search path. As I mentioned earlier, each test was run 50 times to obtain a meaningful average value.

The first observation I made was that under Windows NT, the initial load time for any given DLL was about three times the time it would subsequently take to load the same DLL on the average. This is a side effect of Windows NT's memory management design: Once the DLL is initially loaded and subsequently unloaded, the pages that belong to the DLL image remain in memory; they are inserted into what is called the standby list (a system-maintained list of discarded pages that can be made available to the application if it should need the pages again or if another application requesting new memory should need them). For a more thorough description of the standby list, please consult Helen Custer's Inside Windows NT, pages 194–196.

Reloading the DLL's pages from the standby pages is much more efficient than reloading from the disk. Over time, the pages will migrate from the standby list to the free list such that, if there is a lot of memory allocation and access activity goes on between the initial and subsequent DLL loading tests, the time difference will even out. To simulate this behavior (and make sure that I could obtain meaningful average DLL loading times from several tests), I added a little option that allows the test application to hog as much memory as it possibly can so that the standby list will be exhausted quickly. There is also a little utility that comes with the Windows NT Resource Kit that can be used to force pages off the standby list (CLEARMEM.EXE).

That worked, but unfortunately, right after I freed the hogged memory, the load time was about 20 times the average load time—or 7 times the initial load time!

This phenomenon put me into some kind of Catch-22 situation: On the one hand, I wanted to obtain a reliable average figure for DLL loading times under normal working conditions; on the other hand, the only reliable and consistent figures I could obtain were not the ones under normal working conditions! My way out of that dilemma is a little daring but, I hope, valid: I base my results on the comparisons between the average DLL load times and assume that the relationships between the initial and subsequent average load times are constant so that the comparison values are still meaningful under normal working conditions.

If you wish to rebuild the DLLs or add your own DLL variations, or if you are just curious to see what I did to build 18 DLLs, read on; otherwise, skip this subsection and continue under the heading titled "The Theory."

Building the DLLs

The DLLs were built using Visual C++™ version 2.2 using a makefile generated by Visual C++. You will find the project in the attached sample code in the PAGETEST subdirectory. Each of the 18 DLLs is built from the same project; you should build each DLL as the retail (no debug) version and then copy the generated executable to a new location using the naming convention that follows.

The PTAPP sample application expects the name of the DLL to encode the information about what the DLL contains. Each letter in the DLL's name represents one property, according to the following scheme:

The first letter expresses whether the DLL is small (that is, does not contain any data) or large (that is, has 100,000 static data elements). An S in this position indicates small, an L indicates large, and F indicates that all of the 100,000 data elements are initialized to a relocatable string, the address of which must be fixed up at load time when the DLL is rebased.
Note that 100,000 relocatable strings does not necessarily mean 100,000 relocations. There is a problem with the linker in Visual C++ version 2.x that will limit to 64K the number of relocatable items in a portable executable (PE) file. Thus, if you run an .EXE header utility such as YAHU on one of the DLLs whose name begins with an F, you will find that there are only about 34K of relocations. This problem will be fixed in upcoming versions of Visual C++.
The second letter indicates whether the DLL supports C run-time code or not. This letter can either be N (meaning that the DLL has a custom entry point that does not call the C run-time initialization code), C (meaning that the DLL's entry point is DllMain, which implicitly initializes the C run-time support) or D (meaning that the DLL has a custom entry point that calls _CRT_INIT to initialize the C run-time library dynamically).
Finally, the third letter is N if the DLL does not export any symbols or E if a function is exported. The remaining letters are currently unassigned.
For example, SCNNNNNN.DLL is a small DLL that implicitly calls the C run-time initialization code, but does not export a symbol. FNENNNNN.DLL is a large DLL with many relocations that does not call the C run-time initialization code but exports a symbol.

In order not to introduce any unwanted side effects into the comparisons, I made the DLLs as small as I possibly could. The smallest DLL I provide has nothing but a custom DLL entry point that does not initialize the C run-time support code.

There is no MFC support in any of the DLLs because MFC DLLs implicitly link to other DLLs and perform custom initializations that I did not want introduced into the measurements. All of the other variations of DLLs are built with small modifications to the project, as follows:

In order to make a small DLL into a large DLL, add the symbol MANYPAGES to the preprocessor directives. Generate a large DLL with many fixups by adding both MANYPAGES and FIXUPS to the preprocessor definitions. To build a DLL with a custom entry point that calls the C run-time initialization code, add the symbol DYNACRT; to build a DLL with implicit C run-time initialization code, add the symbol STANDARDENTRY and rename the DLL entry point in the Settings/Link/Entry text box to DllMain. Finally, to have the DLL export a symbol, add the preprocessor directive HASSYMBOLS.
For the ambitious, I also define a symbol, HUGEBINARY, that will generate a really big DLL (a DLL with about 15,000 data pages which, if combined with the FIXUPS symbol, will yield about 15,000 relocations). The DLL is 40 to 61 MB in size, depending on whether you define FIXUPS or not; therefore, I do not include the binary in the DLL test set.

Whatever options you use to build the DLL, the resulting executable will be called PAGETEST.DLL in the WINREL subdirectory of the PAGETEST project. After building the DLL, you should copy the DLL to a different location, renaming the DLL according to the above naming conventions.

To see how searching for the DLL binary affects the load time, I kept two copies of each DLL on my machine—one in the same directory as PTAPP.EXE (the test application) and one in the subdirectory that is listed at the very end in the search path (in my case, C:\DOS). After having run the test with the DLL found in the same directory as the executable, I renamed all of the DLLs to force the operating system to look for the DLLs in another directory.

When I built the DLLs, I ran into a few scenarios where I changed one option for test purposes and was unable to recreate the original configuration afterwards. Thus, just to make sure that you can rebuild the DLLs exactly as I built them, here are the project options I used.

Compiler

/nologo /MT /W3 /GX /YX /O2 /D <see above> /FAcs /Fa "WinRel/" FR "WinRel/" /Fp 
"WinRel/pagetest.pch" /Fo "WinRel" /c

Preprocessor
The exact preprocessor options depend on the type of library built, as explained before.

Linker

kernel32 advapi msvcrt /nologo /subsystem:windows /DLL /incremental:no /PDB: 
"WinRel/pagetest.pdb" /MACHINE:I386

Note The PE file format contains time stamps. That means that if you build the same DLL two times, the resulting binary images will not be identical. A byte-byte-byte file-comparison utility should report six differing bytes in two groups of three consecutive bytes, one for every pair of independently built, but otherwise identical, DLLs.

The Theory

The operating system has to go through these steps to load a DLL:

Locate the DLL executable file on disk.
Traverse the list of DLLs loaded into the application's address space to determine if the DLL is already loaded.
Allocate the memory for the DLL to reside in and map the DLL binary into that memory. (In Windows NT, this happens through section objects.)
Perform various manipulations in order for the DLL to work (that is, resolve fixups in the DLL, and so forth).

Various factors determine how fast a DLL will be loaded. Here is a (possibly incomplete) list of the ones that need to be taken into consideration:

The underlying hardware and software: How fast the computer is and what operating system it runs.
The current state of the system and the application: How tight the system is on virtual memory, and if the DLL can be loaded at the preferred base address.
The DLL itself: How big it is, how many locations in the DLL need to be fixed up (in conjunction with two above), and whether the DLL implicitly links to other DLLs that also need to be loaded.

This list tells us that rebasing a DLL is by no means the only factor that determines a DLL's loading time. In this article I present a lot of numbers that should give you an idea of how widely the loading time for a DLL can vary and how much an application can influence the loading time.

Note that rebasing a DLL may result not only in a greater load time, but also in a penalty in pagefile usage. One of the first steps in loading a DLL consists of creating a section object—that is, a contiguous region of memory that is backed by the DLL executable file. Whenever a page of the DLL is removed from an application's working set, the operating system will reload that page from the DLL executable file the next time the page is accessed.

Of course, when a DLL is rebased, this scheme no longer works because the pages that contain relocated addresses differ from the corresponding pages in the DLL executable image. Thus, as soon as the operating system attempts to fix up an address when loading an executable file, the corresponding page is copied (because the section was opened with the COPY_ON_WRITE flag), all the changes are made to the copy, and the operating system makes a note that from now on the page is to be swapped from and to the system pagefile instead of the executable image.

There are two potential performance hits in this setup: First, each page that contains an address to be relocated takes up a page on the system pagefile (which will, in effect, reduce the amount of virtual memory available to all applications); and second, as the operating system performs the first fixup in a DLL's page, a new page must be allocated from the pagefile, and the entire page is copied.

The act of performing fixups also increases a DLL's load time, although the algorithm that scans the relocation section of the DLL and applies the fixups is fairly efficient. (The complexity of the traversal is simply a linear function of the number of fixups to be performed.)

Fixups

A couple of frequently asked questions about DLL rebasing are, "What exactly is a fixup, and is there any way that I can code so that I avoid a lot of fixups in my executable?" The answer to both questions depends to a high degree on the platform for which a particular executable has been built. In this article, I will limit the discussion to executables built for Intel 386, 486, and Pentium processors. (Note that executables built for other platforms have different notions of what a fixup is.)

On 386, 486, or Pentium processors, there are basically two things that can cause an address to be marked as relocatable: static objects and absolute jumps.

First, if a static object is referenced by DLL code, the absolute address of the object is used (assuming that the DLL is loaded into its preferred address). For example, in the code fragment

LPSTR lpName="Name";

the DLL loader will allocate the string "Name" in the DLL's data segment and fill the beginning address of that string into the location that corresponds to the variable lpName. If the string "Name" must be relocated because the DLL could not be loaded at its base address, lpName must be updated accordingly. Note that in this case, every reference to lpName from within the code must also be fixed up.

Objects that can be subject to relocation are literal strings (for example, the string "Name" in the example above), as well as global and static data of every type, including statically allocated C++ objects. Note that especially in C++ there may be many hidden cross-references from one static object to another. Uninitialized data will (trivially) not be fixed up during the relocation process, but references to uninitialized static data will.

The second category of items that can be relocated in an i386 executable is absolute jumps and function calls, including calls to system functions. Note that there is not much you can do in your code to avoid relocations, except for cutting down on statically allocated data. One way to accomplish that would be to avoid resource references by name in favor of referencing resources by ordinal (inasmuch as each name that you explicitly use in your code automatically becomes a potentially relocatable item).

I would not recommend, however, that you design your DLL code with the specific goal of minimizing load time unless (1) the number of statically allocated objects can be significantly reduced, and (2) such a coding practice does not sacrifice other goals in your software design.

One optimization you can perform rather easily, however, is to sort your relocatable data out into only a few pages. It is obvious that two pages with one relocatable item each will both need to be backed by the pagefile if the DLL needs to be rebased. If both relocatable items will occur in the same page, there is only one page that is affected. You might want to check with the pragma (data_seg) directive to ensure that as many relocatable items as possible go into as few pages as absolutely necessary.

The Tools

The fun part about gathering the DLL load times was that I got to understand the internal workings of the operating systems, as well as the executable format, a little bit better. Here are a few tools I considered very useful for dissecting the DLLs as images and at run time:

An executable header utility (such as YAHU, written about in the technical article "YAHU, or Yet Another Header Utility") can tell you almost everything about the DLL after it has been built but before it is loaded—for example, how many sections there are, how big each section is, how many relocations there are, and so forth.
A process walker (for example, PWALK from the Win32 SDK) can tell you where in a process's address space a DLL has been loaded, and where the individual sections of the DLL go.
A process viewer (for example PVIEW from the Windows NT Resource Kit) tells you how many pages the DLL uses in memory at any given time.
The performance monitor that is shipped with Windows NT (by default installed in the Administrative Tools group) is an excellent tool to monitor the impact of DLL loading on the system, for example, in terms of pagefile usage.

Let us look at how we can use these tools to get a better understanding of the internal workings of a DLL. Running YAHU on the DLL SNNNNNNN.DLL, we obtain the following information on the five sections in the DLL:

Section .TEXT, size 0x28 bytes, located 0x400 bytes behind the beginning of the file. This section contains the complete DLL code. This section will be loaded one page (0x1000) behind the start address of the DLL image.
Section .DATA, size 9, located 0x600 bytes behind the beginning of the file. This section contains all the initialized data and will be loaded two pages (0x2000) behind the start address of the DLL image.
Section .IDATA, size 0x6c, located 0x800 bytes behind the beginning of the file. In this section you will find the import data; that is, the names of the DLLs that the executable links to and the names of the functions called in those DLLs. This section will be loaded three pages (0x3000) behind the start address of the DLL image.
Section .EDATA, size 0x35, located 0xA00 bytes behind the beginning of the file. This section contains the DLL's export information and will be loaded four pages (0x4000) behind the start address of the DLL image.
Section .RELOC, size 0x32, located 0xC00 bytes into the file. This section contains the complete relocation information for the entire DLL and will be loaded five pages (0x5000) behind the start of the DLL image in memory.

In other DLLs, you may find more sections—for example, the .BSS section, which contains uninitialized data.

Note that the offsets of the respective sections in the file help you to look at the binary data. For example, open the DLL in binary mode in Visual C++, and scroll down to offset 0xc00. You will see eight bytes of heading followed by six data bytes. The exact format of the relocation records is described in the Microsoft Systems Journal article "Peering Inside the PE: A Tour of the Win32 Executable File Format" (Pietrek 1994) in the MSDN Library. Note that the information in the .RELOC section gives you all you need to determine where in memory the relocations will be performed.

Thus, the DLL image of SNNNNNNN.DLL consists of six pages: The PE header and the five sections listed above, each of which happens to consist of one page. Now run PTAPP.EXE under control of PWALK, and select a small DLL with no exports and no CRT support from the Select DLL menu. You should see a message saying that SNNNNNNN.DLL was located somewhere on your hard drive. Choose Load DLL from the Run Single Tests menu. You should now see a message saying that the DLL was loaded at some address. Then go back to PWALK, rewalk the process, and scroll down to the address that PTAPP reported as the loading address (if the DLL was loaded at the preferred base address, this would be 0x10000000). You will then see the six pages of the DLL exactly in the order they were specified in the executable header. Note that the page that belongs to the .RELOC section is listed as a second page in the .EDATA section.

Then run PVIEW.EXE and select the process PTAPP.EXE from the process list combo box. In the User Address Space group box, select SNNNNNNN.DLL from the combo box. You should now see all of the DLL's pages sorted by access type: The DLL is listed as occupying a total of 24K (six pages). 12K (or three pages) are listed as read-only—the DLL header page beginning at 0x10000000 and the two pages in the .EDATA section beginning at 0x10004000. One page in the .IDATA section is marked as read/write. (This must be read/write because an import designation may refer to a DLL that must be rebased, so entries in this section may actually have to be updated.) The one page in the .TEXT section is marked as execute, and the .DATA section page has copy-on-write protection.

If you run the same procedure on one of the large DLLs, you will see that the .DATA section will grow as expected, and all of the relocatable data in that section will be marked as copy-on-write. As mentioned before, the copy-on-write scheme ensures that relocations will be performed not on the physical page of the DLL image, but on a copy on the pagefile.

The Numbers

One of the caveats I mentioned earlier about measuring DLL load times is that your mileage may vary greatly. I ran the test set several times and found that, although some patterns and general relationships can be detected, the influence of the overall machine work load may skew the results widely—differences of up to 20 percent from one test run to the other are not atypical.

Let me first describe how I obtained the numbers and then interpret the results. Please refer to Appendix A for the test runs on which I based the evaluations in this paper.

In order to obtain a set of numbers, run the test application PTAPP and choose Run All Tests from the Run Multiple Tests menu. This will invoke a script that loads all of the 18 DLLs 50 times each. (An individual scenario can be tested by choosing a particular DLL from the Select DLL menu, choosing Finish to locate the DLL and initialize the test, and then choosing Run Without Hogging from the Run Multiple Tests menu. A DLL can be loaded in a one-shot fashion using the Load DLL menu item from the Run Single Tests menu.) Caution: The test takes several minutes to complete.

The result of each test will be displayed in the application's main window. The first line displays the resolution of the system performance counter (this can be used to compute absolute times), and after the last test, you will find a table of 36 figures. These numbers are the average load times (in performance-counter ticks) for each of the 18 DLLs loaded both at the preferred address and rebased. As I mentioned earlier, the number of ticks, in conjunction with the performance counter resolution, can be used to compute the absolute loading times through this formula:

loading times in seconds = number of ticks/ performance counter resolution.

The test application also computes the relative load time in parentheses behind each result; this is based on the smallest result encountered while running the test.

In order to obtain the four sets of 36 numbers each (as listed in Appendix A), you should run the test application four times: twice under Windows NT (once with the DLLs located in the same directory as PTAPP.EXE, and once with the DLLs located deep down in the search path), and twice under Windows 95 (same conditions).

As I mentioned before, none of the results I present are groundbreakingly new nor surprising. Here are the important conclusions:

The figures for Windows NT and Windows 95 do not differ wildly, except that Windows 95 does seem to load small DLLs slower and large DLLs faster than Windows NT.
All other things being equal, the size of the DLL does not matter; that is, the costs for loading a small DLL and a large DLL are pretty much equal. Thus, if possible, you should avoid writing a lot of small DLLs and instead write fewer large DLLs if load time is an issue for you. Note that this observation holds true over a very wide range of DLL sizes—when I ran the test on the huge binary DLL I mentioned earlier (the one with 15,000 pages), the load time did not differ very much from the load time for the small DLL that contains six pages total.
Rebasing the DLL incurs an overhead of about 600 percent on Windows NT and around 400 percent on Windows 95. Note, however, that this implies a great number of fixups (34,000 in the sample suite). For a typical DLL, the number is much smaller on the average; for example, in the debug version of MFC30D.DLL, which ships with Visual C++ version 2.x, there are about 1700 fixups, which is about 5 percent of the 34,000 fixups in the sample suite.
The single biggest factor that slows down the loading of DLLs is the location of the DLL. The documentation for LoadLibrary describes the algorithm that the operating system uses for locating the DLL image; a DLL located at the first search position (the current directory) loads in typically 20 percent or less of the time as the same DLL located deep down in the path loads. It is fairly obvious that the exact load time difference depends a lot on the length of the path, the efficiency of the underlying file system, and the number of files and directories that need to be searched.

Here are the numbers in neat, digestible format. Please refer to Appendix A for information on how the numbers were pulled together.

Figure 1. Windows NT 3.51 DLL load times

Figure 2. Windows 95 DLL load times

The Recommendations

The single, major thing you can do to speed up DLL loading is to ensure that the operating system does not spend a lot of time locating the DLL—either put the DLL in the same directory from which the executable is started, or start the executable with your environment variable set up so that the DLL in question can be located quickly. This is something you can do without even touching the DLL. If you load the DLL repeatedly and explicitly, you can use the SearchPath application programming interface (API) to first obtain the full path name of the DLL location so that you can provide the operating system with an exact location before loading the DLL.

The other main optimization that can help you speed up DLL loading—if there are a significant number of relocation items in the DLL—is to try to ensure that the DLL will not have to be rebased by the operating system. You will also notice that for very small DLLs, the presence of the C run-time initialization code may slow down the DLL loading a little bit.

As you can see from the numbers above, there is a fixed cost in loading a DLL, regardless of its size; thus, you are much better off writing one bigger DLL instead of a number of small DLLs.

Finally, I need to reiterate that due to the way both Windows NT and Windows 95 handle the management of pages that are to be discarded (the pages are, in fact, kept in memory and will be reused over time), the loading of an executable is much faster if the same executable has already been loaded into any application's address space or has recently been loaded and is still on the standby list.

What Else Is There?

I wouldn't like to end this article without mentioning another issue that is related to DLL loading: binding import addresses to external DLLs. Fortunately for me, there is no need to explicitly discuss this issue here because pretty much everything that is to be said has already been said in Matt Pietrek's article series on DLL binding in the "Windows Q&A" column in Microsoft Systems Journal, which describes the internals of DLL import binding as well as the usage of the BIND utility. (See the reference in the "Bibliography" section.)

Bibliography

Custer, Helen. Inside Windows NT. Redmond, WA: Microsoft Press, 1993.

Pietrek, Matt. "Peering Inside the PE: A Tour of the Win32 Executable File Format. "Microsoft Systems Journal 9 (March 1994). (MSDN Library, Books and Periodicals)

Pietrek, Matt. "Windows Q&A." Microsoft Systems Journal 10 (July 1995). (MSDN Library, Books and Periodicals)

Pietrek, Matt. "Windows Q&A." Microsoft Systems Journal 10 (August 1995). (MSDN Library, Books and Periodicals)

Appendix A. Results from a Test Run

All tests were executed on an i486 machine running at 33 MHz with 24 MB of RAM. Note that the references (1.0 base value) differ from test set to test set. In the charts, the values from the respective second test sets (DLLs located in the search path) have been adjusted relative to the reference value of the first test set.

Table 1. Windows NT 3.51, DLLs Located in Current Directory (Reference: 1.0 == 17.5 ms)

1a. DLLs Loaded at Preferred Address

DLL Type	Small DLL	Large DLL	Large DLL with Fixups
No CRT, no exports	1.0	1.0	1.0
No CRT, exports	1.1	1.0	1.1
DllMain, no exports	1.25	1.22	1.24
DllMain, exports	1.23	1.18	1.21
CRT_INIT, no exports	1.18	1.2	1.15
CRT_INIT, exports	1.2	1.19	1.21

1b. DLLs Rebased

DLL Type	Small DLL	Large DLL	Large DLL with Fixups
No CRT, no exports	1.25	1.23	6.4
No CRT, exports	1.26	1.3	6.38
DllMain, no exports	1.4	1.4	6.5
DllMain, exports	1.29	1.42	6.52
CRT_INIT, no exports	1.4	1.4	6.45
CRT_INIT, exports	1.3	1.3	6.4

Table 2. Windows NT 3.51, DLLs Located in Search Path (Reference: 1.0 == 85.4 ms)

2a. DLLs Loaded at Preferred Address

DLL Type	Small DLL	Large DLL	Large DLL with Fixups
No CRT, no exports	1.0	1.0	1.0
No CRT, exports	1.0	1.0	1.0
DllMain, no exports	1.0	1.0	1.0
DllMain, exports	1.0	1.0	1.0
CRT_INIT, no exports	1.0	1.0	1.1
CRT_INIT, exports	1.0	1.0	1.0

2b. DLLs Rebased

DLL Type	Small DLL	Large DLL	Large DLL with Fixups
No CRT, no exports	1.1	1.0	2.1
No CRT, exports	1.1	1.0	2.1
DllMain, no exports	1.1	1.1	2.1
DllMain, exports	1.0	1.0	2.1
CRT_INIT, no exports	1.1	1.1	2.1
CRT_INIT, exports	1.1	1.0	2.1

Table 3. Windows 95, DLLs Located in Current Directory (Reference: 1.0 == 21.0 ms)

3a. DLLs Loaded at Preferred Address

DLL Type	Small DLL	Large DLL	Large DLL with Fixups
No CRT, no exports	1.0	1.2	1.2
No CRT, exports	1.0	1.2	1.1
DllMain, no exports	1.0	1.2	1.2
DllMain, exports	1.1	1.2	1.2
CRT_INIT, no exports	1.0	1.2	1.1
CRT_INIT, exports	1.0	1.2	1.2

3b. DLLs Rebased

DLL Type	Small DLL	Large DLL	Large DLL with Fixups
No CRT, no exports	1.1	1.2	4.0
No CRT, exports	1.1	1.2	3.8
DllMain, no exports	1.1	1.2	4.0
DllMain, exports	1.1	1.2	4.0
CRT_INIT, no exports	1.1	1.2	4.0
CRT_INIT, exports	1.1	1.2	4.1

Table 4. Windows 95, DLLs Located in Search Path (Reference: 1.0 == 94.7 ms)

4a. DLLs Loaded at Preferred Address

DLL Type	Small DLL	Large DLL	Large DLL w/ Fixups
No CRT, no exports	1.0	1.0	1.1
No CRT, exports	1.0	1.0	1.0
DllMain, no exports	1.0	1.0	1.1
DllMain, exports	1.0	1.0	1.0
CRT_INIT, no exports	1.0	1.0	1.0
CRT_INIT, exports	1.0	1.0	1.1

4b. DLLs Rebased

DLL Type	Small DLL	Large DLL	Large DLL with Fixups
No CRT, no exports	1.0	1.0	1.7
No CRT, exports	1.0	1.1	1.7
DllMain, no exports	1.0	1.1	1.7
DllMain, exports	1.0	1.0	1.7
CRT_INIT, no exports	1.0	1.1	1.7
CRT_INIT, exports	1.0	1.0	1.7