This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.


December 1998

Microsoft Systems Journal Homepage

Download Dec98Win32.exe (4KB)

Jeffrey Richter wrote Advanced Windows, Third Edition (Microsoft Press, 1998) and Windows 95: A Developer's Guide (M&T Books, 1995). Jeff is a consultant and teaches Win32 programming courses (http://www.jeffreyrichter.com.

Several readers have told me that they have had some problems using the code included with my article, "Custom Performance Monitoring for Your Windows NT® Applications" (MSJ, August 1998). The first problem is that they can't compile the code because it sets the low-level WH_KEYBOARD_LL and WH_MOUSE_LL hooks. These new hooks were added in Windows NT 4.0 Service Pack 3. The WinUser.H header file included with many compilers (such as Visual C++® 5.0) are not up-to-date. To build my code, you will need to get a newer WinUser.H header file (try the MSDN Platform SDK CD, disk 6). Note that these low-level hooks are not identical to the existing WH_KEYBOARD and WH_MOUSE hooks; you cannot simply remove the _LL and have my code work.
      The second problem is that PerfMon raises an access violation when you invoke its Add to Chart dialog box. I reported this PerfMon bug to Microsoft and they fixed it for Windows NT 5.0. The problem occurs because PerfMon has a string buffer that is too small, and subsequently its stack gets corrupted. To avoid this bug, just move the performance DLL closer to the root of your drive so that it has a shorter path name.

Q My application requires the use of several DLLs, which really hurts its load/initialization performance. I think I have done everything possible to improve load-time performance, including rebasing and binding my DLLs. I would really like to postpone loading my DLLs until they are actually needed. I know that I can accomplish this using explicit linking (calling LoadLibrary and GetProcAddress), but this means I would have to keep track of whether the DLL was already loaded and it would make coding more tedious. Isn't there an easier way to postpone loading DLLs until you actually call a function in them?

Kristin Trace

I am developing an application that needs to run on both Windows® 9x and Windows NT. My application checks to see if a certain process is running on the system. When my app runs on Windows NT, I use the PSAPI functions EnumProcessModules and GetModuleFileNameEx. For Windows 9x I use the Toolhelp functions CreateToolhelp32Snapshot, Process32First, Process32Next, and so on. When my application initializes, I call GetVersionEx to determine which OS my application is running on, and then call only the appropriate set of functions. My application compiles and links perfectly, but when I run it on Windows NT I get the following message: "The procedure entry point Process32Next could not be located in the dynamic link library KERNEL32.dll." Likewise, when I run my application on Windows 9x, I get a similar message regarding PSAPI.DLL. Can you tell me how I can get rid of such runtime errors?
 

Conor Kiernan

A Both of these problems can be solved using a new feature offered in Visual C++ 6.0: delay-load DLLs. A delay-load DLL is implicitly linked, but the loader will not load the DLL until your code actually calls a function in the DLL. This will improve Kristin's initialization performance because the loader will not do all its work up front. For Conor, the runtime errors will disappear because his application will only load the required DLL.
      I've spent quite a bit of time playing with the new delay-load DLL feature of Visual C++ 6.0, and I must say that Microsoft has done an excellent job in implementing it. Delay-load DLLs offer many features and work equally well on both Windows 9x and Windows NT. Let's start off with the easy stuff—just getting it to work.
      To make delay-load DLLs work with Visual C++ 6.0, create your DLL just as you normally would. You also create your executable as usual, but you'll have to change a couple of linker switches and relink the executable. Here are the two linker switches you need to add:

 /Lib:DelayImp.lib 
 /DelayLoad:MyDll.dll
The /Lib switch tells the linker to embed a special __delayLoadHelper function into your executable. The second switch tells the linker several things:
  • Remove MyDll.dll from the table in the executable that tells the OS loader to implicitly load the DLL when the process initializes.
  • Embed a special table in the executable indicating which functions are in MyDll.
  • Resolve calls to the delay-loaded functions by having calls jump to the __delayLoadHelper function.
      When the application runs, a call to a delay-loaded function actually calls the __delayLoadHelper function instead. This function references the special table I mentioned previously, and knows to call LoadLibrary followed by GetProcAddress. Once the address of the delay-loaded function is obtained, __delayLoadHelper fixes up calls to this function so that future calls go directly to the delay-loaded function. Note that other functions in the same DLL will still have to be fixed up the first time you call them. Also, you can specify the /DelayLoad linker switch multiple times, once for every DLL that you want to delay-load.

Error Conditions
      OK, that's it. It's that simple! But there are a couple of issues that you should take into consideration. Normally, when the OS loader loads your executable, it tries to load the required DLLs. If a DLL can't be loaded, the loader pops up a message box like the one shown in Figure 1. But for delay-load DLLs, the existence of the DLL is not checked at initialization time. If the DLL can't be found when a delay-loaded function is called, the __delayLoadHelper function raises a software exception. You can trap this exception using a structured exception handling (SEH) frame and keep your application running; if you don't trap the exception, your process will be terminated.

Figure 1 Message Box
Figure 1 Message Box

      Another problem that can occur is that __delayLoadHelper finds your DLL, but the function you're trying to call isn't in the DLL. This can happen if the loader finds an old version of the DLL. In this case, __delayLoadHelper raises another software exception and the same rules apply. The code in Figure 2 shows how to write the SEH code properly to handle these errors. While examining the code in Figure 2, you'll notice a lot of other stuff that has nothing to do with SEH and error handling. This code has to do with additional features available when you use delay-load DLLs. I'll describe these more advanced features shortly. If you don't use these features, you can delete this additional code.
      As you can see, the Visual C++ team has defined two software exception codes, VcppException(ERROR_ SEVERITY_ERROR, ERROR_MOD_NOT_FOUND) and VcppException(ERROR_SEVERITY_ERROR, ERROR_ PROC_NOT_FOUND), to represent the DLL module not found and the function not found. My exception filter function DelayLoadDllExceptionFilter checks for these two exception codes. If neither of these exception codes is thrown, my filter returns EXCEPTION_CONTINUE_ SEARCH, as any good filter should. (Never swallow exceptions that you don't know how to handle.) If one of these two exception codes is thrown, then the __delayLoadHelper function provides a pointer to a DelayLoadInfo structure containing some additional information. The DelayLoadInfo structure is defined in the DelayImp.H file in Visual C++ and is shown here:

 typedef struct DelayLoadInfo {
     DWORD cb;             // size of structure
     PCImgDelayDescr pidd; // raw form of data     // (everything is there)
     FARPROC * ppfn;       // points to address of     // function to load
     LPCSTR szDll;         // name of dll
     DelayLoadProc dlp;    // name or ordinal of     // procedure
     HMODULE hmodCur;      // the hInstance of the     // library we have loaded
     FARPROC pfnCur;       // the actual function that     // will be called
     DWORD dwLastError;    // error received (if an    // error notification)
 } DelayLoadInfo, * PDelayLoadInfo;
This data structure is allocated and initialized by the __delayLoadHelper function. As the function progresses through its work of dynamically loading the DLL and getting the address of the called function, it populates the members of this structure. Inside your SEH filter, the szDll member points to the name of the DLL attempting to be loaded, and the dlp member contains the function you attempted to look up. Since functions can be looked up via ordinal or name, the dlp member looks like this:

 typedef struct DelayLoadProc {
     BOOL fImportByName;
     union {
         LPCSTR szProcName;
         DWORD  dwOrdinal;
     };
 } DelayLoadProc;
      If the DLL loaded successfully, but did not contain the desired function, then you might also look at the hmodCur member to see at what memory address the DLL loaded. Also, check the dwLastError member if you want to see what Win32 error caused the exception to be raised. For an exception filter this will probably be unnecessary because the exception code tells you what happened. The pfnCur member contains the address of the desired function. This will always be set to NULL in the exception filter because __delayLoadHelper couldn't find the address of the function.
      Of the remaining members, cb is for versioning, pidd points to the table embedded in the EXE file that contains the list of delay-load DLLs and functions, and the ppfn member is the address where the function's address will go if the function is found. pidd and ppfn are used by the __delayLoadHelper function internally; they are for super-advanced use and it is extremely unlikely that you will ever have to examine or understand them.

Unloading a Delay-load DLL
      So far, I have explained the basics of using delay-load DLLs and recovering from error conditions. However, Microsoft did not stop here. They added the ability for your application to unload a delay-load DLL. For example, your application might require a special DLL to print the user's document. This DLL is a perfect candidate to make a delay-load DLL because most of the time it probably won't be used. However, if the user chooses the Print menu option, you can call a function in the DLL and it will be loaded automatically. This is great—but after the user's document is printed, it is unlikely that the user will print another document immediately and you could unload the DLL, freeing system resources. If the user does decide to print another document, the DLL will be reloaded on demand.
      To unload a delay-load DLL, you must do two things. First, you must specify an additional linker switch (/Delay: unload) when building your executable file. Second, you must modify your source code and place a call to the __FUnloadDelayLoadedDLL function at the point where you want the DLL to be unloaded:


 BOOL __FUnloadDelayLoadedDLL(LPCSTR szDll); 
      The /Delay:unload linker switch tells the linker to place another table inside the file. This table contains the information necessary to reset the functions you've already called so that these functions call the __delayLoadHelper function again. When you call __FUnloadDelayLoadedDLL, you pass it the name of the delay-load DLL that you want to unload. The function then goes to the unload table in the file and resets all of the DLL's function addresses. Then __FUnloadDelayLoadedDLL calls FreeLibrary to unload the DLL.
      Let me point out a couple of potential problems. First, make sure that you don't call FreeLibrary yourself to unload the DLL because the function's address will not be reset, causing an access violation the next time you attempt to call a function in the DLL. Second, when you call __FUnloadDelayLoadedDLL, the DLL name you pass should not include a path, and the letters in the name must be the same case as when you passed the DLL name to the /DelayLoad linker switch. If you don't comply, __FUnloadDelayLoadedDLL will fail. Third, if you never intend to unload a delay-load DLL, do not specify the /Delay:unload linker switch. Your executable file will be smaller. Finally, if you call __FUnloadDelayLoadedDLL from a module that was not built with the /Delay:unload switch, nothing bad happens; __FUnloadDelayLoadedDLL simply does nothing and returns FALSE.

Other Features
      Another feature of delay-load DLLs is that, by default, the functions that you call are bindable to a memory address where the system thinks the function will be in a process's address. I'm not going to give a detailed explanation of binding in this column because there is a lot of information about it in MSDN Knowledge Base articles. However, binding a module allows it to load significantly faster and is strongly encouraged. If you are unfamiliar with binding, you should research it to improve the performance of your application with respect to both speed and memory usage. Since creating bindable delay-load DLL tables can make your executable file bigger, the linker also supports a /Delay:nobind switch. Most applications should not use this linker switch since binding is generally preferred.
      The last feature of delay-load DLLs is for advanced users and really shows Microsoft's attention to detail. As the __delayLoadHelper function executes, it has the ability to call hook functions that you provide. These functions receive notifications of __delayLoadHelper's progress and errors. In addition, these functions can override how the DLL is loaded and how the function's memory address is obtained.
      To get the notification or override behavior, you must do two things to your source code. First, you must write a hook function. This function must look like the DliHook function that appears in Figure 2. The DliHook skeleton function does not affect __delayLoadHelper's operation. To alter the behavior, start with the DliHook function and then modify it as necessary. Once you write the function, you need to tell __delayLoadHelper the address of the function. Inside the DelayImp.LIB static-link library, two global variables are defined: __pfnDliNotifyHook and __pfnDliFailureHook. Both of these variables are of type PfnDliHook:


 typedef FARPROC (WINAPI *PfnDliHook)(unsigned dliNotify, PDelayLoadInfo pdli);
As you can see, this is a function data type and matches the prototype of my DliHook function. Inside DelayImp.LIB, the two variables are initialized to NULL, which tells __delayLoadHelper not to call any hook functions. So to have your hook function called, you must set either of these variables to your hook function's address. In my code, I simply add these two lines at global scope:

 PfnDliHook __pfnDliNotifyHook  = DliHook;
 PfnDliHook __pfnDliFailureHook = DliHook;
      As you can see, __delayLoadHelper actually works with two callback functions. __delayLoadHelper calls one to report notifications and the other to report failures. Since the prototypes are identical for both functions and the first parameter, dliNotify, tells you why the function is called, I like to make my life simpler by creating a single function and setting both variables to point to one function.

Wrap-up
      This new delay-load DLL feature of Visual C++ 6.0 is pretty cool, and I know many developers who wish they had this feature years ago. I can think of a lot of apps (especially Microsoft® applications) that will be taking advantage of this mechanism in the future. Try it out for yourself and see how it can help the performance of your apps.

Have a question about programming in Win32? Send your questions via email to Jeffrey Richter from his website at http://www.jeffreyrichter.com.

From the December 1998 issue of Microsoft Systems Journal.