Don't be concerned if you see the same DLL appear more than once in the list. ModuleList treats each instance of a DLL with a different load address as a separate instance. I intentionally kept the DLL names in alphabetical order to make it easier to notice when this occurs.
When I set out to write ModuleList, I had a seemingly simple goal: the program should run on as many versions of Windows NT and Windows 9x as possible. This would have been much easier if ModuleList didn't need to run on Windows NT 4.0. However, as I write this column, Windows NT 4.0 is still in widespread use as a development platform.
In an ideal world, I could just use the ToolHelp32 APIs, which provide all the capabilities I need. Alas, ToolHelp32 doesn't exist on Windows NT prior to version 5.0. This presents two problems. First, I can't directly call any ToolHelp32 APIs. Second, a means of getting the equivalent information for Windows NT 4.0 is needed. Luckily, PSAPI.DLL (which I wrote about in my August 1996 column) comes to the rescue.
By using a combination of ToolHelp32 and PSAPI.DLL APIs, I can access all the information I need to build the DLL list. Unfortunately, I can't just call the appropriate APIs directly based upon the operating system ModuleList is running on. By calling the APIs directly, I'd create an implicit reference to those APIs in the ModuleList executable. Since the ToolHelp32 APIs aren't in Windows NT 4.0, and since PSAPI.DLL won't load on Windows 9x, I'd create a program that wouldn't run on any operating system.
The way to overcome this problem is to bite the bullet and go through the drudgery of using GetProcAddress to obtain function pointers to the appropriate APIs, based upon the underlying operating system. In Windows NT 5.0, there's a new feature called Delay Load import descriptors that sounds like it could solve this problem. That is, the operating system lets you put off loading a DLL and hooking up to its APIs until you actually call the API. However, since this feature doesn't exist in Windows 9x and prior versions of Windows NT, it doesn't do you much good now.
The ModuleList Code
The central point for the ModuleList code is ModuleList.CPP (see Figure 2). Most of the code is boilerplate dialog code, which I won't waste time on. Instead, let's focus on the PopulateTree function. After clearing the contents of the tree view, the function resets global instances of a ModuleList class and a ProcessIdToNameMap class. Next, PopulateTree adds data to these class instances by calling the PopulateModuleList_ToolHelp32 or PopulateModuleList_PSAPI functions as necessary. Finally, PopulateTree walks through all the items of the ModuleList class and adds the relevant information to the TreeView control.
Before getting into how the ModuleList and ProcessIdTo-NameMap classes are filled, let's first look at the classes themselves. All of the code for the classes can be found in Module-List-Classes.H and ModuleListClasses.CPP (see
Figure 2). The ModuleList class is just a linked list-based container class for ModuleInstance class instances. The ModuleList class has member functions to add a new module, enumerate through all the ModuleInstances, and look up a ModuleInstance given a base address (HMODULE) and file name.
The ModuleInstance class represents each loaded module that has a unique filespec and load address. In addition to storing the HMODULE and filespec, the ModuleInstance class also keeps a list of process IDs that reference the module. There are methods to add a new process ID to the list, enumerate through the process ID list, and retrieve the number of referencing processes.
The final member of this set of classes is the ProcessId-ToNameMap class. Its sole reason for existence is to translate a process ID into a filespec for the executable associated with the process (in essence, the process name). The implementation of this class is rather crude, using a dynamically grown array and brute force scanning algorithms. Yes, using the STL Map class would be more elegant. However, I still spend more time wrestling with STL-induced compiler errors than I save by using the STL in a simple program like this.
The third and final source file from the ModuleList program is ModuleListOSCode.CPP. This is where I isolated all the code that's specific to a particular operating system. There are only two functions in this module: Popu-late-ModuleList_ToolHelp32 and PopulateModuleList_
PSAPI. Both functions take references to empty ModuleList and ProcessIdToNameMap class instances and fill them up. Immediately preceding both functions is a series of typedefs. I needed all these typedefs so that I could use GetProcAddress and call the PSAPI and ToolHelp32 APIs through function pointers, thereby avoiding an implicit reference to the APIs.
The PopulateModuleList_ToolHelp32 function looks up the addresses of the five ToolHelp32 APIs it will use. It then creates a ToolHelp32 snapshot of the process list. Using this snapshot, the function iterates through each of the processes. At each stop, it creates a module list snapshot. As the code enumerates through the module list snapshot, it fills in the ModuleList and ProcessIdToNameMap classes that were passed to it. The function is also careful to call Close-Handle on each snapshot when it's done using the snapshot.
The PopulateModuleList_PSAPI function looks up the addresses of the three APIs in PSAPI.DLL that it needs. The code then calls EnumProcesses to obtain an array of process IDs. Next, the code iterates through each of the process IDs and calls OpenProcess to get a corresponding process handle. If a process handle can be obtained (which isn't always the case), the function uses EnumProcess-Modules to get an array of all HMODULEs in the designated process. By itself, an HMODULE in another process is almost useless. Luckily, PSAPI.DLL has the GetModule-FileNameEx API (in both ANSI and Unicode) to retrieve a file name from a process handle and HMODULE combination. I used the ANSI version (GetModuleFileName-ExA) since the rest of the program is ANSI-centric.
In both of these functions, you might notice a minor flaw: not all of the necessary information is collected at one time. With ToolHelp32, multiple module list snapshots are taken during the process enumeration. Likewise, the PSAPI-based function has to use a series of calls to EnumProcess-Modules. The hangup in both cases is that, during the enumeration, a DLL could load or unload. Even worse, a process could start or terminate. Either way, the results wouldn't be entirely consistent. Short of somehow suspending all other processes while the enumeration occurs, this potential loophole can't be avoided.
A DLL Mystery
Take another look at Figure 1. Notice that the highlighted line is spoolss.exe, which is the Windows NT print spooler subsystem. The DLL that it references is MSDBI.DLL. The description for MSDBI.DLL is "Microsoft® VC Program Database." In simpler terms, this is the Visual C++® DLL that reads debug symbol tables. What in the heck would a print spooler need to use a symbol table for? There isn't a reason. As a result, tracking down why spoolss.exe loads MSDBI.DLL is an interesting exercise.
If you're lucky, the program's EXE file or some DLL will implicitly link to the DLL in question. When this happens, you can use a module dependency-listing program to ferret out the connections. One such program is Depends.exe from my February 1997 column. An even better program is Microsoft's own Depends.exe, from the Platform SDK, which I highly recommend. When I wrote my Depends program, I was completely unaware of the Microsoft version. Unlike my version, the Microsoft program has a nice GUI and displays much more information than just module dependencies.
If you run Depends on spoolss.exe, you won't find MSDBI.DLL. That means that the MSDBI.DLL was loaded via LoadLibrary, either directly or indirectly. When I say directly, I mean that somebody explicitly called LoadLibrary on the DLL in question. An indirect load means that LoadLibrary was called for some other DLL, which in turn had an implicit reference to the DLL in question.
With a little thought, you can create numerous scenarios involving a mixture of LoadLibrary calls and implicit references. Figuring out the exact circumstances for a DLL being loaded can be a real nightmare. While a dependency program can help with implicit references, determining DLLs that were loaded via LoadLibrary is trickier. I use a system-level debugger to set a breakpoint on the Load-Library entry point in KERNEL32.DLL. (Under Windows 9x I'd use LoadLibraryA, and under Windows NT, I'd use Load-LibraryExW.) When the breakpoint is hit, you can look at the stack to find the parameter that points to the DLL name being loaded.
Returning to the example at hand, how does MSDBI.DLL get loaded into the spoolss process? When it starts, the initialization code in spoolss calls LoadLibrary to load WIN32SPL.DLL. WIN32SPL.DLL has an implicit reference to LocalSpl.DLL. LocalSpl.DLL uses a single IMAGE-HLP.DLL function, ImageNtHeader. This reference to an IMAGEHLP API is enough to bring in IMAGEHLP.DLL, which in turn implicitly refers to MSDBI.DLL. Quite a twisted path! If you experiment with ModuleList, you'll no doubt find many other strange situations like this. Tracking down the dependencies is a great way to bone up on the various system components and their relationships.
Looking Forward
This issue marks my 60th consecutive monthly column for MSJ. That's five straight years without missing a month (although I've come close on a few occasions). When I first started out, this space was the Windows Questions and Answers column. I covered 16-bit Windows-based programming questions for nearly two years before switching the focus to Win32 programming. In a recent column, I described issues for the forthcoming 64-bit version of Windows NT. That's quite a leap in the evolution of Windows that I've had the privilege to write about.
Under normal circumstances, 60 consecutive months would probably be an MSJ record. However, that honor goes to Paul DiLascia, who started his column before me and is still going strong. I've joked with Paul that someday I'm going to catch up with his streak. However, that won't be happening. For a variety of reasons (all positive), I'm going to cut my column schedule down to once every three months.
Having more time between columns should allow me to focus more of my time on learning, and less on writing. However, in order to make the most of my columns, I need your help. Keep sending me those column topic suggestions. While I won't always be able to help every person with a particular problem, I'm always trying to spot trends where an in-depth Under the Hood column could help out. Thanks for reading!
Have a question about programming in Windows? Send it to Matt at mpietrek@tiac.com.
From the September 1998 issue of Microsoft Systems Journal.
|