In the December issue I introduced TraceSrv, a utility that makes it simple to have all the trace statements for your application go to the same place, whether from multiple processes or from many different machines. TraceSrv also works with any language that supports COM, so you can even have trace statements from your VBScript code and your back-end Windows NT® service completely coordinated, enabling you to find out what side of the network conversation had the problem. While TraceSrv can really help you out on brand new development, it would be truly great to use on existing development projects. In this column, I will show you how to simply call a single function and have all of the trace statements for your application automatically sent through TraceSrv. The other benefit is that the techniques that cause your app to use TraceSrv automatically can also be applied to other interesting bugslaying tools.
While it sounds like a pretty big order to make TraceSrv work its magic with a single function call, keep in mind that I am making one assumption here: that your application's existing trace statements are probably eventually going through a call to OutputDebugString. For the one or two teams out there that have developed their own tracing utility that never calls OutputDebugString, it's very simple to extend my code to fit your individual situation.
I called this extension TraceSrvHelper. Before I jump down to the nitty gritty code, I want to discuss my overall goals and how I will be accomplishing them. Based on what I set out above, you probably have an idea of what is needed to get your trace statements sent over to TraceSrv. You want to somehow hook all the calls to OutputDebugString so that you can have all the trace statements routed through TraceSrv. When I first started thinking about the problem, I quickly thought of some possible solutions and listed their pros and cons.
First, I decided to create a macro that redefines OutputDebugStringA and OutputDebugStringW to call my functions instead. It would work and be pretty simple to implement. On the con side, not everything calls OutputDebugString directly. MFC uses AfxTrace and calls OutputDebugString inside MFC42.DLL. This would require a recompile of MFC. Getting the include files straight could be problematic; you could end up with compilation errors by redefining a declaration as a macro. Also, if a new DLL were loaded, unless it did the OutputDebugString mapping, its trace statements would still go to the real OutputDebugString.
Second, I could have a separate static library that has my implementations of OutputDebugStringA, OutputDebugStringW, LoadLibraryA, LoadLibraryW, LoadLibraryExA, and LoadLibraryExW. Again, it's pretty simple to implement and the include problem would be solved. This would allow you to handle the case where LoadLibrary is called because the code could physically hook the imported call to OutputDebugString that the library calls.
Now, the negatives. It would still require complete recompiles of things like MFC to work. Also, this would require that you link to the TraceSrvHelper library before any other libraries to get my functions linked in. This might not be possible with different options like the #pragma lib directives in libraries like MFC.
Third, I could physically hook all imported calls to OutputDebugStringA, OutputDebugStringW, LoadLibraryA, LoadLibraryW, LoadLibraryExA, and LoadLibraryExW in all the modules loaded by the application. This would definitely solve the problem of getting the OutputDebugString calls out of modules like MFC42.DLL without requiring a recompile of anything. All calls to OutputDebugString would be hooked no matter where they were called in the application. All of the possible include and linking issues are taken care of; everything can just be done without a change. Any loaded modules can be handled because I will catch all calls to the LoadLibrary family of functions. But this would be more difficult to implement because more housekeeping and handling are needed to keep everything straight.
As you can probably tell, I felt that the third option held the most promise. However, I realize that many folks might start wondering what I mean when I "physically hook all imported calls." This means that I will need to patch in code to replace calls to OutputDebugString so that it calls my function instead. Before you really start wondering about my sanity, this does not mean that I will need to pound through all the actual code bytes disassembling everything while looking for call instructions. The magic of Win32® imported functions makes this very easy to hook.
Now that you see from a high level how I'll get the OutputDebugString calls rerouted to routines that send the output to TraceSrv, I want to list my original functional specs for each particular operation in TraceSrvHelper. This way you'll get a good idea what the main operations will be doing. I can also point out where I made some bad assumptions so you can learn from my mistakes.
- Initializing TraceSrvHelper. This is the only function the user is required to call. First, it needs to create an ITrace interface from TraceSrv so that it can call the Trace method. After the ITrace interface is properly created, it will hook all the necessary functions in all loaded modules so all trace statements can be rerouted and all calls to the LoadLibrary family can be monitored.
-
A call to OutputDebugString is made. Since the call is rerouted to my wrapper function, I simply call the ITrace::Trace or ITrace::FullTrace method to get the information over to TraceSrv.
-
One of the LoadLibrary family of functions is called. Make a call to the appropriate real LoadLibrary function to get the module loaded. Hook all the necessary functions in the loaded module.
-
Shutting down TraceSrvHelper. Since trace statement rerouting might be needed for only part of an application, I need to allow everything to be reset to normal. This function only needs to "unhook" the hooked functions in all modules so that they all call the original functions.
While it might look like a great deal of code is involved, it works out to not be that much. While I would like to say that TraceSrvHelper is a perfect piece of code, I have to admit that it only handles 99.837 percent of the tracing that any application might do. As I walk you through the implementation, I will point out the areas where I made some bad assumptions.
Common Code and BugslayerUtil.DLL
Looking over the functional specs for TraceSrvHelper, I realized that there were two ideas that could be abstracted and generalized: finding all the loaded modules currently in a process and hooking functions. Since I can see other cases where these could be useful, I decided to package these functions into a DLL called BugslayerUtil.DLL. As all good bugslayers know, after ASSERTs and trace statements, code reuse is a close number three on the Really Good Things list.
From now on, when I write code for this column I will place generalized functions into BugslayerUtil.DLL and keep moving the code forward from column to column. I also moved the memory dumping and validation code (MemDumpValidator) from the October 1997 Bugslayer column into BugslayerUtil.DLL. The only ramification is that MemDumpValidator assumes you're linking to the DLL version of the C runtime libraries. Since almost all applications should be doing this anyway, it's not much of a problem. If you need to use the static C runtime, you can easily remove the MemDumpValidator code from BugslayerUtil.DLL and use it as originally presented in the October column.
I want to point out one bugslaying tip about the code. Under the main BugslayerUtil directory is a directory called Tests where I put all the unit tests for BugslayerUtil.DLL. Keeping the unit tests as part of the code makes it easy for others on your team to quickly do a unit test if they add functionality or fix a bug. The easier it is to test something, the more likely that it will be tested.
Now that you know where the two new common functions will be packaged, it's time to figure out which modules are loaded into a process. There are standard APIs for this information, but the various Windows® platforms use different ones. Windows 95 and Windows 98 use the ToolHelp functions; Windows NT 4.0 uses the PSAPI library; and Windows NT 5.0 offers both ToolHelp and PSAPI. Unfortunately, there are wide enough differences between the two libraries that I felt it would be more valuable to have one function that hid all the differences and would work no matter which platform my programs were running on. The main work I did in creating GetLoadedModules was to provide some internal wrapper functionsNT4GetLoadedModules, and TLHELPGetLoadedModulesthat hide the differences between the libraries so that each one returns the same parameters. I'm sure to need a corresponding GetCurrentThreads and GetCurrentProcesses functions to complete the set of wrapper functions.
Hooking Imported Functions
The second reusable function in BugslayerUtil.DLL is HookImportedFunctionsByName. Before I dive into the function implementation, I need to explore what hooking a function really means.
An imported function is a function that comes out of a DLL. For example, when your program calls OutputDebugString, it is calling a function that resides in KERNEL32.DLL. When I first started doing Win32-based programming, I thought that calling an imported function would be just like calling any other functiona call or branch instruction would jump to an address and start executing the imported function. The only difference might be that with an imported function the operating system program loader would have to run through the executable and fix up the addresses to reflect where the DLL being called would be loaded into memory. When I looked at how a call to an imported function really is implemented, I was amazed at the simplicity and beauty of the design.
The problem with the way I was thinking becomes apparent when you consider how many APIs there are and that you can easily call the same ones many times throughout your whole program. If the loader had to find each of the times that you called OutputDebugString, it could take forever to get a program loaded. Even if the linker generated a complete table of where each call to OutputDebugString took place in the code, a huge amount of looping and memory writing would make load times excruciatingly slow.
It becomes apparent that each of the calls to OutputDebugString will always go to the same address where OutputDebugString is loaded into memory. The program loader could be much faster if all of the calls an executable made to the imported OutputDebugString function would go to an indirect address. The program loader would only have to fix up a single address that corresponds to OutputDebugString when loading the executable and the DLLs that it will call. This means that instead of being a direct call instruction, calling an imported function generates a jump to the indirect address. This indirect address is stored in the .idata (or import) section of the executable. If you import through __declspec(dllimport), instead of being an indirect jump, the code is an indirect call, as this can save some space.
Hooking an imported function is a matter of finding the import section of the executable, looking for the particular function you would like to hook, and then writing the hook function address into its place. While this might sound like a lot of work, it's not that bad because the Win32 PE executable format is laid out quite nicely.
In Chapter 10 of his excellent book, Windows 95 System Programming Secrets (IDG Books, 1995), Matt Pietrek describes a method for hooking imported functions. Matt's code simply finds the imports section for a module and, using the value returned from a call to GetProcAddress, loops through the imported functions. When it finds the function it's looking for, it overwrites the original imported function address with the hook function. I used Matt's function when I originally worked on TraceSrvHelper.
However, there are two small issues that need to be handled. The first is that when Matt wrote his book, most people did not merge their import section with other PE sections. So if the import section is in read-only memory, you cause an access violation placing the actual hook. This is fixed by setting the virtual memory protection so you can write to it before you write the hook function address to memory. The second issue, which is a little tougher to work around, is that under certain conditions you cannot hook imported functions under Windows 95 or Windows 98.
When you use TraceSrv and TraceSrvHelper, you want to be able to have your trace statements redirected any time you run your application, even when the app is running under the debugger. While you would not think that would be a problem, hooking functions running under a debugger is where the issue resides. In Windows NT, or when running a program in Windows 95 outside a debugger, when you call GetProcAddress to find a function and then look through the import section for that address, you will always find it. But, under Windows 95, calling GetProcAddress in your program while it is running under a debugger returns a different address than when it runs outside a debugger. What is actually returned when running under the debugger is a "debug thunk"a special wrapper around the real call.
Windows 95 doesn't implement copy-on-write in the operating system. With copy-on-write, the operating system will share a common code page in memory, but when
a process writes to that memory, the memory is copied so that the individual process gets its own copy that will not interfere with any other process. In the Windows 95 architecture, any memory that is above the 2GB line is shared among all processes. If one process were to write a breakpoint to this shared memory area without the copy-on-write, the breakpoint would apply to all processes, not just the one being debugged. That's why you cannot step into a system function like CreateThread using a GUI debugger on Windows 95. A kernel debugger like WDEB386 works differently and runs before Windows 95 starts; it inserts itself between Windows 95 and the CPU. Overall, the lack of copy-on-write isn't much of an issue for most people, only those who write debuggers or who want to hook functions correctly whether or not they are running under a debugger.
Fortunately, it's not too hard to get the real address for an imported functionit just takes a little more work and avoidance of GetProcAddress. The PE file IMAGE_IMPORT_
DESCRIPTOR structure, which holds all the information about functions imported from a specific DLL, has pointers to two arrays in the executable. These arrays are called the import address table (or sometimes the thunk data array). The first pointer is the import address table, which the program loader fixes up when the executable is loaded. The second pointer is the original list of imported functions that the loader does not touch. To find the real imported function address, simply work your way through the original import address table until you find the named function that you want to hook, then write in the hook address in the real import address table, which the program is using. By doing this extra step, the hooking code will always work no matter where it is called.
HookImportedFunctionsByName is the function that does all of the work. Since I wanted to make the hooking as generic as possible, I went to the trouble of allowing you to hook multiple functions imported from the same DLL at the same time. As the name of the function implies, it will only hook those functions that are imported by name. I will leave it as an exercise for the reader to extend this function to hook imported functions by ordinal. Figure 1 shows the parameters for HookImportedFunctionsByName.
The structure for each element in paHookArray is simply the name of the function to hook and the new hook procedure. Since you might want to hook or unhook functions
at will, all of the original imported function addresses will be returned.
The real guts of HookImportedFunctionsByName are not terribly difficult to follow. After doing the good bugslayer practice of validating every parameter thoroughly, I call the helper function, GetNamedImportDescriptor, to find the IMAGE_IMPORT_DESCRIPTOR for the requested module. After getting the pointers to the original and real thunk arrays, I loop through the original thunk array looking at each function imported by name to see if it's in
the paHookArray list. If the function is one that is to be hooked, I simply set the real thunk array's memory to PAGE_READWRITE so the hooked address can be written safely, copy the hook into the real thunk function, and reset the memory protection to its original setting. Since the unit test function for HookImportedFunctionsByName is included with the BugslayerUtil.DLL source code, you might want to use it to step through the code if you do not quite
see what's going on. Now that I have some of the supporting functions that will be needed, I can actually implement TraceSrvHelper.
TraceSrvHelper Implementation
All the code to the TraceSrvHelper library is provided in the source code. You might notice that in the list of functions to hook, I am also hooking ExitProcess. In my original specifications I planned to hook OutputDebugStringA/W, LoadLibraryA/W, and LoadLibraryExA/W, but not ExitProcess. This is a case where I forgot to take into account something that I should have. In other words, I had a bug in my specifications. I was not handling the case where the application is shutting down, TraceSrvHelper still has some hooks in place, and some of the supporting codelike the whole OLE subsystemis shut down underneath TraceSrvHelper. When I first ran into this, TraceSrvHelper ended up hanging the application. That's why TraceSrvHelper only handles 99.837 percent of all your daily tracing requirements.
While most of your development probably will not run into cases like this, it can be common when you are developing tools like TraceSrvHelper. In my test case, I had an MFC application where I caused a few memory leaks. MFC reports these memory leaks after WinMain ends because they are done from static objects deep inside MFC42.DLL.
As I described in the October installment of Bugslayer, you should pay attention to the way static objects are constructed and destructed. Since MFC was shutting down OLE (as it should have) before it did the memory leak output, TraceSrvHelper was hanging. Consequently, TraceSrvHelper needs to trap calls to ExitProcess so that it can clean itself up before there are any potential conflicts. What holds true for shutting down can have some ramifications on starting up as well. I handle these by having TraceSrvHelper initialize OLE as part of its startup. The only issue in using TraceSrvHelper in your static constructors is that you will have to call InitTraceSrvHelper in one of your constructors so that it is initialized before WinMain/main.
Since everyone wants to see their potential memory leaks on shutdown, I call _CrtDumpMemoryLeaks in the ExitProcess hook function, TSH_ExitProcess, to at least show the leaks at the point ExitProcess is called. The only drawback is that a few of the items reported as a leak might actually be memory that was allocated in static constructors. Also, if you have any trace statements in your static object destructors or in your DllMain DLL_PROCESS_
DETACH processing, then they will not be seen in TraceSrv.
I pondered various ways to get around the ExitProcess problem. All of the methods that I thought of would require large architectural changes and would affect TraceSrv and any trace viewer applications. Fortunately, the ExitProcess case is the only real issue with TraceSrvHelper and its impact should be felt by relatively few people. If you have a lot of trace statements that might get called after ExitProcess and you are doing MFC development, Paul DiLascia's excellent TraceWin library will fit the bill.
Now that I have shown you where the problem is, it's time to look at the main operation: hooking. When TraceSrvHelper functions are called, there are three times when hooking must occur: during initialization, immediately after one of the LoadLibrary functions is called, and at shutdown. The reason for hooking at initialization is that is how TraceSrvHelper gets trace statements redirected to TraceSrv. The hooking must happen again when one of the LoadLibrary calls occurs so that any and all DLLs that are brought into the process are subsequently hooked.
Again, if you look at my specifications, I failed to call the hooking after processing a LoadLibrary, so I missed those extra DLLs. I quickly fixed this the first time I ran the test program. Finally, I need to hook at TraceSrvHelper shutdown by putting the original function pointers back so that the application can continue to run. As I mentioned earlier, TraceSrvHelper can be loaded and unloaded at will, whenever you need it in your application.
Since all three of these hooking scenarios are essentially identicalonly the patch-in functions are differentI wrote the HookTraceSrvHelperFuncs function to handle all of the hooking. After it gets the list of modules loaded in the process by calling another helper function, AllocateLoadedModuleList, it starts looping through the modules to see which one needs to be hooked. In the file there is a Standard Template Library vector of module handles: g_vHookedModules. If the module is either in the array or meets some of the other criteria in IsHookableDLL, it is
not hooked. If the module is to be hooked, then I simply call the HookImportedFunctionsByName utility routine from BugslayerUtil.DLL to do the hooking. Then I add the module to g_vHookedModules array so it is not hooked again. Finally, I take the array returned by HookImportedFunctionsByName and copy any valid hooked original addresses to the master list. The master list is the seven original functions that I hook in TraceSrvHelper. This array is used to do the unhooking. I do not need to keep a separate array of hooked items for each module because the original functions are always at the same address in memory no matter which module they came from.
The IsHookableDLL function deserves a special mention for three lines of code. Since the memory above 2GB in Windows 95 is shared memory, it belongs to the operating system and is not accessible by user programs other than for reading. If I tried to patch this memory, then the code would crash. If you stepped through Windows 95 with a kernel debugger, you would see that the various functions like VirtualProtect, used to access memory above 2GB, work very hard to not allow you to do it. To avoid any problems, if you are running under Windows 95 or Windows 98 and the module is loaded above 2GB, then IsHookableDLL will return FALSE to not hook the module. With Windows NT, you can patch with impunity.
There are two other items to discuss as part of the implementation. If you compile the TraceSrvHelper.DLL with USE_TRACESRV_INTERNALLY defined, then all of the trace statements I have inside TraceSrvHelper will go through TraceSrv. If you want to extend TraceSrvHelper, defining this flag can make it pretty easy to debug TraceSrvHelper with itself!
The other item is summarized in the "big hairy note" near the top of the file. When I was implementing TraceSrvHelper, I used a lot of trace statements, as you can tell by the code. During a test, everything hung for a second until I got a stack overflow and everything crashed. A pause and then a stack overflow almost always indicates some sort of infinite recursion, so I started checking HookTraceSrvHelperFuncs to see if I messed up something looping through the currently loaded modules. After scratching my head a bit, I finally realized what happened: I had forgotten to always skip the TraceSrvHelper.DLL module handle when patching loaded functions. I patched TraceSrvHelper's own calls to OutputDebugStringA to point to TSH_OutputDebugStringA! If you look at TSH_OutputDebugStringA, you see that I call OutputDebugStringA. If you are running under a debugger, you can still see the trace statement. Therefore, TSH_OutputDebugStringA just called itself over and over. To get around my bug, I redefined the TRACEx macros to call InternalODS instead. Accidentally hooking yourself, while interesting, can lead to some very bizarre bugs.
The Whopping Strange DEC Alpha Bug
When I first moved TraceSrvHelper over to the DEC Alpha CPU, everything went fine with my initial testing since I was running inside the debugger. When I ran the test programs straight from the command line, nothing worked. I ran everything inside the debugger again and took extra care to look at each call stack when one of the hook functions was called. I did not see anything out of the ordinary. The only thing I had to go on was that the test programs would pause for a second and then just go away. As I mentioned before, this is a sure sign of something blowing the stack.
My first thoughts were that I had not really taken into account some of the operating system interactions and that I probably needed to exclude a particular system DLL from being hooked. I went back to my Intel machine and stepped through everything with a kernel debugger so I could have total visibility. After wading through all sorts of RPC calls and into the security subsystem and back, I couldn't see anything wrong. I was really stumped at this point because I have found that the Alpha and Intel versions of Windows NT are so close in implementation that when you mess up on one, you always have the same problem on the other. I shifted back to the Alpha and planned to debug this problem almost blind (I cannot run the kernel debugger because I am out of COM ports on my Intel computer).
My first step was to sprinkle many more trace statements on all of the hook functions. I then took the source code to DBMON and compiled it for the Alpha, hoping it would work since it was not on the Alpha portions of the SDK. DBMON is a little utility that allows you to see any OutputDebugString calls that are not handled by a debugger. It is very similar to the old 16-bit Windows DBWIN program. Fortunately, DBMON was supported on the Alpha. Just as I suspected, I was blowing the stack. I was stuck in a loop with TSH_LoadLibraryA being called over and over.
At this point, I was totally confused because the only way I could get into this case was if LoadLibrary somehow called HookTraceSrvHelperFuncs in the middle of its normal processing. Since that was not likely, I thought that there might be some sort of thread contention that was forcing me into an infinite loop. After adding the thread IDs to each trace statement, I saw that everything was staying in the same thread.
Since the problem never happened under a debugger, I thought I would see if I could attach to the test program with the debugger after it was running. Since my test program, CTEST.EXE, is pretty small, I just put DebugBreak calls into each of the TSH_XXX hook functions so that there would be an unhandled breakpoint exception when the first one was called. I played around with running and attaching the debugger as well as starting from the debugger to see if I could see any differences. At first I did not see anything. But when I compared the trace statements that occurred when TraceSrvHelper was doing the initial hooking, I noticed that, when running outside the debugger, there was an additional DLL that TraceSrvHelper was patching: FX32AGNT.DLL.
I slapped my forehead and let loose at the top of my voice with my best Homer Simpson "Doh!" FX32! is the awesome technology from Digital that allows you to run Intel binaries on a DEC Alpha. It works extremely well and runs Intel binaries almost as fast as native Alpha code. Since FX32! needs to see when a process attempts to load a DLL or create another process, it looks like it injects FX32AGNT.DLL into each process and hooks each of the APIs that can bring an image into memory. That would explain the infinite loop when LoadLibraryA was called; I hooked FX32AGNT.DLL's call to LoadLibraryA, and they in turn hooked my call so that each DLL just called the other until there was no more stack space left.
All I needed to do to avoid hooking FX32AGNT.DLL was to get its module handle and put it into the global hooked modules array, g_vHookedModules, so that IsHookableDLL would not allow it to be hooked. While it took me a couple of hours to find this, it was the kind of debugging that I find really fascinating because I learned much more about the operating system. Now I want to brush up on my Alpha assembler so I can see how they managed to hook functions on a global basis without putting anything in the AppInit_DLLs registry key.
Wrap Up
I tried to make TraceSrvHelper as painless as possible to use so that you can easily retrofit an entire project for TraceSrv. I also hope that between this column and the last one you can see the importance of trace statements. The more that you use them in your project, the easier it will be to debug. How do you know when you have enough trace statements? When you can debug almost any problem in your application without using a debugger!
Da Tips
Did you know that submitting a bugslaying tip to MSJ will keep you thin and make you better looking? Well, maybe not, but at least you will get the satisfaction of helping your fellow developers! Send any tips to me at john@jprobbins.com.
Tip 5
This tip is certainly not high-tech, but it helps stop one of the most annoying C/C++ errors people run into. I often compare a variable to a constant, a number, or the return value of a function in a conditional statement. Although it is slightly harder to read, if you get into the habit of placing the operand that cannot be used as a l-value (numbers, constants, functions, and so on) on the left-hand side of the == operator, forgetting to type the second equals sign in the == operator will result in a (more than obvious) compiler error (a five-second fix) rather than having to track down the logic error later (a five-minute fix).
|