This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.


August 1998

Microsoft Systems Journal Homepage

Download Aug98Bugslayer.exe (75KB)

John Robbins is a software engineer at NuMega Technologies Inc. who specializes in debuggers. He can be reached at john@jprobbins.com.

You've probably heard the proverb, "An ounce of prevention is worth a pound of cure." When it comes to handling crash problems in your code, the proverb should be: "A couple of key lines of code can keep your customers using your application so you can keep your job." Well, I guess that might not be as pithy and memorable as the original, but at least my proverb mentions code.
      Since I am not going to make a million bucks writing proverbs, I had better stick to what I need to talk about here in the Bugslayer column. This month I'll cover exception handlers and unhandled exception filters or crash handlers. If you have been doing any C++ programming at all, you have probably already dealt with exception handlers. Crash handlers are those routines that can gain control right before the application shows that nice fault dialog that drives your users crazy. While the exception handlers are C++-specific, the crash handlers work with both C++ and Visual Basic®-based code.
      To help you make your applications more robust, I will show you how to apply some of the assistance the operating system and the compiler offer. Additionally, if used judiciously, these ideas allow you to gather more information when your app does crash. This lets you solve potential problems faster. I will start out with a brief primer on exception and crash handling as the basis for some of the concepts that I will discuss later. I will also discuss the reusable code that I wrote for this column, which you can use in your exception and crash handlers. Finally, I will deal with some of the issues that have surfaced about the IMAGEHLP symbol engine that I first presented with the CrashFinder application in the April 1998 Bugslayer), this is exactly what I did to debug it. An alternative is to use a kernel debugger like WinDBG to get around this limitation.
      Another issue is that calling SetUnhandledExceptionFilter is a process global operation. If you build the coolest crash handler in the world for your OLE control and the container crashes—even if it's not your fault—your crash handler will be executed. While you might think this could keep you from using SetUnhandledExceptionFilter, I have some code that might help you out.

Handle Only This
      I wrote some simple functions to limit a crash handler to a specific module or modules (see Figure 3). I placed the code in the reusable BugslayerUtil.DLL, which you can find in Aug98Bugslayer.exe
      The basic idea for limiting the crash handler is that I set an unhandled exception filter. When it is called, I check the module it came from. If it is from one of the modules requested, I call the exception handler, but if it is from a module outside those requested, I call the previous exception filter I replaced. By calling the replaced one, multiple modules could use the crash handling API I defined without stepping on each other.
      To set your filter function, simply call SetCrashHand-lerFilter. Internally, SetCrashHandlerFilter saves your filter function to a static value and calls SetUnhandledExceptionFilter to set the real exception filter, CrashHandlerExceptionFilter. If you do not add any modules that limit the exception filtering, CrashHandlerExceptionFilter will always call your exception filter no matter which module had the hard crash. It is best if you set your call to SetCrashHandlerFilter as soon as you can and make sure that you call it again with a NULL filter function right before you unload.
      Adding a module to limit crash handling is done through the AddCrashHandlerLimitModule. All you need to pass to this function is the HMODULE for the module in question. If you have multiple modules that you want to limit crash handling to, just call AddCrashHandlerLimitModule for each one. The array of module handles for limiting are allocated and kept out of the main process heap.
      As you look at the various functions in Figure 3, you will see that I do not make any C runtime library calls at all. Since the crash handler routines are called in extraordinary situations, I cannot rely on the runtime being in a stable state. To clean up any memory that I allocated, I use the automatic static class trick that I first discussed in the October 1997 Bugslayer column. I also provide a couple of functions that allow you to get the limit module size and a copy of the array—GetLimitModuleCount and GetLimitModulesArray, respectively. I will leave it up to you to write a RemoveCrashHandlerLimitModule function.

Translate This
      Now that you have written your exception handlers and crash handlers, it's time to talk about those EXCEPTION_ POINTERS structures each gets passed. Since this is where all the interesting information about the crash is stored, I wanted to develop a set of functions that you can call to translate the information into human-readable form. With these functions, all you need to concentrate on is the display of information to the user in a manner that's appropriate for your particular application. All of these functions are in Figure 3.
      I tried to keep the functions as simple as possible. All you need to do is to pass in the EXCEPTION_POINTERS structures. Each function returns a pointer to a constant string that holds the text. If you looked at the code, you might have noticed that each function has a corresponding function whose name ends in "VB". When I put these functions together I did not realize that Visual Basic couldn't handle a string returned from a function; it can only deal with a string as a parameter. Therefore, to use these functions from Visual Basic, you must pass in your own string buffer. Since the EXCEPTION_POINTERS-handling functions will be called in crash situations, I set them up to use a static buffer in CrashHandler.cpp. When using these functions from Visual Basic, declare a global string variable and Dim it early in the program so the memory is available.
      The GetRegisterString function simply returns the formatted register string. The GetFaultReason function is a little more interesting in that it returns a complete description of the problem. The returned string shows the process, the exception reason, the module that caused the exception, the address of the exception, and—if symbol information is available—the function, source, and line where the crash occurred.


 CH_TESTS.EXE caused a EXCEPTION_ACCESS_VIOLATION in module
 CH_TESTS.EXE at 001B:004010FB, Baz()+64 bytes,
 CH_Tests.cpp, line 56+3 bytes
      The most interesting functions are GetFirstStack-TraceString and GetNextStackTraceString. These functions, as their names indicate, let you walk the stack. Like the FindFirstFile and FindNextFile APIs, you can call GetFirstStackTraceString and then continue to call GetNextStackTraceString until it returns FALSE to walk the entire stack. In addition to the EXCEPTION_POINTERS structure, these functions take a flag option parameter that lets you control the amount of information that you want to see in the resulting string. The following string shows all the options turned on.

 001B:004018AA (0x00000001 0x008C0F90 0x008C0200 0x77F8FE94)
 CH_TESTS.EXE, main()+1857 bytes, CH_Tests.cpp,
 line 341+7 bytes
The values in parentheses are the possible parameters to the function. Figure 4 shows the options flags and what each will include in the output string.
      To see these functions in action, I included two sample test programs. The first, CH_TEST, is a C/C++ example. The second program, CrashTest, is a Visual Basic-based example. Between these two programs, you should get a pretty good idea of how to use all of the functions I've presented.
      The implementation of these functions is rather straightforward and consists mainly of string buffer manipulations. For all the symbol information, I use the IMAGEHLP.DLL symbol engine that I discussed in the April 1998 Bugslayer column. The interesting part of the implementation takes place at the end of Figure 3. When I first started testing the functions, I noticed that the source and line information for an address would appear the first time that I requested it, but that it never appeared on subsequent lookups at the same address. Several astute readers had mentioned that they had seen the same thing with the CrashFinder application, but we were never able to see what was unique about the situation. It appeared that there was a bug in the SymGetLineFromAddr function. It only finds the source and line information once for an address when looking up the information in PDB symbols. It seems to work correctly for C7 and COFF symbols.
      To work around this bug, reader Iain Coulthard figured out that SymGetLineFromAddr only seemed to find the source lines for addresses that are at the start of a line—in other words, addresses with a displacement of zero bytes. Iain found that if you look backwards from the original address until you found the next address that had a zero displacement, the source line lookup works. After finding the zero-displacement address, just subtract the original address from the found address to compute the displacement. Iain's solution was much quicker than mine; the only way I found to make the SymGetLineFromAddr function work was to completely shut down and restart the symbol engine on each SymGetLineFromAddr call.
      I put Iain's workaround in the InternalSym-GetLineFromAddr function so the work was not scattered across the source. InternalSymGetLineFromAddr also takes care of the case where an older version of IMAGEHLP.DLL that does not support SymGetLineFromAddr is on the system. If you plan on using SymGetLineFromAddr for a debugger, you might want to wait until a fixed version of IMAGEHLP.DLL has been released. If you do not want my workaround, undefine WORK_AROUND_SRCLINE_BUG when building BugslayerUtil.DLL.

Wrap-up
      I hope I have been able to show you the power of exception handlers and crash handlers. If used properly, they can save you from full crashes. If you do crash, you can maximize the information that helps you solve the problem quickly. You might want to consider building your release builds with COFF symbol information. This will add to your binary's size, but using the code I presented you can get some excellent free source and line information during a crash.
      Many people have asked me where to find the version of IMAGEHLP.DLL that supports source and line lookup. It is on the November 1997 (or later) Platform SDK in the \MSSDK\bin\Win95\i386 directory. Although it is in the Win95 directory, it works perfectly well with Windows NT. Additionally, it does not look like IMAGEHLP.DLL is redistributable, so I am unable to email copies to those that need it. However, the entire Platform SDK is downloadable from http://msdn.microsoft.com/developer/sdk/bldenv.htm if you do not have the MSDN CD. For further information on redistributing IMAGEHLP.DLL and other files, check out Redist.txt in the \MSSDK\license directory on the Platform SDK.

April Bug
      There was a big bug in the April 1998 Bugslayer column! I mentioned that building your release builds with full PDB symbols would only add 1KB to the size of your binary. I failed to mention that turning on the /DEBUG flag for LINK.EXE also turns on the /OPT:NOREF flag. This means that all the unreferenced functions (COMDATS) will be included as well. Consequently, this can quickly jack up the size of your binary. Therefore, when linking with /DEBUG, make sure to specify /OPT:REF to force only referenced functions to show up in the resultant image. /OPT:REF will also turn off incremental linking, but you never want to have incremental linking turned on for release builds because it will add a ton of padding to the binary and waste all sorts of space. Incremental linking should only be used in debug builds.
      Finally, the size of the PDB record can be a bit bigger than I thought it could be. I was under the impression that it started with the NB10 at the end of the binary. Evidently, there is some header information before it that looks like offset information into the PDB preceding the NB10. This portion can vary in size depending on the size of the binary, but not enough to stop you from building your release builds with full PDB files.

Da Tips
      Got a debugging tip? Send it to john@jprobbins.com so you can bask in your two minutes and thirteen seconds of fame as you help your fellow developers!
      Tip #11 One simple thing I've learned to do is to invalidate that which has been deallocated. For example, if you delete a pointer, set that pointer to NULL immediately afterward. If you close a handle, set that handle to NULL (or INVALID_HANDLE_VALUE) immediately afterward. This is especially true when these are members of a class. By setting a pointer to NULL, you prevent double delete calls. delete NULL is valid. (Thanks to Sam Blackburn, sblackbu@erols.com.)
      Tip #12 In the April 1998 column, I presented a tip about automatically initializing structures that require a size field to be filled out. Reader Simon Salter (simon@chersoft.co.uk) offered an even better way to accomplish this using C++ templates:


 template <typename T>
 class SWindowStruct : public T
 {
 public  :
     SWindowStruct()
     {
         memset ( this , 0 , sizeof ( T ) ) ;
         cbSize = sizeof ( T ) ;
     }
 } ;
Using this class, you just need to declare a structure like the following and it is taken care of automatically:
SWindowStruct<REBARBANDINFO> stRBBI ;

Have a tricky issue dealing with bugs? Send your questions or bug slaying tips via email to John Robbins: john@jprobbins.com

From the August 1998 issue of Microsoft Systems Journal.