Bugslayer, MSJ, December 1999

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

December 1999

Bugslayer

Code for this article: Dec99BugSlayer.exe (31KB)

John Robbins is a consultant and teaches Windows debugging courses (http://www.solsem.com). He's working on a book for Microsoft Press, tentatively titled Debugging Microsoft Windows Applications. Reach John at http://www.jprobbins.com.

This month, I want to start the column with a mind-reading trick. As I relax and start concentrating, I need you to think of a situation where you were recently having difficulty. Ah, it is coming to me... You had some sort of problem. You had a program that crashed, but only at a few customer sites, and no matter how hard you tried, you could not duplicate the problem. It is getting very clear now; you desperately needed to see the state of your program to see where the problem was located. I also see that you have worked on other operating systems. You were begging and pleading for a core dump so you could load the crashed process into the debugger and see what was going on. Ta-da! I just read your mind.
      What if I were to tell you that Windows NT® and Windows® 2000 can give you a core dump of the process? You might have heard it called a user-mode memory dump (or a crash dump, in Win32® terms). Your old friend, Dr. Watson—which ships with most copies of Windows NT-based operating systems—can write a crash dump of the exact state of the process at crash time. There are only three problems. First, those crash dump files can get quite huge because they are a complete memory image of your process. Fortunately, with today's faster Internet connections, you should not have any trouble getting them from those sites experiencing the crashes. Second, while Dr. Watson does write crash dumps, it seems to do a better job on Windows 2000 than Windows NT 4.0. Third, the debugger everyone uses, the Visual C++® debugger, does not read them. The only debugger that reads crash dumps is WinDBG.
      WinDBG is the debugger that comes with the Platform SDK; most developers stopped using it quite a while ago. Microsoft did not actively maintain WinDBG, so it took a while before it started supporting the latest debugging symbols produced by the latest compilers. Another problem is that many people thought that WinDBG was only for debugging device drivers and did not support user-mode programs. The biggest problem with WinDBG is that it is much harder to use than the Visual C++ debugger.
      Fortunately, Microsoft has started actively maintaining WinDBG again. While it is still harder to use, WinDBG is much more powerful than the Visual C++ debugger. In fact, by the time you finish with this column, you will wonder how you ever lived without it for your nastiest debugging problems. Additionally, with the ability to read the crash dumps from your customer's machines, WinDBG becomes mandatory on all developer machines.
      As you will find, WinDBG is lightly documented. This month, it will take the entire column to cover WinDBG so that you can get an idea how to use it. If you do not understand the ins and outs of WinDBG, trying to read a crash dump file can be an exercise in frustration. While I will not be able to cover everything, I will point out all the important commands and options that you need to get the most out of WinDBG.
      My next column will cover how you can extend WinDBG with your own commands and how to deal with crash dumps. Additionally, I will present a small utility that I have found invaluable in my own development: DbgChooser. This utility allows you to pick the debugger you would like to run when you have a crash on your local machine. After you see the power of WinDBG, you will definitely want the option to jump into different debuggers based on where you think the problem is in your code.

WinDBG Myths and Legends
      WinDBG has actually been around for quite a while; it was the first GUI debugger that shipped with the Win32 SDK back in the alpha and beta days of Windows NT 3.1. What makes WinDBG kind of odd is that it mixes an old text-based kernel debugger (i386KD), an old text-based user-mode debugger (NTSD), and it has a GUI portion grafted on. As a kernel debugger for device drivers, it works pretty well and allows you to debug your driver using many of the features that user-mode programmers have become accustomed to, such as watch windows and selectable call stack displays.
      I would be remiss in my columnist duties if I did not mention that WinDBG has a few problems. I have already mentioned that it is harder to use than the Visual C++ debugger, but there are also more bugs in WinDBG. Fortunately, instead of crashing or locking up the debugger, many of the bugs can be worked around. If something does not display correctly, try it again and it will probably display. As I discuss WinDBG, I will try to point out the problems that I have encountered.
      WinDBG seems to be a debug build because you might encounter an occasional assertion. Now that Microsoft is maintaining WinDBG again, these have become quite rare. However, if you do get one, it is probably a good idea to stop debugging and close WinDBG. As with your assertions, there is no telling what will happen after the assertion message box goes away.
      Before I can talk about user-mode debugging with WinDBG, you need to get the latest and greatest version. Fortunately, WinDBG is provided on the Platform SDK. For this column, I used the Windows 2000 Post-RC1 version from http://www.microsoft.com/hwdev/driver/ntdebugging.htm. The version stamp on WINDBG.EXE for the version I used is 5.0.2080.1. It works just fine on Windows NT 4.0 and Windows 2000 RC1. By the time you read this column, the final Platform SDK for Windows 2000 should be out, so you should use that version.
      After you install WinDBG, you should use it to do some simple debugging to see how all the various windows work. Figure 1 shows WinDBG debugging a very simple crash with many of the informational windows open. If you have any experience at all with the Visual C++ debugger, WinDBG should not be that hard to work your way around. The WinDBG help file is fairly complete for the informational windows, so you should look at it as well. I do not want to dwell on the GUI part of WinDBG because all the power is in the text Command window—as well as many of the WinDBG oddities.

Dealing with the Command Window
      As I discuss the commands, you might want to get your program loaded into WinDBG and start debugging it. The commands are not that intuitive, and the only way to learn them is to see them in action.
      The first thing you need to know about the Command window is that there are three types of commands: regular, dot (.), and bang (!). Most of the normal debugging work is done with the regular commands and consists of things like setting breakpoints, dumping memory, and viewing call stacks. If you ever used the old CodeView® debugger, the regular commands should look very familiar.
      Since Win32 is multithreaded, and WinDBG can debug multiple processes spawned by the main process, you need to know how to view the process and threads currently active in regular commands. The tilde command, ~, will show you the state of threads in the current process. The pipe command, |, shows you the states of all processes being debugged. When you use the thread and process commands, you will see that WinDBG numbers them from zero to n, where n is one less than the number of threads or processes. The list will show the current thread or process marked with an asterisk.
      The following output shows the thread information for a multithreaded program. Notice that thread 4 is stopped. The different fields from left to right are thread number, thread handle, running state, thread priority, Thread Environment Block (TEB), and current address.


 0  111 Running  9 0x7ffde000 0x0000000074FF0C24
 1  73  Running  9 0x7ffdd000
       ?ReceiveLotsaCalls@WMSG_ADDRESS@@AAEXXZ
 2  86  Running  8 0x7ffdc000 0xFFFFFFFFF7028B7F
 3  265 Running  7 0x7ffdb000 0x000000006055343D *
 4  261 Stopped  9 0x7ffda000 _DbgBreakPoint@0

      The only way to switch to a different thread context is to use the tilde command followed by the thread number. Using the previous output, to shift to the initial thread, the command is ~0. If you have a Calls window open to show the call stack, you will see the window update each time you switch threads. Switching processes is similar except you use the pipe command followed by the process number to switch to the next process.
      You can apply the process and thread modifiers to some of the commands so that you can specify exactly on which thread or process you want the command to work. For example, the R command displays the registers. To see the registers for the first thread and the third thread in your process, you would use the commands ~0r and ~2r, respectively. If you have multiple processes, you can modify the command with the process and thread combined. For example, |0~3r shows the first process, third thread's registers.
      There are a good number of regular commands, which you can see by reading the help, but Figure 2 shows the most important ones. The breakpoint commands are important enough that I need to discuss them after the rest of the commands.
      One regular command I do want to spend a few moments on is Evaluate Expression (?). Do not get too frustrated with it, as it seems to have some problems. It does just fine for global variables and when you evaluate locals in the current function. However, I have had trouble getting it to evaluate local variables up the stack frame. Fortunately, if you do need to view locals in a different stack frame, you can open the Calls window and double-click on the line that interests you. If you have the Locals window open, it will show the evaluated locals for the stack.
      The dot commands let you control WinDBG itself. Many of the commands are accessible through the toolbar or menu options. However, WinDBG allows you to script the debugger, and the dot commands let you control the debugger in your script. I've described the most important dot commands in Figure 3.
      Scripting the debugger is rather simple. You just need to have a text file with the various commands you want to execute. If you step back and think about it, the ability to script the debugger can help you to get quickly past preliminaries and speed up your problem solving. The following small script example shows how to wait for a particular string to come through the OutputDebugString API and have the debugger perform some actions.


 * Wait for the bad OutputDebugString
 . waitforstr i = 2 
 * This is the problem area.  Now display the call stack
 * to see how I got here. 
 kv 
 * Break into the debugger. 
 .break

Note that the asterisks in this code indicate comments.
      One interesting aspect of scripting the debugger is that the script is in the Command window. This means you can continue to use the debugger as normal. You can start running a script as soon as you start WinDBG by using the -r command-line option.
      When scripting, I've noticed that the > prompt in the Command window sometimes goes away after a script has finished executing. Without the prompt, nothing you enter is accepted. You can work around this by pressing Enter a few times. If that does not work, typing the Help command seems to get it back.
      The bang commands are where the real power of WinDBG becomes obvious. WinDBG is designed to be extensible. This extensibility is through WinDBG extension DLLs that have complete access to the debugging engine, which means you can build extensions that have knowledge of your program so you can do special debugging. Additionally, since you have full access to the Win32 API, you can make commands that can view and manipulate anything you want.
      WinDBG comes with a number of extensions supplied by Microsoft. They are in directories under WinDBG.EXE, and correspond to Windows NT 4.0 free build (.\NT4fre), Windows NT 4.0 checked build (.\NT4chk), Windows 2000 free build (.\W2KFre), and Windows 2000 checked build (.\W2KChk). For Windows NT 4.0 there are eight extensions, and for Windows 2000 there are nine extensions. WinDBG loads the appropriate versions for the operating system you are debugging on so you don't need to worry about loading the correct version extensions yourself. Figure 4 lists the different WinDBG extensions shipped with WinDBG. The key thing to note is that any extension with "KD" is only for debugging kernel-mode device drivers.
      WinDBG has a number of built-in bang commands that you use to get the WinDBG extensions loaded. Figure 5 lists the important commands. WinDBG will allow you to load multiple extensions. The problem is that it can get confusing—both for you and WinDBG—to determine which one is active in the debugger. For that reason, I find it best to always explicitly indicate which extension command I want to run by using the !extension.command format.
      NTSDEXTS is the WinDBG extension you will use the most because it has the most common informational commands. After you look over the selected commands in Figure 6, you will wonder how you ever lived without them. I encourage you to try out the various commands on your own programs so you can see what you have been missing. The !handle command alone is enough to warrant using WinDBG for at least 30 percent of your debugging.
      Now let's get back to breakpoints. In general, setting breakpoints in WinDBG is nearly identical to setting breakpoints in the Visual C++ debugger. However, WinDBG breakpoints are more powerful. First, WinDBG is much smarter about handling breakpoints in DLLs that are not loaded. Where the Visual C++ debugger makes you specifically add which dynamically loaded DLLs you want to set breakpoints in, WinDBG handles it as a normal part of its processing.
      Second, WinDBG makes it much easier to set per-thread breakpoints. While you can fake a per-thread breakpoint in Visual C++ by using an expression breakpoint (DW(@TIB+24)==thread id), with WinDBG you just specify the thread number in the breakpoint dialog.
      The third area where WinDBG breakpoints shine is that you can associate commands to execute when the breakpoint triggers. You can specify any type of command, including WinDBG extensions, and you can specify multiple commands by separating them with semicolons.
      One of the nastiest bugs I ever worked on was in a core module that kept the main data structures straight. It was so difficult to solve because the bug seemed to be data stream-dependent and I could not narrow that stream. To find the data stream that caused the problem, I set up a breakpoint that used the ? (Evaluate Expression) command to dump the data. The last command the breakpoint executed was a G (Go) command, so the breakpoint continued as soon as it finished dumping the variables. I turned on logging and ran the application until I got the crash. The log contained the data stream that caused the problem. Armed with the data stream, it was much easier to start rigging up test cases that reproduced it. In the end, the command breakpoints probably saved weeks of floundering around for the problem.
      If you have already looked in the help for the various breakpoint commands, you might have come across the BA (Break on Access) command. This tantalizing command will break when the memory specified is either read from, written to, or executes. This sounds like an excellent command for tracking down nasty memory problems. There is only one problem: it only works in kernel mode and is not available for user-mode programs.
      Now that you have an idea what the commands are and how you can deploy them, let's see how you can write your own extensions.

Wrap-up

As you can see, WinDBG is a little different. The wonderful informational commands you can access through the Command window certainly make WinDBG something that all developers should get familiar with so they can tackle the nastiest problems. In debugging, information is power, and WinDBG certainly provides information you've never seen before. In my next column, I will discuss WinDBG extensions and how to deal with crash dumps, so stay tuned!

Da Tips!

What! It is the end of the millennium and you haven't sent in your debugging tips? You had better do it soon. Send them on to http://www.jprobbins.com.
Tip 27 Since I talked about WinDBG this month, here's a WinDBG tip. The WinDBG ? command can also be used to call a debugging function. Back in Tip 21 (MSJ, June 1999) John Maver discussed how to do this with the Visual C++ debugger. For WinDBG, just type in the function with parentheses. For example, if your debugging function prototype is


 void CheckMyMem ( void )

you would just type in


 ? CheckMyMem()

to have WinDBG execute your function.
Tip 28 By default, Windows NT and Windows 2000 always bring up the crash dialog to tell you that you had a crash and make you press Cancel to start the debugger. If you want to automatically jump into the debugger (or bring up the DbgChooser utility I'll describe in my next column), set the Auto value to 1 in HKLM\Software\ Microsoft\Windows NT\CurrentVersion\AeDebug. The default value of zero brings up the initial crash dialog.

Have a tricky issue dealing with bugs? Send your questions via email to John Robbins at http://www.jprobbins.com.

From the December 1999 issue of Microsoft Systems Journal.