Bugslayer, MSJ, August 1999

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

August 1999

Bugslayer

Code for this article: Aug99BugSlayer.zip

John Robbins is a software engineer at NuMega Technologies Inc. who specializes in debuggers. He can be reached at www.jprobbins.com.

The other day I was debugging a nasty problem using some advanced breakpoint techniques, and I ran into a problem with the Visual C++® debugger. The debugger had the information I desperately needed, but because of a bug in the debugger (which I will elaborate on later in this column), it was not able to show it to me. This was especially frustrating because the bug that I was looking for was extremely hard to duplicate. I then went to another machine to whip up a program I could run against the active debugger to get the hidden information. I was rather frustrated, but quickly realized that I had just stumbled across material for my next Bugslayer column.

    This month, I'll start out by talking about advanced breakpoints so that you can learn some techniques to maximize your debugging and minimize time spent in the debugger. After a quick overview, I will discuss the previously mentioned advanced breakpoint problem and how I worked around it. In coming up with the solution, I also thought of some ways you could extend the breakpoint handling so that you can do even more with breakpoints. Implementing the solution also meant that I needed to use the lesser-known remote debugging technique, which is an excellent trick to solve some very tough situations. For the hardcore hackers out there, I had to do some major hacking to make parts of my solution work. Finally, I will wrap up with some new things out on the debugging and symbol engine front that you should know about. So hang onto your debuggers, it's time for advanced breakpoints!

Advanced Breakpoints
      The whole idea behind advanced breakpoints is to let the debugger do the work for you. The Visual C++ debugger supports six different types of breakpoints, and with a little ingenuity you can use the debugger to find hard-to-locate problems. Some of these problems include memory overwrites, wild writes, and which iteration through a loop caused the problem. The first step to advanced breakpoint nirvana is achieved through reading the Visual C++ documentation about setting advanced breakpoints. I won't duplicate it all here; I'll concentrate on specific usage scenarios. The advanced breakpoint topics can be found by searching for "Using Breakpoints: Additional Information" on MSDN™ or MSDN Online (http://msdn.microsoft.com). Also, make sure to read the section on "Using Advanced Breakpoint Syntax" so you can learn about setting the context and scope for your breakpoints.

    Although not explicitly mentioned in the documentation, the breakpoints on messages only work with straight C window procedures. If you are using MFC, the only way to break on a Windows message is to put in a handler using Class Wizard and then just set a location breakpoint on the handler.

    The first way I use advanced breakpoints is for memory overwrites. While you should always catch these in debug builds by using the built-in CRT memory debugging, you might occasionally have some slip through in your release builds. (For more information about CRT memory debugging, see the October 1997 Bugslayer column.) Using a memory data breakpoint, you just need to type the variable and its size into the Breakpoints dialog Data page.

    One thing to keep in mind is that the debugger can run very slowly in certain data breakpoint scenarios. If the memory that you are watching is a stack variable, the debugger cannot use a hardware debug register, so it must single-step each assembler instruction and check the memory location after each step to see if it has changed. Consequently, you should work hard to narrow down the condition to a small area of your code so that you can quickly execute up to the point of the error.

    Another advanced breakpoint technique that I use frequently is to monitor memory when a wild write occurs. These bugs are notorious to track down because you never know when they occur. One of the worst ones I ever saw was when someone accidentally wrote to an uninitialized stack pointer and corrupted memory five or six DLLs over and 20 minutes later.

    Once you can get the wild write narrowed down so that you can duplicate it 20 percent of the time, you can set a data memory breakpoint and have it look at larger sections of memory. On the Data page of the Breakpoints dialog, set the number of elements to watch and the complete size of the structure or buffer. While this will usually result in the debugger using the single-step method, it is one of the only ways you can find this type of error. To monitor the progress, I let it run on my second machine and continue looking for it on my main machine. If you do not have a second development machine, this is an excellent reason to get one. It pays for itself the first time you can find the wild write.

    The expression breakpoints are your secret weapon for stopping when a certain condition evaluates to true. The documentation is not explicit about saying it, but the only expressions that are valid are those using C-style evaluations: ==, !=, &&, and so on. Additionally, the expressions can include variables, but are restricted to DWORD-length (32-bit) and less data evaluations. For example, if you need to check a string variable for a specific character you have to create an expression like the following one:

((szBuff[0]='M')&&(szBuff[1]=='S')&&(szBuff[2]=='J'))

      Another thing to keep in mind is that the breakpoint expression evaluator does not handle anything other than variables. Thus, if you want to check enumeration values or constants, you will need to put the exact value in the expression.

    The final advanced breakpoint trick—which is the one that I seem to use most—is setting a skip, or pass, count for the breakpoint. This allows you to let your program run up to the point where you think the problem resides, without stopping each time the line executes. The great thing about the Visual C++ debugger is that the Breakpoints dialog (see Figure 1) will show you how many times the breakpoint has been skipped. This comes in very handy if you are crashing in a loop. You can set the skip count to a very high value and run until your program crashes. Then open the Breakpoints dialog; in the listbox at the bottom, the skip count will be displayed for the breakpoint. Keep in mind that the count remaining does not update if you are single-stepping over the breakpoints. The count only decrements when your application is running and not stopping at the breakpoint.

Figure 1 Breakpoints Dialog

      Figure 1 Breakpoints Dialog

       Figure 1 shows the Breakpoints dialog with a breakpoint having been executed four times. If you've ever used skip counts before on your breakpoints, the fact that the Breakpoints dialog shows you the remaining count might be a surprise.

    Take a hard look at the listbox at the bottom of the dialog. It looks like an MFC CCheckList box, and if you have ever worked with that class you know that they never have a horizontal scrollbar associated with that listbox. What do you think happens if you have a slightly more complicated breakpoint with a slightly longer file name? You guessed it; the breakpoint count goes off the right side of the listbox into the never-never land of clipped text, so the skip count can be essentially useless.

    Recall that at the beginning of this column I talked about the bug in the debugger that I ran into. You guessed it—this is the problem. I immediately worked around it by writing a program that found the Breakpoints dialog and got all the strings out of the listbox so I could see what iteration I crashed on. I wanted to solve this problem so that you could see all the information in the listbox when the Breakpoints dialog is active, and you do not have to rely on a separate program. It's time to talk about how I solved this and added more capabilities to the breakpoints themselves.

Fixing and Extending the Breakpoints Dialog
      In order to fix the clipped text problem, I figured that whatever I did would require me to get into the address space of the debugger so I could take over the Breakpoints dialog listbox. I poked around a bit with a Bugslayer's best friend, MSDN, and I found that the Visual C++ add-ins would certainly do the trick. As I read up on what you could do in add-ins, I saw that I could also do some more to make breakpoints even more powerful because the add-in model fires a BreakpointHit event when a breakpoint goes off. For a complete discussion of Visual C++ add-ins, see the Visual Studio® documentation and Steve Zimmerman's article, "Extend Developer Studio 97 With Your Own Add-ins, Macros, and Wizards" (MSJ, September 1997).

    When debugging, I have noticed that I am always doing the same things repeatedly when breakpoints go off, so I wanted to add the ability to automate those actions. I refer to these actions as command breakpoints. The first action is rather simple; I wanted to do some trace statements so that I could see variable values and such. By having a trace command attached to the breakpoint, I could just use it in a pinch without having to insert one in the code and recompile it.

    The second command breakpoint is one that would allow you to run a Visual C++ macro when the breakpoint goes off. Many times, I want to get the environment to a state where I can concentrate on the bug at hand, and this might mean that I close down all extraneous windows and open others with a macro.

    The last command breakpoint is one to get the memory window open and show the value of a specific address or variable. I am always poking in the memory window when stopped at a breakpoint, so I wanted to automate that as much as possible. As I get down and discuss the implementation, you will see that the memory dump breakpoint command was much harder to implement than I ever expected.

Figure 2 Command Breakpoints

      Figure 2 Command Breakpoints

Using command breakpoints is straightforward. When you add the AdvancedBreakpoints.DLL add-in into Visual C++ it adds a new toolbar button and a single new command, AdvancedBreakpointsDialog. When you click on the button, the dialog in Figure 2 pops up. I tried to follow a similar usage model as the normal Breakpoints dialog, so it should be familiar to you. The upper two-thirds of the dialog contains the commands that you associate with a breakpoint. As you can see, the names match up with their function. Also, the highlighted command in the Commands listbox shows that I even allow multiple commands to be associated with a single breakpoint.

Figure 3 Adding a Breakpoint Command

Figure 3 Adding a Breakpoint Command

To add a new command, click on the New button and fill in the command edit controls, as shown in Figure 3. For the Memory Dump and Macro commands, you just need to type the single parameter in the respective edit control. The Trace command follows the formatting style first pioneered by printf, but the only acceptable conversion specifier is "%s". The New and Modify dialogs do the proper checking to ensure that you entered the Trace parameters correctly. That's all there is to using AdvancedBreakpoints.DLL.

Figure 4 Fixing the Clipped Items

Figure 4 Fixing the Clipped Items

      The neat thing is that just loading AdvancedBreakpoints.DLL will fix the clipped text problems with the Breakpoints dialog! With AdvancedBreakpoints.DLL loaded, you just need to roll your mouse over the clipped item in the Breakpoints list to see the remaining count (see Figure 4). While the checkbox is not visible in the tooltip, feel free to add it if you like; you can still check and uncheck as normal. Now that you have an idea how to use the command breakpoints and have seen the free bug fix, I want to talk about how you can debug the add-in with ease.

Remote Debugging
      One of the better kept secrets about the Visual C++ debugger, and WinDBG as well, is the fact that they both offer good support for remote debugging. For those of you not familiar with remote debugging, it means that your program and a tiny debug stub run on one machine (the remote machine), and the debugger runs on another (the target machine), communicating with one another through TCP/IP.

    If you are doing interesting drawing code, DirectX® games, or applications where you cannot afford to have the overhead of the full-blown IDE environment, remote debugging is almost mandatory. Many times, I have seen UI developers with the debugger scrunched down in one corner and their application scrunched into another, trying to single-step through paint code or activation code with no luck at all. With remote debugging, all of those problems go away—you will wonder how you ever lived without it.

    Another trick that remote debugging allows you to do is to debug from one operating system to another. I find it much more stable to run the main debugger on my Windows NT-based machine and remote debug down to a Windows® 9x-based machine.

    Remote debugging also comes in handy when debugging your Visual C++ Add-ins. While you might be able to debug the Add-in on the same machine with two instances of the debugger, I take the approach that anything that could destabilize the debugger I'm using is not a good thing.

    If you are a manager, you need to stop now and go get your developers a second machine so that they can save time debugging the drawing code. Equipment is very cheap compared to their salaries!

    Setting up and starting remote debugging is rather easy. Again, I do not want to duplicate the Visual C++ documentation here, so I encourage you to search for "Debugging Remote Applications" on MSDN and read it before continuing with the rest of this column. The documentation says what files you need to copy, but doesn't tell you where you can find them, so I listed their locations in Figure 5. I will assume that you read the documentation because I want to concentrate on the tricks to make your remote debugging a pleasurable experience.

    The first thing about remote debugging that the documentation does not explicitly tell you is to share all of the drives on the remote machine. When you start remote debugging, the target debugger will prompt you to locate the DLLs that the target machine needs so that it can match the ones on the remote machine. By sharing all the drives on the remote machine, you can just substitute the UNC names for the drive letters and ensure that you are getting the exact binaries that are loaded on the remote machine. The only drawback is that it does slow down the debugging session because the target machine will have to load the information from the remote machine's files across the network. You can have the target machine point to the files locally, but you must make sure that they match.

    The location of the remote files is stored in the option (.OPT) file for the project. If you are going to use the same project on the target machine to connect to a different machine, you will need to delete the .OPT file for the project. If you accidentally click off the checkbox that tells the debugger to stop looking to match files, you need to delete the .OPT file and start again. I have not found a way to turn that back on.

    While the individual files are stored on a per-project basis, the fact that you told the target machine that you were remote debugging is a global Visual C++ setting. Personally, I consider this a usage bug because it is certainly a project setting. I always forget to go back to the Build | Debugger Remote Connection menu item and set it to local debugging.

    If you are building a network application, be aware that remote debugging by nature transfers a good deal of data across the wire. If you need to minimize the traffic, another remote debugging alternative for you is to use WinDBG, which ships with the Platform SDK. WinDBG supports serial and TCP/IP debugging. Much of the same techniques apply to WinDBG with regards to remote debugging, so if you get it working for the IDE you can do the same for WinDBG. Now that you have an idea of what I did to debug AdvancedBreakpoints.DLL, I want to turn to some of the implementation highlights.

Hacking...I Mean Developing AdvancedBreakpoints.DLL
      First, I must issue a warning. If you are an object-oriented COM purist and believe in the sanctity of objects, you might not want to read this section. I piggybacked off the COM Add-in model to hack into the IDE so that I could fix the clipped text bug in the Breakpoints dialog and get the memory dump breakpoint command to work. I must be a sick guy because I thought it was a lot of fun!

    The first thing I tackled was the clipped text bug. Since the add-in model could get me into the IDE's address space, I just needed to figure out how to get a hold of the Breakpoints dialog's listbox. After a little poking around, I saw that I needed to be notified when the Breakpoints window is created. Notice that I use the word window, not dialog, as the Breakpoints dialog is really a window, like many MFC dialogs. This was screaming for me to implement a CBT hook to get the creation and subclass the window in order to subclass the listbox. If you look in the AdvancedBreakpoint.CPP file included with this month's source code distribution, you can see that creating and destroying the hook is handled in the usual application InitInstance and ExitInstance methods.

    The CBT hook subclasses the actual Breakpoints dialog and the simple replacement window procedure only cares about two messages, WM_CREATE and WM_DESTROY. Originally, I had planned to resize the Breakpoints dialog so that the text would fit. However, based on some of the breakpoints that I have seen, all of us would have had to run at 1600x1200 resolution to make the dialog wide enough. That was clearly a dead end, so I started thinking that it would be nice to use a tooltip.

    When I started looking at MSDN to see what it would take to do tooltips, I found the article, "Tiptoe Through the Tooltips With Our All-Encompassing Tooltip Programmers' Guide" by Roger Jack (MSJ, April 1997). Not only did Roger do a great job explaining tooltips, he implemented exactly what I needed: tooltips for listboxes. I made some minor modifications to Roger's code to allow me to derive a class handling CCheckListbox, and that was about it. The only thing my window procedure for the Breakpoints dialog does is subclass the listbox on WM_CREATE and free some allocated memory on WM_DESTROY. Roger's code does all the hard work of making the tooltip happen. The best code in the world is code you can borrow! Thanks, Roger.

    After I got the clipped text bug fixed I had a dummy add-in, but my column was a little too light to call it quits just yet. As I mentioned earlier, I took the opportunity to start adding the command breakpoints. In the big scheme of things, the hardest part I had with the command breakpoints was coming up with a dialog that made sense and was usable. If you have a better way of doing the job, I would love to hear about it.

    The Macro and Trace commands are so simple they are not even worth mentioning, but the Memory Dump is certainly interesting. The IDE Object Model is getting better, but I would still like to see more capabilities built in. There is a command, ActivateMemoryWindow, that activates the memory window, but it just shows the memory window. Using the Application object's ExecuteCommand method allowed me to call that command. After that, it was hack city. All I wanted to do was get a variable or address into the memory window edit control so it could be dumped. It was an interesting process, so you might want to follow along with the code in Figure 6, which is Commands.cpp from AdvancedBreakpoints.DLL, starting with the CCommands::AdvancedBPCmd_memdump function.

    The first thing that I tried was to find the memory window edit control, set focus to it, and see if I could just set the text. Alas, the simple approach did not work. While I was able to set the text, it did not update the data portion of the memory window. Additionally, it was screwing up the caret focus because the memory window was doing some sort of special handling. I also found that I had to rethink how I went about finding the memory window itself.

    As you probably know, there are three states for the memory window: the first is docked on the toolbar; the second is an MDI child window; and the third is floating outside the IDE altogether. Using one of a hacker's best friends, Spy++, I looked at the parentage of all these states and came up with a scheme to always find the edit window.

    The first step is to find the main IDE window itself so I can account for the two cases, the MDI child and docked on the toolbar, where the memory edit control is a descendant below the main IDE window. To find those cases, it is easy enough to enumerate all child windows under the main window looking for the edit control who has a sibling window entitled "&Address:." The EnumChildWins function handles this. If I do not find the edit control, then I can assume the memory window is free-floating and will appear as a top-level window beneath the desktop. In that case, the LookForUndockedMemWindow function looks for the edit control. The comment in the function shows the window hierarchy for the memory window itself.

    After I found the memory window edit control, it was time to figure out a way to get the caret to the edit control. At first I thought I could just use the SendKeys API that I discussed in the April 1999 Bugslayer column, but it did not work. The problem relates to what happens when a breakpoint triggers in the debugger. When my code is executing, the operating system is in the middle of transferring the focus from the debuggee to the debugger, and there is no telling which window will get the input generated by SendInput. If I could not use SendInput, at least I had the edit control window handle so that I could do the next best thing: force a mouse click by posting the messages myself. That worked like a champ.

    Once I had the focus properly set I figured all I needed to do was to set the text, fake the Enter key, and I was finished. Not so fast, Bugslayer boy! It almost worked, but there were cases where I would force my text in, but the IDE would change the value. I experimented with it a bit and saw that the text was being changed when the previous value in the memory window was a local variable. The IDE was doing the right thing and converting the local variable into its address so I could continue to see the correct memory no matter where I stopped next. Using Spy++ to watch the message stream, I saw that my forced text code would go through and the very next message was a WM_SETTEXT where the IDE was doing its thing. I needed to override that WM_SETTEXT and get my values in there. The only way to do that was to subclass the edit control. You can see the subclass code in the EditSubClass function.

    When I first started fiddling with the conditions that caused the IDE to change the text, I thought I could make my code smart enough to only do the subclass when it needed to. Unfortunately, I never quite found the right conditions. The only thing that worked consistently in all cases was to force my text into the edit control and still subclass the edit control. The only issue was that I needed to know when to stop the subclass if the IDE did not subsequently change the text. Based on the Spy++ message capture, I saw that the IDE always set a EM_SETSEL to select all the text in the edit control before doing anything else. I added that to my subclass procedure and everything started working.

    Up to this point, I had been testing with simple location breakpoints, so I started testing with the other types. Since the others throw up a message box to tell you that the breakpoint was hit, I thought that they might cause some trouble with my focus fake-out attempts. Sure enough, they did. At this point, the column's due date was getting close, so I decided to leave this as an exercise for the reader. I fixed up the code so that when you try to associate anything other than a location breakpoint to a command set that has a memory dump in it, you will be warned. Solving the message box problem should not be that hard. You will obviously need to hook MessageBoxA out of MFC42.DLL, eat the text, and then show the message box when appropriate. My guess is that you could do it in EditSubClass after you unhook the subclass.

    Another thing that I did not hook up was saving and restoring the command breakpoints on a per-project basis. If anyone adds either of these two enhancements, I would be happy to post the code on my Web site so that others could enjoy it as well.

    One thing that I need to warn everyone about is that there seems to be a bug in the breakpoint object in the IDE. It seems that the breakpoint location and the source line are not updated when you edit the source. Consequently, you might have set the breakpoint on line 26, edited the code so you moved the breakpoint, and Advanced Breakpoints will still report the location as line 26. The breakpoint is still valid and will go off, but there does not seem to be any way to get the current, correct line number.

New Bugslaying in Windows 2000
      At the time I wrote this, Windows 2000 beta 3 was about to head out the door, and I wanted to update you on a cool new thing from Microsoft to help all the bugslayers out there. With Windows 2000 they moved the symbol engine out of IMAGEHLP.DLL and into DBGHELP.DLL. The nice thing with DBGHELP.DLL is that it will offer the ability to specify exactly which symbol reader to use so that you can be assured that you will get the right PDB reader. Now you'll no longer have to replace a system DLL in order to read symbols! In fact, IMAGEHLP.DLL is off limits to updating from any programs other than the Windows 2000 updaters.

    Microsoft recommends that everyone start using DBGHELP.DLL instead of IMAGEHLP.DLL as soon as possible. DBGHELP.DLL is redistributable, and you can place it in the same directory as your main EXE. By default, DBGHELP.DLL will use the system-supplied MSDBI.DLL, which reads Visual Studio 97 and Visual Studio 6.0 symbols. If, in the future, you do need to use a different PDB reader, you can set the registry key HKLM\Software\Microsoft\Windows\CurrentVersion\dbgHelp with the REG_SZ value, PDBHandler, to point to the complete path to the desired PDB reader. If you need to set this in the future, I recommend that you set it as your program starts up and clear it when your program ends. That way, everyone else can ensure that they get the proper version of the PDB readers that their applications need.

Wrap-up
      I hope that you find AdvancedBreakpoints.DLL useful for your debugging. It certainly has been for mine, and I have noticed that when I go to a debugger where it is not installed, I keep looking for the tooltips in the Breakpoints dialog as well as the additional commands. If you think of any other features that you would like to see in the debugger, please feel free to email me. If I can hack it into the IDE, I will!

Da Tips!
      The summer is almost over and you have been meaning to send me your debugging tips. This is just a reminder! Send them to john@jprobbins.com.
Tip 23 If you are buried way down deep in your application during a debugging session and you want to quickly go back up the call stack, open the Call Stack window, highlight the function you want to return to, and hit your breakpoint key. The Visual C++ debugger allows you to set breakpoints in the Call Stack window.
Tip 24 Now that Visual C++ 6.0 is able to read export symbols as well as the operating system symbols, a few folks have written me asking how they can set a breakpoint on the first instruction of an exported system function. The trick is to check if the symbols have been loaded in the debugger by checking the debugger output window. If it says "Loaded symbols for XXX", where XXX is the name of the DLL you are interested in, the symbols are loaded.

    If the symbols have not been loaded, open the Breakpoints dialog and click on the arrow next to the Break At edit control to bring up the Advanced Breakpoints dialog. In the Location edit control, type in the name of the function as it is exported from the DLL. (You can check the name with the "DUMPBIN /EXPORTS" command.) Tab down to the Executable File edit control and type in the name of the DLL that exports the function. For example, if you do not load the symbols to KERNEL32.DLL, you can break on LoadLibrary with the following breakpoint syntax:

{,,KERNEL32.DLL}Loadlibrarya

      If the symbols are loaded, you will need to do a little more work. You will still fill out the same information in the Advanced Breakpoints dialog, except that you must manually calculate the symbol name for the location field. If you do load the symbols for KERNEL32.DLL, the correct symbol for LoadLibrary is _Loadlibrarya@4. The breakpoint syntax is

{,,KERNEL32.DLL}_Loadlibrarya@4

The number after the @ sign indicates that the function is a __stdcall function and represents the number of bytes that are in the parameter list. Calculating the byte count is easy; it is the total size of all parameters, each size-aligned to a multiple of four bytes. Keep in mind that the @ sign and number are still needed, even for those functions that do not have parameters. For example, the correct translation for TlsAlloc is _TlsAlloc@0.

Have a tricky issue dealing with bugs? Send your questions or bug slaying tips via email to John Robbins: john@jprobbins.com.

From the August 1999 issue of Microsoft Systems Journal.