June 1998
Download Jun98Bugslayercode.exe (2KB)
John Robbins is a software engineer at NuMega Technologies Inc. who specializes in debuggers. He can be reached at john@jprobbins.com. |
This month let's talk about one of the most important tools in your bugslaying arsenal: your debugger. Many engineers just use their debuggers in a passive mannerto set a breakpoint or two and look at a couple of local variables. With a small investment of time, you can really spelunk through your bugs and learn how your application is interacting with the operating system. Since you probably spend more time debugging your code than writing it, it's very important to be totally comfortable with the debugger.
To ensure that everyone is on the same sheet of music, I will start with a couple of basic issues and work up to some advanced debugger tricks. All of the samples are part of this month's code, so you can work along if you download them. I encourage you to follow any tangents that you might go off on with the sample code, since I can cover only a few tricks in this column and there are enough out there to write a whole book! For this column, I will be using the Visual C++® debugger for several reasons. First, almost everyone has it. Second, when your program crashes, no matter if it's written in C, C++, Visual Basic®, or even Java, you need to debug it at the native code level. Even if C or C++ is not your main development language, you need to be familiar with the one debugger that works for all. If you are using another vendor's debugger, you should not have any trouble following along. While most of the techniques mentioned in this column work on any Win32®-based operating system, several of them work only on Windows NT®. When debugging on Windows® 95 and Windows 98, I always use a kernel debugger like WDEB386. In my opinion, GUI debuggers are too limiting because Windows 95 and Windows 98 do not support copy-on-write page updating. Any address above 2GB, where most of the operating system resides, is shared across all processes. If a debugger set a breakpoint in one process, it is shared by all processesthus crashing the operating system. Windows NT gets around this by copying the page where a breakpoint is set and giving the debuggee process its own view of that page. If you are debugging problems inside your application's logic, GUI debuggers work fine. Unfortunately, half the problems you encounter in your application are bugs that come from your app's interaction with the operating system. By using a kernel debugger on Windows 95 and Windows 98, you get the advantage of being able to step into the operating system, just as you can under Windows NT. Before you jump right into my debugging tips, I strongly recommend reading Matt Pietrek's "Under The Hood" columns in this issue and the February 1998 issue. In these columns Matt gives an excellent introduction to the Intel assembler instructions. Since only a few people in Redmond have the source code to something like KERNEL32.DLL on their system, the rest of us need to look at disassembly. In this column, I will assume that you have at least an understanding of the assembler which Matt covered. While it would be really nice if all crashes occurred in places where you have full source code, 94.6 percent of the time your application crashes in the middle of a ton of assembler code.
Endians, Calling Conventions, and Symbols
|
|
My COMCTRL32.DLL was built at 20:57:56 Tuesday, November 18, 1997, while COMCTRL32.DBG was built at 15:33:18 Friday, April 25, 1997.
Setting Breakpoints Inside a System DLL
Looking Up Things on the Stack
|
|
I do not know what some of the flags to C2.EXE mean, but it doesn't really matterit's not something that you can use directly. The value 0x08000000 for dwCreationFlags is an undocumented flag defined in WINBASE.H as CREATE_
NO_WINDOW. This seems to be the way to spawn another application, in this case a console application, without having any window show up at all. I will leave it as an exercise for you to look up the parameters to LINK.EXE.
|
Figure 3 Debugger Memory Window |
There are a couple of things that you need to keep in mind when looking at items on the stack. All local variables for a function are created on the stack. If you see an instruction like SUB ESP,54h, this is what is happening. For the most part, the reservation of local variables occurs at the start of a scope. As you are looking at the stack, make sure to account for these local variables.
In conjunction with the stack pointer, the base pointer, EBP, is the register used to access both local variables and parameters to functions. EBP is referred to as the frame pointer. If the memory access is a positive offset from EBP, like |
|
then the code is accessing a parameter. Negative offsets from EBP, like |
|
are accessing local variables. If the code has been highly optimized, then EBP can be used as a general register. If the start of the function has a PUSH EBP followed by a MOV EBP,ESP instruction, then frame pointers are being used and it is much easier to find the data.
When looking at values in the stack, it helps to know what is data and what are addresses. As I mentioned in my April 1998 column, it is very important to know where your DLLs load into memory. If you know the starting addresses for your DLLs, then you can quickly find various return addresses in the data. Also, you might want to start getting familiar with the load addresses of important system DLLs like KERNEL32.DLL, NTDLL.DLL, and OLE32.DLL so that you can look them up at a glance. I keep a simple text file around with all the load addresses for the DLLs so I can look them up when I am debugging. To find the load address for a DLL, run |
|
and look for the image base field. After a little work in the debugger and looking at your cheat file, you will become about 70 percent accurate at guessing if a value on the stack is an address or data.
Skipping and Changing Code
|
|
As I step through the disassembly twice, I need to make sure that I let the ADD instruction at address 0x0040103F execute to keep the stack aligned. As my earlier discussion of the different calling conventions indicated, the assembler snippet shows a call to a __cdecl function because of the ADD instruction right after the call. To reexecute the function, I would set the instruction pointer to 0x00401035 to ensure that the PUSH occurs properly.
The simplest way to change the instruction pointer is to right-click in the Disassembly window on the instruction you next want to execute and select Set Next Statement from the popup menu. The other way is to show the Register window, click on the address next to EIP, and type in the address. If you want to return to a different address from a call, you can open the Memory window, view the memory at ESP/EBP, and change the return address directly. Since swapping around the instruction pointer can lead to crashes, you might want to practice on a simple program to see the effects. (You can use the Calling program that is part of this month's code distribution.) While it is useful to change the instruction pointer, it can get tedious to set a breakpoint and change EIP each time you want to avoid a function. In these cases I actually change the code to skip the function altogether. Fortunately, Intel has an instruction that is perfect as a placeholder: NOP. The NOP instruction means exactly what the name implies (no operation), and it will not change anything in your program. To change the code at debug time, you need to become a miniassembler. Since you cannot assemble directly into memory with the Visual C++ debugger, you need to poke in the opcode for the instruction into memory yourself. In the case of a NOP instruction, the opcode is 0x90. If you know other opcodes, you can poke those in as well. If you are curious, the Intel CPU manuals list all the opcodes for each instruction. The steps are pretty simple and I will walk through them using the previous code snippet.
The Debugger is a Tool
Wrap-up and Update
Da Tips!
Have a tricky issue dealing with bugs? Send your questions or bug slaying tips via email to John Robbins: john@jprobbins.com
|
From the June 1998 issue of Microsoft Systems Journal.