Mike Blaszczak
Have you ever wondered why Mike writes our debugging column? Here's a hint-it's not because his programs never have bugs in them. In this installment, Mike shows you the Set Next Statement command, which makes your program do what you meant, instead of what you typed. Take a deep breath and take a look.
While writing software, I code bugs. I can't help it; I do it all the time. I think that, psychologically, I'm motivated to do it on a subconscious level-by ruining perfectly good code, I set myself up to have more work to do later, when the bug is finally noticed and reported. It's not just a matter of job security-I get a real kick from looking like a hero when I finally get off my duff to fix the bug in question.
There's a wide variety of mistakes that I make, and like a big-league pitcher, I try to change up my delivery to keep the Quality Assurance team guessing. My go-to pitch is occasionally screwing up a whole function. I'll code it so that calling it does no good, or doesn't do what I had intended. Another favorite is coding a bad if statement that tests the wrong condition or branches the wrong way. Heck, the controlling expression is either zero or non-zero, and I figure that I'm an outstanding developer just because I get it right a bit more than half of the time.
When working around problems that involve a broken flow of execution, I use the Set Next Statement command in the debugger. In my opinion, this is one of the most underused tools in the debugger. Using Set Next Statement requires an intimate knowledge of the way the compiler and the language work, so it's easy to understand why most folks don't like to play with it. My treatment of this command won't be for the faint of heart, though I strongly encourage you to learn the feature. While you'll spend a little time and sort through some confusion as you begin, you'll probably find that the command really helps get your debugging work done faster. Set Next Statement saves me so much time during a week of solid debugging, in fact, that I don't even come into the office on Fridays anymore.
What's a statement?
The C++ language allows programmers to develop applications that execute a step at a time. That step, as far as the language is concerned, is a single, atomic unit of work I'll call a statement. Maybe my definition of "statement" doesn't match what you'll find in the ANSI C++ Standard document, but it will serve us for this discussion of the Set Next Statement command. As far as I'm concerned, each statement in the C++ language is separated from the next by a semicolon. Some blocks of statements may be grouped together using curly brackets. Each of these lines of code is a single statement:
int n = 30;
printf("%d\n", n);
n++;
Statements have a smaller cousin, which is the expression. Some statements, like the printf() call I made in the example, contain multiple expressions. The for construct uses three expressions separated by semicolons:
for(expr1; expr2; expr3)
The expressions aren't statements by themselves, but they're dangerously close-they evaluated and executed based on the rules that the language dictates. But they're all a part of a statement.
As you step through your code in the debugger, you'll normally step through a line of your source code at a time. Each time you step, you execute all of the statements on a single line. There's no rule that says you can't have more than one statement on a line. It's just not a very commonly used practice, especially since most people feel that the extra statements make the code harder to read. Some developers like to use more than one statement on a single line to more conveniently initialize structures, though. For example, I often code the following idiom, as it makes for a more compact representation of code that's trivially tested:
WINDOWINFO info; memset(info, 0, sizeof(info));
>When you issue the step command in the debugger, you only watch statements that actually cause code to execute. And you only step a line at a time, no matter how many statements are on the line.
A statement implies lots of interesting things. After a statement is done executing, all of the side effects of the code have been realized. If a statement involved postfix and prefix operators, we know they've done their thing by the time the statement is over. Also, by the end of the statement, any temporary objects that the expression might have needed have been created, used, and destroyed. And being at the end of a statement also means that any assignments or calls are complete, too.
If the statement causes a scope change, the side effects might be very considerable. If the statement causes a function call, by the time the statement is done executing, any locals that were in the function (and parameters that are actually temporary objects) have been created and destroyed.
Whatever will you say next?
The next statement, then, is the statement you're about to execute. When you see your source in the debugger, the statement that's about to execute is indicated by a yellow arrow in the left-hand margin of your source file window. The next statement you're about to execute is dictated simply by the flow of control in your application; if you're in a loop, the last statement of the loop will pass control to the first statement of the loop. If you're evaluating a conditional, the outcome of the conditional expression will determine which statement executes next. The debugger follows the flow of your program's execution by watching the instruction pointer register in the CPU. The debugger uses the debug information from your executable image to determine which source code line is represented by the address in the instruction pointer and reflects that information to you as you debug.
You can coerce the debugger into executing a different statement next. If you've stopped execution while single-stepping through your code, or if you've just stopped at a breakpoint, you're welcome to use the Set Next Statement command to move the flow of execution at your whim.
Let's examine ways to use Set Next Statement; we can talk about the command's caveats as we examine some of its more interesting applications.
Testing with Set Next Statement
You might use the Set Next Statement command to force execution to a position in the code that doesn't match a conditional. For example, if you call malloc() and test the return value for NULL to detect an out-of-memory condition, you might have code that looks like this:
char* pstr = (char*) malloc(1024);
if (*pstr == NULL)
{
printf("You're out of memory.\n");
FormatTheUsersDrive();
}
This fragment of code, while trivial and commonplace, isn't easy to test. You might have a utility that eats all of your system resources and forces things like malloc() into their failure case. But you doubtlessly have many, many code fragments like this in your program-and getting them to fail selectively so they might be individually tested isn't very convenient, regardless of what utility you're using. (You do trace through code you think is mostly working to perform testing, don't you? Tracing through code that works is far more educational than tracing through code that doesn't!)
Set Next Statement is a handy tool for making the code within the error-handling if statement work. Simply step over the malloc() call so that your next statement is the if. Since you know the if will evaluate to zero and the following block won't execute, simply force it to by clicking on the first line of code in the block and using the Set Next Statement command. You can make sure that the error message is printed and that your drive gets formatted.
Sherman, set the wayback machine for line 20
The debugger in Visual C++, unfortunately, doesn't support an "undo" mechanism. Once you've stepped past a line of code, you've stepped past the line of code, and that's that. You can, however, use the Set Next Statement command to get back to a line of code that might let you watch what you missed. Such an operation is a bit dicey, though-you might end up causing some interesting and unwanted side effects by taking such an action.
Let's consider a strange little program so we have some fodder for our discussion. My sample program simply allocates some memory, puts an interesting string into it, and then calls a function to reverse the characters in the string while making a new copy of the string.
char* ReverseAndCopy(char* pstrInput)
{
if (pstrInput == NULL)
return NULL;
int nLength = strlen(pstrInput);
char* pstrSource = pstrInput;
char* pstrDuplicate = (char*) malloc(nLength+1);
if (pstrDuplicate == NULL)
return NULL;
char *pstrTarget = pstrDuplicate + nLength + 1;
*pstrTarget-- = '\0';
do {
*pstrTarget-- = *pstrSource++;
} until (*pstrSource == '\0');
return pstrDuplicate;
}
void main()
{
char* pstrBackwards = ReverseAndCopy(".tseb si yekcoH");
printf("%s\n", pstrBackwards);
free(pstrBackwards);
}
Say you load this program (not out loud-people will stare) and run it. But you get overanxious and step over the call into ReverseAndCopy(). Unfortunately, that's where all of the action (and probably a bug or two!) are, but you're no longer able to trace into the function since the next line to execute is actually the call to printf().
You can use Set Next Statement to move the instruction pointer back to the call to ReverseAndCopy(). Again, simply click on the line with the call, and use the Set Next Statement command to move the instruction pointer to the line that has the cursor. You can then use the Step Into command to trace into the call to ReverseAndCopy(), and this time you can carefully watch what happens while the function executes.
But there's a side effect of using Set Next Statement in this case, and you'll notice it quickly if you let the program run to completion. You'll leak some memory! Set Next Statement does exactly that-it moves execution to a given line in your program. It does not undo any of the effects of the code that you're skipping, and it doesn't assure that nothing bad will happen when code executes twice. The first time your program called ReverseAndCopy(), it allocated some memory. The second time ReverseAndCopy() executed, you allocated some more memory-and overwrote the pointer you'd been using to handle the first block of memory. Since the pointer value was simply overwritten, the free() statement executed later on only releases one of the two blocks of memory. And that means the other block ends up leaking-even Mr. Peabody can't help you.
The behavior exhibited in such a scenario isn't a bug: The debugger is doing what you ask, and your program is behaving accordingly. (Of course, while it's not a bug, the behavior leaves room for a feature request-a smarter, semantically aware undo command for the debugger. Such a feature request, of course, would require many, many man-years of work!)
When not to use Set Next Statement
Because of the semantics of the C++ language, there are a few situations where the Set Next Statement command is dangerous. Those largely involve situations where you expect to keep executing your application after making great leaps across scope boundaries. If we consider the preceding application, for example, you might be tempted to use Set Next Statement to trace into main() and then use Set Next Statement to reach the implementation of the ReverseAndCopy() function. That's not a good idea, for a couple of obvious reasons. First, you're skipping the call to the function. When the compiler emits code to call into the function, it sets up a stack frame for local variables and might create temporary objects in order to handle parameters and return values. Since Set Next Statement doesn't know or implement any of these rules, you'll find that stepping across function boundaries ends up causing terribly unpredictable results.
If you know assembly language, you might find it worthwhile to open the debugger's Disassembly window. There, you'll find that you can set the next statement to an exact instruction rather than a single source code statement. If you have the time and knowledge, you might find that the approach can be pretty rewarding. On the other hand, even if you don't know assembly and aren't particularly sure of the semantics surrounding the constructs you're debugging, you might find applications of Set Next Statement that cross construct scope to be a valuable tool in testing a hunch while you're debugging. What if executing a statement or two out of turn ends up being a solution to your problem? If you've made such a hypothesis and the statements in question are nearby, you might be able to use Set Next Statement to prove your theory: Just use Set Next Statement to jump to the interesting expressions and see whether they work.
When aggressively using this debugger feature, you might find that your results are unpredictable-particularly if you're breaking the code you've written by causing it to execute badly out of order or without setting up the context it needs. If you've used Set Next Statement, I'd strongly encourage you to doubt yourself before doubting the tools. By skipping around in the execution order of your application, you're running a very great risk of breaking the code as it was meant to be run. If you think you've found a bug in your application and you've found that bug in a debugging session where you've applied Set Next Statement, you should try to confirm the bug in a normal run of the application. The cause of the problem is far more likely to be that you've left something uninitialized or undone by causing execution to skip from place to place . . . particularly if you're not using the command judiciously.
The Set Next Article command
That covers the Set Next Statement command. Again, let me underscore that the best advice I can leave you with is to be courageous with your investigation of the command. Its use is a little scary at first. But as your knowledge and understanding or the library grows, you'll find little trouble in using the command when it can help the most. The only way to learn is to try and make mistakes.
We've covered a great deal of territory in this column-if there's some debugging technique that you've always been curious about and I've not yet covered for you, please don't hesitate to write to me and suggest it for a future column. Until then, I'll work on some other column ideas and write more about debugging later.
Mike Blaszczak is a software design engineer for the Home Automation team at Microsoft. Mike, a Microsoft employee for more than seven years, enjoys long-distance motorcycling, ice hockey, and writing. He has written for several magazines, and the fourth edition of his book about Windows programming with MFC will be published soon by WROX Press. mikeblas@nwlink.com.