Let's make a small change to the program. Rewrite the DataNode::Insert()
method to look like the code below. It is the highlighted line that has changed:
Node * DataNode::Insert(ValueObject * theData)
{
int result = pData->Compare(*theData);
Node * pNode;
if ( result == IS_LARGER)
{
pNode = new DataNode(theData, this);
}
else
{
pNextNode = Insert(theData);
pNode = this;
}
return pNode;
}
In a large body of code, this looks fairly reasonable, and it will compile and link. Don't examine it too closely, after all, in a real debugging situation you would not know that this is the area of code with a problem. Run the program and enter the following values: 18, 9, 3, 0. It should run to completion without a problem. Now try it again with 18, 3, 9, 0. Oops! An exception from the operating system:
Apparently the program has a bug, and this bug depends on the order in which values are added. Let's use the debugger to find out what is going on. This is a nasty bug to find, because the exception it throws may stop the debugger altogether. If that happens, the best way to get started is a binary search of the code. How far into the code can we go before we see the bug?
Let's start by putting a breakpoint half way through main()
:
If we can get to this breakpoint without crashing, then we know that we can, at least, create an empty linked list. This should work, because it would seem that the bug has to do with processing the values. It is safer, however, to leave nothing to chance.
Try running the code to this point. Sure enough, we get to this line of code without a problem. Time to stop and think. If the two sets of values you entered were 18, 9, 3, 0 and 18, 3, 9, 0, the earliest problem may be with entering 3.
Let's try putting a breakpoint in the line in main()
where we attempt to enter the input value into the linked list:
theLinkedList.Insert(pData)
Clear all the breakpoints and set a breakpoint on this line. Now run the program. You will be prompted to enter the first value, 18. Enter 18, hit return and you are back in the code. Hit Run again and it runs just fine, prompting for the second value, 3. Enter it, hit Run and, once again, that works. Now enter the third value, 9 and run it. Aha! This time it crashes:
Okay, now we know that running this line of code with the value 9 will crash the program. It is time we tried to close in on the problem. Let's do this carefully, one step at a time. Remember — we know that trying to insert the value 9 into the list is causing the program to crash — what we need to do now is find out exactly where the problem is.
Stop debugging and restart with the breakpoint on the same line. Start the program running and enter 18 when prompted, as before. Repeat this for the 3 value, but when you have entered 9, then hit return to get back to the breakpoint, but do not start the program running again. Rather, you should press the Step Into button — you should reach this line in the LinkedList::Insert()
method:
pHeadNode->Insert.(pData);
Stop debugging, clear all the breakpoints and set a new one at this line. Run the program as before, entering each value as prompted until you reach the breakpoint again with the value 9. No problems so far, so try stepping in again. This time you should reach the line,
pNextNode = pNextNode->Insert(pData);
in the TopOfListNode::Insert()
method. Again, stop debugging, reset the breakpoint to this line and repeat the while process, until you return to this breakpoint with the value 9. Step into the code once more. This brings you to the DataNode::Insert()
method.
Note, this is an interesting piece of code, because it has a conditional. We are looking for a conditional, because we know this crash only happens with some values and not with others. At this point, let's clear all the breakpoints and set two new ones, as shown below:
Now, let's start over. Run the code again with the new breakpoints. The very first thing to notice is that we don't hit either breakpoint when we enter 18. Entering 3 brings us to the if
statement, and running that works fine (no crash). Entering 9 brings us to the else
clause. That makes sense from the logic of the program. Run it to see if it crashes.
It does not crash, but brings us back into the else. We appear to be in the next node. Run it again. Again we're back in the else clause the next node. Keep running it and notice that you are never exiting from this list. That is odd because there are only three values in the list, so where are we?
Take a look at the value of pNextNode
, shown in the variables window:
The value of this pointer isn't NULL
, so we can step into the Insert()
call. Stepping in seems to work — we're back at the top of the method — must be in the next node. Continue stepping though the code — something appears to be wrong. We're stepping through node after node, but the list isn't that long. Take a look at the call stack (you can open this from the Debug toolbar). This window shows the recent set of function calls:
This shows that from TopOfListNode::Insert()
, we called into DataNode::Insert()
, and from there into another instance of into DataNode::Insert()
and so on and so on. This list of recursive calls has begun to build up. Go back and take another look at that error message box that pops up when the program fails. "Unhandled Exception in SinglyLinkedList.exe:0xC00000FD:Stack Overflow." A stack overflow seems consistent with what we're seeing here.
You should begin to smell a rat. Look at the address of pNextNode
— each time you call Insert()
on the next node, it's own pNextNode
has the same address:
{bmc 0733.bmp
You appear to be calling Insert()
on yourself! You shouldn't be calling Insert()
on yourself, you should be calling it on the node pointed to by your pNextNode
member variable.
Finding this bug without a debugger would have been a Herculean task. With the debugger, it was fairly straightforward. There are a number of other windows worthy of exploration, including:
I strongly recommend spending a good bit of time exploring your debugger and coming to understand how it works before you need it.
The process of writing software starts with the initial idea, is nurtured through requirements analysis, given form in the design phase, and realized in the implementation. All of that will be worthless, however, if the product is not thoroughly tested. The rule which must never be violated is that schedule yields to quality, and features yield to schedule. Shipping a product which does not work will kill you in the market.
That said, let's be honest. The biggest danger to software is not a bug here or there, and certainly not a missing feature. It is that you will hold up the schedule for so long that you miss the market opportunity. We know of many wildly successful products (and even operating systems) which were not 100% bug-free before release, but which established their market niche even while the development team worked feverishly to get a second, fixed release out the door.
There are simple market realities that may force you to go out with a product with a few minor known bugs. If your core functionality works, if your product has something to offer which outweighs its limitations, then you may be better off fixing the minor problems in a second release. It isn't pretty, and few will say it out loud, but that is the reality of the market place.