Stunt Debugging: Understanding Debug Builds, Part 2

Mike Blaszczak

Last month, Mike showed you the differences between a stock release build and a stock debug build, using the settings provided by AppWizard (see “Stunt Debugging: Understanding Debug Builds” in the December 1997 issue). He saved a few aspects of the discussion for this month: optimizer and linker settings, and versions of the libraries you’re likely to be using.

In a debug build, the compiler will generally be asked to perform no optimizations whatsoever, while a release build will see the compiler set to a very aggressive optimization mode. The compiler performs optimizations in a couple of different steps. First, the code generated by the compiler can be optimized to favor one of several different goals while generating code. The compiler, in a project produced by AppWizard, is asked to perform no optimizations whatsoever in a debug build. In a release build, the compiler is asked to optimize code for speed.

While these settings sound obvious, they do have some subtler facets that deserve explanation. Optimizations are turned off in debug builds for several different reasons. First, the compiler takes longer to run when the optimizer is engaged; the optimizer is essentially an extra pass over the emitted code that the compiler is about to produce. That code analysis takes a considerable amount of time, especially in more complicated source code. Because debug builds are normally performed by developers (like you and me) repeatedly, build time is of the essence after making changes and incrementally testing an application as it’s developed.

The optimizer is also disabled in debug builds, because the optimizer is capable of significantly rearranging your code. As long as the optimizer doesn’t actually break your code, it can do whatever it needs to do in order to make your code run faster. Because the compiler in Visual C++ 5.0 is incredibly advanced, it might do some very shocking things. I once talked with a developer who reached the conclusion that his CWinApp::InitInstance() function wasn’t executed in release builds and was executed in debug builds of his application. I asked him how he’d reached this conclusion, and he said that he’d added this code to his InitInstance() implementation:

int nNumerator = 35;
int nDenominator = 0;
int nResult = nNumerator / nDenominator;

When he made a debug build with the optimizer off, his program immediately raised a divide by zero exception. When he made a release build, the program didn’t raise the exception. The code above, obviously, does perform a divide by zero—but since the exact code was added to his InitInstance() implementation, it was removed by the optimizer. The optimizer realized that the nNumerator and nDenominator variables were never used (aside from the division), so it removed them. It further realized that the nResult value was never used, so it never emitted code to perform the division! Unfortunately, this developer didn’t realize there were much better ways of showing whether or not InitInstance() was being executed (setting a breakpoint, using a temporary, diagnostic message box, or looking for other evidence of the function’s effects), or that the lack of an exception message was actually a side effect of the optimizer.

Debug builds are built without optimization simply because debugging, even at the source code level provided by the IDE debugger, is much easier when the underlying executable code matches the source code exactly and consistently. If you make an optimized build, the compiler might rearrange the machine code that’s actually implementing your program. If you debug through a release build (by adding debug symbols as I showed you last month), you’ll find many situations where stepping to the next statement takes you to a statement that you didn’t expect. You might even find that the compiler has completely neglected to emit any code for a few statements in your program. As in the previous example, the compiler might detect that the code is completely unnecessary and omit it. Or, the compiler might realize that your code can be combined with the side effect of another statement and put all of the results on one line.

In a debug build, the compiler is asked to emit debugging information. Debug information is huge: It describes many subtle aspects of your application’s use of the language. Debug information includes descriptions (including names) of all of the data types and structures your application uses. (And if you’ve written a Windows application with or without MFC, you’ll find that you’re using an alarming number of symbols and data structures!) Debug information also connects code that lives at a particular address with a particular line of source code in a particular source file in your application.

The table that’s making this relationship lets the debugger know which statement in your source matches which breakpoint (or single-step stopping point) it has most recently hit. The table can’t track any relationships at a more granular level than a line of code, however. The compiler’s optimizer, particularly while performing global optimizations, is very likely to combine many lines of code into a few more efficient instructions, or rearrange the order of execution for a few lines of code in order to make the optimization improvements you’ve requested. Because the debugging information can’t efficiently express these optimizations, you’re left to your own devices when debugging code that’s been optimized—things might not be as linear and predictive as you’d expect.

A typical release build will use the “optimize for speed” option provided by the compiler. This option, set in the “Optimizations” category of the “C/C++” tab in the Project Settings dialog box for your project, is tantamount to giving the compiler the /O2 option on the command line. /O2, in turn, turns out to be shorthand for these options:

For debug builds, AppWizard generates a project that uses surprisingly few code generation options when compared to a release build:

As you can see, the debug build has the compiler do very, very little work to produce more efficient code.

Optimizer bugs

I drive aggressively. I have a fast car, and my other car is even faster. I like to go skidding around tracks and zipping over twisty roads in the mountains. Sometimes, I spin out. It’s my own fault: I can’t blame the car. While I fret when one of my beasts is parked in the lot at the local computer store, I realize that the greatest danger my car faces every day of its life is sitting right in the driver’s seat: It’s me. While a vandal might smash a window to take my Alice in Chains Unplugged CD, or an inconsiderate driver might put a big scratch in my baby’s skin, it’s far more likely that I’ll miss a shift, tromp on the accelerator, lighten up the back wheels, and spin out.

In the same way, it’s possible that the optimizer does add bugs to your code. Sometimes the optimizer will make an optimization that’s actually bad and causes your program to fail when you run it. Sometimes, the failures are subtle—perhaps the results of a computation made by a loop are off. In other instances, the optimizer might emit bad code that causes your application to actually crash while running.

Optimizer bugs are extremely rare, and I can’t underscore this enough. They certainly do exist, but I’d say that only one or two issues out of a thousand that I answer for customers with crashing problems have anything to do with a code generation problem. Odds are that you’ve coded a bug into your application, and you need to figure out why the bug appears only in one build type or the other.

It’s my intent to use this article as a vehicle to provide the background that you need to make that diagnosis. But the least likely suspect on your list of possible causes should be a compiler bug. The greatest danger to your code isn’t the optimizer—the greatest danger to your program is sitting right in front of your own monitor. It’s you!

Side effects of optimization

Since the optimizer ends up examining your code very carefully during an optimized build, it turns out that release builds might end up generating an extra warning or two. For example, you might write code like this:

#include <stdio.h>
void main()
{
   int n;
   int x;
   int y;
   for (n = 0, x= 0; n < 100; n++)
   x += n;
   printf("%d\n", x);
}

In a debug build, the above code compiles cleanly. But in a release build, with optimizations turned on, the compiler will notice that the variable y is declared but never initialized or used. It will complain appropriately for use before initialization and a few other situations, as well.

Build changes managed by the linker

The linker is responsible for finding out which libraries will be used by an application—possibly by referring to special records in the application’s OBJ files that tell it a particular library should be referenced by default. While the linker is thus responsible for making sure that the debug or non-debug library variant is used in the build, as the users request, it doesn’t actually manage the setting that causes the behavior. The compiler does that.

But it’s possible that the linker is the tool that notices there’s a problem. For example, you might write an application that’s very, very large. To manage that application’s development, you could set up your build process so that some modules are built with debug information and some aren’t. But you shouldn’t change the /M option that’s given to the compiler for different source modules that are to be linked for use within the same executable module.

Keeping the /M family of options homogenous across your module’s build is paramount, because not doing so can cause different modules to reference different libraries at runtime. Different libraries might have slightly different implementations of the same function, and those differing implementations might not function well together. I’ll cover those details in the next section; suffice it to say for now that the linker is capable of noticing those mismatches in most cases, and it will issue a warning message alerting you to the problem.

The linker’s other build-mode responsibility involves debug information. The compiler emits debug information for your source code—if you so request—while the compilation is happening. That information is made a part of the OBJ file emitted by the compiler. The linker, if you ask it to, will coalesce that information for all of the OBJ files and libraries in your application and format them correctly so that they’re available to the debugger at runtime. To put a finer point on it: The linker will be told to gather and format debug information in a debug build, but it will be asked to discard that information in a release build of the program.

The linker’s generation of debug symbols is governed by the /DEBUG, /PDBTYPE, and /PDB options. These options can be set with the “Generate debug info” check box in the “General” category on the “Link”tab in the “Project Settings” dialog box for your project. The “Debug” category on the same tab can be used to set the format and symbol collection rules the linker will use.

The linker’s main role is to peruse the OBJ files and libraries offered to it and emit an executable image that takes only what’s necessary from those images. The linker does a swell job of this, although the operation can take much longer for larger libraries and bigger applications. Again, since the edit-debug-test cycle should be able to iterate as quickly as possible in the interests of developer (that’s you, pal!) productivity, the linker doesn’t do such an aggressive job of finding and eliminating unused code in a debug build. That lack of effort contributes a great deal to the swollen executable size that most debug applications realize.

In a release build, the compiler will emit an OBJ record for each function that’s actually used. The linker will aggressively hunt down unused code and eliminate it from the resulting image at a function-by-function level. Even if you use all the functions of a class but one, the linker will be able to detect that the one unused function is never referenced and will eliminate it from your executable. But because each and every function in an application must be checked, the algorithm that the linker must use to detect dead code has quite a bit of work to do.

On the other hand, the linker does no such analysis in a debug build and won’t aggressively pursue the removal of dead code from the application’s executable image. This makes the application bloat somewhat when compared to the release build, but it can also cause the resulting program to be artificially dependent on an extra library or two. For example, if you link your application to a library that provides code that references, in turn,
a DLL, a debug build of the application is likely to reference that DLL, even if it doesn’t directly use the DLL or never references features of the library that use the DLL. On the other hand, if the linker is boldly removing dormant code, such a dependency would be removed. As such, without care and attention, it’s possible that a debug build of your application will require some additional files for a proper installation compared to a release build of the same code.

The linker’s dead code elimination can be activated by the /OPT:REF option, or turned off by the /OPT:NOREF option. The flag isn’t directly available in the IDE because the option is automatically set to /OPT:NOREF when the /DEBUG flag is given, and automatically defaulted to /OPT:REF when the /DEBUG flag isn’t present.

Changes in the libraries

The libraries you might use from the Visual C++ package—MFC and the runtimes—are built with the same linker, librarian, and compiler that you might use when building your own applications. As such, the libraries are made available in both debug and release builds, and they bear all of the side effects that I’ve talked about here and last month. If you link with the release version of MFC, for example, you can expect that it doesn’t have any ASSERT() tests compiled into it and won’t have any TRACE() messages, either. The most important fact to remember about the libraries you use when building your application is that they’re different between the release build and debug builds for all of the reasons I’ve discussed so far.

As I mentioned in the preceding section, it turns out that debug builds and non-debug builds of a library might implement the same function in a slightly different way. The most notable example of this disparity is in the memory manager. In the release builds of the runtime libraries, the memory manager does very little checking for bad memory blocks, corrupted heap management data structures, or overrun buffers. On the other hand, the debug build of the library does extensive checking for such situations—I outlined many of the functions and messages involved in that process back in the October 1997 issue (see “Stunt Debugging: Diagnosing Memory Leaks”).

While the libraries do a swell job of tracking memory leaks, the implementation of the leak-checking memory manager isn’t compatible with that of the regular memory manager. Those functions are available only in debug builds of the library. Furthermore, the release build of the library can’t destroy memory blocks allocated by the debug build, and vice versa. The data that each implementation associates with each allocated block of memory is different, depending on the build of the library, and that’s the source of the incompatibility.

The compatibility issue can be further complicated if the size of library-provided classes changes between debug and release modes. Many MFC implementation classes, for example, are smaller in release mode because, like the runtime heap manager, they don’t maintain any data to perform sanity checks during execution.

When the linker is used against the libraries, it will see the functions in the library in records that can be individually linked only if the library was built to enable that granularity. Again, to improve performance in the development environment, MFC and the C runtime libraries are built with function-level linking granularity turned off in their debug releases, and turned on in their release counterparts. You can’t change this, unless you rebuild the libraries for yourself.

The material I’ve presented about the optimizer, libraries, and linker leads to these tips for tackling release-only problems:

Remember, the best way to find a bug that manifests only in a release build is to do release builds regularly, so that you can catch the bug while your application is still being built. You’ll have to look through less code to find your bug, and you’ll have more time to do it. Now that you understand all the things that differ between builds, and how to configure your own builds, you’ll have a much easier time tracking down those build-dependent anomalies.

Mike Blaszczak is the development lead for the MFC&T team and sets the technical direction of the libraries while still participating in their implementation. Mike, a Microsoft employee for more than five years, enjoys long-distance motorcycling, ice hockey, and writing. He’s written for several magazines, and the third edition of his book about Windows programming with MFC was recently published by WROX Press. mikeblas@nwlink.com.