This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.


April 1999

Microsoft Systems Journal Homepage

Bugslayer

Code for this article: April99BugSlayer.exe (146KB)

John Robbins is a software engineer at NuMega Technologies Inc. who specializes in debuggers. He can be reached at john@jprobbins.com.

The hardest thing for me—and for 99.46 percent of the other engineers out there—is effective testing. Not only do most of us consider it the least enjoyable of our tasks, it is usually the first thing cut when schedule pressures start, so we have even less time to get it right.

    In this month's column, I want to concentrate on the first half of the testing equation, the developer's unit tests. The unit test means different things to different people, but I am talking about the testing that goes on as you are developing your code and before you report that a feature is completed. Some developers may think a unit test simply consists of getting the code to compile, but the unit test is probably the most important aspect of the overall quality of your project.

    In a future column, I will talk about Quality Assurance (QA), the second half of testing. While teamwide QA has some things in common across all environments, it also varies widely between teams—even in the same company. Before I write that column, I would really like to hear what aspects you would like me to cover. Better yet, if you have a team environment that really works (or really does not work), please take a few minutes to summarize your environment in email. I promise to protect the innocent as well as the guilty!

    First, I will discuss a few of the techniques I have learned over the years that made my unit testing more effective. You might not agree with them all, but I can assure you that they work for me. After the techniques, it is time for some code. This is MSJ after all, and what's MSJ without code? One of the things that I always found lacking was a relatively easy-to-use testing tool to automate my application so I can easily reproduce my tests. Since I am Mr. Originality when it comes to naming the utilities I develop for this column, I am calling the tool Tester.

Effective Unit Testing Prerequisites

    I want to break down my concepts of effective unit testing into four distinct areas: prerequisites, before coding, during coding, and after coding. The key is to tackle the unit tests as part of your normal development. While most people do a pretty good job, they sometimes start too late in the cycle to make their tests as effective as possible. To me, unit test code is just as much a deliverable as the main code itself.

    The first prerequisite consists of two pieces of software that are vital to any effective testing effort: version control and a bug tracking system. While I hope I am preaching to the converted, unfortunately I keep running into teams that have not yet started using these tools. I guess you have to know where you have been to know where you are going, and these two tools are the only clean way to learn that lesson. Additionally, they are the only effective way to judge whether you are getting any results from changes you implement to your development cycle. If you adopt some of my suggestions, you should see a rise in your bug counts.

    When you bring a new developer to your team, these tools can pay for themselves in a single day. While I am positive that all development shops have extremely detailed and completely up-to-date design documents, a few might not. When the new developer starts, have him sit down with the version control and bug tracking software and start working his way through the changes. Good design documents would be better, but at least this way the new developer can get an idea of the trouble spots in the code and how the code evolved. If your bug tracking system does not automatically keep in sync with the version control system, make sure your version control check-in comments always include the bug number when checking in the fix.

    The second prerequisite actually involves bug tracking. The bug tracking system makes an excellent reminder and to-do list, especially as you are in the process of developing the code. While some developers like to keep notes and to-do lists in notebooks, it might not be the best place. Many times the necessary information can get lost between random hexadecimal number streams from a debugging session and the pages upon pages of doodling that you used to keep yourself awake in the last management status meeting. By putting these notes into the bug tracking system and assigning them to yourself, you consolidate them in one place and they're easier to find.

    Additionally, while you probably like to think that you "own" the code you work on, it really belongs to the team. With your to-do list in the bug tracking system, other team members who have to interface with your code can check your list to see what you have or have not done.

    One other benefit to including to-dos and notes in the bug tracking system is that you have fewer things falling through the cracks at the last minute because you forgot about a problem or a feature. I find myself always running the bug tracking system so that I can quickly jot down key notes and to-dos right when I think about them. I like to have the lowest-priority bug code in the system available for just notes and to-dos. This makes it easier to keep them separate from the more important things, but at the same time you can quickly raise the priority on the note or to-do.

Before Coding

    Before jumping into coding, there are two ideas that you should try to keep in mind. The first is to start writing your unit tests as soon as you start writing your code because they need to be developed in parallel. To start figuring out the interface for a module, write the stub functions for that module and immediately write a test program/harness to call those interfaces. As you add a piece of functionality, you add new test cases to the test harness. This way you can test each incremental change in isolation and the test harness development is spread out over the development cycle. If you do all the regular development after you have implemented the main code, you generally do not have enough time to do a good job on the harness, so you do a less thorough job implementing an effective test.

    Second, you have to think about how you are going to test something before you write it. Try not to fall into the trap of thinking that your code requires the entire application before you can test it. If you start realizing that you're approaching this pitfall, you need to step back and break your testing down. I realize that sometimes you must rely on important functionality from another developer to compile your code. In those cases, your test code will consist of stubs for the interfaces that you can compile against. At a minimum, have the interfaces hardcoded to return appropriate data so you can compile and run your code.

    One benefit to ensuring that your design is testable is that you quickly run into things that you can fix to make your code more reusable and extensible. Since reusability is the Holy Grail of software, anything you can do to improve it is great.

During Coding

    While you are coding, you should be running your unit tests all the time. I seem to think in an isolated functionality unit of about 50 lines of code. Each time I add or change a feature, I rerun the unit test to see if I broke anything. I do not like surprises, so I try to keep them to a minimum.

    The key to the most effective unit tests can be summed up in two words: code coverage. If you take nothing else away from this column except those two words, I will consider it a success. Code coverage means executing as many lines of your code as possible. A line not executed is a line waiting to crash.

    Personally, I do not check in any code to the master sources until I have executed at least 85 to 90 percent of the lines in my code. I know some of you are groaning right now. Yes, getting good code coverage is not easy. Sometimes you need to do far more testing than you ever considered, and it can take a while. However, your job is to write solid code, and, in my opinion, code coverage is about the only way you will get it during the unit test phase.

    Nothing is worse than having your QA staff sitting on their hands when they are stuck with builds that crash. If you get 90 percent code coverage in the unit test, they can spend their time testing your application on different platforms and ensuring that the interfaces between subsystems work. QA's job is to test the product as a whole and to sign off on the quality as a whole. Your job is to test a unit and to sign off on the quality of that unit. When both sides do their jobs, the result is a high-quality product.

    Granted, I do not expect that developers will be able to test on each of the four different Win32®-based operating systems that customers may be using, or even handle all the possible cases. However, if engineers can get 90 percent coverage on at least one operating system, then 66 percent of the battle for quality is won. Several third-party products are available to help you with code coverage. Additionally, in the June 1998 Bugslayer, I discussed how your debugger makes a free code coverage tool.

    If you follow my recommendations, you will have some very effective unit tests at the end of your development—but the work does not stop there. If you have ever looked at the BugslayerUtil.DLL code that is distributed each time I write a column, you will see a directory called Tests under the main source code directory. That directory holds my unit tests. I like to keep my unit tests as part of the codebase because it is easier for others to find them. In addition, when I make a change to the source code, I can easily test to see if I broke anything. I highly recommend that you check your tests into your master sources. Finally, while most unit tests are self-explanatory, make sure that you document any key assumptions so that others do not waste their time wrestling with your tests.

The Bane of Unit Testing: User Interfaces

    Until now, I have been discussing the unit test as a nebulous magic thing that does a whole bunch of work for you. It really falls into two distinct parts: internal tests and UI tests. The internal tests are generally much easier to do than UI tests.

    If you want to see an example of one of my test harnesses for internals code, look in the Bugslayer\SourceCode\TInputHlp\Tests directory of this month's source code distribution. TInputHlp.DLL is part of the Tester utility and handles parsing an input string and playing keystrokes to another application. I will go into much more detail later in the column. In the Tests directory are two test harnesses: PPKTest tests some of the input string parsing, and PlayKeys tests the actual external interfaces. I set up both tests to read their input from a file. This way I can just add input files that specify the parameters to ensure that the input string parsing works on all sorts of different cases.

    UI developers have a much tougher time developing automated unit tests. Without a separate tool that will automate mouse and keyboard input, their tests just consist of a list of steps that affect the test. This is extremely tedious and very error-prone because it is manual testing. If possible, you should try to automate your unit testing. Unfortunately, the Recorder application that used to ship with Windows® 3.0 and 3.1 no longer comes with the 32-bit operating system versions. For those of you newer to Windows, Recorder recorded your mouse and keyboard interactions into a file so that you could play them back. While there are third-party products that will automate your application and a whole lot more, I needed something that was a little more lightweight and geared toward the development engineer. Thus, my Tester application was born.

    When I first started thinking about creating an automation utility, I spent some time considering exactly what I expected from such a tool. At first I thought about doing something like the old Recorder application. Back in the Windows 3.0 days, I used to have a complete set of .REC files to drive my tests. However, the big problem was that there was no way to do conditional tests. If there was a problem with my application, Recorder just went along its merry way, playing the keystrokes and mouse clicks with abandon. One time I even wiped out half of my System directory with Recorder. So my new automation tool simply had to have some sort of if...then...else construct.

    To get conditional constructs, I was going to need some sort of language. It would have been interesting to develop my own testing language, but I don't think anyone wants me to spend the next two years worth of Bugslayer columns discussing the gyrations of designing a language and dealing with YACC and FLEX. It took all of two seconds to realize that I should do the unique portions of Tester as a COM object. That way I can just concentrate on the good parts, and anyone who wants to can use their language of choice. Personally, I am partial to something that uses VBScript or JScript because the testing scripts do not require compiling. However, there are a few limitations to the different script host implementations, and I will discuss some of them in a moment.

Tester Requirements

    Now it is time to talk about the specifics of Tester. A tool that records and plays keystrokes and mouse input is pretty ambitious for a single column, so I want to break it down into two distinct stages for implementation. The first version, covered by this month's source code, is fully functional and does the following:

  • Given an input string of keystrokes formatted like what you would pass to the Visual Basic® SendKeys function, Tester can play the keystrokes to the active window.
  • Tester can find any top-level or child window by title or class.
  • Given any arbitrary HWND, Tester can get all the properties of the window.
  • Tester must notify the user's script of specific window creation or destruction so it can handle potential error conditions or do advanced window handling.

    Future versions of Tester will support more high-level requirements. Tester will have specific objects that map each of the common window classes so that information can be retrieved easily out of the controls. For example, a TTreeControl class would allow you to get the third root node out of a tree control. Also, Tester will have a recorder application that will record keystrokes and mouse actions to produce a script.

    I did not just go ahead and implement the future version requirements in the first version because some serious issues need to be thought through first. If I just recorded mouse movements and clicks, I would get all the positions in pixels. That means the resolution in which the script was recorded must be the same as the resolution it's played back in. If you look around the average development shop, I can assure you that not everyone runs the same resolution. With a hardcoded resolution solution, the scripts are basically worthless if you want to share them among your team.

    Another problem is that the script breaks easily if you move a control over even a pixel or two. Unless you record your scripts only after the UI is frozen, a recorded script is far too fragile. And if you wait until the UI is frozen, you are not doing very good testing.

    These problems will take some additional time to solve. Since I might need the predefined objects to do playback correctly, I left the work to define them until the next step as well.

Using Tester

    Using Tester is relatively simple. First, you create a couple of Tester objects, either start or find your application's main window, pump some keystrokes to your application, check the results, and end. Figure 1 shows a sample Windows Scripting Host (WSH) file that starts NOTEPAD.EXE, types a few lines to it, and closes it.

     Figure 1 shows the three objects most commonly used by Tester. The TSystem object allows you to find top-level windows, start applications, and do the all-important pause. The TWindow class is the main workhorse class for this version. It is a wrapper around an HWND and has all sorts of properties that tell you everything about the window. Additionally, it allows you to enumerate all child windows that belong to that parent. The last object in Figure 1 is the Input class. Right now it only supports the single method PlayKeys to get keystrokes over to the application with focus.

     Figure 2 shows the TNotify class in a WSH script. One of the hardest things when developing automation scripts is handling the case where an unexpected window, like an ASSERT box, pops up. The TNotify class makes it a snap to provide an emergency handler for those cases. The simple script in Figure 2 just watches for any windows with "Notepad" in their caption bars. While you might not use the TNotify class much, when you do need it, you really need it.

    I will discuss the reasons for this in the implementation section, but you need to call the TNotify CheckNotification method every once in a while. This ensures that the notification messages can get through, as you might not have a message loop in your language of choice. While the code in Figure 2 shows using a message box in the notification event handlers, you probably do not want to do that in your real scripts as it can certainly cause problems since the application with focus can change on you.

    Also keep in mind that there is a limited number of notifications that I allow you to set, so you should not use TNotify for general scripting things like waiting for the File Save dialog to appear. Depending on how you set up your notification handlers and how they search for the text in the caption, you can easily get notified of windows that you might not be interested in. This is the case when you have a generic title like "Notepad" and you specify that it can appear anywhere in the title. You should try to be as specific as possible when specifying the notifications that you want when calling the TNotify AddNotification method. Your creation event handlers should also look at the TWindow passed in so that you can verify that it is the window you are interested in. For destruction handlers that are on the generic side, you should search the windows to ensure that the window does not exist.

    Included with this month's code are two other samples that you might want to look at to see how to use Tester. The first sample, NPad_Test.vbs, is a more complete WSH script and has some reusable routines. The other sample, TT (or Tester Tester), is the main unit test for Tester. It is a Visual Basic-based application, and it should give you an idea how to use Tester with Visual Basic. Additionally, these two samples show the TWindows class that is the usual collection class for TWindows.

    While I am partial to using WSH for my tests, it takes more thought up front to make it work correctly. Since everything is untyped and there is no magic editor like Visual Basic for WSH scripts, you are back to the old run and crash style of debugging. The main reason that I like using WSH is that you do not need to rely on compiling your tests. However, if you have a strong build environment, you might want to consider using Visual Basic so that you can build your tests as you build your application. Of course, if you are more comfortable in C or MASM, you can use those as well.

    While the objects in Tester look deceptively simple, the real work is planning your tests. You also want to keep your tests as focused and simple as possible. One problem that I always had when writing my tests was trying to make them do a little too much. Now I just have the scripts do a single operation. A good example is to limit the script to just sending the keystrokes to open a file. You can chain the scripts together in various ways to maximize usage. Once you have the script to open a file, you can use it in three different tests: one to see if you can open a valid file, one to open an invalid file, and one to open a corrupt file. Like your normal development, you should avoid any hardcoded strings if possible. Not only will this make internationalizing your script a piece of cake, but it will certainly help when you change your menu system and accelerators for the hundredth time.

    Another thing to think about when designing your Tester scripts is how you will be able to validate that the script actually worked. If you are bored and have the time, I guess you could just sit there and watch them all run to see if you get the same results as the previous run. Probably a better idea is to log states and key points in your script so you can compare the output to previous runs automatically. If you use the CSCRIPT.EXE WSH executable, you can use WScript.Echo and redirect the output to a file. After the script finishes, you can run a difference utility on the output and, if anything is different, you can flag it to check that the script executed correctly. Keep in mind that you will want to keep the logged information free of run-specific information, and you will want to normalize that information. For example, if you are writing an application that downloads stock quotes, you will not want to include the last price update time in the logging output.

    What about debugging your Tester scripts? Since Tester is not completely integrated into its own debugger, you need to be careful that you do not stop your debugger on a TInput PlayKeys method call. If you do, then the keystrokes will obviously go to the wrong window. To work around this, I generally force the window to which I'm sending keystrokes to the top of the Z-order by calling the TWindow SetForegroundTWindow method before each PlayKeys call. This way I can break on the SetForegroundTWindow call, check the state of the application, and get the keystrokes to the correct window.

Implementing Tester

    Now that you have an idea of how to use Tester, I want to discuss some of the high points of its implementation. I first started to implement Tester using C++ and ATL, but then I realized that Visual Basic was a far better choice. Much of what I was going to implement in Tester is simple, so I just wanted to get the job done quickly. However, as I will point out later, using Visual Basic did require some odd gyrations.

    The first object I started implementing was TInput, which is responsible for all of the input that you need to send to another window. Initially, I thought that the keyboard input was going to be simple—I just thought I would wrap the Visual Basic SendKeys statement. This worked fine when I pumped some keys to Notepad, but when I tried to send keys to Outlook® 98 I noticed that some did not get through. I never did get it to work, so I had to implement my own function called PlayKeys. As I started my research, I noticed a neat new SendInput function had been added to Windows 98 and Windows NT® 4.0 Service Pack 3. This is part of Microsoft® Active Accessibility (MSAA) and replaces all the previous low-level functions like keybd_event. SendInput is the one function that does keyboard, mouse, and hardware events. The neat thing about the function—especially from Tester's perspective—is that it guarantees the input information is placed in the keyboard or mouse input stream as a contiguous unit. This ensures that your input is not interspersed with any extraneous user input. A quick test proved that SendInput worked when sending input to Outlook 98.

    Once I knew how to get the keystrokes played properly, I needed some way to get the keys to play. Since the Visual Basic SendKeys statement already provides a nice input format, I thought I would just go ahead and duplicate it. I did everything but the repeat key code, "{h 42}". There is nothing exciting about the parsing code, and you can see it in the TInputHlp directory. When I started working on the TInput object I was still intending to do Tester in C++, so the parsing code is done in a C++ DLL. The Visual Basic TInput PlayKeys method is just a wrapper to call the DLL. Eventually, I should rewrite everything in Visual Basic so you do not have to drag another DLL around with the main Tester DLL.

    The TWindow, TWindows, and TSystem objects are very straightforward, and you should be able to understand them right from the source code. All of those classes are implemented in Visual Basic and are just wrappers around some Windows API functions. The interesting part is in the TNotify class. When I first started thinking about what it would take to determine if a window with a specific caption was created or destroyed, I did not think it would be that difficult. I found out that it is only moderately difficult, but the creation part is not foolproof.

    My first thought was to implement a systemwide Computer Based Training (CBT) hook since from the SDK documentation that seemed to be the method for determining when windows are created and destroyed. I whipped up a quick sample and soon ran into a problem. When my hook got the HCBT_CREATEWND notification, I could not get the window title consistently. After I thought about it a bit, this made sense; the CBT hook is probably called as part of the WM_CREATE processing and very few windows have set their title at that point. The only windows I could get reliably with the HCBT_CREATEWND notification were dialogs. Watching the window destruction always worked with the CBT hook.

    After looking through all the other types of hooks (which have grown in number since I last checked), I extended my quick sample to try them all. As I suspected, just watching WM_CREATE was not going to tell me the title reliably. A friend suggested that I just watch the WM_SETTEXT messages. Eventually, to set the text in a caption bar, almost every window will use a WM_SETTEXT. Of course, if you are doing your own non-client painting and bit blitting, you will not use the WM_SETTEXT message. One interesting thing I did see was that some technologies, Microsoft Internet Explorer in particular, call WM_SETTEXT with the same text many times in a row.

    Having figured out that I needed to watch WM_SETTEXT, I took a harder look at the different hooks I could use. In the end, the call window procedure hook (WH_CALLWNDPROCRET) was the best one. It allows me to watch WM_CREATE and WM_SETTEXT easily. Additionally, I can watch the WM_DESTROY messages. At first it seemed like I might have some trouble with WM_DESTROY because I thought that the window title might have been deallocated at that point. Fortunately, the text is valid until the WM_NCDESTROY message.

    I decided to just treat WM_SETTEXT processing like the case where the WM_CREATE or WM_INITDIALOG have the title set and accessible. While I could write a state machine to keep track of created windows and when they get their captions set, it sounded pretty error-prone and difficult to implement. The drawback is that if you set a TNotify handler for windows with "Notepad" anywhere in them, you will get a notification when NOTEPAD.EXE opens a new file. This is one of those times where I felt it was better to have a little side effect in the implementation than spend days and days debugging the solution. Also, just getting the hook done was only about a quarter of the problem with implementing the final TNotify class. The other three-fourths was how to let the user know that the window was created or destroyed.

    I made the decision to implement Tester in Visual Basic before I wrote the TNotify class. Earlier, I mentioned that using TNotify is not totally free and that you have to call the CheckNotification method every once in a while. The reason is that Visual Basic cannot be multithreaded and I need a way to know that a window was created or destroyed safely into the same thread in which the rest of Tester is running. I wrote a couple of articles almost two years ago about multithreading Visual Basic, but the techniques I presented in those articles no longer work since Microsoft changed the Visual Basic internals.

    After sketching out some ideas, I was down to the following basic facts about the situation. The actual WH_CALLWNDPROCRET hook has to be systemwide, so it must be implemented in its own DLL. The Tester DLL obviously could not be that DLL because I did not want to drag the entire Visual Basic Tester DLL into each address space. This meant that the hook DLL probably had to set a flag or something that the Tester DLL could read to know that a condition was met. Since Tester cannot be multithreaded, I needed to do the processing in the same thread.

    If you ever did 16-bit Windows-based development, the previous paragraphs list some restrictions that you had to deal with all the time. In those cases, I would use SetTimer to call a timer procedure. For TNotify, the function will check the systemwide hook flags. The timer procedure solution seemed like the answer, but in reality it only almost works in the TNotify case. Depending on the length of the script and if your language of choice implements a message loop, the WM_TIMER message might not get through, so you will need to call the CheckNotification method, which does the flag checking as well. In an attempt to make the checking automatic, I set up the TSystem.Pause method to call DoEvents for the amount of time specified. However, this was a major performance drag on the scripts, so I settled on just asking users to call the CheckNotification function every once in a while.

    If all of this seems confusing, you will be surprised that it really does not take that much code to implement. The hook function, written in C, is in the TNotifyHlp DLL. On the Tester side, TNotify.bas is the module where the timer proc resides, and the actual class is implemented in TNotify.cls. The TNotify class has a couple of hidden methods and properties that the TNotify module can access to get the events fired and to determine what types of notifications the user wanted. If you look at the hook code, the interesting thing is the globally shared data segment, .HOOKDATA, that holds the array of notification data. When looking at the code, keep in mind that the notification data is global and all the rest of the data is on a per-process basis.

    While the TNotify implementation was a little bit of a brainteaser, I was pleased at how few troubles I experienced implementing it. If you do want to extend the hook code, be aware that debugging systemwide hooks is not the easiest thing in the world. While you can do it with the Visual C++® debugger, I have never tried it. I just use a third-party kernel debugger. The other thing you can do to debug systemwide hooks is to resort to "printf debugging." Using a tool like DBMON, you can watch all OutputDebugStrings to see the state of your hook.

    I did have one annoying problem when developing Tester that only appeared on Windows 98. All my test code worked just fine on Windows NT 4.0 and Windows 2000 beta 3, but I could not get the TWindows collection filled. I was checking whether the HWND passed into the Add method was valid with IsWindow. A quick read of the documentation said that IsWindow returns a BOOL. My mistake was assuming that BOOL was TRUE for positive and FALSE for negative. I also like to use the positive form of conditionals, so I was using "1 = IsWindow(hWndT)," which obviously did not work. As you can guess, the operating systems do not return the same thing. It was a small problem, but I thought you could learn from it.

Wrap-up

    Now that you are armed with Tester, maybe your UI unit testing can be almost as easy as some of your internals unit testing. I hope that the unit testing suggestions are helpful as well. If you have any unit testing techniques that work well for you, I would be interested in hearing about them. Also, if you have any ideas or comments for Tester's future, please let me know. Finally, I would like to thank Bob Meagher for some excellent suggestions and ideas for Tester.

Tips

    I know you have tips out there. Send them to me so fellow developers can benefit from your extraordinary wisdom.

     Tip 19 If you have a user who is reporting that your application is running out of memory or resources, you can have them use the Windows NT TaskManager to check memory consumption for you. Have them add the Memory Usage Delta column (select the Processes tab, then Select Column from the View menu). Right after they start your application, have them set the update speed to Paused. As soon as they get the out-of-memory problem, they can select Refresh Now on the View menu. The Mem Delta column will tell you how much memory has been used by your application between startup and the memory problem.

     Tip 20 I always have trouble deciding which properties, methods, and events to support and how to name them when doing a COM object. In MSDN™, Dave Stearns from Microsoft has written an excellent article called "The Basics of Programming Model Design." If you do anything with any sort of COM control, I highly recommend that you read this article.

Have a tricky issue dealing with bugs? Send your questions or bug slaying tips via email to John Robbins: john@jprobbins.com.

From the April 1999 issue of Microsoft Systems Journal