The Windows API Profiler, affectionately known as WAP, is useful for determining which Windows 32-bit API calls are taking up time. WAP can effectively profile any number of processes and threads concurrently. You can run it on a program without having to recompile the program. WAP intercepts the calls from the application to the system and counts and times them. WAP is available in the Windows NT SDK.
WAP modifies the executable image to point to a set of measurement DLLs that sandwich themselves between the application and the system DLLs. See Figure 10.5. If your application performs a checksum on its executable, you must disable the checksum to run WAP.
Figure 10.5 Application interface to the system before and after running apf32cvt
WAP sets the client-server batch size to one before taking any measurements. This assures that the proper API call gets billed for its time. If WAP did not do this, the time for all the API calls in the batch would be counted against the last one in the batch, totally confusing the data (not to mention confusing you). Setting the batch limit to one is a good idea, but you may notice a slowdown in the operation of the application because there are many more client-server transitions. Set another plate: Heisenberg invited himself to the party again.
If you are concerned about the impact of setting the batch level to one for your application, you can get an idea of the cost of a client-server transition on your computer by looking in the WAP data for a call to SetWindowLong. It's a pretty common call. If you don't see a SetWindowLong call, use WAP to find such a call in another application, such as WinHlp32.
The Win32 APIs are contained in the following dynamic link libraries: KERNEL32.DLL, ADVAPI32.DLL, GDI32.DLL, USER32.DLL, and CRTDLL.DLL. The profiler is in the form of five DLL files, one for each DLL to be profiled. As shown in Figure 10.5, these DLLs sit between an application and the Win32 DLL to be profiled, intercept API calls to them, and then make and time a call to the Win32 API. The profiling DLL records the following information for each API:
All result times are in microseconds.
The profiler determines overhead by reading the timer 2000 times upon initialization of the profiling DLL. The minimum time of these calls obtained during this process is subtracted from the time for each API call, thus eliminating the majority of timer overhead from the final results. For accurate timing, it is important that the system be inactive during the calibration process.