The next steps are to build functions and data structures into your application to collect and store performance data, and to provide a mechanism for making the data available to the performance DLL.
The method you use to collect the data can be as simple as incrementing a counter each time a particular routine in the application is called, or it can involve time-consuming calculations. Counters and timers should increment and never be cleared. It's all right for a counter to wrap as long as it does not wrap twice between two Performance Monitor snapshots. If it might, use a 64-bit counter instead of a 32-bit counter. Counter types are defined in Table 12.4 in Chapter 12. Your program can collect and store data during the normal course of application operations, though you should do it so it doesn't affect the application's performance. The sample performance code at the end of this chapter shows a performance counter in a VGA application that uses this method.
For some types of data, it may be more efficient or appropriate to collect the data on demand. In this situation, the performance DLL must communicate to the application that the data has been requested. For data that is expensive to collect (in terms of processor time or memory usage), consider collecting data only when the performance monitoring program requests Costly data. This allows a custom performance monitoring program to routinely request data for all counters that are not costly. The data can be requested only when needed. Windows NT Performance Monitor does not collect Costly data.
Communication between an application and its performance DLL differ for user-mode and privileged-mode applications. The application's performance DLL executes in user mode. Because of this, user-mode applications, such as print and display applications, can use any of the Win32 techniques for interprocess communication, such as named file mapping or RPC. For example, the DDK's sample performance counter code shows a user-mode VGA application that uses a file mapping object to create shared memory mapped into the address space of both the application and the performance DLL. The shared memory provides both storage and interprocess communication. If you use shared memory, consider using a named mutex object so you don't change the data while it is being collected.
Privileged-mode applications must provide an IOCTL interface that returns the performance data to the performance DLL.