Your Right to Know, Part II

More on Windows NT Performance Counters

Rick Anderson
Microsoft Corporation

March 1999

Summary: Follow-up article on Microsoft® Windows NT® performance counters. (11 printed pages) In "Your Right to Know: Finding Leaks and Bottlenecks with a Windows NT PerfMon COM Object" we developed the R1PHDmod component—an ATL wrapper around my PHD C++ class (re-christened CPhd here).

Review
     Making the Component Easier to Use from Visual Basic
     Using the initLog from Visual Basic
     Using R1PHDMod from Visual J++
     Using R1PHDMod from VBScript
     R1PHDMod Internals
     Important PDH functions
     Using PdhBrowseCounters to Get the Right Path String
     Choose Your Weapon
This Time…

Review

As you'll recall, Microsoft Windows NT's PerfMon is an extremely useful tool for tuning applications and systems so they fire on all cylinders. Application developers can use it to tune their specific program. System administrators use it to tune a single computer and/or a network.

PerfMon can be used to monitor paging, database connections, network I/O, and many other system objects that directly impact overall performance. PerfMon can help you discover the bottlenecks in your system or application so you can increase performance. Without the ability to monitor programs and system resources, tuning is a hit-and-miss proposition.

Knowing what's causing bottlenecks enables you to solve the correct problem and improve performance. If a slow Web server shows high paging activity, adding more physical memory can increase performance. If the same server has low paging activity but is constantly waiting for remote database connections, adding memory won't speed things up but implementing connection pooling will.

This Time…

…we're going to make a modification or two to make the component easier to use from Visual Basic and demonstrate how to use it from Microsoft Visual Basic®, Microsoft Visual J++®, and a Web page using Visual Basic Scripting Edition (VBScript). Finally, we'll discuss the Windows NT APIs used by our component so you can understand more clearly what is going on underneath the hood.

Making the Component Easier to Use from Visual Basic

To complete the component, we will add an initLog method that takes an array of counter strings. By passing in the array of counter strings we can specify any counters on any network computer (within our security context) and not be limited to the default local counters. We previously added the IDL for initLog; the C++ implementation is as follows:

STDMETHODIMP Crx4Leaks::initLog(VARIANT *ArrayIn, BSTR name, VARIANT bDate, VARIANT bState){
    USES_CONVERSION;    
    HRESULT hr; LONG  LBnd, UBnd;
    SAFEARRAY *psa = V_ARRAY(ArrayIn);    
    SafeArrayGetLBound(psa, 1, &LBnd);    
    SafeArrayGetUBound(psa, 1, &UBnd);
    typedef char  *rcp;
    m_nc = UBnd - LBnd;
    m_strCntrs = new rcp[m_nc+1];
    
    CComBSTR   bs; long idx;
    for(int i=LBnd;i<UBnd;i++){
        idx = i;
        hr = SafeArrayGetElement(psa, &idx, &bs);
        CK_HR_MSG(hr,"ind=" << idx << endl); 
        m_strCntrs[i] = new char[bs.Length() + 1];
        strcpy(m_strCntrs[i],W2A(bs));
    }
    
    hr=initPHD(W2A(name),bgetBool(bDate),bgetBool(bDate));
    return hr;
}

We must first extract the safe-array pointer using the V_ARRAY macro (from oleauto.h). We then get the array bounds with safe-array functions. Once we know how many strings were passed, we new an array of pointers to store them.

I use the function SafeArrayGetElement, which calls SafeArrayLock and SafeArrayUnlock automatically. Because you are unnecessarily locking and unlocking the array each time you access an element, calling SafeArrayGetElement in a loop is very inefficient. Because initLog is a one-time call and the performance hit is negligible, I decided to go for simplicity. See the Knowledge Base article (search by ID number Q131086) "SAMPLE: SAFEARRAY: Use of Safe Arrays in Automation ID" for a more efficient algorithm to extract elements.

Using the initLog from Visual Basic

The following snippet shows how to use the safe-array arguments from a Visual Basic client:

Private Sub Form_Load()
Dim vx As Variant
Dim v(2) As String

v(0) = "\\ricka5\Process(cpuHog)\% Processor Time"
v(1) = "\\ricka5\Process(cpuHog)\Private Bytes"
Set rx = New R1PHDMODLib.rx4leaks
vx = v
rx.initLog vx, "c:\temp\a"

End Sub
Private Sub CmdLog_Click()
G_cnt = G_cnt + 1
rx.LogCnt G_cnt
End Sub

Notice that to pass a safe-array, you must declare a Variant and a String array. The String array is assigned to the Variant. Running the preceding program (vbMon.exe) produces the file c:\temp\a_ vbMon_Perf.log, having the following contents.

LoopCnt	% Processor time	Private bytes
1	66	380928
2	18	57327616

Using R1PHDMod from Visual J++

One of the advantages of making R1PHDMod a COM object is that we can use it from any language that supports COM, including Visual J++. To use this object, use the Visual J++ wizard to create a Windows application and add two buttons, Log and Leak. From the Project menu select Add COM Wrapper.... In the list box, check R1PHDmod 1.0 type Library. This will create the r1phdmod folder in your project, which contains Irx4leaks.java and rx4leaks.java.

Add the following code so your class looks like:

import r1phdmod.rx4Leaks;
import com.ms.com.*;     // for Variants 

public class Form1 extends Form{
   r1phdmod.rx4Leaks m_rx= new r1phdmod.rx4Leaks();;
   private int m_cnt=1;
   private int[] arLeak; 
   
   public Form1()   {
      initForm();      
      Variant vt = new Variant(true);
      Variant vf = new Variant(false);
      m_rx.initEzLog("C:\\temp\\2\\",vf,vt);
   }

Add the following to the Leak button callback:

arLeak = new int[m_cnt++ * 1024];
arLeak[arLeak.length -2] = 1;

and to the Log button callback add:

m_rx.logCnt(m_cnt);

The Visual J++ environment makes it very easy to use COM objects. When I need to get system information in a Java app, I grab a C/C++ snippet that solves the problem, modify the code for my specific needs, and then wrap it in a wizard-generated ATL component. By leveraging the vast base of C/C++ code, I can quickly expose the information to Visual Basic and Java. I could use Microsoft J/Direct™ to get system information, but wrapping C++ code in a COM object is often quicker and easier. If you don't need blazing speed, use the ATL approach and you'll end up with a component that's easy to extend, debug, and share with other languages.

Using R1PHDMod from VBScript

The major difference between Visual Basic and VBScript when using a COM control is that VBScript only supports late binding, so your control must support the IDispatch interface.

In my previous sample, I used the following to declare and create the component:

Dim rx As R1PHDMODLib.rx4Leaks
Set rx = New R1PHDMODLib.rx4Leaks

In VBScript these must be replaced with:

Dim rx as Object
Set rx = CreateObject("R1PHDMOD.rx4Leaks")

Notice the "Lib" is dropped from the string R1PHDMODLib when using CreateObject. After these two changes, VBScript call methods are exactly the same as Visual Basic early binding.

The joys of late binding

When I'm developing a new ATL component I typically write a Visual Basic test client in parallel. If you're simultaneously working on the VB test client and a COM server, it's easier to use the late binding CreateObject gives you. To use early binding in the Visual Basic IDE, you must first add a reference to the component (from the Project menu). Because the Visual Basic IDE has a reference to your component, Visual C won't be able to rebuild it while VB is running. You must either manually clear the reference to your component or exit VB. Because of this nuisance, I generally opt for the late-binding approach.

Keeping state in a singleton

If you'd like to use the component with Internet Explorer on an HTML page, you must first convert it to a singleton and modify the DllCanUnloadNow so the component is not destroyed when you leave the HTML page (just have DllCanUnloadNow always return S_FALSE). Even though DllCanUnloadNow always returns S_FALSE, Windows NT is smart enough to unload the DLL when the process that loaded it exits. If you use a non-singleton component in an HTML page, it will be unloaded each time you leave the page and a new one will be created when you return. The R1PHDMODLib server needs to stay in memory so it can retain its state between Web pages.

R1PHDMod Internals

As you'll recall from Part I of this article in the last issue of MSDN News, the R1PHDMod ATL template class is very simple: the ATL wizard did most of the work. The component merely forwards counter strings and logging requests on to the CPhd class. The CPhd class is built with the Performance Data Helper (PDH) functions. To understand the CPhd class, you must first understand how the PDH API works.

Stepping backward into the CPhd class

Last time we used the CPhd class as a black box; this time we'll dissect it and explain the PHD API it wraps. CPhd simplifies getting performance data from the PDH functions. The goal of getting performance data with PDH functions mapped nicely into a C++ class. The PDH open and close query functions are a natural fit in a constructor and destructor.

You can simply plug CPhd into your existing C++ application and programmatically get performance data. You might want to do this to prove/disprove a memory or other resource leak, or you may just want programmatic control over logging of performance data. Because the source is included you can easily modify it if you have particular needs.

The default CPhd constructor adds three counters ("Private Bytes," "Working Set," and "Handle Count" of the running process). The other CPhd constructor lets you add an arbitrary number of counters.

The two constructors and a logging method were covered last time, so you can refer back to it or download the complete sample from my Knowledge Base article Q215496, "SAMPLE: Find memory leaks and monitor your Java, VB and ASP Objects" (search by article ID number).

All the way back: A PDH review

Applications, services, drivers, and Windows NT OS objects expose performance data to the Windows NT registry. In ancient times, programmers had to make calls to the registry to get this information (using RegQueryValueEx). While you can still use the registry interface to access this data (as the PerfMon application does), the complexity is like trying to write a GUI application in assembly. The PDH API greatly simplifies collecting this data (compared to the registry approach). The API is documented in the Platform SDK.

PDH samples are found in the Platform SDK.

A subset of the C API is also directly supported for Visual Basic and is documented in the Platform SDK.

PDH architecture

PDH functions work with queries and counters. A query is a set of performance counters that are grouped together so that you can collect their data at the same time. The CPhd C++ class has one query member (m_hQuery), which is initialized in the constructor, used in all the PDH calls (that CPhd makes), and freed in the destructor.

To get data from the PDH functions, you must supply a fully qualified counter string. Counter strings (called counter paths in the documentation) are similar to network path names. For example "\\RICKA4\Memory\Pages/sec" is the counter used to request the current page faults per second on the computer RICKA4. Like file paths, you don't need to specify the computer. "\Memory\Pages/sec" defaults to the local computer. Note that the '/' in "Pages/sec" is not a delimiter but just part of the counter name. Read it as "per," not "backslash."

The PDH documentation mentions the use of wildcards when specifying counter paths, but the Windows NT 4 versions of PDH do not support them. Wildcards are being considered for Windows® 2000.

Most counters require at least two data collections (because they report an average value over time, not the absolute value of the counter). For example, "% CPU usage" needs at least two samplings to figure out the average CPU usage over the time between two calls. It doesn't make sense to ask what the CPU usage of a process is at a specific time. If the process has a running thread, its CPU usage is 100 percent. If all the process threads are waiting to be scheduled when the CPU usage is queried, usage is zero. You don't really care if a process is running or preempted at a specific time, but what percent of the CPU time the process was using in the last n seconds.

Important PDH functions

While the PDH API has 24 functions, we'll discuss only the 6 sample calls. Other useful PDH functions include PdhEnumObjects and PdhEnumObjectItems, which are used to enumerate threads, processes, and other system objects. Windows 2000 will add several new functions.

All of the PDH functions return status as a PDH_STATUS (typedef long). The following list details the PDH methods CPhd uses:

PdhOpenQuery

PDH_STATUS PdhOpenQuery(
   LPVOID pReserved,  // reserved, must be NULL
   DWORD dwUserData,  // Not used in CPhd
   QUERY *phQuery     // pointer to query handle used in all PDH calls
);

PdhOpenQuery is a simple one-time call that initializes the query handle that all of the remaining methods use. It is called in the CPhd constructor. Visual C++ 6.0 shipped with a newer pdh.h file that has ANSI and Unicode prototypes for each function. During compilation the preprocessor replaces PdhOpenQuery with PdhOpenQueryA or PdhOpenQueryW, depending on whether you're doing an ANSI or Unicode build. (The "A" specifies ANSI and "W" indicates "wide," or Unicode). Unless you've recently downloaded the Platform SDK, your pdh.dll won't include either the ANSI or Unicode version of PdhOpenQuery, so Visual C++ 6.0 requires the following hack:

#undef PdhOpenQuery   //       PdhOpenQueryA or PdhOpenQueryW
extern "C" long __stdcall 
PdhOpenQuery (LPCSTR  szD, DWORD  dw, HQUERY  *phQ );

If you don't want to bother with this hack, you can download the latest Platform SDK from http://msdn.microsoft.com/developer/sdk/—but you'll have to install the newer pdh.dll on all machines you want to run your Visual C++ 6.0-built application on. If you just need the pdh.dll, you can download my Knowledge Base article Q194655, "SAMPLE: Using the PHD Class to Isolate Memory Leaks" (search by article ID number). The sample also contains the complete source for CPhd.

PdhAddCounter

PDH_STATUS PdhAddCounter(  
   HQUERY hQuery,               // handle to the query 
   LPCTSTR szFullCounterPath,   // path of the counter 
    DWORD dwUserData,           // not used in CPhd
    HCOUNTER *phCounter         // counter handle buffer
);

PdhAddCounter is called once for each counter path (that is, "\Memory\Pages/sec"). This method trades the counter path in for a counter handle (phCounter), which we will use in PdhGetFormattedCounterValue to get useful counter data. PdhAddCounter is called in the CPhd constructor, once for each counter specified.

PdhCollectQueryData

PDH_STATUS PdhCollectQueryData(  
   HQUERY hQuery  // handle of the query
);

PdhCollectQueryData collects the raw performance data for all the counters we have specified. We call this method each time we want new data. One call collects data for all the counters initialized with PdhAddCounter. This method is called by the logData method of CPhd and in the CPhd constructors (recall that many counters require two query data calls prior to reporting a counter value). While the CPhd class is ready to report data the first time logData is called, the PerfMon application is not as clever. Counters that require two PdhCollectQueryData calls will not display data on the first update.

PdhGetFormattedCounterValue

PDH_STATUS PdhGetFormattedCounterValue(
   HCOUNTER hCounter,            // handle of the counter
   DWORD dwFormat,               // format long | double | 64 bit-int
   LPDWORD lpdwType,             // counter type, not used in CPhd
   PPDH_FMT_COUNTERVALUE pValue  // counter value
);

This function is called once for each counter. The default for the CPhd class is PDH_FMT_LONG (long), but you can set it to double or int64 (a 64-bit integer). We call PdhCollectQueryData in the CPhd constructor so we are prepared to get counter data the first time the CPhd logData method is called. PdhCollectQueryData is called once for each counter (in the constructors) and in the logData method.

PdhCloseQuery

PDH_STATUS PdhCloseQuery(
   HQUERY hQuery  // handle to close.
);

PdhCloseQuery is called when you've finished collecting performance data. Not only does it close the query handle, it also closes all the counters associated with this handle. The CPhd class calls PdhCloseQuery in its destructor.

PdhBrowseCounters

PDH_STATUS PdhBrowseCounters(
   PPDH_BROWSE_DLG_CONFIG pBrowseDlgData  // pointer to struct 
);

PdhBrowseCounters displays a dialog box that allows you to browse all the counters on your system (including all other Windows NT systems on the network where you have permission). The Browse Counters dialog box is very similar to the Add to Chart dialog box PerfMon maps for selecting a counter. PdhBrowseCounters takes only one argument, a PDH_BROWSE_DLG_CONFIG structure. Although this structure has 19 elements, we only need to set the path buffer (where the complete counter path string will be stored) and tell it we want only one counter string.

Using PdhBrowseCounters to Get the Right Path String

PdhBrowseCounters is not used in the CPhd class, but is used by the sample main.cpp included with CPhd. The most frequent problem programmers have when using the PDH API is correctly specifying the counter path string. For example, it's not obvious if you should use "\Processor(0)\% Processor Time" or "\Processor(0)\%Processor Time." You must use the correct counter string spelling or PdhAddCounter will fail.

The code that follows allows you to browse all the counters on your network, print the counter path in the console window, and paste it to the paste buffer. The string can then be copied into your source, guaranteeing a valid string. Just be sure to double the backslash in C++. The sample brings up the Browse dialog box and prints the selected counter until the Cancel button is selected.

I'm using the static modifier to initialize the structure to all zeros (ANSI C/C++ guarantees this). If you don't trust PdhBrowseCounters to not change any of the structure elements (besides the counter path) between calls, you could use memset inside the loop. One of the structure elements is a pointer to a callback function, so if you don't zero out at least that member, you're almost certain to crash with an access violation.

#include "stdafx.h"
#include "rkLeak.h"

void paste2clip(char *p){
    printf("Perf String: \n \"%s\" \n",p);
    LPTSTR  lptstrCopy;     HGLOBAL hglbCopy; 
    if (!OpenClipboard(NULL))         
        return ;     
    EmptyClipboard(); 
    hglbCopy = GlobalAlloc(GMEM_DDESHARE,  strlen(p)+1);
    lptstrCopy = (char *) GlobalLock(hglbCopy); 
    strcpy(lptstrCopy,p); 
    GlobalUnlock(hglbCopy); // put hand on clip
    SetClipboardData(CF_TEXT, lptstrCopy); 
    CloseClipboard(); 
}

void main(){
    static PDH_BROWSE_DLG_CONFIG brwsDlg;   // zero out struct
    char cntrPath[1024];
    brwsDlg.bSingleCounterPerDialog = 1;
    brwsDlg.szReturnPathBuffer = cntrPath;
    brwsDlg.dwDefaultDetailLevel= PERF_DETAIL_WIZARD;
    brwsDlg.cchReturnPathLength = sizeof(cntrPath);
    while(1){
        if ( ERROR_SUCCESS != PdhBrowseCounters( &brwsDlg ) )
            return;
        paste2clip(cntrPath);
    }
}

I have specified the Performance Counter's detail level of wizard. There are four possible levels: novice, advanced, expert, and wizard. Supposedly, each level up would offer more counters, but in reality there are two modes: novice and not novice (that is, advanced, expert, and wizard all give you the same counters).

Choose Your Weapon

Most programmers won't need to write applications using the PDH API, but eventually you'll need to monitor your running app or pinpoint a resource leak. You may want to automate measuring the memory footprint of successive versions of your product (to prevent creeping code bloat). If you choose to do this programmatically, the CPhd class or the R1PHDmod component may be all you need, or at least a good start.

If you don't need to control performance data collection directly from within your application, PerfMon is the tool to use, as described last time.

Get better performance

This article (and part I) provides you with a good understanding of performance data and a framework to collect this information. In a related article, "Alerts Are Cheap Insurance," I examine using PerfMon's alert mechanism to indicate that a problem is developing.