This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.


November 1996

Microsoft Systems Journal Homepage

Matt Pietrek is the author of Windows 95 System Programming Secrets (IDG Books, 1995). He works at NuMega Technologies Inc., and can be reached at 71774.362@compuserve.com.

Click to open or copy the PSAPI project files.

In my August 1996 column, I described some of the APIs in PSAPI.DLL. This DLL is the closest thing that Windows NT¨ has to true system-level information APIs. For example, PSAPI.DLL lets you obtain system information like the list of running processes. While you can get much of this information from the performance data in the Windows NT registry, it's messy and complicated. PSAPI.DLL is much simpler and faster to use than the registry. PSAPI.DLL isn't a standard part of Windows NT, but it is a redistributable component from the new Win32¨ SDK that supports Windows NT 4.0.

This month, let's go over the remaining APIs in PSAPI.DLL. These APIs fall into two categories that are related enough for me to incorporate them all into a demo program. The first category obtains information about the working set of a process, while the second category relates to retrieving the names of memory-mapped files. First, I'll examine all of these APIs, then finish by walking through the demonstration program.

Have you ever wondered about the "Mem Usage" column in the Windows NT 4.0 Task Manager? Where does it get those numbers from? Those numbers are the actual working set of each process. What exactly is a working set? Alas, the term has at least two different meanings. In the traditional computer science sense, the working set of a process is the absolute smallest amount of physical RAM for the process to continue executing without incurring any page faults. A process with a working set of 12KB would only take up 12KB of physical RAM to hold all the code and data that's being accessed at the moment.

The other meaning of the term "working set" is the amount of physical RAM that a process is currently using. This second interpretation is what Microsoft uses. To quote from the Win32 SDK documentation:

The working set of a process is the set of memory pages currently visible to the process in physical RAM memory. These pages are resident and available for an application to use without triggering a page fault. The size of the working set of a process is specified in bytes. The minimum and maximum working set sizes affect the virtual memory paging behavior of a process.

PSAPI.DLL provides access to process working-set information in two ways. The first is via the QueryWorkingSet API. This API fills a buffer with information about every page that's currently part of the working set of the specified process. The only memory pages reported are those physically present at the exact moment you call QueryWorkingSet. The other type of working set information comes from the GetWsChanges API. This routine reports just the pages that have been mapped into memory since monitoring began (you begin monitoring with the InitializeProcessForWsWatch API, which I will discuss later). This is useful for situations such as finding out how much additional RAM a particular operation (for instance, saving a file) takes.

Let's look at the QueryWorkingSet API first, since it's conceptually simpler.

 BOOL WINAPI QueryWorkingSet( HANDLE hProcess, PVOID pv, 
                             DWORD cb );

The first parameter is a process handle for a running process. The easiest way to get a process handle for an arbitrary process is with the Win32 OpenProcess API. If you're not up on the OpenProcess API and process IDs, my August column on PSAPI.DLL describes them.

Returning to the QueryWorkingSet API, the second parameter is a pointer to a memory buffer that QueryWorkingSet writes a series of DWORDs to. The third parameter tells QueryWorkingSet how big the buffer is so that QueryWorkingSet won't write past the end of the buffer. Unfortunately, there's no way to find out ahead of time how big a buffer QueryWorkingSet will need. You have to pass a buffer that's hopefully big enough and be prepared to handle the case where it's not (the API will return FALSE, and you need to try again with a bigger buffer).

The meanings of the DWORDs that QueryWorkingSet writes to the buffer are slightly strange, as I found out the hard way! The first DWORD contains the number of valid DWORD values that follow the first DWORD in the buffer. Each remaining DWORD represents one page in the process working set, and is composed of a linear address combined with various flag values.

To decode these DWORDs, it's necessary to split apart the bits in the high 20 bits from the low 12 bits (actually, it's 13 bits for the DEC Alpha, which has 8KB pages). It's easy to do this with the bitwise AND operator. The high 20 bits (obtained by doing a bitwise AND with 0xFFFFF000) contain the linear address of a page of memory mapped into the specified process. The bottom 12 bits are flag values that define operating system attributes for the page.

The exact meaning of the bit values in the low 12 bits aren't defined in PSAPI.H or anywhere else that I'm aware of. Some experimentation showed the following bit interpretations to be consistent:

0x001

The page is read-only (if bit 0x004 not set)

0x002

The page is executable (code)

0x004

The page is writeable (if bit 0x001 is not set)

0x005

The page is copy-on-write (bits 0x001 and 0x004 are both set)

0x100

The page can be shared across processes, given the right conditions

As an example of what QueryWorkingSet returns, consider the following DWORDs:

 0x00000003
0x00400103
0x00480101
0x00500004

Breaking apart the bits, these DWORDs would be interpreted like this:

0x00000003

3 DWORDs to follow

0x00400103

Linear address 0x00400000, read-only, executable, shared

0x00480101

Linear address 0x00480000, read-only, shared

0x00500004

Linear address 0x00500000, writeable

Now that you know how to get the entire working set of a process using QueryWorkingSet, let's examine the other working set APIs. The primary API is GetWsChanges.

 BOOL WINAPI GetWsChanges(HANDLE hProcess,
   PPSAPI_WS_WATCH_INFORMATION lpWatchInfo, DWORD cb );

The first parameter is a process handle specifying which process you want information for. The last two parameters specify the buffer GetWsChanges writes its information to.

As mentioned earlier, GetWsChanges returns information on what changes to the working set have occurred since monitoring began. The format of the data returned by GetWsChanges is quite a bit different than what QueryWorkingSet returns. Luckily, PSAPI.H tells you exactly what the format of the return data is:

 typedef struct _PSAPI_WS_WATCH_INFORMATION {
    LPVOID FaultingPc;
    LPVOID FaultingVa;
} PSAPI_WS_WATCH_INFORMATION...

GetWsChanges fills in an array of these structures, one structure for each new page added to the working set of the process. The second LPVOID in the structure (FaultingVa) contains the linear memory address of the page that was added to the working set. The first LPVOID in the structure (FaultingPc) is the address of the instruction that caused the page fault referred to by the FaultingVa address. In simpler terms, the structure tells you which pages are in memory, and what caused them to be paged in.

Before racing out and experimenting with GetWsChanges, there's another API, InitializeProcessForWsWatch, that you need to know about. Before you can use GetWsChanges, you need to first pass the process handle to InitializeProcessForWsWatch. Not all processes let you read their working set information due to the security in Windows NT, so be sure that the API returns TRUE.

The last working set API in PSAPI.DLL is EmptyWorkingSet. It takes one parameter, (you guessed it) a process handle. Calling the API causes Windows NT to remove as many pages as possible from the process working set. Why would you want to do this? Primarily for testing and tuning. One note on EmptyWorkingSet: the API is, in essence, obsolete. The Win32 SetProcessWorkingSetSize API does the same thing if you pass it 0xFFFFFFFF for the minimum and maximum sizes.

With the working-set APIs out of the way, the only remaining APIs to cover in PSAPI.DLL are GetMappedFileNameA and GetMappedFileNameW. As you can probably guess, the version ending in A is the ANSI version, while the version ending in W is the Unicode version. The ANSI version of GetMappedFileName is prototyped like this:

 DWORD WINAPI GetMappedFileNameA(HANDLE hProcess, 
                                LPVOID lpv,
                                LPSTR lpFilename, 
                                DWORD nSize );

The hProcess and lpv parameters specify a linear address in a specific process. If this address is somewhere within a memory-mapped file, the lpFilename buffer is filled with the name of that memory-mapped file. The nSize parameter tells the API how big the lpFilename buffer is. It's interesting that the filenames returned by GetMappedFileName don't use drive letters. Rather, they're in their device form. For example:

 \Device\Harddisk0\Partition1\WINNT\System32\ctype.nls

The PSAPIWorkingSetDemo Program

To pull together all of the APIs that I've described here, I wrote the PSAPIWorkingSetDemo program (see Figure 1). This program was a real stretch for me, at least the user interface was. Normally, for a demo program, I would create an app with a dialog as the main window. For PSAPIWorkingSetDemo, I used two, count 'em, two dialogs. I hope you appreciate my extra labor.

The main window of PSAPIWorkingSetDemo is shown in Figure 2. The top list box contains a list of all processes (and their IDs) that I obtained using the PSAPI APIs described in my earlier column. Whenever you click on a process in the top list box, the bottom list box updates with detailed information about the working set of the selected process.

Figure 2 PSAPI Working Set Demo

On the top-right side, you'll find a summary of the working set information shown in the bottom list box. The "total" field is exactly how much RAM is used for the selected process (including RAM used by shared system DLLs like KERNEL32.DLL). This number should always be the same value that you see reported from the Windows NT 4.0 Task Manager. I didn't invest the time to make PSAPIWorkingSetDemo update these fields automatically. Instead, I'll drag out the old "Left as an exercise for the reader" ploy.

The Private field shows the amount of memory used by pages that cannot be shared with other processes in the system. Examples of this would be stack and heap pages. The Shared field shows how much memory is used by pages that could theoretically be shared with other processes. For example, the code pages of EXEs and DLLs can usually be shared across processes. The Page Tables field shows how much RAM is taken up by the page mapping tables that translate between virtual addresses and the physical addresses that go out on the computer's bus.

Along the bottom row of the main PSAPIWorkingSetDemo window are four buttons. When pressed, the Empty working set button calls EmptyWorkingSet on the process that's selected in the top list box. The Start Delta and End Delta buttons are used in tandem. I'll come back to them later when I describe the other dialog.

Let's zoom in on the bottom list box (the working set details) and see what it's all about. The list box is populated by calling QueryWorkingSet and analyzing the output before sending it on to the list box. All of this is done by the AddWorkingSetInfo routine in PSAPIWorkingSetDemo.CPP. As I describe what happens, refer to that routine's code to see my implementation.

The AddWorkingSetInfo routine starts by calling OpenProcess on the specified process ID to get back a process handle. Next, the code calls QueryWorkingSet to get an array of all the pages in the process. Since the addresses in the array aren't sorted, I call qsort to put them into ascending address order. Once sorted, the bulk of the routine just iterates through every page and adds information about the page to the bottom list box. Actually, this isn't strictly true. Whenever I see two or more adjacent memory pages with the same attributes (for example, readonly, shared, and so on), I combine them into one line in the output. This is why the second column in every output line is a size that's a multiple of 4KB.

For each reported range of pages in the output list box, the function emits the starting address of the range, the size of the range, the attributes of the pages and, if possible, where the page came from. The attributes of the page are extracted from the bottom 12 bits of the page's DWORD description, and are exactly the attributes I described earlier (readonly, executable, and so on). In the list box, I've abbreviated the attributes as follows:

0x001

RO (read-only)

0x002

E (executable)

0x004

RW (readable,writeable)

0x005

CW (copy-on-write)

0x100

S (shared)

0x000

P (private, if 0x100 bit not set)

The last column for each page is the owner of the page (if I was able to determine an owner). The primary means of identifying who owns a page is the GetModuleNameAndSectionInfo function, a routine I wrote, which I'll describe later. If GetModuleNameAndSectionInfo didn't find an owner, the page may be from a memory-mapped file. I check for this possibility by calling GetMappedFileNameA. If the page is from a memory-mapped file, the API returns TRUE, and my code emits the memory-mapped file's name as the page's owner.

If a working-set page doesn't come from an EXE or DLL, and if it's not from a memory-mapped file, its owner field will be blank. There are numerous ways that this can happen. For starters, all pages with addresses above 2GB are from the ring 0 (kernel mode) portion of Windows NT. I wasn't able to come up with a good way to identify the owners of these pages while working within the confines of ring 3 user-mode code. As for pages below 2GB in memory without a listed owner, owners could be part of a stack, a heap, or belong to system data structures like the thread information block. See my May 1996 column for details on the thread information block.

Having explained the working-set information in the main dialog, let's now turn to those Start Delta and End Delta buttons. These two buttons demonstrate the other portion of PSAPI.DLL's working set functionality, the GetWsChanges API. When you press Start Delta, PSAPIWorkingSetDemo begins collecting working set additions to the currently selected process. The End Delta dialog causes PSAPIWorkingSetDemo to display a second dialog with information about each new page added since the Start Delta button was pressed. This dialog, entitled Working Set Delta, is shown in Figure 3.

Figure 3 Working Set Delta

When you press the Start Delta button, the StartWorkingSetDelta function in PSAPIWorkingSetDelta.CPP gets control. It begins by opening a handle to the selected process, and then passes that handle to InitializeProcessForWsWatch. Next, the function calls GetWsChanges. The code ignores the results of this call. Why bother to call GetWsChanges and then ignore the results? Calling this API clears out all of the information about working-set pages up to the moment that the Start Delta button was pressed.

The End Delta button causes the EndWorkingSetDelta function, also in PSAPIWorkingSetDelta.CPP, to take over. This function calls GetWsChanges to get all the working set additions into an array of PSAPI_WS_WATCH_INFORMATION structures. The FillDeltaList box function is where all processing of the working-set additions occurs. The outermost loop of this function walks through every valid PSAPI_WS_WATCH_INFORMATION and throws out all working set changes made by ring 0 system code above 2GB. The end of the PSAPI_WS_WATCH_INFORMATION array is indicated by an entry with a NULL linear address for either the FaultingVa or FaultingPc field.

For every remaining PSAPI_WS_WATCH_INFORMATION, FillDeltaListbox adds a line to the Working Set Delta list box. Each line contains the FaultingVa and FaultingPc addresses at a minimum. In addition, the code attempts to decode those addresses to something more meaningful. Once again, I fall back on my GetModuleNameAndSectionInfo function. In the best case, my code can give information about where the new page is from and who forced it into memory. For example, the line below says that the page encompassing the address 10190275 was faulted in by the instruction at 1017FFAF.

 10190275 1017FFAF CWDLL32.DLL!.data(4) via CWDLL32.DLL!.text(1)

The remainder of the line gives more information. The page at 10190275 is in the CWDLL32.DLL .data section (section 4). The faulting instruction at address 1017FFAF is in the .text section (section 1) of CWDLL32.DLL.

The last part of the PSAPIWorkingSetDemo code to describe is the GetModuleNameAndSectionInfo function from PSAPIHELPER.CPP. This function takes a process handle, a linear address, and output buffers to write its results to. Using the process handle and the linear address, the function tries to determine which EXE or DLL the address falls within. Not content to stop there, the function burrows down another level and attempts to determine the specific code or data section within the EXE or DLL. If everything goes well, the function could tell you (for instance) that a particular address is within the .rsrc section of KERNEL32.DLL, and that the .rsrc section is the fourth section within KERNEL32.DLL.

The GetModuleNameAndSectionInfo code starts with a call to VirtualQueryEx. VirtualQueryEx fills in a MEMORY_ BASIC_INFORMATION structure with information about the input address. It seems to be a little known fact that after a VirtualQueryEx call, one of the MEMORY_BASIC_ INFORMATION fields (AllocationBase) contains the load address of the EXE or DLL that the input address belongs to. For instance, if you pass an address within USER32.DLL to VirtualQueryEx, upon return the AllocationBase field will contain USER32.DLL's load address.

Once you know that a particular address falls within an EXE or DLL, a little more work yields the specific section within the module. To figure out which section the address falls within, you need to look at the module's IMAGE_SEC-TION_HEADER table. The IMAGE_SECTION_HEADER and other executable file data structures are defined in WINNT.H. I won't attempt to describe the intricacies of locating the IMAGE_SECTION_HEADER here. The code in PSAPIHELPER.CPP is the best description.

One key point about GetModuleNameAndSectionInfo is that the code for traversing the module's data structures can't just access the data using pointers. Remember, the module being examined is in another process. The function has to use ReadProcessMemory to get at the module's data structures. That's why the code may seem a little more complex than it needs to be.

This ends my little tour of PSAPI.DLL. This month, I described the working set and memory-mapped file APIs. Be sure to refer back to my August column for a description of the process, module, and memory-management APIs. It would be nice if there was a set of unified Win32 API APIs to retrieve this information on both Windows NT and Windows¨ 95. But, given all of the information that PSAPI.DLL provides, at least there's no reason for Windows NT programmers to be envious of the Windows 95 TOOLHELP32 APIs.

Have a question about programming in Windows? Send it to Matt at 71774.362@compuserve.com.

From the November 1996 issue of Microsoft Systems Journal.