Win32 Q&A, MSJ October 1999

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

October 1999

Code for this article: Oct99Win32.exe (35KB) Jeffrey Richter wrote Advanced Windows, Third Edition (Microsoft Press, 1998) and Windows 95: A Developer's Guide (M&T Books, 1995). Jeff is a consultant and teaches Win32 programming courses (www.solsem.com). He can be reached at www.JeffreyRichter.com.

In my July 1999 column, I discussed how you should create threads by calling the C runtime library's _beginthreadex function instead of the operating system's CreateThread function. That column apparently hit a hot button for many readers, since I received a lot of email about it.
      The area of concern is in regard to freeing the thread's _tiddata structure. Most readers agree that if you link to the C runtime's static library, you should create your threads using _beginthreadex or the _tiddata structure will leak when the thread terminates. However, many readers indicated that the DLL version of the C runtime library traps DLL_THREAD_DETACH notifications and automatically destroys the _tiddata structure. These readers then surmise that it's fine to use CreateThread over _beginthreadex. For the most part they are correct, but there are some remaining issues. The following discussion summarizes what you need to know.
      Calling CreateThread in an EXE or a DLL linked to the single-threaded, static C runtime library does not create a _tiddata structure, and C runtime functions will not work correctly. If you call _beginthreadex to create your threads (as I suggest), then you would discover this bug at compile time instead of runtime.
      Calling the CreateThread function in an EXE linked to the multithreaded, static C runtime library does leak the _tiddata structure. Calling CreateThread in a DLL linked to the multithreaded, static C runtime library frees the _tiddata structure. However, if your DLL calls DisableThreadlibrarycalls, then your DLL will not receive the DLL_THREAD_DETACH notification and the _tiddata block will leak.
      Calling the CreateThread function in an EXE or a DLL linked to the multithreaded, dynamic C runtime library frees the _tiddata structure. If you always link to the multithreaded, dynamic C runtime library, then the _tiddata structure will never leak. However, _beginthreadex does more than just guarantee freeing of the _tiddata structure when the thread terminates. Calling _beginthreadex also ensures that the signal function and floating point exceptions all work correctly.
      I still strongly recommend that you avoid CreateThread in favor of _beginthreadex. There is absolutely no disadvantage to calling _beginthreadex, and your code is much safer. Besides, there are several advantages: compile-time checking if you link to an improper C runtime library; guaranteed freeing of the _tiddata block regardless of whether you're creating an EXE or a DLL and regardless of which multithreaded C runtime library you're using; signal support; and floating point exception support. And who knows what _beginthreadex might do for you in the future? C'mon now, it's not that hard to do a global search and replace!

Address Windowing Extensions
      For the remainder of this month's column, I'd like to introduce a new memory management feature offered in Windows® 2000. I know that this feature addresses issues about which many readers have emailed me.
      Over time, applications require more and more memory. This is especially true of server applications; as an increasing number of clients make requests of the server, the server's performance diminishes. To improve performance, the server application needs to keep more of its data in RAM and reduce disk paging. Other classes of applications, such as database, engineering, and scientific apps, also require the ability to manipulate large blocks of storage. For all of these applications, a 32-bit address space is just not enough.
      To help these applications, Windows 2000 offers a new feature called Address Windowing Extensions (AWE), which addresses two issues: it allows applications to allocate RAM that is never swapped by the operating system to or from disk, and it allows an app to access more RAM than fits within the process's address space.
      Basically, AWE provides a way for an application to allocate one or more blocks of RAM. When allocated, these blocks are not visible in the process's address space. Then the application reserves a region of address space (using VirtualAlloc), which becomes the address window. The application calls a function that assigns one RAM block at a time to the address window. Assigning a RAM block to the address window is extremely fast (usually on the order of a few microseconds).
      Obviously, only one RAM block at a time can be accessed via a single address window. This makes your code more difficult to implement since you must explicitly call functions within your code to assign different RAM blocks to the address window as you need them. Figure 1 shows how to use AWE. As you can see, AWE is very simple to use. Now, let me point out a few interesting things about this code.
      The call to VirtualAlloc reserves a 1MB address window. Usually the address window is much bigger. You must select a size that is appropriate for the size of the RAM blocks your application requires. Of course, the largest contiguous free block that's available in your address space determines the largest window you can create. The MEM_RESERVE flag indicates that I am just reserving a region of addresses. The MEM_PHYSICAL flag (new for Windows 2000) indicates that this region will eventually be backed by physical RAM storage. One limitation of AWE is that all storage mapped to the address window must be readable and writable—hence PAGE_READWRITE is the only valid protection that can be passed to VirtualAlloc. In addition, you cannot use the VirtualProtect function to alter this protection.
      Allocating physical RAM is simply a matter of calling AllocateUserPhysicalPages:
BOOL AllocateUserPhysicalPages( HANDLE hProcess, PULONG_PTR pulRAMPages, PULONG_PTR aRAMPages);
This function allocates the number of RAM pages specified in the value pointed to by the pulRAMPages parameter and then assigns these pages to the process identified by the hProcess parameter.
      Each page of RAM is assigned a page frame number by the operating system. As the system selects pages of RAM for the allocation, it populates the array—pointed to by the aRAMPages parameter—with each RAM page's page frame number. The page frame numbers themselves are not useful in any way to your application; you should not examine the contents of this array, and you most definitely should not alter any of the values in it. Note that you neither know which pages of RAM were allocated to this block, nor should you care. When the address window shows the pages in the RAM block, they appear as a contiguous block of memory. This makes the RAM easy to use and frees you from having to understand exactly what the system is doing internally.
      When the function returns, the value in pulRAMPages indicates the number of pages that the function allocated successfully. This will usually be the same value that you passed to the function, but it can also be a smaller value.
      Only the owning process can use the allocated RAM pages; AWE does not allow the RAM pages to be mapped into another process's address space. Therefore, you cannot share RAM blocks between processes.
      Of course, physical RAM is a very precious resource and an application can only allocate whatever RAM has not already been dedicated. You should use AWE sparingly or your process and other processes will excessively page storage to and from disk, severely hurting overall performance. In addition, less available RAM adversely affects the system's ability to create new processes, threads, and other resources. An app can use the GlobalMemoryStatusEx function to monitor physical memory use.
      To help protect the allocation of RAM, the AllocateUserPhysicalPages function requires the caller to have the Lock Pages in Memory user right granted and enabled or the function fails. By default, this right isn't assigned to any user or group. The right is given to the Local System account, which is typically used for services. If you want to run an interactive application that calls AllocateUserPhysicalPages, an administrator must grant you this right before you log on and run the application. The sidebar "Setting User Rights" explains how to turn this privilege on in Windows 2000.
      Now that I've created the address window and allocated a RAM block, I assign the block to the window by calling MapUserPhysicalPages:
BOOL MapUserPhysicalPages( PVOID pvAddressWindow, ULONG_PTR ulRAMPages, PULONG_PTR aRAMPages);
The first parameter, pvAddressWindow, indicates the virtual address of the address window. The second two parameters, ulRAMPages and aRAMPages, indicate how many and which pages of RAM to make visible in this address window. If the window is smaller than the number of pages you're attempting to map, the function fails. The main goal for this function is to make it execute extremely fast. Typically, MapUserPhysicalPages is able to map the RAM block in just a few microseconds.
      Note that you can also call MapUserPhysicalPages to unassign the current RAM block by passing NULL for the aRAMPages parameter:
// Unassign the RAM block from the address window BOOL MapUserPhysicalPages(pvWindow, ulRAMPages, NULL);
Once the RAM block has been assigned to the address window, you can easily access the RAM storage simply by referencing virtual addresses relative to the address window's base address (pvWindow in my example code).
      When you no longer need the RAM block, you should free it by calling FreeUserPhysicalPages:
BOOL FreeUserPhysicalPages( HANDLE hProcess, PULONG_PTR pulRAMPages, PULONG_PTR aRAMPages);
The first parameter, hProcess, indicates which process owns the RAM pages you're attempting to free. The next two parameters indicate how many pages and the page frame numbers of those pages that are to be freed. If this RAM block is currently mapped to the address window, it is unmapped and then freed.
      Finally, to completely clean up, I free the address window by calling VirtualFree, passing the base virtual address of the window, 0 for the region's size, and MEM_RELEASE.
      My simple example creates a single address window and a single RAM block. This allows my application to access RAM that will not be swapped to or from disk. However, an application can create several address windows and can allocate several RAM blocks. These RAM blocks can be assigned to any of the address windows, but the system does not allow a single RAM block to appear in two address windows simultaneously.
      64-bit Windows 2000 fully supports AWE. Porting a 32-bit application that uses AWE is easy and straightforward. AWE is less useful for a 64-bit application since a process's address space is so large, but it's still useful because it allows the application to allocate physical RAM that is not swapped to or from disk.

The AWETest Sample Application
      The AWETest application (see Figure 2 ) demonstrates how to create multiple address windows and assign different storage blocks to these windows. When you start the program, it internally creates two address window regions and allocates two RAM blocks.
      Initially, the first RAM block is populated with the string "Text in Storage 0", and the second RAM block is populated with the string "Text in Storage 1". Then, the first RAM block is assigned to the first address window and the second RAM block is assigned to the second address window. The application's window reflects this (see Figure 3). Using this window, you can perform some experiments. First, you assign RAM blocks to address windows using each address window's combobox. The combobox also offers a No Storage option that unmaps any storage from the address window. Second, editing the text updates the RAM block currently selected in the address window.

Figure 3 The AWETest Interface

       Figure 3: The AWETest Interface

      If you attempt to assign the same RAM block to the two address windows simultaneously, the message box shown in Figure 4 appears, since AWE doesn't support this.
      The source code for this sample application is clear-cut. To make working with AWE easier, I created three C++ classes contained in the AddrWindow.h file. The first class, CSystemInfo, is a very simple wrapper around the GetSystemInfo
Figure 4: Warning!

      Figure 4: Warning!
function. The other two classes each create an instance of the CSystemInfo class.
      The second C++ class, CAddrWindow, encapsulates an address window. Basically, the Create method reserves an address window, the Destroy method destroys the address window, the UnmapStorage method unmaps any RAM block currently assigned to the address window, and the PVOID cast operator method simply returns the virtual address of the address window.
      The third C++ class, CAddrWindowStorage, encapsulates a RAM block that may be assigned to a CAddrWindow object. The Allocate method enables the Lock Pages in Memory user right, attempts to allocate the RAM block, and then disables the user right. The Free method frees the RAM block. The HowManyPagesAllocated method returns the number of pages allocated successfully. The MapStorage and UnmapStorage methods map and unmap the RAM block to or from a CAddrWindow object.
      Using these C++ classes made implementing the sample application much easier. The app creates two CAddrWindow objects and two CAddrWindowStorage objects. The rest of the code is just a matter of calling the correct method for the proper object at the right time.

Have a question about programming in Win32? Contact Jeffrey Richter at http://www.JeffreyRichter.com

From the October 1999 issue of Microsoft Systems Journal.