Poking Around Under the Hood: A Programmer's View of Windows NT 4.0

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

August 1996

Poking Around Under the Hood: A Programmer's View of Windows NT 4.0

Matt Pietrek

Matt Pietrek is the author of Windows 95 System Programming Secrets (IDG Books, 1995). He works at NuMega Technologies Inc., and can be reached at 71774.362@compuserve.com.

This article assumes you're familiar with Win32

Windows NT¨ 4.0 (also known as the Shell Update Release) has been garnering much attention for its new usability and performance improvements. But there's much more to Windows NT 4.0 than just the pretty Windows¨ 95 user interface. In fact, many of the features originally slated for Cairo, the next major revision of Windows NT, have made their way into Windows NT 4.0. From a programming standpoint, Windows NT 4.0 offers a slew of new APIs and COM interfaces, as well as the APIs and interfaces that originated with Windows 95. In addition, the basic architecture of the Win32¨ subsystem (essentially USER32 and GDI32) has been changed. While this shouldn't affect the average programmer, take a look at the "How'd They Speed It Up Like That?" sidebar if you're interested in how the architecture improves performance. With all its new features and improved performance, Windows NT 4.0 should be a no-brainer upgrade. I believe the existing Windows NT user base will move to this new version relatively quickly, just as the migration from Windows NT 3.1 to 3.5 didn't take long at all.

In this article, I'll give a programmer's perspective of what's new and exciting in Windows NT 4.0 relative to Windows NT 3.51 and Windows 95. I'll describe some new architectural additions to Windows NT 4.0. I'll also describe new functionality in the core system DLLs (such as KERNEL32, USER32, OLE32, and so on) and several new DLLs. I'll finish up by talking about my pet topic, getting at system information, and some cool new Win32 SDK tools. There's so much that's new in Windows NT 4.0 that I can't hope to cover every subject in detail. Rather, I'll attempt to give the big picture view and leave the details to other articles and the Win32 documentation.

This article was written based on information from the beta 2 version of the Windows NT 4.0 SDK, so some things may have changed by the time you read this.

Programming For the New Shell

The good news about the new Windows NT 4.0 shell and user interface (EXPLORER.EXE) is that they're not a radical departure from Windows 95. In perusing through the most recent help files, it appears that all of the Windows 95 OLE interfaces for the shell are now supported in Windows NT 4.0. Thus, if you've written code specific to the Windows 95 shell (such as shell extensions), this code should work unmodified in Windows NT 4.0. Windows NT 4.0 supports these interfaces from SHLOBJ.HS: IContextMenu, ICopyHook, IEnumIDList, IExtractIcon, IFileViewer, IFileViewerSite, IShellExtInit, IShellFolder, IShellLink, and IShellPropSheetExt. In addition to the standard shell interfaces, Windows NT 4.0 also includes the Windows 95 Briefcase interfaces from RECONCIL.H, including INotifyReplica (briefcase), IReconcilableObject (briefcase), andIReconcileInitiator (briefcase). Later on, I'll touch on the namespace extensions. For now, the important thing to know is that you can write common shell code for both Windows 95 and Windows NT 4.0.

Of course, if you've never programmed for the Windows 95 user interface or written shell extensions, there's much to learn. To get you up to speed, I'd suggest reading Jeff Prosise's March 1995 MSJ article, "Integrate Your Applications with the Windows 95 User Interface Using Shell Extensions."

Any similarity between the Windows 95 and Windows NT 4.0 user interfaces shouldn't be too surprising. An examination of the Windows 95 and Windows NT 4.0 EXPLORER.EXE files shows that they use the same basic set of Win32 functions, have very similar resources, and therefore most likely use the same code base. Obviously, the Windows NT Explorer will use a few more functions because of the security and administrative needs of Windows NT. Also, while the Windows 95 Explorer uses the ANSI Win32 functions, the Windows NT 4.0 Explorer uses the equivalent Unicode versions.

Additions to the Core Architecture

Windows NT 4.0 has numerous additions and refinements to its basic architecture. If you consider only those elements that programmers can use directly, the list is small but significant.

Perhaps the most publicized architectural feature in Windows NT 4.0 is Distributed COM (DCOM). DCOM extends COM so that COM objects and their clients can reside on different machines on a network. Thus, a server machine can serve up COM objects to clients running on other machines. The beauty of DCOM is that you really have to do very little to support it. Let's say you already have your COM/OLE setup working across processes with proxy and stub interfaces. There's very little difference between communicating with an object in another process on the same machine and communicating with a process on another machine entirely. The proxies and stubs make these differences transparent to application code. For more in-depth information on DCOM, see Don Box's May 1996 MSJ article, "Introducing Distributed COM and the New OLE Features in Windows NT 4.0."

Another big addition to Windows NT 4.0 is DirectXª support: Direct Drawª, Direct Soundª, and Direct Playª. If you go to the \WINNT\SYSTEM32 directory in Windows NT 4.0, you'll see DDRAW.DLL, DSOUND.DLL, and DPLAY.DLL. Their inclusion is interesting because DirectX allows programs to access hardware devices directly. Many programmers, myself included, consider accessing the hardware directly verboten because of stability concerns. The fact that Windows NT 4.0 includes DirectX support is ample evidence that Microsoft sees Windows NT 4.0 as much more of a mass market operating system than prior versions of Windows NT.

Fibers are a new architectural addition on the KERNEL32 side of things. To be completely accurate, fibers were introduced with the Windows NT 3.51 service pack 3, but Windows NT 4.0 is the first major release that includes them. Fibers are lightweight threads that need to be scheduled manually. I'll talk more about them later when I go through the new KERNEL32 functions.

If you've done multithreaded programming with Win32 before, you're probably familiar with the four basic synchronization methods: critical sections, events, semaphores, and mutexes. Windows NT 4.0 adds a fifth synchronization object, waitable timers, that has handles that you pass to functions such as WaitForMultipleObjects. After the specified time elapses, the timer object is set to the signaled state and an optional callback routine is invoked.

The final architectural addition I'll mention here is something that you don't access programatically. Rather, it's new functionality built into the Windows NT 4.0 loader. In WINNT.H, three new bits can be set in a Win32 EXE or DLL image (you only care about these defines if you are writing a linker or file browser):

 #define IMAGE_FILE_REMOVABLE_RUN_FROM_SWAP  0x0400
#define IMAGE_FILE_NET_RUN_FROM_SWAP        0x0800 
#define IMAGE_FILE_UP_SYSTEM_ONLY           0x4000

How do you set these bits? Linkers will be updated with new options to set these bits. For Microsoft¨ LINK, the options will be /swaprun:cd, /swaprun:net, and /driver:up, respectively.

If an EXE or DLL has the IMAGE_FILE_REMOVABLE_RUN_FROM_SWAP bit set, the loader copies the disk image to the swapfile and builds the in-memory representation out of the swap file when it loads the module. Why would you want this? One very good reason is for programs such as uninstallers that need to delete their own EXE files. This was difficult to do in the past because the operating system would put a lock on any running executable file. IMAGE_FILE_NET_RUN_FROM_SWAP does the same thing, but it's for situations where the executable is on a network rather than a disk on the local machine. The remaining new flag, IMAGE_FILE_UP_SYSTEM_ONLY, tells the loader that this executable can be run on a single processor system only.

Into the Trenches

Having described in general terms what's new with Windows NT 4.0, let's now drop down a level and look at some new DLLs and some new functions in existing DLLs.

For the most part, Windows NT 4.0 implements the Windows 95 functions that didn't make it into Windows NT 3.51 due to time constraints. In the interest of keeping this article focused, I won't dwell on these functions. But keep in mind that just because Windows NT now has the same major version number as Windows 95 doesn't mean the Windows NT 4.0 API is identical to the Windows 95 API. For example, there are still some groups of functions-such as the Image Color Matching and TOOLHELP32 routines-that are in Windows 95 but not in Windows NT 4.0 (at least as of beta 2).

Many of the new APIs are related to data integrity and security. For instance, there's the new Cryptography API, new security-related functions in OLE, and even a DLL to help you verify the authenticity of files. See Mary Kirtland's article, "Safe Web Surfing With the Internet Component Download Service," in the July 1996 MSJ for more about code signing and digital signatures. This emphasis on security is mostly due to the Internet.

One issue with portability of these new functions between Windows NT and Windows 95 is a #define in the system header files. All of the new goodies I'll describe below are only defined if the preprocessor symbol _WIN32_WINNT is greater than 0x0400. In the header files, information that is guarded by

  #if _WIN32_WINNT >= 0x0400

is implemented only in Windows NT version 4.0 and later, not in Windows 95. This allows you to do compile-time checking for platform differences.

The value of _WIN32_WINNT is set in win32.mak depending on the platform you chose to target. When building for Windows NT 3.51-Japanese version, _WIN32_WINNT is defined to be 0x0351. When building for Windows NT 4.0, _WIN32_WINNT is defined to be 0x0400. If you are building an application to run on Windows 95 and you want compile-time notification of compatibility issues that you will need to code around, do not define _WIN32_WINNT.

If you do not include win32.mak in your makefile, you will need to define _WIN32_WINNT to get some of the new Windows NT 4.0-specific material from the header files.

New KERNEL32 Features

Memory management is the bedrock of any operating system. In this category, KERNEL32 has two new functions: VirtualAllocEx and VirtualFreeEx. The "Ex," as with other virtual memory management functions, means that these functions can operate on processes other than your own.

Why allocate memory in another process? When combined with the SetThreadContext and WriteProcessMemory functions, you can force another process to do your bidding. For example, you can allocate some memory in another process, write whatever code and data you want into it, then change the instruction pointer of the other process to point at your new code. If Nashville also has these VirtualXXXEx functions, there will be a relatively easy way to do things such as injecting a DLL into another process in a portable manner. You might be thinking that a function like VirtualAllocEx would compromise the security of a process, but that isn't the case. A system with properly set up access rights won't allow a rogue process to get a handle to other processes in the first place.

Windows NT 4.0 has several new highly extended file system routines in both ANSI and Unicode. The CopyFileEx function is a superset of CopyFile, and provides a way of getting called back as the file copy occurs. In the callback function, you're passed information such as the number of bytes copied, the total number of bytes to be copied, and file handles for the source and destination files.

The FindFirstFileEx function provides much more flexibility when searching for a file or directory. An info level parameter gives you control over how much information the function returns when it finds a matching file. A search op parameter lets you limit the search to just directory or device names. A related function, GetFileAttributesEx, reports the attributes on a specified file while giving you control over how much it reports.

The GetDiskFreeSpaceEx function deviates a fair amount from its predecessor. Instead of returning information in units of sectors, clusters, and bytes per sector, it just returns the total and free sizes of the disk in bytes. On the other hand, it returns this information in 64-bit unsigned integers, which can be a problem for some of us bit-challenged programmers with only 32 bits to play with. The function also differentiates between free space on disk and the space the caller would be allowed to use; a program wouldn't be allowed to use all of the remaining disk space if the system administrator set up disk usage quotas.

Fiber Makes Your Code Go

Turning now to execution-related functions, the big addition to Windows NT 4.0 is fibers. While I could attempt to describe what a fiber is myself, the Windows NT 3.51 service pack documentation does such a good job that I'll defer to it here:

"A fiber is a lightweight thread that an application must manually schedule. Fibers run in the context of the threads that schedule them. Each thread can schedule multiple fibers. In general, fibers do not provide advantages over a well-designed multithreaded application. However, using fibers can make it easier to port applications that were designed to schedule their own threads.

"From a system standpoint, a fiber assumes the identity of the thread that created it. For example, if a fiber accesses thread local storage (TLS), it is accessing the thread local storage of the thread that created it. In addition, if a fiber calls the ExitThread function, the thread that created it exits. However, a fiber does not have all the same state information associated with it as that associated with a thread. The only state information maintained for a fiber is its stack, a subset of its registers, and the fiber data provided during fiber creation. The saved registers are the set of registers typically preserved across a function call."

Fibers are like threads stripped of their ability to be scheduled automatically by the thread scheduler. If you've used co-routines on other platforms, fibers should be second nature to you. A thread can create as many fibers as it wants, but each fiber isn't a separate thread in and of itself. Each fiber that a thread creates will share thread resources with other fibers created by the thread. Just as every thread shares its process's resources (such as memory and file handles), fibers share their thread's resources (such as the last error value retrieved by calling GetLastError). Fibers aren't something that you want to use if you're writing brand-new code that's only intended to run on Win32 platforms. Fibers are a way
of getting applications written for other platforms onto Windows NT.

If you do forge ahead and use fibers, the CreateFiber routine creates a new fiber. This is somewhat like calling CreateThread with the CREATE_SUSPENDED flag. Once you have fibers, you can switch from one fiber to another by specifying its address. By itself, a fiber isn't going anywhere; you must goad it into action by calling SwitchToFiber.

Making things a little more interesting, only a fiber can switch to another fiber. How do you become a fiber if you're a thread? This is where the ConvertThreadToFiber function comes in handy. Typically, you'll create whatever fibers you need, then convert the initial thread to a fiber. When you're all finished with your threads, you need to clean up their memory by calling DeleteFiber.

How does this tie into the overall multithreading in Windows NT? The key point to remember is that fibers share a thread's resources, including its CPU timeslices. Thus, a thread that owns fibers should be scheduled just like any other thread. When a thread that has created fibers is executing, only one of the fibers within the thread will execute. The fiber will execute until it switches to another fiber or the thread scheduler ends the owning thread's timeslice.

Other Thread Improvements

Aside from fibers, there are other changes in store for threads. If you paid attention to early architectural discussions about Windows NT, you might recall something called priority boosting. That is, when the scheduler takes a thread out of a waiting state and lets it run, it juices up the thread's priority just a bit to make the system more responsive. In Windows NT 4.0, the SetProcessPriorityBoost and SetThreadPriorityBoost functions let you disable this priority boosting on a single thread or all threads in a process. The corresponding functions, GetProcessPriorityBoost and GetThreadPriorityBoost, are for returning the boost flag.

The big news on the thread synchronization front is waitable timers, which have handles and names, and act like other synchronization objects (such as events).You can pass a waitable timer handle to functions like WaitForMultipleObjects. To get a handle to a waitable timer, you can either create a timer via CreateWaitableTimer or you can get the handle of an existing timer by calling OpenWaitableTimer. You call SetWaitableTimer with the handle to use the timer. SetWaitableTimer takes parameters that specify the period of the timer, whether it will be triggered just once or repeatedly, and an optional pointer to a callback function that's invoked when the timer is made active. To deactivate a timer, use the CancelWaitableTimer function. When you're all done with the timer, you should pass the timer handle to CloseHandle as you do with other handle-based objects.

Have you ever been in a multithreading situation involving critical sections and wished that you could try to acquire a critical section but not block if some other thread owned it? Windows NT 4.0 has just the answer for that. A new function, TryEnterCriticalSection, acts very much like EnterCriticalSection. The difference is that it will return FALSE if some other thread already owns the critical section. This lets you do something else while some other thread owns the critical section.

The final two synchronization functions are relatively obscure. InterlockedCompareExchange and InterlockedExchangeAdd have been added to the other Interlocked Win32 functions, which ensure safe access to memory variables on multiprocessor machines. What's significant is that these new functions are implemented through native 486 instructions. The somewhat startling conclusion here is that the Intel version of Windows NT 4.0 requires at least a 486 to run.

There are a few other new KERNEL32 functions worth mentioning. GetCurrentHwProfile is for the limited Plug-and-Play support in Windows NT 4.0. ObjectDeleteAuditAlarm and DuplicateTokenEx are new security-related functions. In the area of digital signatures and Internet security, there are three new functions: WinSubmitCertificate, WinLoadTrustProvider, and WinVerifyTrust.

New USER32 Features

Compared to the frenetic additions to KERNEL32, USER32 is relatively sedate in its new functions. Other than adding in support for Windows 95 functions like MessageBoxIndirect, there are only a handful of truly new functions in Windows NT 4.0's user.

MsgWaitForMultipleObjectsEx (as its name implies) extends the capability of MsgWaitForMultipleObjects, which is a form of WaitForMultipleObjects that also returns if a window message is received. The basic MsgWaitForMultipleObjects function has one parameter that specifies whether just one or all of the waiting handles have to be signaled before it returns. In MsgWaitForMultipleObjectsEx, this single-or-all flag has been replaced with a bit flags parameter that encompasses the old functionality and also lets the function return if an asynchronous procedure call (APC) has been queued for the waiting thread.

Finally, the Windows NT USER32 introduces two new resource types. The RT_ANICURSOR and RT_ANIICON resources (animated cursors and animated icons) will be joining other illustrious resources such as menus, bitmaps, and string tables in an RC file near you. The binary format for a resource file is the same as for Windows 95.

New GDI32 Features

Like USER32, there's not much that's terribly new in the Windows NT 4.0 GDI32 component. In fact, comparing the exports of the Windows NT 3.51 GDI32.DLL to the Windows NT 4.0 beta 2 version shows that some DCI functions, such as GdiDciCreateOffscreenSurface, have been removed. DCI, you might recall, was an early standard for giving applications direct access to the video display hardware for increased performance. DCI has been supplanted these days by the DirectDraw API.

There are eight new OpenGL functions in Windows NT 4.0: wglCreateLayerContext, wglDescribeLayerPlane, wglSetLayerPaletteEntries, wglGetLayerPaletteEntries, wglRealizeLayerPalette, wglSwapLayerBuffers, wglCopyContext, and GetEnhMetaFilePixelFormat. Most of these functions deal with layer planes. Video boards often break up the color contents of a given pixel into layers (for example, a red plane, a green plane, and a blue plane). In Windows NT 4.0, these new OpenGL functions let you read and write the video layers directly.

Tales from the Crypt

The big news in ADVAPI32.DLL is the addition of the Cryptography (or Crypto) API. Once Windows NT made its way into mission-critical applications at places such as financial institutions, it needed to acquire additional security mechanisms. A large part of this comes in the form of the cryptographic functions.

The Crypto APIs were also introduced because of the Internet. On the Internet, people access data and programs that may have been tampered with. Cryptography is a way of ensuring that you get what you want. One form of cryptography, public key encryption, is especially good for use with the Internet, as one of the keys to decode the data can be made public with no loss in security.

One of the core concepts of the Crypto API is the Cryptographic Service Provider (CSP), a plug-in module (a DLL) that provides functions for encryption and decryption. Think of a CSP as being like a printer driver; a CSP must export a standard set of functions that is called by the operating system. How it implements those functions is up to the individual CSP.

Each CSP is responsible for performing whatever verification is needed to ensure that the user is allowed to use its facilities. This verification can range from nothing to a scan of the corneas in your eyeballs (the documentation refers to this as "biometric"). The point is, organizations using the Crypto APIs can provide their own CSP with whatever level of security they require. Microsoft provides a default CSP called RSABASE.DLL so that everybody will have at least a basic level of cryptographic functionality. RSA is a widely used public-key cipher (named after its developers, Ron Rivest, Adi Shamir, and Leonard Adleman). Not surprisingly, the exported functions in RSABASE.DLL match up nearly identically with the Win32 Crypto APIs.

Let's take a quick tour of the Crypto API. As I mentioned earlier, the cryptography functions (shown in Figure 1) are implemented in ADVAPI32.DLL. However, unlike KERNEL32.DLL or USER32.DLL, ADVAPI32.DLL has never really had its own distinct header file that encompasses all of its functions. Thus, the cryptography functions come from a new header file, WINCRYPT.H. To link to them, you'll use ADVAPI32.LIB. Also, a nice feature of the Crypto API is that the functions are consistently named-they all start with "Crypt."

The context functions are the first functions used when working with the Crypto API. To start with, you'll need an HCRYPTPROV (handle to cryptography provider), which you acquire through CryptAcquireContext. When acquiring an HCRYPTPROV, you can either request a specific CSP or use a default CSP. Applications can set the default CSP with the CryptSetProvider function. After using a CSP, clean up by calling CryptReleaseContext.

After hooking up to a CSP, the next step is to generate the keys used to encrypt and decrypt the data. CryptGenKey creates random keys, and CryptDeriveKey creates keys based on data that you supply. When creating a key, one of the parameters that must be specified is the encryption algorithm. The algorithms available depend on the CSP in service at the moment. It's important to note that these functions don't let you access the actual key data. Rather, both functions return handles to the keys, known as HCRYPTKEYs. After generating your HCRYPTKEYs, you can get down to the business of making your data indecipherable (perhaps hiding those Beavis and Butthead WAV files from your boss). The CryptEncrypt function takes an HCRYPTKEY and a buffer to encrypt. The corresponding function, CryptDecrypt, reverses the process. Pretty straightforward stuff.

Up to this point, I haven't mentioned anything about the actual key data itself. At some point it becomes necessary to get at the key data (known as a blob). For example, you may be working with public key encryption and need to retrieve the public key data to make it available to others. (It would be kind of pointless if you didn't!) The CryptExportKey function takes an HCRYPTKEY as input and spits out a key blob. The program that wants to decrypt something passes the blob to CryptImportKey, which adds the key to its CSP and returns an HCRYPTKEY. These importing and exporting functions, along with a handful of other functions, make up the key exchange functions. Key exchanging is the act of importing and exporting keys between the secure environment of the CSP and your application's address space.

The final set of functions in the Crypto API are the hashing and digital signature functions. Unlike encryption, hashing data involves a loss of information. The result of hashing any amount of data is a hash value that's a fixed length (typically 16 or 20 bytes). Given just a hash value, there's no way to get back to the original data. Hashing data is the basis of digital signatures. The hashing functions
in the Crypto API are where things like passwords come into play.

Digital signatures let you provide assurance that your data wasn't altered before it reached the recipient. To sign something, you hash your data (for instance, an email message), and then attach the hash value to the data. The hash value is known as a message digest in the Crypto API. To verify that the message wasn't altered, the recipient can rehash the message and compare it to the message digest that was sent along with the data. How does the recipient know that the original data wasn't replaced by different data and a corresponding message digest? Public key encryption is applied to the message digest before it's sent. That is, the message digest is encrypted so the recipient can verify that the digest came from you. If the digest is known to be good, it can be assumed to be trustworthy for rehashing the message and verifying its source.

New Shell Features

Something as heavily used and complex as a user interface is always subject to changes and extensions, and Windows NT 4.0 is no exception. Figure 2 shows the new Windows NT 4.0 shell interfaces defined in SHLOBJ.H.

Some of the new interfaces in this list are just ANSI and Unicode versions of earlier shell interfaces. When the Windows 95 shell team defined the interfaces, they apparently defined only one interface that implicitly used ANSI strings. In Windows NT 4.0, this issue is cleaned up by separating the interfaces into ANSI and Unicode versions. For backwards compatibility, the ANSI versions use the same IIDs (interface IDs) as the original interfaces.

Among the new interfaces, the two most worth checking out are IShellBrowser and IShellView. You don't implement IShellBrowser yourself, it's implemented by Explorer and represents the outermost window of an Explorer window. (See David Campbell's article, "Extending the Windows Explorer With Name Space Extensions," in the July 1996 MSJ.) By implementing IShellView, you can create your own custom views in the Explorer. Your IShellView implementation calls out to IShellBrowser to have it perform actions such as setting up toolbars and adding menu items. By implementing an IShellView interface and a handful of other interfaces, you can create your own name spaces in Explorer and use Explorer to browse objects specific to your application.

Another new interface, IShellExecuteHook, appears to have been added to let the Start/Run dialog work with items defined in your name space. It's a safe bet that these new extensions to the shell are all part of integrating the Internet seamlessly into the Windows user interface.

A Richer RichEdit 2.0

Windows NT 4.0 has two versions of the RichEdit control. There's RICHEDIT.DLL, as in Windows 95, but there's also a RICHED20.DLL, which implements RichEdit 2.0 windows. While the only documentation currently available for the RichEdit 2.0 specification is the header files, several nifty new features are discernible.

In the new RICHEDIT.H, the presence of window messages like EM_SETUNDOLIMIT and EM_REDO imply that RichEdit 2.0 windows have undo and redo support. Another set of window messages, EM_SETLANGOPTIONS and EM_GETLANGOPTIONS, indicate that international issues like keyboard layouts and fonts will be customizable. Also worth noting is that there's a new window class name for RichEdit controls, RichEdit20A. If you want these new features in your existing RichEdit-enabled code, you'll have to go back and modify the class names where you've specified a RichEdit window.

New OLE Features

Any discussion of OLE in Windows NT 4.0 has to start with DCOM. The new COM/OLE functionality falls into four categories. First are extensions required to use objects on remote machines. Second are new functions and APIs relating to security. (Once you get machines communicating data back and forth, security becomes a much bigger issue than on a single machine.) Third, OLE now supports free threading; any object can be serviced by any thread, as opposed to the slower (but easier to program) apartment model. The last new category is property storage and property sets.

Any OLE program needs to call CoInitialize or OleInitialize to set up the OLE system before it can be used. In Windows NT 4.0, a new function, CoInitializeEx, gives the programmer more flexibility over how OLE will work. These new flags can be passed to CoInitializeEx:

 typedef enum tagCOINIT
{
COINIT_MULTITHREADED   = 0x0,     // OLE calls objects
                                  // on any thread.
COINIT_APARTMENTTHREADED = 0x2,   // Apartment model
COINIT_DISABLE_OLE1DDE  = 0x4,    // Don't use DDE for
                                  // Ole1 support.
COINIT_SPEED_OVER_MEMORY = 0x8,   // Trade memory for
                                  // speed. 
} COINIT;

Note the flag that specifies apartment model threading instead of free threading. One way to instantiate a new instance of a class is to get a class factory interface pointer by calling CoGetClassObject, then invoke QueryInterface on the class factory. The second parameter to CoGetClassObject, dwClsCtx, indicates whether the object needs to be in-process or can be in another process. To support DCOM, Windows NT 4.0 accepts a new flag for this parameter, CLSCTX_REMOTE_SERVER. If other CLSCTX flags are specified in this parameter, OLE will try to use the least expensive method available because COM calls over a network will almost always be slower than in-process local server calls.

A more direct way to create a new class instance is to use CoCreateInstance. One performance problem with OLE is that you often need to obtain multiple interface pointers and this can take time, especially if the object is in another process or on another machine. To remedy this problem, Windows NT 4.0 has the CoCreateInstanceEx function, which has two advantages over its CoCreateInstance predecessor. First, it lets you fill an array with pointers to interface IIDs. CoCreateInstanceEx then calls ::QueryInterface for each of the IIDs passed to it. This lets you get multiple interface pointers with a single call. The other new capability in CoCreateInstanceEx is that you can specify a source location for instantiating the new object (for instance, on a specific machine on a network).

Regarding the new security features in OLE, the first new function you'll come to is CoInitializeSecurity. As its name implies, this function initializes the security layer. Its primary parameters are two Access Control Lists (ACLs). The ACLs define who is allowed to use OLE services in the process that called CoInitializeSecurity, and who is denied from using OLE services.

Beyond initialization, OLE security has both client and server object support. On the client side is a new interface, IClientSecurity. The IClientSecurity::QueryBlanket method allows the client to obtain information that authenticates the server. A regular API function, CoQueryProxyBlanket, serves the same purpose. Another interface method, IClientSecurity::SetBlanket, lets the client specify how the server object will be authenticated. This method is functionally similar to the new CoSetProxyBlanket function.

On the server side of OLE security is a new interface, IServerSecurity. The QueryBlanket method lets the server object identify the client that invoked it. This method corresponds to the new CoQueryClientBlanket API. The IServerSecurity interface has the ImpersonateClient method, which makes the server object execute with the same privilege level as the calling client. After the server object impersonates the client, it returns to its normal security parameters via the RevertToSelf method. Both of these methods have regular API equivalents, CoImpersonateClient and CoRevertToSelf.

Native Winsock 2.0 Support

In the area of networking, Windows NT 4.0 will be the first operating system with native support of the Winsock 2.0 specification. I'll be honest and confess that I've done only minimal work with sockets. I'm pretty confident that if you're using Winsock now, you're already much more aware of all the things that Winsock 2.0 offers than I could hope to be. However, I'm duty bound to at least mention at a cursory level what's new.

The primary additions to the Winsock 2.0 specification are access to protocols other than TCP/IP, overlapped I/O, and the ability to query and request specific qualities of service. In the sockets sense, quality of service refers to things such as how quickly data can be piped over a connection and how long you may have to wait for data to flow (latency).

In the new system header files, WINSOCK.H remains for backwards compatibility. A new header file, WINSOCK2.H, has been added as a superset of WINSOCK.H. In addition, you'll need to switch to the WS2_32.LIB import library to use the Winsock 2.0 functions. This import library corresponds to WS2_32.DLL, which is the system DLL that implements Winsock 2.0.

Since WS2_32.DLL provides a superset of the functions in WSOCK32.DLL, you might think that WS2_32.DLL contains just the new Winsock 2.0 functions and passes along 1.x functions to WSOCK32.DLL. Just the opposite is true; through the magic of forwarders, many of the functions in WSOCK32.DLL are really implemented in WS2_32.DLL. Forwarders are a barely documented features of Portable Executable files. When exporting a function from a DLL, it's possible to indicate to the Win32 loader that the function's code really resides in some other DLL. Unfortunately, the linker switches for forwarding aren't described anywhere. I am investigating forwarding and may do a future MSJ column on this subject. In any event, Figure 3 shows the Winsock 1.x functions that are forwarded to other DLLs in Windows NT 4.0.

Notice that the Microsoft extensions to Winsock have been moved out into their own separate DLL. In prior Win32 implementations, the three Microsoft-specific functions were part of WSOCK32.DLL. In Windows NT 4.0, a new DLL named MSWSOCK32.DLL implements these functions and WINSOCK.DLL contains forwarders to these routines.

Internet Support

There's currently a lot of interest in integrating the Internet into apps and operations systems, and Windows NT 4.0 is certainly on the cutting edge. All of the ActiveXª functionality that was the topic of the Microsoft Internet Professional Developer's Conference in San Francisco can be found in Windows NT 4.0.

A large part of Microsoft's Internet support on the client side is in a DLL called WININET.DLL. This DLL supports the HTTP, FTP, and Gopher protocols, and makes it fairly easy to write Internet client applications without knowing anything about TCP/IP, Winsock, and so on. If you're curious, WinInet is built on the Winsock functions, which I mentioned earlier.

The WinInet functions divide up nicely into four categories: general purpose functions that apply to all protocols, HTTP, FTP, and Gopher functions (see Figure 4). For a more in-depth description, see Jeffrey Richter's article, "Microsoft's Internet Extensions for Win32" in the Spring 1996 Microsoft Interactive Developer.

Most of the WinInet functions work with or return HINTERNETs (handles to Internet). While your code sees all HINTERNETs as the same type, one HINTERNET can mean something completely different from another. This makes the WinInet functions a bit confusing at first. The first type of HINTERNET that you obtain comes when you initialize WININET.DLL by calling InternetOpen. This is the first of 13 possible subtypes of HINTERNETs. Figure 5 shows all 13 varieties of HINTERNETs. You can query the subtype of a particular handle by calling InternetQueryOption and sending it the HINTERNET with the INTERNET_OPTION_HANDLE_TYPE parameter.

After setting up WININET.DLL, the initial HINTERNET is usually passed to InternetConnect, which takes a parameter that indicates the type of connecting server (HTTP, FTP, or Gopher). InternetConnect returns another subtype of HINTERNET, which is then passed to the appropriate HTTP, FTP, or Gopher functions. An alternative to calling InternetConnect is to call InternetOpenUrl. This function parses the URL passed to it and connects to the appropriate type of server automatically.

After connecting to the desired server, you turn to the protocol-specific functions, which I'll describe momentarily. Regardless of what you connect to or how you connect to it, you'll probably want to read data from the server. The InternetReadFile function works with HINTERNETs for any of the three supported protocols. Before exiting the program, all of the HINTERNETs should be closed by calling InternetCloseHandle.

A nice feature of the WinInet functions is that they can be set up to act synchronously or asynchronously. The default is synchronous operation; when you call a WinInet function, your thread blocks until the function succeeds or fails. To get asynchronous operation, call InternetSetStatusCallback. A callback function will be installed, and you'll be kept informed of the status of your request. More importantly, the original WinInet call won't block until the operation completes.

There are just a few HTTP functions in WinInet. You open an HTTP request (which returns yet another type of HINTERNET), add any HTTP style headers, then call HttpSendRequest. The InternetReadFile function retrieves the data from the request. Remember, the HTTP request will probably return raw HTML. It's up to you to parse it as appropriate for your application.

Of all the protocols supported by WinInet, FTP is the lowest level, so there are quite a few FTP functions. All of the basics of interacting with an FTP server are provided, including commands to read, write, rename, delete, and enumerate files. You can also create and delete directories and change the current working directory. In short, WinInet has just about everything you'd need to write your own version of the venerable FTP program.

Finally, there are the WinInet Gopher functions. GopherFindFirstFile and InternetFindNextFile enumerate through the available files. To retrieve a file, you open it with GopherOpenFile (which returns yet another type of HINTERNET), and pass that HINTERNET to InternetReadFile.

The Licensing API

A new set of functions in Windows NT 4.0 that hasn't garnered much attention (at least not yet) is the License Service API (LSAPI), which was developed by Microsoft in cooperation with other companies. The seven LSAPI functions provide basic software-metering capabilities. For example, let's say you wanted to sell a site license for 50 copies of your product to some other company. The LSAPI functions would let your program ensure that no more than 50 instances run at a single time. An LSAPI-enabled application differs from other applications that use licensing in that it's not tied to a single proprietary license server. Your application can work with any license server that supports LSAPI.

The functions in the LSAPI are located in LSAPI.DLL. These are LSEnumProviders, LSFreeHandle, LSGetMessage, LSQuery, LSRelease, LSRequest, and LSUpdate. To use LSAPI functions, include LSAPI.H in your code and link to the LSAPI32.LIB import library.

LSEnumProviders lets you enumerate through all installed LSAPI servers and returns a unique string for each server. Once you've found a suitable server, you pass that string (along with a boatload of other parameters) to LSRequest. Alternatively, you can let LSRequest try to find a license server for you. Either way, LSRequest will either return an LS_HANDLE, meaning your application can continue executing, or failure. If LSRequest fails, error codes tell you why. When your application is finished, it calls LSRelease to indicate that it's done.

Of course, a determined hacker can bypass something as simple as I've described above. Thus, LSAPI supports multiple "challenge" protocols to determine if your code has been tampered with. There's even an interesting section in the LSAPI documentation that describes steps your code should take to ensure that it's not being hacked. Not surprisingly, these steps are applicable to many scenarios (such as game passwords), not just application licensing.

IMAGEHLP

A new feature in Windows NT 4.0 that I'm particularly happy with is the addition of IMAGEHLP.DLL to the base operating system. IMAGEHLP isn't really a new DLL. Rather, it used to be buried in the Win32 SDK. However, it was an integral part of tools like REBASE and BIND, and the SDK even included the source for the Windows NT 3.51 version. In Windows NT 4.0, IMAGEHLP has been shined up and extended quite a bit. Unfortunately, newer Win32 SDKs no longer include source for IMAGEHLP.DLL.

The "image" part of IMAGEHLP comes from the fact that EXEs and DLLs are called images by the handful of people who actually work on their inner guts. The fundamental purpose of IMAGEHLP is to remove some of this complexity from working with EXEs and DLLs to provide a library of commonly used functions for low-level hackers (like yours truly).

The real reason for including IMAGEHLP.DLL in the base operating system is the new image integrity functions. Other IMAGEHLP functions let you modify images, extract useful nuggets of information from images, access debug information, and perform debugger-like tasks such as walking a call stack. Figure 6 shows the IMAGEHLP functions broken down by categories.

The image integrity functions in IMAGEHLP work with what's known as a certificate. A certificate is a relatively small blob of bytes created by processing all of the data in a file to effectively create a multiple-byte checksum. Certificates put a personal imprint on a file (as in digital signatures). Applications can use the certificate data to verify that an executable file came from the correct source and that it hasn't been modified. The image integrity functions provided by IMAGEHLP are for working with the WIN_CERTIFICATE structure, a security related structure also used by the WinSubmitCertificate function.

The important IMAGEHLP image modification functions center around basing and binding of images. Basing an image sets its preferred load address; if the loader is able to load the file at that address, it doesn't have to do any base fixups. Binding an image effectively does the work that the Win32 loader does at load time to look up the address of each imported function. Like basing an image properly, binding it causes less work for the Win32 loader at load time, so the EXE or DLL loads faster. A top-notch install program would base and bind the EXEs and DLLs it installs. If you look closely (using any file dumper), all of the images that ship with Windows NT are based and bound. There are also functions to strip off symbol table information and compute image checksums, both things that the typical programmer doesn't need to worry about.

The image access functions in IMAGEHLP primarily give access to information in files without having them loaded into memory by the Win32 loader. For example, the ImageLoad function will map the specified file into memory and return a pointer to a structure containing informational flags, how much memory the image would take if loaded by the Win32 loader, and so on. Once the file is mapped into memory by ImageLoad, functions like ImageNtHeader and ImageDirectoryEntryToData provide even more information about the image. These functions are primarily intended for use by developers of programming tools such as debuggers and file dumpers.

The IMAGEHLP symbol table functions will be meaningless to nearly everybody except debugger writers. Of course, anybody who has to deal with symbol tables will be overjoyed that there's one set of routines to access symbol names. In theory, these routines will be kept up to date as Microsoft evolves and changes their debug formats. Currently, CodeView¨ (also known as PDB) and COFF formats are supported.

One of the symbol-table functions that's useful to a wider audience is StackWalk. As it name implies, this routine walks the stack of the specified thread. One scenario where this could be useful is if you installed a try/catch handler within your WinMain function to catch all program faults yourself, rather than letting the operating system throw up its ugly fault dialog box.

Another function that you may find useful is UnDecorateSymbolName. All you C++ programmers have no doubt seen those horribly long decorated functions names (such as "?bar@foo@@AAEXH@Z") that look like complete garbage. UnDecorateSymbolName turns those symbols into something readable by mere mortals.

Process And Thread Information

Although there were hints of it in the Windows NT 3.5 SDK, PSAPI.DLL has been relatively unknown until now. PSAPI.DLL is a convenient way to get system information such as process lists without reading in and parsing the Windows NT performance data in all of its convoluted glory. From PSAPI.DLL you can get a list of processes and list every module within the process. PSAPI.DLL also has functions for enumerating running device drivers. Finally, PSAPI.DLL has functions for learning about the working set of a process by querying demand load page faults. For moreinformationon PSAPI.DLL, see this month's "Under the Hood" column.

New SDK Tools

With the tour of new functions, interfaces, and DLLs out of the way, I'll finish up with a quick overview of some new tools from the Win32 SDK. Some of these will no doubt end up in some future version of Visual C++¨. The first thing you'll notice is that there are quite a few new programs in the BIN directory of the SDK. A good number of them are programs from the TAPI and MAPI SDKs that have finally been bundled into the Win32 SDK, so I won't mention them further.

If you're curious what exported functions a program calls, the APIMON program could be just what you're looking for. For a given run of a program, APIMON shows which functions were called, how many times they were called, and how much time was spent in the function. For instance, running CLOCK.EXE briefly showed that it called GetDC 28 times, and spent a total of 1.092 seconds in the GetDC code. It optionally shows each API call and the first four parameters on the stack. APIMON is flexible in letting you decide what modules to monitor calls from. APIMON works on any Win32 EXE, and doesn't require any modifications to the program to monitor it.

The WPERF program is a companion to PERFMON. It displays information such the number of page faults per second or the current thread count, but provides a smaller selection of system metrics. However, each statistic WPERF displays gets its own separate chart area, rather than showing all the charts in the same window like PERFMON does. Also, WPERF gets its data directly out of NTDLL.DLL, rather than querying the registry performance data.

The MEMSNAP program provides a convenient way to do before-and-after comparisons of process resources. Each time you run MEMSNAP, it writes out a log file with columns of statistics about each running process. For each process you get the process name, process ID, paged and nonpaged pool sizes, amount of memory reserved in the pagefile, and total committed bytes. In addition, MEMSNAP reports the number of handles and threads each process is using. By comparing MEMSNAP snapshots, you can tell if a particular process is leaking memory or other resources.

UCONVERT is a new tool for converting files from ANSI, OEM, or other code pages into Unicode. The program has a rudimentary dialog-based user interface where you specify the source file and any conversion options. UCONVERT can also accept text from the clipboard and emit its converted text back to the clipboard.

Without any documentation, it's hard to say exactly what WINOBJ is, but it's way cool, regardless. I think it's a browser for Windows NT's objects. If you've read Helen Custer's Inside Windows NT (Microsoft Press 1993), you'll recognize most of them as Windows NT Executive objects. The two exceptions are the desktop and WindowStation objects, which are Win32 objects. WINOBJ shows the objects in a hierarchy using TreeView controls (see Figure 7). For certain types of object instances (such as SymbolicLinks), you can double click to get details for the specific object.

Figure 7 WINOBJ

The last cool tool isn't even an SDK program, it's the Windows NT task manager, which is the first task manager in Windows worthy of the name. You start TASKMGR.EXE either by the Ctrl-Alt-Delete sequence or by right clicking on the system tray. TASKMGR is a three-tabbed dialog. The first tab, Applications, is the traditional "Top Level Windows only" view of the system that you've come to know and hate. The middle tab, Processes, is much more interesting (see Figure 8). It's a true process list, and it includes useful statistics such as memory usage for each process. The third tab, Performance, provides charts and statistical information on CPU and memory usage, the number of processes, threads, and handles, and other assorted information. Whenever it's running, TASKMGR puts a small CPU usage bar chart in the icon area of the tray.

Figure 8 Task Manager Process List

Summary

Calling Windows NT 4.0 the Shell Update Release just doesn't do it justice. While the new shell is nice (especially if you've been working with Windows 95), improvements that aren't immediately visible should make Windows NT 4.0 a mass-market operating system. Microsoft has addressed several areas that were problems in the past: the performance of the GUI and hardware compatibility. It's also nice to see that with Windows NT 4.0 you'll get the long awaited DCOM and Internet extensions. In addition, the Windows NT 4.0 team wasn't content to focus solely on new APIs. New functions like CopyFileEx prove that they're still tuning and tweaking even the most basic areas of the operating system. Finally, as a tool developer, I'm especially happy to see DLLs like IMAGEHLP and PSAPI become part of the base operating system. My opinion on Windows NT 4.0 is summed up by a single fact: I haven't booted any other operating system since I've installed it.

HOW'D THEY SPEED IT UP LIKE THAT?

If you're a fan of Windows NT, you've probably heard that Windows NT 4.0 is faster than previous versions (and Windows NT 3.51 was pretty well tuned). If you're really an operating system nut (like me), you've probably also heard that USER and GDI have been moved into the kernel. While the USER and GDI components work differently in Windows NT 4.0, this is an overly simplified description. Although what I'll describe here won't directly affect most programmers, it's interesting to see what's really going on.

Page 124 of Helen Custer's Inside Windows NT describes how USER and GDI were implemented in Windows NT 3.1 (and subsequently in Windows NT 3.5 and 3.51). What we think of as USER and GDI in Windows NT is really a protected subsystem. "Each protected subsystem runs in a process with a private address space. For an application to gain access to a subsystem, it must send a message. The server receives the message, validates all parameters, executes the required functions, and returns the results to the caller."

Custer then goes on to describe some performance concerns with this approach, and has this to say: "For the server to get the message and execute it, a context switch must occur-that is, the Windows NT executive must perform the following sequence:

Save the client thread's context (volatile machine state).
Select a server thread for execution and load the server thread's context.
Execute the Win32 API routine using the server's thread.
Save the server thread's context.
Reload the client thread's context and process the results of the API routine."

As you might imagine, these operations in real life can be on the order of thousands of clock cycles for a single call to the Win32 subsystem (USER and GDI). If you've ever seen the program CSRSS.EXE and wondered what it was, there's your answer: it's the EXE that the Win32 subsystem process is created from.

At this point, it's helpful to dig into some details. Note that I'm using Intel-specific terminology in the following discussion. For example, when I say Ring 0, I mean "kernel mode" as defined by Windows NT. Let's see how a KERNEL32 call such as PulseEvent works without using any client-server calls. In KERNEL32.DLL, the code for PulseEvent begins like this:

 PulseEvent proc
PUSH    00
PUSH    DWORD PTR [ESP+08]
CALL    DWORD PTR [NtPulseEvent]

All KERNEL32.DLL does is grab the single parameter off the stack and pass it as a parameter to an NTDLL.DLL function. In NTDLL.DLL, the code for NtPulseEvent looks like this:

 NtPulseEvent proc
MOV     EAX,0000005C
LEA     EDX,[ESP+04]
INT     2E
RET     0008

All that NTDLL.DLL does is load EAX with a dispatch number (0x5C in this case) and EDX with a pointer to the parameters on the stack, then invoke an INT 2Eh. Keep in mind that everything I've described so far happened at Ring 3. On Intel 80x86 processors, any INT instruction causes the CPU to transition to Ring 0 before jumping to the address corresponding to the interrupt number in the Interrupt Descriptor Table (IDT). It's also worth noting here that in the transition from Ring 3 to Ring 0, the CPU also switches the stack to whatever Ring 0 stack is specified in the TSS.

Now, let's take a look at Ring 0, or kernel mode. At Ring 0, the INT 2Eh handler is called _KiSystemService, which is located in NTOSKRNL.EXE. _KiSystemService takes the dispatch number (placed in EAX by NTDLL.DLL) and uses it as an index into a dispatch table that each thread has a pointer to. Just before jumping to the designated handling code, _KiSystemService copies the parameters from the Ring 3 stack (which EDX points to) onto the Ring 0 stack. Altogether, this takes about 60 instructions to accomplish, which is significantly less than the overhead imposed by the two thread context switches that USER and GDI go through with the Win32 subsystem process.

In Windows NT 4.0, the mechanism for getting from Ring 3 to Ring 0 for NTDLL.DLL functions was extended to include USER and GDI functions. Windows NT 4.0 essentially dispenses with the original vision for a client-server architecture in exchange for increased performance.

The new method for USER and GDI to transition to Ring 0 is nearly identical to what NTDLL.DLL does. That is, the code sets up EAX and EDX appropriately before invoking an INT 2Eh. The only difference is that the bit value 0x00001000 is set in the dispatch code for USER and GDI functions. Put another way, dispatch codes less than 0x1000 come from Ring 3 kernel-type code like that provided by NTDLL.DLL. Dispatch codes greater than 0x1000 are for the Win32 subsystem (USER and GDI).

Consider the following GDI32.DLL implementation of GetTextCharsetInfo in Windows NT 4.0:

 GetTextCharsetInfo proc
MOV       EAX,0000106E
LEA       EDX,[ESP+04]
INT       2E
RET       000C

Inside _KiSystemService, the code checks the 0x00001000 bit in the dispatch code and uses it to select one of two dispatch tables. Dispatch codes less than 0x1000 are handled by a table that routes nearly everything to code elsewhere in NTOSKRNL.EXE. Dispatch codes greater than 0x1000 use a table that sends nearly everything to routines in WIN32K.SYS.

What's WIN32K.SYS? In Windows NT 4.0, it's the Ring 0 device driver implementation of the Win32 subsystem. In some ways, WIN32K.SYS is the replacement for WINSRV.DLL, which was the core of the Win32 subsystem in Windows NT 3.x. A quick glance at the file sizes bears this out. The Windows NT 3.51 version of WINSRV.DLL is nearly 1.4MB, while in Windows NT 4.0 beta 2 it's shrunk down to around 160KB. In its place is WIN32K.SYS, which weighs in at over 1.2MB. The key difference is that WINSRV.DLL ran at Ring 3, while WIN32K.SYS runs at Ring 0. By running at Ring 0, the Win32 subsystem can avoid the overhead of thread context switches.

One side effect of moving the Win32 subsystem to kernel mode (Ring 0) may not be obvious immediately. GDI calls are now implemented at Ring 0, and GDI communicates to your output devices (for example, the display and printers) through device drivers. If you guessed that graphics drivers need to be ported to kernel mode code, your guess is correct.

To test the new Win32 subsystem architecture, I wrote the SETCURS program (see Figure A). SETCURS calls the SetCursor function 5000 times in a loop and uses the QueryPerformanceCounter to time it. I selected SetCursor because under Windows NT 4.0 the function almost immediately invokes an INT 2Eh. Under Windows NT 3.51, SetCursor takes a much longer code path involving thread switching to the Win32 subsystem. In an informal test, I found that Windows NT 4.0 beta 1 was about three times as fast as Windows NT 3.51. It's unlikely that the internals of the SetCursor function changed much between 3.51 and 4.0, so the speed-up is almost certainly due to implementing SetCursor in kernel mode, thereby avoiding thread switches.

Some people have raised concerns about stability and the shifting of the Win32 subsystem into Ring 0 kernel code. Yes, a buggy video driver that overwrites other kernel data structures can crash the system. On the other hand, a buggy video driver would kill the Win32 subsystem process in prior versions of Windows NT, so the net effect is the same. Regardless of the architecture, robust video and printer drivers are essential.

As for buggy applications overwriting USER or GDI data, there's no need to worry. All of the Ring 0 code executes out of reach of normal application code and uses a different stack. Besides, if it's been good enough for NTDLL.DLL for all these years, it's good enough for USER and GDI code.

USING FIBERS

Since fibers aren't exactly intuitive when you first encounter them, I wrote a small sample application, FIBER (see Figure B). Within main(), the code creates three fibers by calling CreateFiber. Each fiber is created with the same start address, specifically the FiberRoutine function. As each fiber is created, the code store the fiber's address into a global variable.

At this point, FIBER.CPP has its original thread and three fibers. To make any of the fibers execute, main() calls ConvertThreadToFiber, thereby making a fourth fiber. Just to prove that fibers can't be run without direct intervention, the original thread sleeps for one second. The idea is that none of the three created fibers executes during this time period. After the Sleep function returns, the fiber created using the ConvertThreadToFiber is still running and will continue to run until it calls SwitchToFiber to let some other fiber execute. The fiber that main() selects is the first fiber it created (Fiber1). Fiber1 begins execution at the beginning of FiberRoutine and emits a message indicating which fiber it is.

After displaying its "In Fiber n" message, Fiber1 calls SwitchToFiber. This suspends Fiber1 and executes Fiber2. Since Fiber2's start address is also at the beginning of the FiberRoutine code, it prints out a similar message. This process repeats when Fiber2 switches to Fiber3.

The last thing Fiber3 does is call SwitchToFiber, specifying the address of the fiber created from the original thread. This causes execution to resume in main() where the first SwitchToFiber call occurred. main() cleans up by calling DeleteFiber on the three fibers it created and exits. Figure C shows the end results of a run of FIBER.EXE.

Figure C FIBER

VERSIONITIS, OR "WHAT VERSION AM I?"

One of the problems inherent with having mostly compatible operating systems (Windows NT, Windows 95, and Win32s¨ under Windows 3.1) is in making a single program work on all platforms. In some cases, this involves determining which operating system you're running on and invoking code specific to that operating system. Sometimes the hardest part is just figuring out what system you're running on. As you might imagine, Windows NT 4.0 adds its own twist to the story.

Back when there was just Windows NT and Win32s, a program could call GetVersion and check the high bit (bit 31) of the returned version DWORD. If the bit was set, the system was Win32s, otherwise you were on Windows NT. When Windows 95 arrived on the scene, the value it returned from GetVersion also had the high bit set. To differentiate between Win32s and Windows 95, you had to look at the version number in the lower bits of the GetVersion return value. If it was 4.0 or greater, you could assume Windows 95, otherwise it was Win32s. I always thought this a bit cheesy, as version numbers are subject to change. In my September 1994 MSJ article, "Investigating the Hybrid Windowing and Messaging Architecture of Chicago," I pointed out that the second highest bit (bit 30) was set for Windows 95 and not for Win32s. All Microsoft documentation, however, suggested relying on the major version number field rather than on bit 30.

As you might imagine, many programmers got their version-checking code wrong and simply checked for the operating system major version number in the low WORD. Until Windows NT 4.0, these programs got away with it because the Windows NT major version number was 3. With Windows NT 4.0, though, both Windows 95 and Windows NT report the same major version number (4). Much to my surprise, the lead story in the March/April 1996 Microsoft Developer Network News suggested that bit 30 could be used to tell the difference between Windows 95 and Win32s. I guess what's old is new again.

Of course, the real solution to this versioning problem is the GetVersionEx function. It returns explicit bitfields that indicate which platform you're running on. The problem with GetVersionEx is that it's not available on Windows 3.x or in Windows NT 3.1. Thus, if you call it directly from within your code, your programs won't run on those platforms.

Another related problem is the expected Windows version number that the linker puts in EXEs and DLLs. Over time, the meaning of the major version number in Win32 has essentially come to indicate which version of the shell is running. That is, major version 3 means the old ProgMan user interface, while major version 4 is the new Explorer interface. Because the meaning of the version number was contorted beyond its originally intended meaning, strange idiosyncrasies popped up. For instance, Windows NT 3.51 is able to run Windows 95-based applications marked as requiring version 4.0, but it doesn't give these applications the new Windows 95 look. With the release of Windows NT 4.0, these issues should be behind us.

From the August 1996 issue of Microsoft Systems Journal.