Digital Equipment Corporation
Maynard, Massachusetts
October 1994
This article describes how to port Windows NT™ applications on the Intel® X86 platform to DEC® Alpha AXP™. Issues of interest to Windows NT version 3.5 developers, such as compiler descriptions, porting methodologies, and Alpha AXP optimization, are discussed in this article.
This article contains the following chapters and appendixes:
This chapter provides an overview of porting applications from Microsoft® Windows NT™ version 3.5 on Intel® X86 to Windows NT version 3.5 on DEC® Alpha AXP™ machines. This chapter briefly describes Alpha AXP and helps you assess the effort needed to port your applications from Intel X86 to Alpha AXP.
Alpha AXP is a high-performance, 64-bit operating system neutral architecture, developed by Digital Equipment Corporation ("DEC," or "Digital"). Alpha AXP has a 64-bit physical and virtual address space and processes 64-bit integers and floating-point numbers.
The Alpha AXP architecture is a scalable architecture that will support faster and faster chip designs over its long design lifetime. First-generation chips (DECchip 21064 family) are available with clock speeds from 133MHz to 200MHz. Second-generation chips (DECchip 21064A family) are available with clock speeds from 225MHz to 275MHz. These chips provide industry-leading performance using dual-issue designs, meaning that two instructions can be processed each clock cycle.
Digital is currently shipping client systems for Windows NT for Alpha AXP using a 150MHz 21064 and server systems that support up to 4-way SMP and that support 190MHz 21064 processor boards. Both give excellent performance, and because they have the same byte ordering as Intel X86, they interoperate very efficiently with Intel systems.
Windows NT is a Microsoft operating system that has been ported to and optimized for the Alpha AXP architecture through collaborative engineering work between Digital and Microsoft. It is one of three operating systems (the others being OSF/1, UNIX®, and OpenVMS) currently shipping on the Alpha AXP platforms.
Windows NT is a modern, 32-bit, multitasking operating system that exploits some advanced features, such as 64-bit registers, of the Alpha AXP architecture. Windows NT supports a 32-bit virtual address space on Alpha AXP, as it does when implemented on Intel X86 architectures.
Part of Windows NT is a subsystem or facility that allows MS-DOS® and Win16 Intel executables to run on Alpha AXP and interoperate with native Win32® applications. These applications run more slowly than native applications for many reasons, but for personal productivity tools, mail, and many other low-performance applications, performance may be very satisfactory.
In Windows NT version 3.5, OLE 2.01 allows enhanced interoperability between Win16 (16-bit Microsoft Windows®–based applications) and Win32 (32-bit Windows-based) applications. Each Win16 application can run in its own address space, improving system robustness. Performance of these Intel X86 executables on RISC microprocessors such as Alpha AXP is also significantly improved.
The features described above are important to your clients. Your clients will want to interoperate between native, high performance Win32 Alpha AXP applications and legacy, X86 Win16 executables, lower performance applications, and personal productivity tools. Developers may find some value in this feature, therefore your support staff will need to understand how Win16 and Win32 applications interoperate and share data on both Intel X86 and Alpha AXP platforms.
In contrast to the support for Intel Win16 executables in Windows NT, native Win32 Intel programs will not execute on Alpha AXP. Recompilation of these applications is the proper method to provide support for Alpha AXP. This recompilation is usually very easy; some developers port significant applications in less than one working day.
Many popular CASE tools and compilers, including the Microsoft SDK, are supported on both Intel X86 Windows NT and on Alpha AXP Windows NT. These common tools ease migration and allow Alpha AXP to be a development platform for deployment on Intel NT, and vice versa.
The following table lists the basic porting steps:
Step | Procedure |
1 | Set up the source and make files on Alpha AXP. |
2 | Ensure the components that make up the development environment are available on Alpha AXP. |
3 | Modify the make files as necessary, so they work in the Alpha AXP nmake environment. |
4 | Compile and link your application. |
5 | Correct any errors and re-compile and/or re-link your application. |
6 | Test your application. |
7 | Optimize your application. |
This chapter describes the CLAXP compiler, as well as other Digital compilers available on Alpha AXP. With the exception of the compiler, the Win32 SDK is the same on both Intel X86 and Alpha AXP. The compiler on the Win32 SDK for Alpha AXP is called CLAXP, not CL386. An alias, CL, is now used on all platforms, making make files more uniform.
CLAXP is Digital's and Microsoft's compiler on Windows NT for Alpha AXP. The CLAXP compiler generates Alpha AXP–specific optimized machine code. The compiler front end, or code interpreter, looks very similar to Microsoft's CL386 compiler. The interpreter accepts and parses most of the CL386 switches. The back end is Digital’s code optimizer that generates code optimized specifically for the Alpha AXP architecture.
For further details on the CLAXP compiler, see the CLAXP Compiler Specifications Version 8.00 document. This document is located in \mstools\bin\claxp.txt. For more information on designing your application to take advantage of the Alpha AXP architecture, see the Windows NT for Alpha AXP Calling Standard document.
The DEC FORTRAN compiler for Alpha AXP is an implementation of full language FORTRAN-77 conforming to American National Standard FORTRAN, ANSI X3.9-1978. The DEC FORTRAN compiler will compile any source code pool that strictly adheres to ANSI FORTRAN-77.
The compiler also supports extensions to the ANSI standard, including a number of extensions defined by the DEC FORTRAN compiler that runs on OpenVMS, DEC OSF/1, and ULTRIX systems. The following list describes some of the more significant extensions:
Other vendor's FORTRAN compilers' extensions may differ. However, because Digital's FORTRAN extensions have often become de facto standards, compatibility is likely even when syntax extensions are used.
Certain FORTRAN extensions specific to Microsoft FORTRAN that are not yet supported by DEC FORTRAN are:
These extensions (except the INTERFACE TO statement) will be supported in the next major release of DEC FORTRAN.
In addition to language extensions, the DEC FORTRAN run-time library provides a number of built-in utility routines to the ANSI-defined intrinsic functions. Other compilers are likely to differ in what utility routines are available.
The development kit provided with DEC FORTRAN supports a command line interface and the nmake utility. Source code debugging with a graphical user interface is provided via the windbg utility.
DEC FORTRAN uses GEM as its back-end on all Alpha AXP platforms. The DEC FORTRAN compiler provides a multiphased optimizer that is capable of performing optimizations across entire programs. Although builds may take more time and memory compared to compilers that optimize less thoroughly, the improved performance of highly optimized code at run time is worth the added time.
DEC FORTRAN is run-time compatible with CLAXP. This capability allows you to mix and match FORTRAN and C/C++ modules to meet your application needs. Run-time compatibility with other compilers has not been tested, though it may work if proper calling standards are followed.
The DEC FORTRAN kit includes dynamic-link library (DLL) versions of its Run-Time Libraries (RTLs). These RTLs can be reproduced and distributed royalty-free worldwide.
ASAXP compiles source files written in Alpha AXP assembly language. You need to be familiar with both the Alpha AXP 21064 DECchip architecture and the Alpha AXP assembly language. For assembler syntax, see the file \mstools\bin\asaxp.txt.
In most cases, porting your application simply means recompiling and relinking the application on Alpha AXP. If you use the Win32 SDK to develop your application, you only have to make minor changes to your development environment. This chapter describes how to port your application to Alpha AXP.
If you use the nmake utility on your Intel X86 system, your application's build environment will need very few changes to work on Alpha AXP. If you use another build system, be sure the that system is available on Alpha AXP. Otherwise, you need to modify your make file(s) to work with nmake. The following sections describe the changes you need to be aware of.
The _ALPHA_ symbol defines the Alpha AXP target environment. The make file should pass -D_ALPHA_= 1 to the CLAXP compiler. If the make file includes <ntwin32.mak>, this is automatically ensured.
Additionally, ensure that the symbols _MIPS_ and _X86_ are not defined.
Most of the compiler options are the same on both Intel X86 and Alpha AXP. There are two compiler flag changes on Alpha AXP to be aware of:
See Appendix A for a list of compiler options. The \mstools\bin\claxp.txt file provides detailed information about the compiler flags.
If you use other build environments and your build software is not ported to Alpha AXP, you must rebuild your dependency files. The nmake compiler, linker, and resource macros are predefined in a file named \mstools\h\ntwin32.mak. Use this file as a start for rebuilding your application. The Building Applications chapter in the Windows NT SDK Programming Techniques document provides further information about the build process.
In most cases, you do not need to modify the header files. Compile the application first and investigate any header file errors generated by the compiler. The following sections describe some common header file changes that may be required.
Applications developed on multiple platforms may have preprocessing operators defined for different machine types—for example,
#ifdef _X86_, #ifdef _MIPS_. These statements must be updated to apply to Alpha AXP. This is accomplished by adding an #ifdef _ALPHA_ section and adding the appropriate architecture-specific code. These symbols are defined in ntwin32.mak.
Once the necessary changes have been made to the build environment, executing the nmake utility will compile your application. It is also possible to compile the application from command line by using the CLAXP compiler. Most of the CL386 options are available on the CLAXP compiler. You can also compile your program in command prompt mode. Appendix A contains a list of Alpha AXP compiler options.
The link command on Alpha AXP is the same as on Intel X86 and understands the same flags.
There are three debuggers currently available on Alpha AXP systems to help you debug code at the application and kernel levels: windbg, NTSD, and KD. The following section provides a brief introduction to these debuggers.
The WinDebug debugger, windbg, is a GUI-based debugging tool. It is located in \mstools\bin or in the Win32 SDK Tools program group. The debugger allows you to set breakpoints; examine values of local variables, registers, and assembly-language instructions; and so on.
To debug your application, compile the code with the /Zi and /Od options. Then link with debug:full and debugtype:cv (CodeView®-style debugging information), or with debug:full and debugtype:coff (for global only, non-static symbols). The windbg debugger also has full 64-bit register support on Alpha AXP NT systems.
You can use the NT Symbolic Debugger (NTSD) for assembler programs that have been compiled and linked with the debugtype:coff and debug:partial options using the link32 command. There are two commands (rL, rF) available on Windows NT for AXP that are not available on Intel X86. These commands enable you to examine large integers and floating point registers.
For information on commands and how to use NTSD, see the Tools User's Guide in Microsoft Win32 SDK.
The Kernel Debugger (KD) allows you to debug kernel-mode executables and device drivers. KD can also be used to perform remote driver debugging between different architectures. For example, you can use KD to debug an Alpha AXP driver from an Intel machine and vice versa.
This section briefly discusses the architectural issues you need to consider when porting applications from Windows NT for Intel X86 to Windows NT for Alpha AXP.
For a detailed description of the architectural considerations that need to be addressed when porting your application, see the AXP Notes document. You can locate this document in the Win32 SDK Tools program group. The \mstools\bin\claxp.txt file is also a good reference for compiler-specific features.
For defining variable argument lists, the CLAXP compiler supports two header files:
Use the macros va_start(list,v), va_arg(list,mode) & va_end(list) as defined in the header file <stdarg.h>. All programs that properly use the varargs macros for variable argument list processing will port unchanged to Windows NT on Alpha AXP.
The following example illustrates the proper use of the varargs macros:
// Example: VARARG.C
/* VARARGS.C illustrates passing a variable number of
arguments using the following macros:
va_start va_arg va_end
*
* Also the ANSI and UNIX type:
* va_list
* and the UNIX types:
* va_alist va_dcl
*/
#include <stdio.h>
#include <stdarg.h>
int average( int first, ... );
void main()
{
/* Call with 3 integers (-1 is used as terminator). */
printf( "Average is: %d\n", average( 2, 3, 4, -1 ) );
/* Call with 4 integers. */
printf( "Average is: %d\n", average( 5, 7, 9, 11, -1 ) );
/* Call with just -1 terminator. */
printf( "Average is: %d\n", average( -1 ) );
}
/* Returns the average of a variable list of integers. */
int average( int first, ... )
{
int count = 0, sum = 0, i = first;
va_list marker;
va_start( marker, first ); /* Initialize variable arguments */
)
{
sum += i;
count++;
i = va_arg( marker, int);
}
va_end( marker ); /* Reset variable arguments */
return( sum ? (sum / count) : 0 );
}
An error message, "Memory Access Errors", is displayed for uninitialized variables due to inappropriate values on the high order bits of those variables. Use the Disassembly option in windbg to locate the improperly initialized variables.
When using the Disassembly option, a listing from the compiler is very helpful; create a .COD file using the /FAcs CLAXP (compiler) option. In optimized only cases, specify the -O* -Zi options and use the listing file created.
If you notice behavior such as a floating point exception, invalid access, and so on, examine the variables involved and investigate the call stack until you find the offending variable.
The Alpha AXP memory architecture naturally aligns and references data in 2-, 4-, or 8-byte quantities. In Windows NT version 3.5, there is a switch to enable the alignment fault exceptions. (Automatic fixups are disabled by default.) Turning off automatic fixups by default is desirable because it allows Alpha AXP application developers to locate alignment faults in their own applications.
For maximum performance, align your data structure components on natural boundaries. Also, avoid using byte and word length integers in favor of longwords or quadwords wherever possible. (For example, avoid using chars and shorts; use ints or int64s instead.)
If unaligned access of data is required, you can use the UNALIGNED pointer qualifier defined in the Windows NT system include files. The performance using UNALIGNED is better than the kernel handle alignment faults (though not as good as using aligned data). When instructed, the compiler will insert routines to handle the unaligned data; and thus avoid operating system trap(s), resulting in significantly improved performance.
The UNALIGNED pointer is portable across all Windows NT platforms, regardless of whether the machine has alignment restrictions or not.
The following is a table of UNALIGNED pointer examples:
UNALIGNED int | *ip; | // pointer to unaligned int |
int UNALIGNED | *ip; | // pointer to unaligned int |
int * UNALIGNED | ip; | // wrong |
typedef struct _FOO UNALIGNED | *PFOO; | // pointer to unaligned struct |
typedef SYMBOLS UNALIGNED | *psymbols; | // pointer to unaligned symbol |
UNALIGNED PLONG | NextEntry; | // wrong |
PLONG UNALIGNED | NextEntry; | // wrong |
LONG UNLIGNED | *NextEntry; | // correct |
There are routines available to help locate the source of alignment traps. Performance gains of up to 20 percent have been observed when the programmer instructs the compiler in such a way that an alignment trap will not occur.
See Appendix C for details on data type natural alignment.
The #pragma pack directive can be used to pack structure members together tighter than the default packing the compiler would use. In some cases this is necessary to map structures to preexisting data. In other cases, it is to reduce memory use. It results in structure members that no longer have their natural alignment; thus the CLAXP treats access to these structure members as unaligned. The code generated is bigger and slower than aligned access because the compiler will load the 16-bit, 32-bit, or 64-bit object byte by byte, or by checking the alignment dynamically. This is still much faster than a hardware trap. /ZP8 is the default packing CLAXP would use.
Use the #pragma pack directive only when necessary, such as when compiling a data structure that will be read from a disk file. For example:
#pragma pack(1)
typedef struct
{
...
}
#pragma pack() //resume default
If you use the #pragma pack directive, it is necessary to appropriately declare pointers as UNALIGNED. Note that you may incur a large performance penalty for UNALIGNED access.
On Alpha AXP platforms, LARGE_INTEGER types are treated as one naturally aligned quadword. CLAXP adds 64-bit integer support. CLAXP handles type long double as a 64-bit floating point type (rather than 80-bit). See Appendix B for a data types comparison list.
By default, integer division by zero is reported as an exception on Alpha AXP. The exception may occur on Alpha AXP but not on Intel X86. This is considered a latent bug in the original code and should be corrected.
All Windows NT platforms use identical IEEE floating point formats. For finite floating point values, Alpha AXP floating point behavior is identical to that of MIPS® and Intel X86. However, for non-finite floating point values (for example, infinity, denormals, and NaNs) the Alpha AXP will raise a floating point exception when such values are encountered. The floating point exceptions are:
STATUS_FLOAT_DIVIDE_BY_ZERO | 0xC000008E |
STATUS_FLOAT_INVALID_OPERATION | 0xC0000090 |
STATUS_FLOAT_OVERFLOW | 0xC0000091 |
STATUS_FLOAT_UNDERFLOW | 0xC0000093 |
STATUS_FLOAT_INEXACT_RESULT | 0xC000008F |
Note that if you use default options, floating point exceptions that are not reported under Intel X86 may raise floating points exceptions in Alpha AXP.
At the present time, the CLAXP compiler supports the following IEEE related options:
IEEE floating point NaNs, Infinities, and denormals are not supported in the compiled code. Underflows are quickly forced to zero, and the use of a NaN or Infinity raises an exception. This is the default value, and should be used for all applications except those that require IEEE-floating point exception behavior, because it produces the fastest execution speed.
Run-time library routines may still produce NaNs and denormals, however, so the use of the _matherr routine to handle those situations is recommended. If an application does require support for IEEE NaNs and denormals, use the QAieee option (equivalently /Qaieee1).
IEEE floating point NaNs, Infinities, and denormals are supported. Use this value for applications that expect IEEE-compliant, masked response handling of non-finite operands.
Same as /QAieee1, but IEEE Inexact Operation exceptions are also enabled. Use this value only for applications requiring the IEEE inexact operation exception to be raised (this is almost never needed).
For normal compile modes, if one of the exceptions listed in the previous section does occur, it is likely that the exception PC does not point to the instructions that actually cause the trap. If using a debugger, look for the offending floating point instruction a few instructions prior to the Fir (continuation address). If you are looking near the beginning of a function, look at the last few instructions in the calling frame.
Compile your code using the /QAieee1 option if you want the exceptions to be precise.
The CONTEXT structure is an architecture-dependent data structure that contains register data. If you have code that accesses fields in this structure, you will need to modify the application for Alpha AXP.
The page size is architecture-dependent (4 KB on Intel and 8 KB on Alpha AXP) and should not be hard-coded into applications. If the application assumes the page size to be 4 KB, it will not work correctly on Windows NT on Alpha AXP.
Although the LARGE_INTEGER data type is a 64-bit integer, it is not a quadword. A quadword is a 64-bit integer data type and is not supported on Windows NT for Alpha AXP. LARGE_INTEGER is created from an array of two longwords. The LARGE_INTEGER data type can only be used in conjunction with a set of run-time library functions. (For example, LargeIntegerAdd, and so on.)
Ideally, you would like to have atomic load/store of shared data. An atomic load/store requires a single instruction. On the Intel X86 platform, the size of shared data is 1, 2, or 4 bytes, but on Alpha AXP it is 4 or 8 bytes.
To ensure portability across platforms, it is necessary to protect multithread access to shared data structure with locks. (For example, EnterCriticalSection, LeaveCriticalSection.)
The jump buffer is compiler-specific and version-specific. If you use setjmp/longjmp, do not link objects produced by various compilers or versions.
The CLAXP compiler does not produce any assembler source files.
This chapter lists some of the common utilities that are publicly available on the Internet for Alpha AXP.
You can access Digital’s Alpha AXP Developer Support home page on the World Wide Web by specifying the following Universal Resource Locator (URL):
http://www.digital.com/www-swdev/
From the Alpha AXP Developer Support home page, follow the steps below to access the public domain tools:
To reach... | Click on... | |
1 | Alpha AXP Technical Support page | TECH button (on the Alpha AXP Support home page) |
2 | Microsoft Windows NT Software page | Microsoft Windows NT (in the Software Area) |
3 | Windows NT Public Domain page (this is where the tools are stored) | Unsupported Software Tools Built for Alpha AXP (in the Public Domain Software area) |
Most of the public domain utilities also contain the executables for Intel-based Windows NT machines. Note that all of these tools were collected from the Internet and are copyrighted as per the agreements in the individual source code. Digital makes no warranties, either written or implied, concerning this software. The following table lists the public domain tools available on the Internet.
Tool | Description |
bsdcmpat.lib | Contains a library of routines to help in porting to NT. Routines include bcmp, bcopy, bstring, bzero,getopt, index, and isctype. Headers include ctype, getopt, paths, string, strings, and unistd. |
cal.exe | Prints a calendar. If you specify a number between 1 and 12 for month, only that month is printed for that year. Year can be between 1 and 9999. |
cat.exe | Reads each file in sequence and displays it on the standard output. |
cmp.exe | Compares two files. With no options, cmp makes no comment if the files are the same. If they differ, it reports the byte and line number at which the difference occurred to the standard output. |
color.exe | Changes the color of the foreground and background. The available colors are black, blue, green, cyan, red, and magenta. |
comm.exe | Compares sorted data. |
compress.exe | Uses modified Lempel-Ziv. This command is compatible with the compress/decompression used on the UNIX systems compress programs. This is version 4 and supports up to 16 bits compression. |
egrep.exe | Searches a file for regular expressions. Egrep patterns are full regular expressions. |
grep.exe | Searches a file for regular expressions. Grep patterns are limited regular expressions in the style of 'ex'. |
flex.exe | Generates output as C code source file via programs that recognize lexical patterns in text, Fast Lexical Analyzer Generator (FLEX). |
fold.exe | Folds the contents of the specified files, or the standard input if none are specified, breaking the lines to have a maximum of 80 characters. |
head.exe | Gives the first n lines of the specified files or the standard input. |
ls.exe | Acts as a UNIX ls work-alike. Some options are different or absent. Use the -? command line argument for help. The executable is compiled for I386 and should be UNIX-code–compatible, but hasn't been tested. |
mawk.exe | Implements the AWK programming language. |
mewinnt.exe | Invokes the MicroEMACS windows editor. |
par.exe | Copies, by way of a filter, its input to its output, changing all white characters (except newlines) to spaces, and reformatting each paragraph. Paragraphs are delimited by vacant lines, which are lines containing no more than a prefix, suffix, and intervening spaces. |
perl.exe, perlglob.exe |
Combines, using the Perl language, some of the features of C, sed, awk, and shell. |
sed.exe | Uses regular-expression routines from EMACS (may not be fast). GNU sed is a batch stream editor. For speed, use Perl. |
soss.exe | Runs SOSS, a file server conforming to SUN Microsystems' NFS protocol version 2. |
tar.exe | Executes a version of the tape archiver command available on most UNIX machines based upon the GNU tar utility. It does not yet utilize the tape drive available. |
uniq.exe | Reports repeated lines in a file. The repeated lines must be adjacent in order to be found. |
unshar.exe | Extracts files from the SHELL archive. |
viewps.exe | PostScript text extractor. |
win100.exe | Invokes Kermit and terminal emulator. |
winvn.exe | Runs the Visual Usenet news reader for Microsoft Windows. |
xstr.exe | Extracts and hashes strings in a C program. |
xvi.exe | Runs a portable multiwindow version of the UNIX editor, 'vi'. |
yacc.exe | Runs Berkeley Yacc, an LALR parser generator. It has been made as compatible as possible with AT&T® Yacc. |
zip.exe, unzip.exe |
Packages and compresses (archive) files. Lists/test/extracts from a ZIP archive file. |
This chapter contains a summary of tips and hints for optimizing your application on Windows NT for Alpha AXP. For detailed performance analysis and optimization techniques on Windows NT, see the Optimizing Windows NT (Volume 3 of the Windows NT Resource Kit ) book by Russ Blake from Microsoft Press®.
An important tool in optimizing your application is the compiler. Once your application is working, recompile the application with optimization turned on (using the /Ox switch). The performance gain will vary from application to application but typically you can expect a 20–30 percent gain in performance. Also consider specifying some of the other CLAXP optimization switches summarized below for any potential gain in performance.
Since you often develop your application initially in debug mode to facilitate testing and debugging of your application (using the -Zi -Od switches), remember that the compiler has turned off optimization. You can quickly turn off debugging in nmake by passing nmake a nodebug flag—for example, >nmake nodebug=1.
Use the following options for optimization:
Option | Description |
/Ox | full optimization (except in-lining) |
/O2 | full optimization (same as /Ox /Ob2) |
/d2O3 /Ob2 /Oi | full optimization with byte vectorization |
/d2Gt64 | use of Global Pointer |
You can also use UNALIGNED keyword for unaligned data. See the ALIGNMENT section below.
If you are using the DEC F77 compiler for your FORTRAN application, the following table lists the optimization switches that are not enabled by default:
Optimization Switch | Description | Notes |
/align:dcommons | Aligns COMMON data blocks on natural boundaries up to eight bytes. | These switches may be enabled en masse via /fast. |
/assume:noaccuracy_sensitive | Allows floating point operations to be reordered. | |
/math_library:fast | Uses versions of some intrinsics that trade a small amount of accuracy for improved performance. | |
/inline:all | Inlines every possible routine. | Use these switches carefully. By default, the compiler automatically inlines and unrolls according to its heuristics. |
/unroll:<count> | Specifies how many times loops are unrolled. |
Key tools for performance are described below. For more detailed information on the use of these tools, see the Optimizing Windows NT book.
PerfMon (Performance Monitor)
Performance Monitor (PerfMon) analyzes the performance of your system or application. Use this tool to analyze all the key areas for potential bottlenecks—CPU usage, disk I/O activity, memory statistics, network traffic, and so forth.
WAP (Windows API Profiler)
WAP profiles the Windows API. For example, how much time is spent in which Win32 API calls? It can be run without recompiling your application and is part of the Windows NT SDK.
CAP (Call Attributed Profiler)
CAP profiles your entire application (how much time is spent in each function). This is used to identify the "hot spots" in your program so you can focus on those areas for optimization. WAP requires that you recompile your application with the -Gh option on the C compiler. (Currently only C programs are supported.) It is part of the Windows NT SDK.
WST (Working Set Tuner)
WST can improve the speed of your program by reducing processor cache bottlenecks. It does this by telling the linker to reorder the functions in your program in the order that most reduces paging. Using WST involves several steps. See the Optimizing Windows NT book for more details.
In both versions of Windows NT (3.1 and 3.5), the operating system automatically resolves Alpha AXP alignment faults at run time by default. In most cases this is a desirable feature because the alternative would be for an application to unexpectedly terminate with a data misalignment exception if any alignment fault occurred.
However, allowing the operating system to resolve alignment faults can degrade the performance of your application if there are hundreds or thousands of alignment fixups per second. The rate of alignment fixups can be monitored using the PerfMon or wperf performance tools.
To eliminate alignment errors in your application on Alpha AXP, change the default operating system control for alignment exceptions so that alignment faults become visible to your application. You can then use the debugger to locate the source of the alignment faults.
You can change operating system defaults in the registry by using the new SDK tool called axpalign (an easy and preferred method) or with the regedt32 method. The two methods are discussed below.
axpalign /enable
To disable alignment fault exceptions enter:
axpalign /disable
EnableAlignmentFaultExceptions : REG_DWORD : 0x1
To disable alignment fault exceptions and revert to the default operating system behavior, enter the following value:
EnableAlignmentFaultExceptions : REG_DWORD : 0x0
After using either method, you will need to reboot your system. The changes apply system-wide and affect all applications. Use this method only while locating alignment faults in your own system or application. Otherwise, older applications that still contain alignment errors may terminate with data misalignment exceptions.
If your application requires that the operating system handle alignment faults regardless of the operating system alignment exception control, insert the following statement early in your program:
SetErrorMode(SEM_NOALIGNMENTFAULTEXCEPT);
This statement must be executed before any alignment error can occur. The effect of the statement is to set a flag that causes the operating system to handle alignment faults for your program.
Use the PerfMon or wperf tools to display the number of alignment fixups per second. If this value is zero, you do not need to debug your system or application. A value of less than 100 or 500 alignment fixups per second is not considered a performance problem. However, addressing alignment faults is essential for obtaining good performance on RISC architectures.
There are several techniques for addressing alignment faults:
long *ip;
...
percent = (double *)ip / 100.0;
will cause an alignment trap if the value of pointer ip is not a multiple of 8. Since ip is merely a pointer to a 4-byte long type, there is no reason to believe it will be a multiple of 8. The compiler cannot detect this at compile time. Another example is when the address for some data is obtained from a memory allocation function that makes no guarantee about the alignment of the address it returns.
For example, if MyAlloc() does not round up addresses to 8-byte multiples, then:
TimePtr = MyAlloc(8);
...
MyQuerySystemTime((PLARGE_INTEGER)TimePtr);
may result in an alignment trap because MyQuerySystemTime is expecting the address of a properly aligned LARGE_INTEGER type.
Note that on MIPS platforms this particular code example will work as expected even though the code is wrong, because LARGE_INTEGER types are treated as two naturally aligned longwords. On Alpha AXP platforms the code may result in an alignment trap because LARGE_INTEGER types are treated as one naturally aligned quadword.
struct _FOO {
...
long count;
};
struct _FOO UNALIGNED *pFoo;
void SetCounter(long * ip);
...
pFoo->count++;// Ok, the compiler knows
// the long may be unaligned.
SetCounter(&pFoo->count); // WRONG! Function
// SetCounter is expecting
// a pointer to a normal,
// aligned long.
One workaround in this case is to type SetCounter
to receive an UNALIGNED pointer rather than a normal pointer. Another work-around is to declare a local count variable and pass its address to SetCounter.
Note that the Windows NT system services do not explicitly enforce quadword alignment of quadword pointer parameters.
CLAXP | CL386 | Description |
/?,/help | /?,/help | print compiler options |
/batch | /batch | batch compiler mode |
/Bd | /Bd | verbose, shows all default macros and include files |
/c | /c | compile only, no link |
/C | /C | preserve comments during preprocessing |
/D<name>{=|#}<text> | /D<name>{=|#}<text> | define macro or constant |
/E | /E | preprocess to stdout |
/EP | /EP | preprocess to stdout, but no #line |
/F<num> | /F<num> | set stack size |
/Fa[file] | /Fa[file] | name assembly listing file |
/FA<s|c> | /FA<a|s|c> | configure assembly listing, in CLAXP /FAa=/FAc |
/Fd[file] | /Fd[file] | specify program database .PDB file |
/Fe<file> | /Fe<file> | specify executable file |
/FI[file],/Fc[file] | /FI[file] | specify forced include file, use /FAc, /FAcs |
/Fm[file] | /Fm[file] | specify linker map file |
/Fo<file> | /Fo<file> | specify object file |
/Fp<file> | /Fp<file> | specify precompiled header file |
/Fr[file] | /Fr[file] | specify source browser file |
/FR[file] | /FR[file] | specify extended .SBR file |
not available | /G3 | optimize for 80386 |
not available | /G4 | optimize for 80486 (default) |
not available | /G5 | optimize for Pentium |
not available | /Gd | __cdecl calling convention |
/Ge | /Ge | enable stack checking calls (default) |
/Gf | /Gf | enable string pooling |
/Gh | /Gh | enable hook __penter function call |
not available | /Gr | __fastcall calling convention |
/Gs[num] | /Gs[num] | disable stack checking calls |
/Gt<n> | /Gt<n> | threshold for gp-relative data |
/Gy | /Gy | separate functions for linker |
/Gz | /Gz | __stdcall calling convention |
/H<num> | /H<num> | max external name length |
/I<dir> | /I<dir> | add to include search path |
/J | /J | default char type is unsigned |
/link | /link | linker control options |
not available | /LD | create .DLL |
/MD | /MD | link with MSVCRT.LIB |
/MD | /ML | link with LIBC.LIB |
/nologo | /nologo | logo suppress copyright message |
/O | not available | maximum speed, /Oi /Ob2 or O2 |
/O1 | /O1 | minimize space |
/O2 | /O2 | maximize speed (same as /O) |
not available | /Oa | assume no aliasing |
/Ob<0|1|2> | /Ob<0|1|2> | inline expansion (default n=0) |
/Od | /Od | disable optimizations (default if /Zi and no /O*) |
not available | /Og | enable global optimization |
/Oi[-] | /Oi[-] | enable intrinsic functions (default=Oi-)? |
/Op[-] | /Op[-] | improve floating-pt consistency (decrease performance) |
CLAXP | CL386 | Description |
not available (see /O1) | /Os | minimize space |
not available (see /O2) | /Ot | maximize speed |
not available | /Ow | assume cross-function aliasing |
/Ox | /Ox | maximum opts (/Ogityb1 /Gs for CL386, /Oi /Ob2 for CLAXP) |
not available | /Oy[-] | enable frame pointer omission |
/P | /P | preprocess to file |
not available | /QmipsGx | generate MIPS-specific instructions |
/QAgl | not available | generate fetches and stores in units of longword |
/QAgq | not available | generate fetches and stores in units of quadword |
/QAieee, /QAieee1 | not available | IEEE floating point NaNs, infinities, and denormals support |
/QAieee0 | not available | disable IEEE floating point support |
/QAieee2 | not available | /QAieee1 and IEEE Inexact Operation exception support |
/Tc<source file> | /Tc<source file> | compile file as .c |
/Tp<source file> | /Tp<source file> | compile file as .cpp |
/u | /u | remove all predefined macros |
/U<name> | /U<name> | remove predefined macro |
/V<string> | /V<string> | set version string |
/vd<0|1> | /vd<0|1> | disable/enable vtordisp |
/vmb | /vmb | best case for pointers to class members |
/vmg | /vmg | full generality for pointers to class members |
/vms | /vms | define single inheritance |
/vmm | /vmm | define multiple inheritance |
/vmv | /vmv | define virtual inheritance |
/w,/W0 | /w,/W0 | disable all warnings |
/W<n> | /W<n> | set warning level (default n=1) |
/WX | /WX | treat warnings as errors |
/X | /X | ignore standard include directories |
/Yc[file] | /Yc[file] | create .PCH file |
/Yd | /Yd | put debug info in every .OBJ |
/Yu | /Yu | use .PCH file |
/YX[file] | /YX[file] | automatic .PCH |
/Z7 | /Z7 | C7 style CodeView information |
/Za | /Za | ANSI compatibility (implies /Op) |
/Zd | /Zd | debugging information |
/Ze | /Ze | enable extensions (default) |
/Zg | /Zg | generate function prototypes |
/Zh | /Zh | home arguments (low-level debugging) |
/Zi | /Zi | prepare for debugging (CodeView I information for windbg) |
not available | /Zl | omit default library name in .OBJ |
/Zn | /Zn | turn off SBRPACK for .SBR files |
/Zp[n] | /Zp[n] | pack structs on n-byte boundary |
/Zs | /Zs | syntax check only |
Data Type (byte) | Alpha AXP | Intel X86 |
char | 1 | 1 |
unsigned char | 1 | 1 |
short | 2 | 2 |
unsigned short | 2 | 2 |
int | 4 | 4 |
unsigned int | 4 | 4 |
long | 4 | 4 |
unsigned long | 4 | 4 |
void * | 4 | 4 |
char * | 4 | 4 |
float | 4 | 4 |
double | 8 | 8 |
long double | 8 | 10 |
All data types are identical except for long double.
Data Type | Alignment Starting Position |
8-bit character string | Byte boundary |
16-bit integer | Address that is a multiple of 2 (word alignment) |
32-bit integer | Address that is a multiple of 4 (longword alignment) |
64-bit integer | Address that is a multiple of 8 (quadword alignment) |
IEEE floating single S | Address that is a multiple of 4 (longword alignment) |
IEEE floating double T | Address that is a multiple of 8 (quadword alignment) |
IEEE floating extended X | Address that is a multiple of 16 (octaword alignment) |
IEEE floating single S complex | Address that is a multiple of 4 (longword alignment) |
IEEE floating double T complex | Address that is a multiple of 8 (quadword alignment) |
IEEE floating extended X complex | Address that is a multiple of 16 (octaword alignment) |
In addition to this article, the following documents will help you in porting your application from Windows NT for Intel X86 to Alpha AXP:
Microsoft Win32 Software Development Kit for Alpha AXP, on \mstools\bin\axpnotes.txt.
Microsoft Win32 Software Development Kit for Alpha AXP, on \mstools\bin\claxp.txt.
Microsoft Win32 Software Development Kit for Alpha AXP, on \mstools\bin\asaxp.txt.
Microsoft Win32 Software Development Kit for Alpha AXP, on \mstools\bin\wap.txt.
Windows NT for Alpha AXP Calling Standards, Digital Equipment Corporation, Rev 1.7, January 1994.
Russ Blake, Optimizing Windows NT, Microsoft Corporation, Summer 1993.
Microsoft Win32 SDK version 3.5, Tools User's Guide, Microsoft Corporation, 1993.
********************
Copyright 1994 Digital Equipment Corporation. All rights reserved. Restricted rights: Use, duplication, or disclosure by the U.S. Government is subject to restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at DFARS 252.227-7013.
The information in this document is subject to change without notice and should not be construed as a commitment by Digital Equipment Corporation. Digital Equipment Corporation assumes no responsibility for any errors that may appear in this document.
This software described in this document is furnished under a license and may be used or copied only in accordance with the terms of such license.
No responsibility is assumed for the use or reliability of software on equipment that is not supplied by Digital or its affiliated companies.
Alpha AXP, DEC C, DEC FORTRAN, DEC OSF/1, Open VMS, ULTRIX, and VAX are trademarks of Digital Equipment Corporation.
Intel is a trademark of Intel Corporation.
CodeView, Microsoft, Microsoft Press, MS-DOS, and Windows are registered trademarks and Windows NT and Visual C++ are trademarks of Microsoft Corporation.
All other trademarks and registered trademarks are the property of their respective holders.