In general, alignment errors can be avoided by following these rules:
char a[10];
char *p = &a[1];
long l = *(long *)p; // ERROR!; Attempt to move a long from
// the address of a char.
However, the converse operation, accessing a large-aligned address with a recast pointer of smaller alignment, is perfectly safe. For example, you could use a
(char *) cast to access the first byte, or any byte, of a long variable.
In programs that use large arrays of structures, it may be advantageous to use packed structures to conserve space, or you may be required to use structure packing to read a pre-existing data format. In such cases, you can still use packed structures, while carefully “unpacking” members of a structure before using data in the program. This technique may involve copying the data in a structure member by member, or element by element, or field by field, into a temporary location that is correctly aligned.
Alpha versions of the compiler automate safe access to unaligned data by providing the _ _unaligned keyword. If you are writing code meant to compile in both Alpha and x86 environments, you should define _ _unaligned as UNALIGNED because headers are designed to remove the _ _unaligned keyword from x86 builds. For example:
if defined (_M_ALPHA)||defined(_M_MRX000)||defined(_M_PPC)
#define UNALIGNED __unaligned
#else
#define UNALIGNED // NULL
#endif
Syntactically, UNALIGNED (or _ _unaligned) is a type qualifier like const and volatile; it controls the meaning of what the pointer points to. However, _ _unaligned has meaning only when used in a pointer declaration. Some sample UNALIGNED pointer declarations are shown in the following example:
#define UNALIGNED __unaligned
UNALIGNED int *p1; // p1 is a pointer to unaligned int
UNALIGNED struct {int i;} *p2; // p2 is a pointer to unaligned struct
UNALIGNED int *) q; // cast q to an unaligned int*
As an example of use, suppose that p1
is a pointer to unaligned int, as above, and that the packsize is set to 1. A packsize of 1 means that each member of a structure is aligned on the next available byte, regardless of alignment considerations. The following code illustrates the correct and incorrect use of _ _unaligned and #pragma pack in conjunction with integer operations. In the first section, a fault is generated because the _ _unaligned qualifier is not used, whereas in the second section, the _ _unaligned qualifier is used correctly.
#pragma pack (1)
struct s {
char c; // offset 0
int i; // offset 1!
} ss;
#pragma pack ()
void f(int *p)
{
*p = 23; // will generate a fault
}
void g(_ _unaligned int *p) // OK
{
*p = 42;
}
void main ()
{
f(&ss.i);
g(&ss.i);
}
The following example shows the machine code output from the sample above. Function f
shows the code generated by the improper handling of unaligned data, and function g
shows the extra code generated when _ _unaligned is used.
Note that in function g
, even though approximately 18 extra instructions are generated to handle the unaligned data, that code will execute much faster than if the operating system is left to perform the alignment fixup, as in function f
.
f::
mov 23, t0
stl t0, (a0) // This instruction will get an alignment fault.
ret ra // 300X slowdown
nop
g::
and a0, 3, t4
mov 42, t0
bne t4, L$3
stl t0, (a0)
ret ra
L$3:
bic a0, 3, t2 // Code to safely handle an unaligned address.
and a0, 3, t5 // 3X slowdown.This is much faster than letting
ldl t3, (t2) // the OS handle it
insll t0, t5, t4
mskll t3, t5, t3
addq t5, 4, t5
bis t3, t4, t3
ldl t4, 4(t2)
stl t3, (t2)
inslh t0, t5, t3
msklh t4, t5, t4
bis t4, t3, t4
stl t4, 4(t2)
ret ra ; 000016
There is a performance penalty for accessing data through an _ _unaligned pointer. To guarantee that there are no alignment errors, the compiler must access the dereferenced data as a series of smaller pieces. If the data is already properly aligned, this technique for accessing data is not necessary. Therefore, use the _ _unaligned keyword only when needed.
The problem here is that the structure actually requires 12 bytes, when the programmer may have intended it to take up only 8 bytes. If space requirements are critical, you can reorder the members of the structure so that the same-sized elements are next to each other and pack tightly. Usually, you start with the largest members and work your way down to the smallest ones. The previous example could be reorganized as:
struct x_
{
int b; // 4 bytes
short c; // 2 bytes
char d; // 1 byte
char a; // 1 byte
} foo;
Now all of the members end up on naturally aligned boundaries, and the actual size of the structure is 8 bytes instead of 12.
There are two problems with this approach. The first problem is that reordering assumes that the user has full control over how the data structure is defined and laid out. If, for example, the data structure represents the layout of fields in a file on disk, the user may not have the freedom to rearrange the members. Yet in order to read the data from disk efficiently, the user would like to have no padding between the members.
The other problem occurs where the structure size requires padding at the end to make array elements have the same alignment, and the user needs to make sure that there is no padding between the array elements (either for memory savings or, again, for reading data from a fixed-format source). Here the use of the compiler directive #pragma pack(1)
may be necessary. This will, however, cause elements of the structures to be unaligned, and it will require the use of the _ _unaligned qualifier on Alpha to generate the special code to access this data without taking alignment traps to the operating system. For example:
# pragma pack (1)
struct x_
{
char a; // 1 byte
int b; // 4 bytes
short c; // 2 bytes
} foo;
# pragma pack ()
void bar()
{
__unaligned struct x_ *px = &foo;
. . . .
px->b = 5;
}
This code tells the compiler that the pointer px
points to data that is not naturally aligned, and tells the compiler to generate the appropriate sequence of load/merge/store operations to do the assignment efficiently.
Note that _ _unaligned should only be used as a last resort, because the generated code is less efficient than accessing naturally aligned data. However, _ _unaligned is preferable to the performance penalty incurred with alignment faults.
If at all possible, arrange the members of a data structure to preserve alignment and minimize space at the same time, as outlined above.