ID Number: Q44463
5.10 6.00 6.00 6.00ax 7.00 | 5.10 6.00 6.00a 6.00ax
MS-DOS | OS/2
Summary:
The following is a sample of a common mistake where array and pointer
declarations are confused:
A program is divided into several modules. In one module, declare
an array with the following declaration:
signed char buffer[100];
In another module, access the variable with one of the following:
extern signed char *buffer; /* FAILS */
extern signed char buffer[]; /* WORKS */
CodeView reveals that the program is using the wrong address
for the array in the first case. The second case works correctly.
More Information:
The following declarations are NOT the same:
char *pc;
char ac[20];
The first declaration sets aside memory for a pointer; the second sets
aside memory for 20 characters.
A picture of pc and ac in memory might appear as follows:
pc +--------+
| ??? |
+--------+
ac +-----+-----+-----+-----+ +-----+
| ? | ? | ? | ? | ... | ? |
+-----+-----+-----+-----+ +-----+
The same is true for the following:
extern char *pc;
extern char ac[];
Thus, to access the array in ac in another module, the correct
declaration is as follows:
extern signed char ac[];
In your case, the correct declaration is the following:
extern char buffer[];
The first declaration says that there's a pointer to char called pc
(which is 2 or 4 bytes) somewhere out there; the second says that
there's an actual array of characters called ac.
The addressing for pc[3] and ac[3] is done differently. There are some
similarities; specifically, the expression "ac" is a constant pointer
to char that points to &ac[0]. The similarity ends there, however.
To evaluate pc[3], we first load the value of the pointer pc from
memory, then we add 3. Finally, we load the character that is stored
at this location (pc + 3) into a register. The MASM code might appear
as follows (assuming small-memory model):
MOV BX, pc ; move *CONTENTS* of pc into BX
; BX contains 1234
MOV AL, [BX + 3] ; move byte at pc + 3 (1237) into AL
; ==> AL contains 'd'
A picture might appear as follows, provided the pc is properly set to
point to an array at location 1234 and that the array contains "abcd"
as its first four characters:
address: 1000 1234 1235 1236 1237
pc +--------+--->>>>>------v-----v-----v-----v-----+
| 1234 | *pc | a | b | c | d | ...
+--------+ +-----+-----+-----+-----+
pc[0] pc[1] pc[2] pc[3]
*pc *(pc+1) etc.
Note: Using pc without properly initializing it (a simple way to
initialize it is "pc = malloc(4);" or "pc = ac;") causes you to access
random memory you didn't intend to access (and causes the strange
behavior).
Because ac is a constant, it can be built into the final MOV command,
eliminating the need for two MOVs. The MASM code might appear as
follows:
MOV AL, [offset ac + 3] ; mov byte at ac + 3 into AL
; offset ac is 1100, so move
; byte at 1103 into AL
; ==> AL contains 'd'
The picture might appear as follows:
address: 1100 1101 1102 1103 1119
ac +-----+-----+-----+-----+ +-----+
| a | b | c | d | ... | \0 |
+-----+-----+-----+-----+ +-----+
ac[0] ac[1] ac[2] ac[3] ac[19]
*ac *(ac+1) etc.
Note: If you first initialize pc to point to ac (by saying "pc =
ac;"), then the end effect of the two statements is exactly the same.
(This change can be shown in the picture by changing pc so it contains
the address of ac, which is 1100.) However, the instructions used to
produce these effects are different.
Note: If you declare ac to be as follows, the compiler will generate
code to do pointer-type addressing rather than array-type addressing:
extern char *ac; /* WRONG! */
The compiler will use the first few bytes of the array as an address
(rather than characters) and access the memory stored at that
location, which is why the problems result.
Additional reference words: 5.00 5.10 6.00 6.00a 6.00ax 7.00