Based Pointers

Based pointers combine the advantages of near and far pointers. Based pointers are two bytes in size, like near pointers, but their range is not limited to the default data segment. Like far pointers, they can refer to any available memory location. Based pointers provide a more efficient way to represent addresses outside the default data segment by exploiting the commonality among multiple pointers.

This is possible because a based pointer contains only the offset portion of an address. To use such a pointer, you must define a “base” for it. A base consists of the segment portion of an address and is stored separately from the pointer itself. If many based pointers refer to locations within the same segment, they can all share the same base. The offset and segment values are combined whenever a based pointer is used to access a memory location.

By comparison, every far pointer contains both an offset and a segment value, which can result in wasted space if many far pointers refer to locations within one segment. Near pointers contain only an offset, but they always use the DS register for their segment value, so they are restricted to addressing the default data segment.

Summary: Using based instead of far pointers makes your program smaller.

The use of based pointers instead of far pointers makes your program smaller by saving two bytes for each pointer that shares a base with another. Under certain conditions, based pointers can also be faster than far pointers. If your program has many based pointers that are all based on the same segment, and if those pointers are used consecutively, the compiler does not need to load a new segment value each time a pointer is used. If you enable full optimizations in such circumstances, based pointers can be almost as fast as near pointers.

Define a pointer's base using the __based keyword, followed by a base expression in parentheses, where you might otherwise place __near, __far, or __huge. For example,

void __near np;

void __based(base) bp;

There are several types of base that you can specify for a based pointer:

A fixed base

A variable base

The __self keyword

The void keyword

These types of base are described in the following sections.

Pointers with a Fixed Base

Pointers based on a fixed segment are restricted to accessing locations in a single segment. This segment is specified when the based pointers are declared. You can make assignments to the based pointers themselves, which changes the offset portion of the address. Making assignments in this way causes the pointers to refer to different locations within the segment. However, you cannot change the base that the based pointers use.

There are two ways to specify a fixed base for based pointers: by using a named segment or by using the segment in which a variable is stored.

Using a Named Segment

You can specify a named segment as the base for your pointers by using the __segname keyword and a string literal. For example, the following example declares a pointer based in the default code segment:

void __based(__segname(“_CODE”)) *bp;

The pointer bp can address any location in the default code segment. There are four segments accessible through predefined strings:

Segment

Definition

_CODE	Current code segment
_CONST	Constant segment
_DATA	Default data segment
_STACK	Stack segment

The following example declares a pointer based in the default data segment:

char __based(__segname(“_DATA”)) *bp;

This is equivalent to a near pointer.

You can also specify user-defined segments, as long as the segment is allocated somewhere else in the program. For example,

char __based(__segname(“MYSEG”)) *bp;

You can define MYSEG with an assembly-language file or by allocating data in a named segment. See “Data Stored in a Named Segment” for more information.

Using the Segment of a Variable

You can also base your pointers on the segment in which another variable is stored. Specify this type of base by casting the address of a variable to the __segment data type, as follows:

int i;

void __based((__segment)&i) *bp;

This declaration allows bp to access any location in the same segment that i is stored. If i is declared as __near, or if the program is compiled in tiny, small, or medium model, this is equivalent to declaring bp as a near pointer.

Pointers with a Variable Base

Pointers with a variable base can access any available memory locations. When you make assignments to the based pointers themselves, you change the offset portion of the address, which allows you to refer to various locations within one particular segment. You can also make assignments to the base itself. The compiler uses the updated value of the base whenever one of these based pointers is used. In this way, changing a single base value effectively changes the locations referenced by all the based pointers using that base.

There are three ways to specify a variable base for based pointers: by using the segment value of another pointer, by using a variable of type __segment, or by using another pointer.

Using the Segment Value of Another Pointer

You can give a based pointer the segment of another pointer as its base value. To do this, cast a pointer to the __segment data type, as follows:

char __near *np;

char __far *fp;

void __based((__segment)np) *bnp;

void __based((__segment)fp) *bfp;

Notice that this syntax is similar to that used to base a pointer on the segment in which a variable is stored. The difference is that you cannot change where a variable is allocated, but you can change the value of a pointer.

Because np is a near pointer, it uses the DS register as its segment value. Accordingly, bnp uses DS as its base and is equivalent to a near pointer.

Because fp is a far pointer, it contains a segment value, and bfp uses that segment as its base. If you change the segment portion of fp, bfp will refer to a location in the new segment. (Remember that far pointer arithmetic is performed only on the offset portion, so incrementing fp won't affect the base of bfp. However, if you make an assignment to fp that changes its segment, the base of bfp will be similarly modified.)

Using a Segment Variable

In addition to using a cast to the __segment data type, you can define variables of type __segment. You can then base your pointers on such a segment variable, as follows:

__segment videomem; // define a segment variable

char __based(videomem) *vidptr;

videomem = 0xB800; // use video memory as segment

// move to row 10, column 40

vidptr = (char __based(videomem) *)(2 * ((80 * 9) + 39));

*vidptr = 'A'; // write an A there

In this example, videomem is a segment variable that contains the segment in which video memory resides. Because vidptr is based on videomem, any value assigned to vidptr is interpreted as an offset into video memory. A cast is used in the assignment to vidptr to prevent a compiler warning. If videomem were assigned a new value, vidptr would act as an offset from that new value and evaluate to an entirely different address.

You cannot base a pointer on a constant that is cast to the __segment type, as in the following example:

unsigned vidptr __based((__segment)0xB800) *vidptr; // error

You must use a segment variable that is defined separately.

Pointers based on a segment variable are especially useful in conjunction with based heaps. Microsoft C/C++ lets you define a special heap that resides in a segment. You can use such a based heap to allocate objects dynamically, just as you would with a traditional heap. These dynamically allocated objects can all be referenced with pointers based on that segment.

The following program demonstrates the creation of a based heap:

/* Compile in Small Model */

#include <malloc.h>

#include <stdio.h>

#include <string.h>

__segment segvar;

char __based(segvar) *b_string;

void main()

{

if( (segvar = _bheapseg( 1000 )) != _NULLSEG )

{

if( (b_string = _bmalloc( segvar, 20 )) != _NULLOFF )

{

_fstrcpy( (char __far *)b_string, (char __far *)"This is a test.\n" );

printf( “%Fs”, (char __far *)b_string );

printf( “Size = %d\n”, sizeof b_string ); /* Always 2 */

_bfree( segvar, b_string );

}

else

puts( “bmalloc failed” );

_bfreeseg( segvar );

}

else

puts( “_bheapseg failed” );

}

First, the program calls the library function _bheapseg and requests 1,000 bytes in a new based heap:

if( (segvar = _bheapseg( 1000 )) != _NULLSEG )

If it cannot allocate the amount of memory requested, _bheapseg returns _NULLSEG (null segment). Otherwise, the function returns the valid address of a segment, which is assigned to segvar.

Next, the program calls _bmalloc and requests 20 bytes of memory from the based heap. The variable segvar is passed to identify the based heap that _bmalloc should use. Just as malloc returns a pointer to a block of memory, _bmalloc returns an offset to a block of memory. This offset is assigned to the based pointer b_string:

if( (b_string = _bmalloc( segvar, 20 )) != _NULLOFF )

The value _NULLOFF means “null offset” and indicates the failure of _bmalloc. If the allocation succeeds, the program continues with this code:

_fstrcpy( (char __far *)b_string, (char __far *)"This is a test.\n" );

printf( “%Fs”, (char __far *)b_string );

printf( “Size = %d\n”, sizeof b_string ); /* always 2 */

The standard strcpy function won't work because this is a small-model program that expects all pointers to be near. The _fstrcpy function accepts far pointers, and it is possible to cast a based pointer to a far pointer. Then the string and its size are printed.

Finally, the block of memory and the based heap are freed:

_bfree( segvar, b_string );

_bfreeseg( segvar );

The run-time library provides a complete set of memory-management functions that work with based heaps.

Using Another Pointer

You can also base your pointers on the complete address of another pointer, instead of using only the segment portion of its address. In this case, a based pointer acts as an offset from the pointer itself, instead of simply sharing the segment with that pointer. For example,

int *ip;

int __based(ip) *bp;

Whenever bp is used, the compiler adds together the offset of ip and the offset stored in bp, and uses the segment of ip to find the address.

The following example illustrates pointers based on a pointer:

#include <stdio.h>

#include <malloc.h>

#include <stdlib.h>

#include <string.h>

int *ip; /* int pointer */

int __based(ip) *bp; /* based on ip */

char __based(ip) *cp;

void main()

{

int *mem1, *mem2;

bp = (int __based(ip) *)0; /* bp equals *(ip+0) */

cp = (char __based(ip) *)2; /* cp equals *(ip+2) */

if( (mem1 = (int *)malloc( 100 )) != NULL )

if( (mem2 = (int *)malloc( 100 )) != NULL )

{

ip = mem1; /* ip points to mem1 */

*bp = 5;

strcpy( (char *)cp, “String stored in mem1.” );

ip = mem2; /* ip now points to mem2 */

*bp = 12345;

strcpy( (char *)cp, “String stored in mem2.” );

ip = mem1; /* point to mem1 */

/* which still holds previous values */

printf( “%s *bp= %i\n”, (char *)cp, *bp );

ip = mem2; /* point to mem2 */

/* display the values there */

printf( “%s *bp= %i\n”, (char *)cp, *bp );

free( mem2 );

free( mem1 );

}

else puts( “Second malloc failed.” );

else puts( “First malloc failed.” );

}

Two calls to malloc provide two sections of memory, whose addresses are stored in the variables mem1 and mem2. When ip is assigned one of these addresses (mem1), the pointers based on ip point somewhere within that piece of memory. When ip is assigned the address in mem2, the effective addresses of bp and cp also change.

Note:

Pointers based on pointers are the only form of based pointers that can be used in a 32-bit program. They are the only type of based pointer that can be used in a flat (that is, nonsegmented) address space.

If you have a group of pointers that all refer to locations within a buffer of memory, you can define them as offsets from a pointer that references the start of the buffer. If you relocate that buffer, you can update the entire group of pointers by modifying just the pointer that acts as their base. If you write the buffer to disk, you can also write the based pointers to disk. Once you reload the buffer into memory, you can make the based pointers valid again by updating their base.

Pointers Based on the __self Keyword

You can base a pointer on the segment that the pointer itself is stored in. This is done by using the __self keyword, cast to the __segment type. Consider the following example:

typedef struct node NODE;

struct node

{

int name;

NODE __based((__segment)__self) *left;

NODE __based((__segment)__self) *right;

};

This example declares a structure named NODE for use in a binary tree. Each node in the tree contains pointers to its two child nodes. These pointers are self-based, so they refer to locations within the segment in which the node itself is stored. This is possible only when an entire tree can fit in a single segment. Based pointers provide an advantage over far pointers in such a data structure by reducing the size of each node by four bytes.

If you want to build a tree out of nodes that contain self-based pointers, do not use malloc to allocate the nodes, because it may return memory in different segments. Instead, use a based heap along with pointers based on a segment variable. The following example assumes the type declaration given above:

void main()

{

__segment segvar;

NODE __based(segvar) *nodeptr;

// ignore error checking for this example

segvar = _bheapseg( 30000 );

nodeptr = _bmalloc( segvar, sizeof(NODE) );

nodeptr->left = _bmalloc( segvar, sizeof(NODE) );

nodeptr->right = _bmalloc( segvar, sizeof(NODE) );

nodeptr->name = 1;

nodeptr->left->name = 2;

nodeptr->right->name = 3;

}

This program first allocates a based heap of 30,000 bytes and uses segvar to store the heap's segment. Then the program allocates NODE objects from that based heap, so all the nodes in the tree reside in the segment specified by segvar. Note that nodeptr is based on segvar, instead of being self-based. A self-based pointer declared as a local variable in a function uses the SS register as its base, which may not be in the same segment as segvar.

Pointers Based on the void Keyword

The final way to declare a based pointer is to base it on void. Such a pointer is not based on any particular segment. It is an offset that can be combined with any segment to form a full address. You can combine a segment value and a void-based pointer using the “base operator,” which consists of a colon and a greater-than symbol (:>). That is,

segment:>offset

Such an expression denotes a complete address and can be dereferenced with the indirection operator (*). You can use the base operator only with pointers based on void, not with other types of based pointers.

The segment value can be a variable of type __segment, or it can be an integer cast to type __segment. For example,

__segment videomem = 0xB800; // use video memory as segment

char __based(void) *offptr;

// set offset to row 10, col 40

offptr = (char __based(void) *)(2 * ((80 * 9) + 39));

*(videomem:>offptr) = 'A'; // write an A there

offptr += 2; // move to col 41

*(((__segment)0xB800):>offptr) = 'A'; // do it again

The pointer offptr can be used with any segment variable. If you have many segments organized in the same way, you can use one void-based pointer to access the same relative location in each of them.