Kaare Christian
Kaare Christian is a research associate at Rockefeller University. He is a frequent contributor to PC Magazine and is the author of the forthcoming Microsoft Guide to C++ Programming (Microsoft Press).
QI’m currently using a C subroutine library to access records from a simple mailing list database. Can I continue to use this library, even though I’m rewriting most of my software in C++ using MicrosoftÒ C/C++ 7.0?
AOne of C++’s most important features is its compatibility with C. Yes, you can use C library functions in a C++ program. In fact, you’ll probably do it all the time.
Often, the only thing you need to do is to let your C++ compiler know that the functions are C functions. I assume that your C subroutine library already has an include file with full function prototypes. If you are missing a few prototypes, or if some are incomplete, repair that first. (If you don’t have a set of function prototypes, but you have the library source, Microsoft C/C++ 7.0 can easily generate the prototypes using the /Zg option.)
Given a C header file with accurate and complete function prototypes, include it in your C++ source file using the following syntax:
extern "C"
{
#include "dblib.h"
}
The extern "C" construct is called a linkage specification. It identifies a region of the source in which C linkage is in force. Above, the linkage specification makes it possible to access the C functions in DBLIB.H from C++ code.
If you need to use extern C in a header that has to be included into C or C++ programs, you may use the #ifdef conditional and the _ _cplusplus predefined identifier. During a C compile, the _ _cplusplus identifier is undefined and during a C++ compile it is defined.
#ifdef _ _cplusplus
extern "C" {
#endif
BOOL _ _cdecl ReadDBFRecord(int key, dbfRecord *rec,
int maxrecordlen);
BOOL _ _cdecl WriteDBFRecord(int key, dbfRecord *rec,
int recordlen);
#ifdef _ _cplusplus
}
#endif
Actually, the #ifdef _ _cplusplus identifier is useful in many places where you want to segregate C++ code from C code.
You should also make sure that your C database routines have been compiled using the C component of Microsoft C/C++ 7.0. This ensures that compiler helper functions are all in sync.
Let me make one final point. Yes, it is possible to access a C function from a C++ function. But that doesn’t make it a good idea. C++ is about using object-oriented design and programming to boost your productivity. In the short term, keep using your C library, but also start thinking about a more object-oriented approach to your software challenges.
QI’m interested in using polymorphism (virtual functions) in my software, but I’m afraid that things will run slowly. To me, performance is very important--some of my customers still use original IBMÒ PCs! I have a friend who has been using C++ for a while who says that I should never use virtual functions because of the overhead. Just exactly what is the overhead? Should they be avoided, or is my friend exaggerating?
AAs you point out, virtual functions in C++ are the key to polymorphism. They select, at run time, a member function that corresponds to a specific object, an object whose type is not known during compilation. Virtual functions are dispatched using a compiler-generated table of function addresses. Each class that contains virtual functions contains one of these tables, which are commonly called vtables. The calling overhead, optimally, is an extra MOV instruction to pull in the address of the table.
Although I know that maximizing the efficiency of virtual function calls was a major design goal for C/C++ 7.0, like many programmers, I prefer to be convinced. I wrote a tiny example, shown in Figure 1, to illustrate the overhead of virtual member function calls. Here is the slightly cleaned-up mixed source/object listing (/Fc option) produced by Microsoft C/C++ 7.0, small model, maximum optimization (/Ox option).
00000c 8b 76 04 mov si,WORD PTR [bp+4] ;load this into si
; pb->fn(); // direct function call
00000f 56 push si ;push this
000010 e8 ed ff call ?fn@Base@@QACXXZ ;call directly
; pb->vfn(); // function call via vtable
000013 56 push si ;push this
000014 8b 1c mov bx,WORD PTR [si] ;load addr of
;vtable into bx
000016 ff 17 call WORD PTR [bx] ;call indirect
;via vtable
The source/object listing above shows a direct function call and a function call via a vtable. The second case, the virtual function call, requires an extra MOV instruction that loads the vtable address into BX. Of course the above example just shows virtual function calling overhead for one case. Results will vary based on many things, including optimization level, memory model, single or multiple inheritance, function arguments, and so on. But the overhead is remarkably low.
The other way to look at the overhead of virtual function calling is to consider the alternatives. Historically, similar problems have been handled using manually built tables of function pointers, cascaded if statements, and case statements. None of these options even approaches the speed and convenience of C++’s virtual functions.
But the low overhead of a C/C++ 7.0 virtual function call doesn’t mean that virtual functions should be overused. When virtual functions are needed, don’t worry about the extra instruction. But don’t make every member function a virtual function "just in case." Use virtual functions only when appropriate.
Figure 1 Virtual Member Function Calls
class Base
{
public:
void fn();
virtual void vfn(void);
};
void funk(Base *pb)
{
pb->fn(); // direct function call
pb->vfn(); // function call via vtable
}
QI have been advised by the C++ guru at work that I shouldn’t use both malloc and new in the same program. He says that in C++ I should only use new. Is this true? (I have thousands of lines of C, and I would hate to have to change all my mallocs to news.)
AYour C++ guru sounds more like a C++ fanatic. Malloc and new can coexist, but you have to be careful, and you have to realize that you are treading on implementation-specific turf.
The first two rules are simple. First, you must never use the delete operator to free memory that you allocated using malloc. Second, you must never use the free function to release memory that you allocated using new.
To be complete, let me mention two other things about allocation. You must never realloc memory that you have allocated with new. Once allocated with new, the size is fixed. If you need to be able to change the size of a memory block, you must use malloc, and the memory block must not contain classes that have constructors or destructors.
You must use new when you allocate objects that have constructors or destructors. That’s because new, unlike malloc, is known to the compiler; it is a keyword. When the compiler encounters the new operator it determines if a constructor exists, and if it exists it is invoked automatically.
If you malloc your classes, they won’t be properly constructed.
In my newer code I usually use new, not malloc, for the following reasons. The first is type checking. When you use new the compiler knows exactly what you are allocating, and it makes sure that you are using an appropriate pointer type. The second reason is that I can define a new_handler using the _set_new_handler function, so that allocation failures are handled automatically.
In C/C++ 7.0, Microsoft has extended the new operator so that it can handle memory models. For example, you can write the following to allocate an array of one thousand ints in far memory:
int _ _far *fpi = new _ _far int[1000];
The other memory models that can be specified with new are _ _near, _ _based, and _ _huge. This is a C/C++ 7.0-specific feature at the moment, but it is a good idea that is likely to be picked up by other vendors.
Also, notice that I’ve written _ _far (two underscores), not far or _far. In C/C++ 7.0 all three forms are accepted, but the _ _far form is preferred, because words starting with a double underscore are earmarked for implementation-dependent features of the compiler and standard libraries. Only _ _far is listed in the table of C/C++ 7.0’s reserved words; future versions of Microsoft C++ might not support _far or far.
QOver the past few years, with much effort, I’ve learned to use C declarations. In my opinion, the difficulty of C declarations is one of the language’s worst features. I’ve been reading up on C++ and I’m a bit stumped by const and const pointer declarations. All this stuff seems counterintuitive.
AI certainly sympathize--C declarations are tough. C++ makes declarations even trickier in a number of ways, such as the introduction of const. Const data is just that, constant. It doesn’t change. But you have to be very careful when using the const keyword to declare pointers. Is it the pointer itself that is const, which means the pointer can only point at one thing? Or is it the pointed-at thing that’s const, meaning you can’t use the pointer to modify what it points at? And then there is the combination, a const pointer to const data--about as constant as you can get.
Let’s start with const pointers.
int x; // x is an int
int *pi; // pi is a pointer to an int
int *const cpi = &x; // cpi is a const pointer to an int
// cpi is initialized to point at x
The "inside out" method of reading C/C++ declarations explains the declaration of cpi. Working from the inside out, it reads "cpi is a const pointer to an int." (And it must be initialized, because it can’t be retooled to point at anything else.) A const pointer can be used to change the value of the thing it points at, but it can never point at anything else.
Thus
*cpi = 10; // okay, change the value stored at the
// location cpi points to
is okay, but
cpi++; // NG, can’t change cpi itself, cpi is a
// const pointer
is not allowed.
Now for a pointer to a const.
const int *pci; // pci is a pointer to an int const
The declaration of pci reads "pci is a pointer to an int const." Another way of saying "int const" is "const int," so the more pleasant way to read the declaration is "pci is a pointer to a const int." A "pointer to a const int" is nearly the opposite of a "const pointer to an int." A pointer to a const int can point at any const int, but it can’t be used to alter whatever it points toward.
Thus
*pci = 10;// n.g. pci points at a const, and we
// can’t change a const
is not allowed, but
pci++; // okay, point pci at the next const int
is okay.
The combination of the above two rules should explain const pointer to const:
const cx;
const int *const cpci = &cx; // cpci is a const pointer
// to a const int