Steven Sinofsky
QI am writing an application for the MicrosoftÒ WindowsÔ operating system in C++. I've never used malloc and free from the C run-time library before and would like to take advantage of the operators new and delete in C++. I recall my compiler’s implementation of malloc and free in large model used up a selector for each allocation, and sometimes I need to control the allocation flags, for example when allocating shared memory. Can I do this?
AIn both Microsoft C/C++ 7.0 (C/C++ 7.0) and BorlandÒ C++ (BC++) the current default implementation of new and delete is to use the C run-time library functions malloc and free, though there is no language requirement that they be implemented as such (so it is not a good idea to rely on such behavior between compiler implementations). Both have been optimized for mixed-model programming in Windows. In both implementations, near allocations for small or medium model programs are made in DGROUP as expected. All far allocations are implemented using a suballocation scheme based on GlobalAlloc memory blocks. (Far allocations can be obtained explicitly by using fmalloc in BC++, _fmalloc or “_far operator new” in C/C++ 7.0, or implicitly by using the large or compact models.)
Unfortunately, neither C++ implementation gives you the option of easily modifying the allocation flags for a particular class object when the object is dynamically allocated, that is, when it’s allocated using the new operator. It is possible for you to write your own implementation of new and delete for a particular class. Both C/C++ 7.0 and BC++ 3.0 permit the addition of any number of user-defined parameters to the new operator, with the restriction that the first parameter must be a size_t. In addition, C/C++ 7.0 permits overloading the return value of the new operator and the delete operator based on the size of the pointer (near, far, based, huge). The use of the _far keyword in both the class declaration and in the declaration of the allocation operators is required, since the objects must be far pointers.
A major limitation to keep in mind is that each allocated object requires a call to GlobalAlloc, so you should not use this for all objects in your application, but only for those that require special memory allocation flags. If you need finer-grained control, then you will need to implement a suballocation scheme on top of this. For more information on how to do this, see “Improve Windows Application Memory Use with Subsegment Allocation and Custom Resources,” MSJ (Vol. 6, No. 1).
In Figure 1, in the call to ::GlobalAlloc, GPTR is explicitly OR’ed in with the flags passed in by the user. Although the C++ default argument mechanism could have been used, this interface is easier to use since there is no need to add to the default arguments to customize the flags. If you implement a new operator, you should always implement a matching delete operator. In CGlobalObject::operator delete note the check for NULL in the delete operator. Deleting a NULL pointer is legal in C++ so you must account for that possibility. The CDDEObject class is derived from CGlobalObject to take advantage of the custom allocation. Other objects in your application that require this custom memory allocation should also derive from CGlobalObject.
Although this might seem like a good occasion for multiple inheritance, problems arise when two base classes have both overloaded the new operator. The programmer is required to explicitly specify the class name on the call. Also notice the use of the global scope specifier, the :: in front of calls to the Windows API. Although not required, they distinguish calls to the native Windows API from calls to my application framework member functions with the same name. Finally, note the use of a debug message when an allocation fails. Usually these facilities are provided at a higher level; for example, the Microsoft Foundation Classes (MFC) support diagnostic messages (TRACE) as well as an exception handling mechanism for such fatal conditions.
Figure 1 Using New and Delete
#include <windows.h>
#include <stdlib.h>
class _far CGlobalObject
{
public:
void _far* operator new(size_t cbSize, UINT fuAlloc);
void operator delete(void _far* p);
};
void _far* CGlobalObject::operator new(size_t cbSize, UINT fuAlloc)
{
HANDLE h;
if ((h = ::GlobalAlloc(fuAlloc | GPTR, cbSize)) = = NULL)
{
#ifdef _DEBUG
::OutputDebugString("CGlobalObject::operator new failed\r\n");
#endif
return NULL;
}
else
return (void _far*)::GlobalLock(h);
}
void CGlobalObject::operator delete(void _far* p)
{
if (p != NULL)
::GlobalFree(LOWORD(::GlobalHandle(HIWORD(p))));
#ifdef _DEBUG
else
::OutputDebugString("CGlobalObject::operator delete passed NULL\r\n");
#endif
}
class CDDEObject : public CGlobalObject
{
protected:
HWND m_hwnd; // handle to window for conversation
// other member data specific to this object type
o
o
o
};
CDDEObject _far* pDDE = new(GMEM_DDESHARE) CDDEObject;
QI design my classes with small accessor functions (functions that just return a copy of some internal state) and I want them to be inline, since the base class implementation usually just returns a member variable. But I would also like the functions to be virtual so that I can redefine their implementation in a derived class. When I compile my program and look at the generated code, I notice that these functions are never inlined, even with maximum optimizations. Why is that?
AThe C++ language definition does not have anything against inlining virtual functions, but this is one optimization that is very difficult for compilers to perform. The inline keyword is only a hint to the compiler and there is no requirement that any function be expanded inline. Different compilers do varying levels of inlining. Usually there are some restrictions on what functions may be inlined and the compiler will warn you if a function is not being expanded inline. For example, BC++ 3.0 will not inline functions with looping constructs, whereas C/C++ 7.0 will inline arbitrary code, even recursive functions. C/C++ 7.0 automatically inlines functions based on optimization heuristics if you specify the /Ob2 switch on your compiles.
Here the problem is that the compiler cannot be sure which virtual function to call when its virtual function calling mechanism tries to call the function. Consider parts of the following two classes and a global function that uses them:
class CLine
{
protected:
int m_nLength;
public:
inline virtual int GetLength() { return m_nLength; }
};
class CArrow : public CLine
{
protected:
int m_nArrowLength;
public:
inline virtual GetLength()
{ return m_nLength + m_nArrowLength; }
};
void Render(const CLine* pLine)
{
o
o
o
// code that calls GetLength
n = pLine->GetLength();
o
o
o
}
Since CArrow is derived from CLine, Render can take a pointer to either a CLine or a CArrow class object. Therefore a call to GetLength in Render can result in either CArrow::GetLength or CLine::GetLength getting called because the member function is being called polymorphically. To use an inline function, the compiler must know the exact type of the object pLine refers to, for all possible calls to the Render function. This is not possible without a very sophisticated analysis of your entire application where the exact type of every object is known at the call site. This is the kind of analysis that current environments and tools are unable to do. All is not lost, however, since the compiler often knows the exact type of an object. For example, if you declare a class object of type CLine as a local variable on the stack, subsequent calls to GetLength in the same scope are optimized to nonvirtual (and if possible, inlined) calls. Both BC++ and C/C++ 7.0 are able to perform this optimization.
void LocalRender()
{
CLine line;
CArrow arrow;
o
o
o
n1 = line.GetLength(); // Fcn expanded inline
n2 = arrow.GetLength(); // Fcn expanded inline
Render(&line); // Subsequent call not
// expanded inline
o
o
o
}
In general, function inlining is an optimization that should be done only after your program is working correctly. A useful guideline is only to inline accessor functions and wrapper functions (that map directly to other API calls). All other functions should be regular out of line code until you begin to tune your application using a source code profiler. It is also helpful to debug your program with inlining turned off (the /Ob0 flag in C/C++ 7.0 or the -vi flag in BC++). This makes it easy to set breakpoints in these functions, which make debugging a class easier. (If you don’t turn inlining off, the debugger can’t tell which copy of the expanded function you wish to break in.)
QI'd like to write some of my own classes to assist members of my team who are not well-versed in Windows and C++, but I don’t have the time to write a complete application library and we already have a large body of existing code. Any ideas for some easy helpers?
AOne area where C++ can really help is in doing some of the mundane cleanup work that Windows requires. For example, how many times have you left off the ::EndPaint when handling a WM_PAINT message? MFC can be easily integrated with existing C or C++ code and address many of these mundane issues of programming. It lets you concentrate on the code specific to your application. An abbreviated version of the CPaintDC class in MFC can be found in Figure 2. The constructor takes care of calling ::BeginPaint, the member functions make use of the hDC that is a member variable, and the destructor calls ::EndPaint.
Although only TextOut is implemented as a member function, you can see how this could be extended to other GDI calls. MFC implements member functions for all of GDI in a common base class called CDC. CPaintDC is derived from CDC (other helper classes—CWindowDC, CMetaFileDC, and CClientDC—follow this same logic). You can also see the possibilities for giving your coworkers many helper functions that simplify common cases (for example, you might not require the cbString parameter to CPaintDC::TextOut and let the implementation call the Windows API lstrlen automatically). It is best to start with a foundation such as MFC and determine which helpers you need by experience; otherwise you might find your class difficult to maintain. Also, since the hDC is a public member variable, you can use your existing drawing code by just passing the m_hDC to a draw routine. Similarly, it is also possible to access the PAINTSTRUCT. It is a reasonable practice to make such system data public, since the implementation of low-level Windows functions, such as BeginPaint, won’t change in the future.
Figure 2 An Abbreviated CPaintDC
class CPaintDC : public CDC
{
public:
CPaintDC(HWND hWnd); // BeginPaint
~CPaintDC(); // EndPaint
HDC m_hDC; // HDC for painting
HWND m_hWnd; // HWND of window being repainted
PAINTSTRUCT m_ps; // paint struct required by Windows
// member functions
BOOL TextOut(int x, int y, LPCSTR lpszString, int cbString);
};
inline CPaintDC::CPaintDC(HWND hWnd)
{
m_hWnd = hWnd;
m_hDC = ::BeginPaint(m_hWnd, &m_ps);
}
inline CPaintDC::~CPaintDC()
{
::EndPaint(m_hWnd, &ps);
}
inline BOOL CPaintDC::TextOut(int x, int y, LPCSTR lpszString, int cbString)
{
return ::TextOut(m_hDC, x, y, lpszString, cbString);
}
QI wrote a simple String class and I included an overloaded assignment operator (operator=) so that I can copy one string to another. My program (see Figure 3) crashes though and I’m not sure why. The strange thing is that I set a breakpoint at the first line of my operator=, but the function never gets called. Is C++ doing something behind my back?
Figure 3 Incomplete String Class
class String
{
char* pszString;
int len;
public:
String();
String(const char* pszInitial);
~String();
String& operator=(const String& source);
// other member functions
o
o
o
};
String::String()
{
pszString = NULL;
len = 0;
}
String::String(const char* pszInitial)
{
if (pszInitial != NULL)
{
len = strlen(pszInitial);
pszString = new char[len + 1];
strcpy(pszString, pszInitial);
}
else
{
len = 0;
pszString = NULL;
}
}
String::~String()
{ delete pszString; // may be NULL }
String& String::operator=(const String& source)
{
delete pszString;
if (source.pszString != NULL)
{
len = strlen(source.pszString);
pszString = new char[len + 1];
strcpy(pszString, source.pszString);
}
else
{
len = 0;
pszString = NULL;
}
return *this;
}
void main()
{
String s1("Hello World");
String s2 = s1; // assign s1 to s2
} // my program crashes when I exit
AC++ does lots of things behind your back, and this is a good example. Although you planned ahead and accounted for assigning one string to another via the operator=, you need to consider another member. This member, the copy initializer or copy constructor, is invoked whenever you initialize a new object with another object, and if you don’t supply one C++ will give you one. The following code would execute correctly with your implementation:
String s1(“Hello World”);
String s2;
s2 = s1; // assign s1 to s2
This is because it uses the overloaded assignment operator. Your program is causing the copy constructor to be invoked, not the assignment operator. A copy construction takes place when the assignment occurs along with a variable definition. As with the default implementation of operator=, the copy constructor function does a member-wise assignment. Since your class has pointers to allocated data, this will result in two objects pointing to the same allocation. Since both objects will eventually be destroyed, one will delete an allocation that has already been deleted, causing the crash you are seeing. For your string class the compiler will generate the following function:
String::String(const String& source)
{
// generated copy constructor
pszString = source.pszString;
// two objects will point to the same buffer!
len = source.len;
}
Usually your implementation of a copy constructor looks a lot like the code for an overloaded assignment operator. The only difference is that the assignment operator needs to take into account the case where the right side of the assignment is the same as the left side, which is missing from the code you submitted. A better implementation is found in Figure 4.
There are still other optimizations that you could make to Figure 4. For example, the current assignment operator always forces an allocation. If the destination (the “this” pointer) string is longer than the source string, an allocation isn’t necessary and the assignment can be optimized to just a strcpy.
As a reminder, a brief table of member functions that a C++ compiler will (silently!) generate for you is shown in Figure 5. There is also a brief description of the default implementation. This table is based on information from a very useful book by Scott Meyers called Effective C++: 50 Specific Ways to Improve Your Programs and Designs (Addison-Wesley, 1992).
When overriding the default constructor and destructor, the construction and destruction of class members and base classes will always take place. In contrast, the copy constructor and assignment operator are completely replaced by your implementation. A default constructor will not be generated if you provide any other type of constructor for your class, such as one that takes argument(s). If you do not provide a default constructor, you will not be able to create arrays of objects. And don’t forget that access protection applies to these special members as well, so you can, for example, make it impossible for users to assign your class’s objects by making the assignment operator private.
Figure 4 Improved Code
String& String::operator=(const String& source)
{
if (source.pszString = = pszString)
// check for assignment to self: s = s;
return *this;
delete pszString;
if (source.pszString != NULL)
{
len = strlen(source.pszString);
pszString = new char[len + 1];
strcpy(pszString, source.pszString);
}
else
{
len = 0;
pszString = NULL;
}
return *this;
}
String::String(const String& source)
{
delete pszString;
if (source.pszString != NULL)
{
len = strlen(source.pszString);
pszString = new char[len + 1];
strcpy(pszString, source.pszString);
}
else
{
len = 0;
pszString = NULL;
}
}
Figure 5 Member Functions Generated by Default
Function type | Declaration | Default implementation |
Default constructor | Class() | Does nothing |
Default destructor | ~Class() | Does nothing |
Copy constructor | Class(const Class&) | Memberwise copy |
Assignment operator | Class& operator=(const Class&) | Memberwise copy |
Address of operator (const) | const Class* operator&() const | Returns this |
Address of operator (non-const) | Class* operator&() const | Returns this |