Implementing Automation Collections

Charlie Kindel
Program Manager/OLE Evangelist, Microsoft Developer Relations Group

Created: March 10, 1994
Revised: October 18, 1994

Click to open or copy the files in the XRTFrame sample application

Abstract

This article describes how to design and implement Automation collections using the Microsoft® Foundation Classes (MFC). A precise definition of collections is given, followed by the definition of an Automation collection.

Throughout this article, code fragments are given in both C++/MFC and Visual Basic®. It is assumed that the reader is familiar with both. The XRTFRAME sample application is an MFC application that illustrates the Automation collection concepts presented in this article.

Definition of an Automation Collection

The Microsoft® Excel version 5.0 Visual Basic Programmer's Guide defines Automation collections as follows:

"A group of objects. An object's position in the collection can change whenever a change occurs in the collection. Therefore, the position of any specific object in the collection is unpredictable. This unpredictability distinguishes a collection from an array."

This article, and the associated sample code, uses this definition.

The defining characteristic of a collection is the ability to iterate over the items contained in the collection. An item is any thing that can be accessed via an Automation interface. Examples of items in a typical Microsoft Windows®-based application are multiple-document interface (MDI) child windows, a cell in a spreadsheet, and a button on the toolbar. Examples of collections are the cells in a worksheet, the open worksheets, the windows containing those worksheets, and the buttons on the toolbars.

Collections vs. Arrays

Initially, it seems to make sense to think of the set of open windows in an application as an array of windows, where it is possible to access any given window via an index. This implies that each window has some fixed indices relative to some starting point (the first window created?). However, a typical Windows-based application changes the z-order of existing windows and may dynamically create and destroy other windows. While it is not difficult to implement an array whose upper bound can grow and shrink, it is awkward to remove items from the middle of an array. Therefore, it is often more convenient to use collections instead of arrays. This paradigm of having no guaranteed order or size is ideally suited to the use of collections. A collection can be implemented as an array, but in many (if not most) cases, it is easier to implement them as linked lists, where items can be added, removed, and moved around at will. It is still useful to be able to access individual windows in the list by name or index, and even iterate over all of the windows. Collections provide a mechanism for doing just this. In a sense, the term collection is a non-computer science way of describing a linked list.

Typically (but not always), an arbitrary item or object in an application can be identified through both a human-readable name and some sort of indexing. For example, a user could find a window in a collection using its name (typically a string representing its caption) or by using a number (representing its position in the z-order).

Automation Collections

Automation collections are collections that are exposed through a standard OLE interface.

A collection is exposed from Automation through a collection object. There is no "collection object" type. Collection objects can be pseudo-objects; that is, it is reasonable for collection objects to exist only while a client is iterating over the collection.

The following table shows the standard properties and methods of a collection object. Note that some are optional:

Member Description Optional?
Add method Adds the indicated item to the collection. Yes
Count property Returns the number of items in the collection. No
Item method Returns the indicated item in the collection, or VT_EMPTY if the item does not exist. No
_NewEnum property Returns an OLE object that supports IEnumVARIANT. This method is not visible to users. No
Remove method Removes the specified item from the collection. Yes

The Six Commandments of Automation Collections

There are basically six rules that an Automation collection must abide by in order to be called a collection. These six "commandments" define what methods and properties a collection must support and the semantics of the object.

The First Commandment

A property or method that returns a collection must be named with the plural name for the items in the collection. If the plural name for the item is the same as the singular name, "Collection" should be appended to the singular name to obtain the name for the property.

Examples of this first commandment of collections are given below:

Name of an item in the collection Type of the item in the collection Name of the collection object
Word Word object Words
Item Item object Items
Document Document object Documents
Foot Foot object Feet
Vertex Point object Vertices

The Second Commandment

All collection objects must have an _NewEnum property.

_NewEnum property

Name: _NewEnum. (Both NewEnum and _NewEnum are frequently used. However, the name of this method is not important because it is always accessed via a special DISPID: DISPID_NEWENUM.)

Description: Points to an object that supports IEnumVARIANT.

Parameters: None.

Return type: VT_UNKNOWN. (The Automation Reference indicates that the return type should be VT_DISPATCH. However, this does not make sense because the return is a pointer to an IEnumVARIANT interface, not an IDispatch interface.)

Usage example: None. End users do not know about the NewEnum method.

Note: _NewEnum will not be accessible to users. The _NewEnum method must have a special DISPID: DISPID_NEWENUM.

The defining characteristic of a collection is the provision to allow  a user to iterate over the items in it. The NewEnum method is the primary mechanism for doing this. In programming languages such as Visual Basic® for Applications, the _NewEnum property is used internally by the implementation to support constructs that iterate over collections. However, some languages do not have built-in support for collections (Visual Basic 3.0 and the DispTest tool are examples).

For Each w In words
   MsgBox w.Definition
Next

Automation controllers that support "for each"-like constructs for iterating over collections will retrieve the _NewEnum property from the collection object and then QueryInterface to get the IEnumVARIANT pointer. IEnumVariant::Next is used to iterate over the collection by the Automation controller.

The _NewEnum property is "restricted". Restricted properties and methods are not accessible to end users. (TypeLibrary creators can specify that a property or method is restricted by using the "restricted" attribute for the property or method in the ODL specification that is used by MKTYPLIB.EXE to generate a type library.) The underscore in the name of property indicates that it will not be visible to users that use a type library browser, such as the Object Browser in Microsoft Excel 5.0. In MFC, the DISP_PROPERTY_EX_ID macro should be used to define the dispatch map for the NewEnum method:

DISP_PROPERTY_EX_ID(CWords, "_NewEnum", DISPID_NEWENUM, _NewEnum, SetNotSupported, VT_UNKNOWN)

Note that ClassWizard is not capable of automatically generating DISP_PROPERTY_EX_ID entries in a class dispatch map; you must add this entry manually.

The _NewEnum property should be declared in a collection's type information as follows:

#define DISPID_NEWENUM -4

properties:
      [id(DISPID_NEWENUM)] IUnknown* _NewEnum ;

The name of the NewEnum method need not be localized because end users will never see it.

The Third Commandment

All collection objects must have a Count property.

Count property

Name: Count.

Description: The count of items in the collection.

Access: Read-only.

Data type: VT_I4.

Usage example:

Print Words.Count
For i = 1 To Words.Count
   Print Words(i).Definition
Next i

The Count property provides a second means for iterating over the objects in a collection. It is most useful in languages that do not directly support collections (that is, languages that do not support the _NewEnum property) such as Visual Basic 3.0.

The fact that the position of an item in a collection may change whenever items are added or removed from the collection can cause some confusion when the Count property is used to iterate over a collection. For example, the following code will not work correctly with most collection objects:

For i = 1 To Words.Count
   Words.Remove(i)
Next i

However, the following code will work just fine:

For i = 1 To Words.Count
   Words.Remove(0)
Next I

or:

For I = Words.Count to 1
   Words.Remove(0)
Next I

(Both examples assume that the collection object Words implements the optional Remove method.)

Note that some collections are zero-indexed and some are indexed starting at 1. It is recommended that new collections be designed such that indexing starts at 1.

The Fourth Commandment

All collection objects must support at least one form of indexing by implementing the Item method.

Item method

Name: Item.

Description: Returns the indicated item in the collection, or VT_EMPTY if the item does not exist.

Parameters: Varies.

Return type: VT_DISPATCH.

Usage example:

Print Words(3).Definition
Print Words("fox").Definition
Print Words.Item(3).Definition
Print Words.Item("fox").Definition
Print Cells.Item(1,1).Text
Print Cells.Item("R1C1").TEXT

Note   The Item method must be the default member for the object. In MFC, the dispatch map macro DISP_DEFVALUE(CMyObject, "Item") denotes the Item method as the default member for the CMyObject object by assigning to the dispatch map the DISPID of 0.

Most collections support one or more forms of indexing. The following code shows how a collection object can support indexing using both an index and a string. This example prints the definition of the third word in the Words collection and then prints the definition of "dog".

Print Words(3).Definition
Print Words("dog").Definition

The statements above are equivalent to:

Print Words.Item(3).Definition
Print Words.Item("dog").Definition

because Item is the default member (DISPID_DEFVALUE) of the Words collection object.

Collections that support indexing must use the Item method to implement indexing. Collections that support several types of indexing should implement an Item method that takes one or more VARIANTs as index parameters.

In MFC, the Item method should have a dispatch map entry that looks like this:

DISP_FUNCTION(CDataItems, "Item", GetItem, VT_DISPATCH, VTS_VARIANT)
DISP_DEFVALUE(CDataItems, "Item")

The ODL statement for the Item should look like this:

methods:
   [id(0)] IDispatch* Item(VARIANT Index);

The Fifth Commandment

Collections that support manual addition of objects should consider doing so using the standard Add method.

Add method

Name: Add.

Suggested syntax: Add obj [, index][, before][, after]

where index is a string property that can later be used as a string index, and before and after (one or the other, not both) control the placement of the object in the collection.

Description: Adds the indicated item to the collection. If an object is created as a result of the addition, that object should be returned.

Parameters: Varies.

Return type: VARIANT, the type of which varies with the implementation. If the Add method cannot add the item to the collection, it should raise an exception. If an object is created as a result of the Add, the return value should be of type VT_DISPATCH.

Here is an example that illustrates many of the ways in which the Add method can be used:

Set MyDict = CreateObject("Diction.Application")

Set dog = CreateObject("Diction.Word)
dog.Letters = "dog"
dog.Definition = "Man's best friend."
MyDict.Add dog, Index =: dog.Letters

Set cat = CreateObject("Diction.Word)
cat.Letters = "cat"
cat.Definition = "Dog's best friend."
MyDict.Add cat, Index =: cat.Letters, After := dog

The above example assumes that the implementation of Add follows the suggested syntax given above.

The Add method is not appropriate for all collections, so it is not required. For many application-created collections, objects are automatically added to the collection for the user. If a collection object's Add method sometimes creates an object, the return type of the method can be IDispatch*. When an object is created, a pointer to the object is returned; when an object is not created, NULL is returned.

The Sixth Commandment

Collections that support manual removal of objects should consider doing so using the standard Remove method.

Remove method

Name: Remove.

Description: Removes the specified item from the collection.

Parameters: Varies.

Return type: VT_EMPTY.

Example:

MyDict.Remove("fox")
Set x = MyDict.Item(5)
MyDict.Remove(5)
Print x.Definition

The object is not deleted; it is simply removed from the collection.

Remove should support the same kinds of indexing as the Item() method for the same collection.

The Remove method is not appropriate for all collections, so it is not required. For many application-created collections, objects are automatically removed from the collection for the user.

Implementing Collections in MFC

The Automation support found in MFC version 2.5 and higher greatly reduces the amount of code you need to write to support Automation in your applications. This makes it much easier to implement Automation collections. This section describes how to implement collections using MFC. The COLLECT sample found in the software library (sample #S14424) shows how to implement Automation collections without using MFC.

Collection Design Decisions

In most applications, collections can be classified into two categories:

It is important to determine the category of a collection before implementing one.

Properties That Are Collections

While it certainly possible for a top-level object (an object that can be created via an IClassFactory::CreateInstance implementation) to be a collection object, it is more likely that collection objects will be sub-objects. For example, the Documents property of the Application object in Microsoft Excel 5.0 returns is a sub-object of Application.

Set docs = Application.Documents
For Each doc in docs
   MsgBox doc.Title
Next

We call the Documents property a collection property. Collection properties always have a type of IDispatch* (or a type derived from IDispatch*) and are usually read-only. "Read-only" is used because it is possible for an application to design a collection object that is passed around.

The Application object might implement the Documents property as follows:

// In APP.ODL
...
     properties:
           [id(3)] IDispatch* Documents;
...

// In APP.CPP
...
DISP_PROPERTY(CApplication, "Documents", GetDocuments, SetNotSupported, VT_DISPATCH)
...
LPDISPATCH CApplication::GetDocuments()
{
   CDocuments* pDocs = new CDocuments ;
   ASSERT(pDocs) ;
   if (pDocs == NULL)
      AfxThrowOleDispatchException(0,"Out of memory")
   return pDocs->GetIDispatch(FALSE) ;
}

This function simply creates an instance of the Documents collection object and returns its IDispatch pointer. In this case, CDocuments is a pseudo object in that the collection is created only when a collection pointer is requested. Note the call to pDocs->GetIDispatch(FALSE). The parameter is FALSE in this function because the act of creating a new instance of CDocuments automatically gives the object a reference count of 1. When the caller releases the pointer (by calling pDocs -> Release()), the object should delete itself because the only thing that has a pointer to the newly created pseudo collection object is the caller.

If the application were structured such that CDocuments actually was the container of all document objects, the GetDocuments method would look like this:

LPDISPATCH CApplication::GetDocuments()
{
   return m_Docs.GetIDispatch(TRUE) ;
}

In this case, we are assuming that CApplication has a member (m_Docs) of type CDocuments. Note that here we pass the ::GetIDispatch function a TRUE parameter to indicate that the object's reference count should be increased. This way, when the caller calls Release on the returned IDispatch pointer, m_Doc won't try to delete itself.

One last point regarding properties that are collections: Remember what the first commandment of Automation collections says: "A property or method that returns a collection must be named with the plural name for the items in the collection. If the plural name for the item is the same as the singular name, "Collection" should be appended to the singular name to obtain the name for the property."

Implementing _NewEnum and IEnumVARIANT

The second commandment of Automation collections says that each collection object must implement the _NewEnum property. The whole point of the _NewEnum property is to allow Automation controllers to iterate or enumerate over the items in the collection using language features such as the For Each construct in Visual Basic.

To illustrate how _NewEnum works, consider the following Visual Basic code:

Set docs = Application.Documents
For Each doc in docs
   MsgBox doc.Title
Next

The above code enumerates through all of the open documents in the application, and for each document found, pops up a message box indicating the title. The following steps are performed by Visual Basic when it encounters a For Each clause similar to the one above:

So, in short, _NewEnum points to an enumerator object that knows how to enumerate over the items in the collection. An enumerator object is really just an object that implements the IEnumVARIANT interface. See the OLE Programmer's Reference (MSDN Library, Product Documentation, SDKs, OLE 2) for details on IEnum interfaces.

You can choose to implement IEnumVARIANT on the same object as your collection's IDispatch, or you can create a separate object that only implements IUnknown and IEnumVARIANT. In either case, if you are using MFC, you will need to implement the IEnumVARIANT interface on some object. The following section illustrates how to do this. (Note that the discussion below applies to implementing any interface on an MFC CCmdTarget derived class.)

Adding IEnumVARIANT to a CCmdTarget derived class

MFC's OLE implementation uses nested classes to implement Component Object Model (COM) interfaces. All objects that expose COM interfaces in MFC have classes derived from the CCmdTarget class. CCmdTarget provides a rich implementation of IUnknown (and IDispatch, but the IDispatch implementation is only enabled if CCmdTarget::EnableAutomation is called) that allows for aggregation and other COM concepts.

To add a COM interface such as IEnumVARIANT to a CCmdTargetClass, first use the BEGIN_INTERFACE_PART and END_INTERFACE_PART macros in your class declaration as follows.

Note   Please look at the XRTFRAME sample application for a more complete example of implementing Automation collections in MFC.

class CDocuments : public CCmdTarget
{
...
  BEGIN_INTERFACE_PART(EnumVARIANT, IEnumVARIANT)
    STDMETHOD(Next)(THIS_ unsigned long celt, VARIANT FAR* rgvar, 
                        unsigned long FAR* pceltFetched);
    STDMETHOD(Skip)(THIS_ unsigned long celt) ;
    STDMETHOD(Reset)(THIS) ;
    STDMETHOD(Clone)(THIS_ IEnumVARIANT FAR* FAR* ppenum) ;
    XEnumVARIANT() ;        // constructor to set m_posCurrent
    POSITION m_posCurrent ; // Next() requires we keep track of our current item
  END_INTERFACE_PART(EnumVARIANT)    
  
  DECLARE_INTERFACE_MAP()
...
};

These macros are expanded out by the C++ preprocessor to the following:

class CDocuments : public CCmdTarget
{
...
  class FAR XEnumVARIANT : public IEnumVARIANT
  {
  public:
    STDMETHOD_(ULONG, AddRef)(); 
    STDMETHOD_(ULONG, Release)();
    STDMETHOD(QueryInterface)(REFIID iid, LPVOID far* ppvObj); 
    STDMETHOD(Next)(THIS_ unsigned long celt, VARIANT FAR* rgvar,
                  unsigned long FAR* pceltFetched);
    STDMETHOD(Skip)(THIS_ unsigned long celt) ;
    STDMETHOD(Reset)(THIS) ;
    STDMETHOD(Clone)(THIS_ IEnumVARIANT FAR* FAR* ppenum) ;
    XEnumVARIANT() ;        // constructor to set m_posCurrent
    POSITION m_posCurrent ; // Next() requires we keep track of our current item
  } m_xEnumVARIANT ;
  friend class XEnumVARIANT ;
  
  DECLARE_INTERFACE_MAP()
...
};

In other words, by using the INTERFACE_PART macros, you are adding a nested class to your class and you are declaring a member of that class (m_xEnumVARIANT). In the example above, you can see that a pointer to m_xEnumVARIANT is a pointer to an IEnumVARIANT interface.

You must declare an interface map in your class definition by using the DECLARE_INTEFACE_MAP() macro as shown above.

In your implementation file, you need to add the actual interface map data; this is accomplished with the BEGIN_INTERFACE_MAP, INTERFACE_PART, and END_INTERFACE_MAP macros:

BEGIN_INTERFACE_MAP(CDocuments, CCmdTarget)
    INTERFACE_PART(CDocuments, IID_IEnumVARIANT, EnumVARIANT)
END_INTERFACE_MAP()

The next step is to implement the member functions of CDocuments::XEnumVARIANT. The implementation found in the MFCOLL sample is given below.

CDocuments::XEnumVARIANT::XEnumVARIANT()
{    m_posCurrent = NULL ;  }

STDMETHODIMP_(ULONG) CDocuments::XEnumVARIANT::AddRef()
{   
    METHOD_PROLOGUE(CDocuments, EnumVARIANT)
    return pThis->ExternalAddRef() ;
}   

STDMETHODIMP_(ULONG) CDocuments::XEnumVARIANT::Release()
{   
    METHOD_PROLOGUE(CDocuments, EnumVARIANT)
    return pThis->ExternalRelease() ;
}   

STDMETHODIMP CDocuments::XEnumVARIANT::QueryInterface( REFIID iid, void FAR* FAR* ppvObj )
{   
    METHOD_PROLOGUE(CDocuments, EnumVARIANT)
    return (HRESULT)pThis->ExternalQueryInterface( (void FAR*)&iid, ppvObj) ;
}   

// IEnumVARIANT::Next
// 
STDMETHODIMP CDocuments::XEnumVARIANT::Next( ULONG celt, VARIANT FAR* rgvar, ULONG FAR* pceltFetched)
{
    // This sets up the "pThis" pointer so that it points to our
    // containing CDocuments instance
    //
    METHOD_PROLOGUE(CDocuments, EnumVARIANT)

    HRESULT hr;
    ULONG   l ;
    CDocument*  pItem = NULL ;
    POSITION pos = theApp.m_templateList.GetHeadPosition() ;
    CDocTemplate* pTemplate =(CDocTemplate*)theApp.m_templateList.GetNext(pos) ;

    // pceltFetched can legally == 0
    //                                           
    if (pceltFetched != NULL)
        *pceltFetched = 0;
    else if (celt > 1)
    {   
        return ResultFromScode( E_INVALIDARG ) ;   
    }

    for (l=0; l < celt; l++)
        VariantInit( &rgvar[l] ) ;

    // Retrieve the next celt elements.
    hr = NOERROR ;
    for (l = 0 ; m_posCurrent != NULL && celt != 0 ; l++)
    {   
        pItem = pTemplate->GetNextDoc( m_posCurrent ) ;
        celt-- ;
        if (pItem)
        {
            rgvar[l].vt = VT_DISPATCH ;
            rgvar[l].pdispVal = pItem->GetIDispatch( TRUE ) ;
            if (pceltFetched != NULL)
                (*pceltFetched)++ ;
        }
        else 
            return ResultFromScode( E_UNEXPECTED ) ;
    }
    
    if (celt != 0)
       hr = ResultFromScode( S_FALSE ) ;

    return hr ;
}

// IEnumVARIANT::Skip
//
STDMETHODIMP CDocuments::XEnumVARIANT::Skip(unsigned long celt) 
{
    METHOD_PROLOGUE(CDocuments, EnumVARIANT)
    POSITION pos = theApp.m_templateList.GetHeadPosition() ;
    CDocTemplate* pTemplate=(CDocTemplate*)theApp.m_templateList.GetNext(pos) ;

    while (m_posCurrent != NULL && celt--)
        pTemplate->GetNextDoc( m_posCurrent ) ;
    
    return (celt == 0 ? NOERROR : ResultFromScode( S_FALSE )) ;
}

STDMETHODIMP CDocuments::XEnumVARIANT::Reset()
{
    METHOD_PROLOGUE(CDocuments, EnumVARIANT)
    POSITION pos = theApp.m_templateList.GetHeadPosition() ;
    CDocTemplate* pTemplate=(CDocTemplate*)theApp.m_templateList.GetNext(pos);
    m_posCurrent = pTemplate->GetFirstDocPosition() ;
    return NOERROR ;
}

STDMETHODIMP CDocuments::XEnumVARIANT::Clone(IEnumVARIANT FAR* FAR* ppenum) 
{
    METHOD_PROLOGUE(CDocuments, EnumVARIANT)   
    CDocuments* p = new CDocuments ;
    if (p)
    {
        p->m_xEnumVARIANT.m_posCurrent = m_posCurrent ;
        return NOERROR ;    
    }
    else
        return ResultFromScode( E_OUTOFMEMORY ) ;
}

The OLE Programmer's Reference (MSDN Library, Product Documentation, SDKs, OLE 2) does a more than adequate job of explaining how the IEnum family of interfaces work, so we won't go into great detail about how the above code works. However, there is one very interesting point that is worth mentioning. You'll note that the first statement in each member function is:

METHOD_PROLOGUE(CDocuments, EnumVARIANT)   

The METHOD_PROLOGUE macro facilitates getting a pointer to the containing class from within a nested class. The above example is expanded by the C++ preprocessor to:

CDocuments* pThis = ((CDocuments*)((BYTE*)this - 
                           offsetof(CDocuments, m_xEnumVARIANT))); 

After this code executes, pThis is pointing to the instance of the containing class. This is a handy way of being able to access a containing class's members from a nested class.

Collection Samples

The following examples illustrate the implementation of Automation collections: