Using OLE DB and ADO 2.5 to Represent Nonlinear, Nonrelational Types of Data -- MSJ, November 1999

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

November 1999

Using OLE DB and ADO 2.5 to Represent Nonlinear, Nonrelational Types of Data
Bob Beauchemin

OLE DB is a COM object model that is defined as a set of cotypes such as Data Source or Rowset. a cotype is expressed in pseudo-IDL and simply lists the set of mandatory and optional interfaces. OLE DB can be extended by adding a new cotype or refining the definition of an existing one.

This article assumes you're familiar with OLE DB

Bob Beauchemin works at DevelopMentor, teaching and developing the OLE DB/ADO and MTS/COM+ curriculum. Reach Bob at bobb@develop.com or http://www.develop.com/hp/bobb. (Microsoft Press, 1998).

OLE DB was introduced in 1996 as part of the Microsoft® Universal Data Access strategy. The Universal Data Access strategy can be summarized as follows: data should be accessible through a common API without changing its format, replicating it to a common data store, or sacrificing performance for generic functionality. The original OLE DB specification defined data access in terms of the relational DBMS (RDBMS) paradigm of rectangular rowsets of rows and columns. This was consistent with the preceding APIs that dealt with relational databases (such as ODBC, dbLib, and DAO), and made it easy for developers who were already familiar with an existing data access API to transition to the world of OLE DB.
     OLE DB is a COM object model that is defined as a set of cotypes such as Data Source or Rowset. The definition of a cotype is expressed in pseudo-IDL and simply lists the set of mandatory and optional interfaces. To allow extensibility, a provider can implement additional interfaces not specified in the cotype definition. To accommodate different types of data, OLE DB can be extended by adding a new cotype or refining the definition of an existing one. Dino Esposito's article, "Exposing Your Custom Data in a Standardized Way Through ADO and OLE DB" (MSJ, June 1999), covered the existing base cotypes and how to use them. He looked at OLE DB objects from both provider (OLE DB COM server) and consumer (OLE DB COM client) points of view.
     In this article, I'll discuss in detail the new cotypes introduced by OLE DB 2.5 that accommodate certain nonrelational types of data. I'll show you how to use these new cotypes and explain how they integrate with the existing OLE DB object model. While most of this article will focus on OLE DB-level access, the impact on ADO will be discussed as well. You'll see that as the OLE DB object model morphs to encompass different types of data, it comes closer to providing truly universal data access.

Extensions for Tree-structured Data
     Not all data fits neatly into rectangular rowsets. The first nonconforming data to challenge the rectangular rowset paradigm was hierarchical data—that is, data that includes parent and child relationships. This is the native data model of hierarchical DBMS systems (like IBM's IMS), object hierarchies in object-oriented databases (OODBMS), and others. Although this data can be shoehorned into the RDBMS rectangular rowset, performance problems are encountered as repeating fields in a join are transmitted with each joined row. This problem can occur even with RDBMS data, as shown in Figure 1.

Figure 1 Repeating Fields in a Join

     Figure 1 Repeating Fields in a Join

     The OLE DB 1.5 specification addressed this problem by introducing a new type of column called a chapter column. The idea was that individual levels of the hierarchy would be represented as separate rowsets and the chapter column would contain pointers from each parent row into the corresponding set of rows in the child rowset. Figure 2 shows such a hierarchy using chapters. At the same time that chapters were introduced, Microsoft included an OLE DB service provider—which is simply a provider that gets its data from other OLE DB providers rather than a data store—called the MSDataShape provider. This provider consumes sets of rectangular rowsets and produces chaptered hierarchies, which solved the hierarchical data model problem for the time being.

Figure 2 Chapter Hierarchy

     Figure 2 Chapter Hierarchy

     Next on the agenda was multidimensional data. Rectangular rowsets have two dimensions represented by rows and columns; multidimensional data is represented as a series of points that can be organized around many dimensions at once. Although this can be modeled as rows and columns, multidimensional query results often use more than two axes, and the performance penalty of modeling them as multiple rectangular rowsets is severe. OLE DB for OLAP (introduced as part of OLE DB 2.0) introduced the Dataset cotype that more closely represents how multidimensional data is structured and used. With the new Dataset cotype and a few other minor changes, multidimensional data was integrated into the OLE DB object model.
     The chapter paradigm is useful for modeling parent/child relationships in homogenous hierarchies, where the rows at a given level of the hierarchy contain the same structure and number of columns. It is not as effective for modeling heterogeneous tree-structured hierarchies where, at a given level, each node (row) may contain varying properties (columns). Examples of tree-structured hierarchies include file systems that contain different metadata (extended attributes) for each node, and data stores like Exchange where different types of data can be stored in different folder types.
     Tree-structured data stores have some unique requirements. First, each row can have a varying number of columns. There will often be some common columns and some extended columns. The ability to add columns to an individual row without affecting other rows in the same rowset is mandatory. Second, operations may be performed that affect every row at a given hierarchical level, for example, moving or renaming a directory in a file system (and every file in it). Third, each node consists of one main source data column (the file contents in a file system or the message in a message store), as well as common columns (file name, file size, and so on) and varying columns (extended attributes). Often the requirement is just to access the source data column directly, as in a request to open the file C:\myfile.txt and read its contents.
     To accommodate tree-structured data without the performance penalty of coercing it into the rectangular model (imagine a rowset containing all the possible extended attributes as columns), OLE DB 2.5 introduces two new cotypes, the Row and the Stream. The definition of these cotypes is shown in Figure 3. The Row cotype represents an individual row in a rowset (a single file in a directory of files or the result of a singleton SELECT statement in SQL). It can be fetched in the context of a rowset or by itself. a handful of new interfaces integrate the Row cotype into the existing OLE DB object design.
     The Stream cotype represents the source data column of such a row and contains familiar interfaces like IStream and ISequentialStream to access its contents. You can determine whether a provider supports row objects by checking the DBPROPVAL_OO_ROWOBJECT bit on the property DBPROP_OLEOBJECTS. The value of this property can be obtained by using the GetProperties method on the Data Source object's IDBProperties interface as shown here:
bool SupportsRow(CDataSource *pDataSource) { CComVariant varObj; pDataSource->GetProperty( DBPROPSET_DATASOURCEINFO, DBPROP_OLEOBJECTS, &varObj); return (varObj.intVal & DBPROPVAL_OO_ROWOBJECT); }
For simplicity, this code uses the ATL CDataSource object, which encapsulates the functionality of IDBProperties:: GetProperties.
     Often what needs to be accomplished is to access the row or default stream directly, so OLE DB 2.5 also introduces a new binding mechanism based on URLs. The URL binding mechanism allows you to use a URL as a connection string without knowing which provider services the URL you are requesting. Through this mechanism, it is also possible to ask for any specific OLE DB object type directly without traversing through all the OLE DB object hierarchy (as in Data Source creates Session, which creates Rowset, which creates Row, and so on). This is accomplished by means of a new provider cotype called the Binder and a new runtime component called the Root Binder.
     The first provider to take advantage of these new features is the OLE DB Provider for Internet Publishing, MSDAIPP. This provider works by communicating with a Web server through either the Web Extender Client (WEC) protocol of Microsoft FrontPage® or the Web Distributed Authoring and Versioning (WEB-DAV) protocol. This provider first shipped with Microsoft Internet Explorer 5.0 and also ships with Office 2000 (it's used in the implementation of Web Folders) and Windows® 2000. The Internet publishing provider has unique features geared toward team development, version control, and working while offline. I'll use this provider as my vehicle to explore OLE DB 2.5 extensions. While occasionally I will address some provider-specific behavior, the overall architecture can be generalized to other OLE DB 2.5-aware providers.

Inside the Row Object
     Before going into what you can do with a Row object (see Figure 4), I'm going to explain how you get one.

Figure 4 The Row Object

     Figure 4 The Row Object

As it turns out, there are many different ways to get a Row object—this is OLE DB, after all. In previous versions of OLE DB, Rowsets were typically acquired by calling IOpenRowset::OpenRowset on the Session object or by calling ICommand::Execute on the Command object. These two methods have been overloaded to now return either Rowsets or Rows. The OLE DB Session object is often used as a factory object for the Rowsets and Rows. The Session object's implementation of IOpenRowset::OpenRowset determines which cotype to return based on the requested interface. If a Row-specific interface is requested (for example, IRow), then a Row object is returned. If a Rowset-specific interface or an interface that is part of both Rowset and Row is specified, a Rowset object is returned instead. You can also force a Row object to be returned by setting the initial property set's DBPROP_IRow property to VARIANT_TRUE.
HRESULT GetSam(IOpenRowset *pSession, IRow **ppRow) { // call OpenRowset passing URL for table name // and asking for an IRow-specific pointer DBID idTable; idTable.eKind = DBKIND_NAME; idTable.uName.pwszName = L"http://bob.com/sam.htm"; return pSession->OpenRowset(NULL, &idTable, NULL, IID_IRow, 0, NULL, (IUnknown **)ppRow; }
     You can also execute a Command that returns a Row using ICommand::Execute. As with OpenRowset, you indicate your preference for a row by asking for an interface that is specific to the Row object. This would be useful in a SQL scenario for SELECT statements that return a single row. In some DBMSs, it is much more efficient to fetch a single row than to fetch a rowset containing a single row.
     Getting a Row via IOpenRowset::OpenRowset or ICommand::Execute is useful for singleton select-style scenarios. To accommodate collections of rows, the existing Rowset object can now provide access to individual rows either via the new Row object or via the traditional accessor-style binding. You can get a Row object from a rowset using the Rowset's IGetRow interface as shown in Figure 5.
     This works, but is an awful lot of overhead to bind to a single row, especially considering the fact that code for creating and initializing the initial Data Source and Session objects is not shown here. To optimize traversal into a provider's object hierarchy, OLE DB 2.5 allows you to do direct binding to the object of your choice, rather than always explicitly instantiating all of the intermediate objects in the hierarchy. Depending on how the provider is implemented, the cost of building the intermediate object model may be bypassed. OLE DB provides a Root Binder object that can be used to bind to a provider's object via its name. The following illustrates how to use the Root Binder to attach to a Row object from the MSDAIPP provider.
HRESULT GetSam3(IRow **ppRow) { CComPtr<IBindResource> pbr; hr = CoCreateInstance(CLSID_ROOT_BINDER, NULL, CLSCTX_INPROC_SERVER, IID_IBindResource, (void**)&pbr); // Bind to an IRow return pbr->Bind(NULL,L"http://bob.com/sam.htm", DBBINDURLFLAG_READWRITE, DBGUID_ROW, IID_IRow, NULL, NULL, NULL, (IUnknown **)ppRow); }
Much easier, yes? Now that you have a row, let's see what you can do with it.

Row Operations
     OLE DB providers that deal with rowsets read data through an accessor. Consumers ask the provider to create an accessor (using IAccessor::CreateAccessor) and receive a handle to it for use in subsequent calls against the rowset. Consumers are expected to set up an array of DBBINDING structures, one for each column, and use this as a parameter to CreateAccessor. Accessors and DBBINDINGs contain a single buffer for all the data to be fetched, and although they allow some nice data-fetching optimizations, they are tedious to set up just to retrieve a few column values. Column values from a row can be returned by using the IRow::GetColumn method, which uses a simpler DBCOLUMNACCESS structure instead of an accessor. Figure 6 shows this structure. Figure 7 shows how to set one up using the data from IColumnInfo::GetColumnInfo. Once you have this array of structures in place, it's a simple matter to call GetColumns to get column information (value, status, and length). You can also set column values using the same array with the SetColumns method. New extended columns may be added to an individual row by using IRowSchemaChange::AddColumns, as shown in Figure 8.
     Since the point of using DBCOLUMNACCESS structures rather than accessors is to squeeze every bit of performance out of the data-fetching mechanism, OLE DB 2.5 introduces two other optimizations. In addition to fetching only certain columns rather than all of the columns (this is available with accessors as well), there are options on the BINDURLFLAGS that are hints to the provider to delay fetching either the default stream or the columns themselves.
     The last optimization concerns the column information. IColumnsInfo gets information about all the columns. When there are many columns you may not even want information about all of them. The semantics of IColumnsInfo have been refined in OLE DB 2.5 so that it is up to the provider to determine whether IColumnsInfo::GetColumnInfo returns information about all columns or only columns in the base schema (the resource rowset columns in the case of Document Source providers). In addition, IColumnsInfo2 is a new interface that contains a method called GetRestrictedColumnsInfo, which allows you to get information about a subset of columns. This is an optional interface on the Rowset and Row objects.
     So far, so good. If you have a row, you can get and set the column data, and even add new columns and delete columns. But what if you don't have a row and want to create one from scratch? For this you can use the aptly named ICreateRow method. ICreateRow is available on the Session object, the Row object (provided that the data source points to a folder), or the Binder object. Based on the values of a bitmask called DBBINDURLFLAG (see Figure 9), you can use it to create a new document or folder, or even overwrite an existing document. It works as shown in Figure 10.
     It's good to be able to create new rows (documents) and read information about existing rows, but most of the time what you really want is to be able to read and write the contents, the stuff in the document. OLE DB 2.5 provides a new cotype for this, the Stream. a Row can contain 0 to n Stream columns. The provider is allowed to designate one as the default Stream. The default Stream can be accessed using a shortcut DBID, DBROWCOL_DEFAULTSTREAM. Stream columns can be accessed through a method (IRow:: Open) on the Row object. For Document Source providers, like the OLE DB Provider for Internet Publishing, the size of the default Stream is available as a column in the resource rowset (RESOURCE_STREAMSIZE).
     As shown in Figure 11, the OLE DB Stream cotype has a single mandatory interface (ISequentialStream) and six optional interfaces.

Figure 11 The Stream Object

     Figure 11 The Stream Object

While IStream is an optional interface, it is required if the Stream object is to be usable by ADO. To get a Stream object from the Row object, read its contents, and print the contents as a string, you can use the following code. The buffer is set up with ANSI documents. Stream objects can be ANSI or Unicode.
HRESULT ReadTheStream(IRow *pRow) { HRESULT hr; CComPtr<IStream> pStm; DBID idStream = DBROWCOL_DEFAULTSTREAM; hr = pr->Open(NULL, &idStream, DBGUID_STREAM, 0, IID_IStream, (IUnknown **)&pStr); if (SUCCEEDED(hr)) { // Read the first 200 bytes CHAR szContents[200]; ULONG cbRead; memset(szContents, 0, 200); hr = pStm->Read((void *) szContents, 200, &cbRead); if (SUCCEEDED(hr)) printf("Stream contents are: %s\n", szContents); } return hr; }
Folder Operations
     In a tree-structured data source, some rows may represent terminal leaf nodes in the tree, while other rows may represent intermediate nodes with subrows. Because this model maps onto so many common scenarios (such as Web sites, file systems, and MAPI message stores), OLE DB formalizes this concept by characterizing such providers as Document Sources. a Document Source provider returns rows with a set of columns known as the Resource Rowset columns. These are required columns and are listed in Figure 12. The value of the column RESOURCE_ISCOLLECTION field will be set to VARIANT_TRUE if the row represents a folder. The Internet publishing provider (MSDAIPP) is a Document Source provider.
     Given the nature of a Document Source provider, you would like to be able to do operations like rename and move an entire folder tree, as well as list the documents contained in a folder. The easiest way to list documents is to open a rowset on a Session object that points to a folder. All the documents and subfolders in that folder will be returned as rows in the rowset.
// get IOpenRowset interface on Session object • • • // call OpenRowset passing URL for table name // ask for a rowset DBID idTable; idTable.eKind = DBKIND_NAME; idTable.uName.pwszName = L"http://bob.com/myfolder/"; CComPtr<IRowset> pRowset; pOpenRowset->OpenRowset(NULL, &idTable, NULL, IID_IRowset, 0, NULL, (IUnknown **)&pRowset);
     If you happen to have an existing row that points to a folder, you can enumerate its children using the IScopedOperations interface. IScopedOperations allows you to perform operations on all the child rows in one call. The correct way to list folder contents is by calling IScopedOperations::OpenRowset.
HRESULT GetMyChildren( IRow *pRow, IRowset **ppRowset) { HRESULT hr; CComPtr<IScopedOperations> pRowScops; // pRow must point to // a row that's a folder hr = pRow->QueryInterface(IID_IScopedOperations, (void **)&pRowScops); if (SUCCEEDED(hr)) hr = pRowScops->OpenRowset(NULL, NULL, NULL, IID_IRowset, 0, NULL, (IUnknown **)ppRowset); return hr; }
     The Copy, Move, and Delete methods do what you would expect, but take a special bitmask (dwMoveFlags is shown as an example in Figure 13). This allows fine-grained control over behaviors such as whether links will be updated and whether moving or copying all the existing folders and documents will be done as an atomic operation. Since these methods perform multiple operations, Copy, Move, and Delete return arrays of status fields that allow you to determine exactly what happened. You can determine whether a provider supports scoped operations against Row objects by checking the DBPROPVAL_OO_SCOPED bit on the property DBPROP_ OLEOBJECTS.
Direct Binding
     For a given OLE DB provider, usually only the Data Source and (optional) Enumerator objects can be instantiated directly. The Session, Command, Rowset, and other objects must be created by factory methods, such as IDBCreateSession::CreateSession, which creates a Session object from a Data Source object. For example, to get the default stream contents for a given document, I'd have to do the following:
Instantiate the MSDAINITIALIZE factory object.
Call IDataInitialize::Initialize to get a Data Source object, then initialize it.
QueryInterface for IDBCreateSession and use it to create a Session object, getting an IOpenRowset pointer.
Use IOpenRowset to create a Row object and get its IRow pointer.
Call IRow::Open to open the default stream and get a
      pointer to its ISequentialStream interface.
Use IStream::Read to read the stream (file).
     Whew! That's a lot of work just to open and read a file. The developers of OLE DB 2.5 must have thought so too, as they invented an easier way, known as direct binding. The concept is that with a single URL the OLE DB runtime should be able to figure out what provider you need, invoke the correct provider, and ask it to get the specific object you desire. It does this through another new cotype called the Binder (see Figure 14).

Figure 14 The Binder Object

     Figure 14 The Binder Object

     There are two types of Binder object, one that is implemented by your provider and one that is provided by the OLE DB runtime, called the Root Binder. Similar in concept to the root enumerator (which simply lists all the providers and provider-specific enumerators that are registered), the Root Binder has the smarts to look at a URL, determine which provider services that URL, and call that provider's Binder object. The Root Binder passes along the URL (and any properties you give to it), and it is up to the provider's Binder object to retrieve a pointer to the specified interface on the object of your choice. In this release, the Root Binder knows which provider to call by looking for mappings in the registry under HKLM\Software\Microsoft\DataAccess\RootBinder. This registry key contains mappings of URL prefixes to providers. Providers claim their part of the URL namespace by calling IRegisterProvider::SetURLMapping during provider installation. Currently, each prefix can only be registered to a single provider.
     The OLE DB Provider for Internet Publishing registers three prefixes: http and https as you'd expect, and another prefix, msdaipp, which it uses for internal administrative work. For example, the URL msdaipp://editedoffline returns a rowset of items that have been edited offline. The Internet publishing provider can use this list to synchronize documents between offline storage and a Web server through a special interface that it implements on the Row object, ISynchronizeRow. Your provider registers the URL prefixes it wants to support by instantiating a Root Binder object and calling IRegisterProvider::SetURLMapping.
     Let's see how direct binding works in code. You've already seen an example of how to use the Root Binder to produce a Row object. Here, I'll use the Root Binder to directly instantiate a Stream object for the same document:
HRESULT GetSam4(IStream **ppStm) { CComPtr<IBindResource> pbr; CoCreateInstance(CLSID_ROOT_BINDER, NULL, CLSCTX_INPROC_SERVER, IID_IBindResource, (void**)&pbr); // Bind to an IStream return hr = pbr->Bind(NULL, L"http://bob.com/sam.htm", DBBINDURLFLAG_READWRITE, DBGUID_STREAM, IID_IStream, NULL, NULL, NULL, (IUnknown **)ppStm); }
Note that the only difference here is that I'm asking for a Stream object instead of a Row object. You can also use the Root Binder to create a new row with ICreateRow:
HRESULT CreateSteve(IRow **ppRow) { HRESULT hr; CComPtr<ICreateRow> pCreateRow; CoCreateInstance(CLSID_ROOT_BINDER, NULL, CLSCTX_INPROC_SERVER, IID_ICreateRow, (void**)&pCreateRow); // Create a new Row LPOLESTR *ppwszNewURL = NULL; return pCreateRow->CreateRow(NULL, L"http://bob.com/steve.htm", DBBINDURLFLAG_READWRITE, DBGUID_ROW, IID_IRow, NULL, NULL, NULL, ppwszNewURL, (IUnknown **)ppRow); }
     After reading the documentation, I originally thought that the OLE DB object type that you wanted to create was always derived from the URL. An example would be the URL bobsprovider://myspecial-row/row, with the trailing /row being the name of the object the user wanted. The Internet publishing provider does not do this, but if a provider writer had a reason for wanting that behavior, such a URL could be handled by the provider's Binder (it is the provider's implementation choice, after all).
     Providers may be required to implement and register Binder objects to be considered OLE DB 2.5-compliant. In the registry on my machine, the only other provider that appears to have implemented a Binder object is the Active Directory™ Services Interface (ADSI) provider, which has registered the prefixes for ldap, winnt, nds, and nwcompat. You can determine if a provider supports direct binding by checking the DBPROPVAL_OO_DIRECTBIND bit on the property DBPROP_OLEOBJECTS, using IDBProperties::GetProperties.

Other OLE DB 2.5 Enhancements
     In addition to the new Row, Stream, and Binder cotypes and their new interfaces, interfaces have been added to other OLE DB objects to support and interoperate with the new objects. a complete list of the new cotypes and new interfaces on existing cotypes is provided in Figure 3. One new type of provider is the Document Source provider. You can find out if a provider is a Document Source provider by checking the DBPROP_DATASOURCE_TYPE property for the value DBPROPVAL_DSTDOCSOURCE.
     OLE DB 2.5 contains enhancements to the OLE DB runtime, namely the service components and Data Links. The session pooling service has been enhanced to permit specification of the session timeout value—that is, the interval at which the pool manager performs garbage collection on the pool. Sessions that are still unused since the last garbage collection interval are released from the pool. This value is configurable on a per-provider basis and is specified as an SPTimeout DWORD value under the provider's CLSID. The default is 60 seconds. The retry wait time that the pooler uses to wait between connection attempts is also configurable. This can be specified in the registry under HKLM\Software\Microsoft\DataAccess\ Session Pooling\SPTimeout as a DWORD value. The default is 60 seconds. The whitepaper "Pooling in the Microsoft Data Access Components," by Don Willits and Leland Ahlbeck (http://msdn.microsoft.com/library/techart/pooling2.htm), describes the behaviors these timeouts control in detail.
     The Data Links component has received a number of enhancements. There are new features that make Data Links more configurable and less confusing for users. Some new dwPromptOptions flags in the PromptDataSource method of IDBPromptInitialize allow you to disable provider selection and disallow or track saving the password in the .UDL file. a provider filter may be specified so that the user sees only multidimensional providers, nonmultidimensional providers, or enumerators. Finally, you can filter the providers or .UDL files visible in the Data Links provider selection list based on an array of provider strings.
     Providers are also allowed to customize the Connection and Advanced tabs of the Data Links property sheet. To do this, the provider implements property pages for both tabs, registers their CLSIDs, and supports the IServiceProvider interface on the Data Source object. When the provider is selected, the Data Links component will QueryInterface for IServiceProvider, then call IServiceProvider::QueryService, specifying the SVC_DSLPropertyPages service and requesting the IID_ ISpecifyPropertyPages interface. If this succeeds, the Data Links component calls ISpecifyPropertyPages::GetPages to retrieve the array of property page CLSIDs. The provider and Data Links component then communicate through the IPropertyPage, IPropertyPage2, IPropertyPageSite, IPropertyBag, and IPersistPropertyBag interfaces. The default behavior is to use the standard property pages.
     Although these new options make OLE DB Data Links easier to use from a programmer's perspective, the New | Microsoft Data Link choice has been removed from all the context menus (it was available from all of them in Windows NT 4.0). There is no Data Links control panel applet, and Data Links are not available from the Administrative Tools menu, as is the ODBC Data Sources Administrator. There will be a registry script that can be run by developers or as part of a product installation to reinstate this OLE DB 2.1 behavior.

ADO 2.5
     All this talk of new features in OLE DB API terms is useful for those C++ programmers and provider writers who use the native API, but most programmers access OLE DB functionality through ADO from Visual Basic® or script. Because of this, ADO has been revised in tandem with OLE DB to support many of these new features.
     The first thing you will notice about ADO 2.5 is that there is no direct exposure of the Root Binder. Programmers using ADO can specify the appropriate provider and the URL as a connection string and call Connection.Open as before, but the much preferred method is to use URL= in the connection string. This invokes the Root Binder and allows it to invoke the correct provider, based on the URL. Here's an example of establishing a connection via the Root Binder and printing the resulting connection string.
Dim conn as ADODB.Connection Set conn = New ADODB.Connection conn.Open "URL=http://bob.com/" Debug.Print conn.ConnectionString
The previous code prints this:
Provider=MSDAIPP.DSO.1; Data Source= http://bob.com/; Bind Flags=0; Mode=Read; Lock Owner=""; Locale Identifier=1033; User ID=""; Password=""; Ignore Cached Data=False; Cache Aggressively=False; Treat As Offline=False; Mark For Offline=0; Protocol Provider={00000000-0000-0000-0000-000000000000}
     The Command or Recordset objects can still be used to open an ADO Recordset as in previous versions. When using the WEB-DAV protocol, the Internet publishing provider uses its own dialect of SQL, which supports commands like "select * from scope() where RESOURCE_ STREAMSIZE > 2048". When not using the WEB_DAV protocol, a command string is required (it cannot be blank), but any command string results in a Recordset containing all the rows. Here are two examples that get a Recordset containing the contents of a folder:
Dim strConnect as String strConnect = "URL=http://bob.com/" Dim cmd As ADODB.Command Set cmd = New ADODB.Command cmd.ActiveConnection = strConnect cmd.CommandText = "select * from scope()" Dim rs1 As ADODB.Recordset Set rs1 = cmd.Execute Dim rs2 As ADODB.Recordset Set rs2 = New ADODB.Recordset rs2.Open "select * from scope()", strConnect
     The URL must refer to a folder (directory); it will not work when pointed at an individual document—just as you cannot get an IRowset pointer to an individual document through the Root Binder in OLE DB.
     ADO defines new Record and Stream objects that encapsulate most of the functionality of the OLE DB Row and Stream cotypes. When you get a Record object, it is not necessary to know if you are pointing at a folder or a document, but you can find out by looking at the value of the RESOURCE_ISCOLLECTION field. If you have a Recordset (folder) object, you can get a Record object (representing a folder or document) directly from the current row in a Recordset. This is the equivalent of the OLE DB IGetRow:: GetRowFromHROW.
Dim strConnect as String strConnect = "URL=http://bob.com/" Dim rs As ADODB.Recordset Set rs = New ADODB.Recordset rs.Open "select * from scope()", strConnect Dim rec As ADODB.Record Set rec = New ADODB.Record rec.Open rs 'Get a Record object 'corresponding to the current row 'in the Recordset
You can also open a Record object by using the Value of the RESOURCE_ABSOLUTEPARSENAME field of the current record, like this:

Dim strConnect as String strConnect = "URL=http://bob.com/" Dim rs As ADODB.Recordset Set rs = New ADODB.Recordset rs.Open "select * from scope()", strConnect Dim rec As ADODB.Record Set rec = New ADODB.Record rec.Open rs("RESOURCE_ABSOLUTEPARSENAME").Value
     That's the hard way. If you know the name of the document that you want to access, you don't have to go through the Recordset. Using MSDAIPP, Record.Open will work with a string literal specified as the source. (Make sure that it's not prefixed by URL=.)
Dim rec As ADODB.Record Set rec = New ADODB.Record rec.Open "http://bob.com/sam.htm"
     You can also use direct binding and the URL= syntax in the connection string to invoke the Root Binder when using Record.Open, just like when using Recordset.Open and Command.Execute. Note the subtle difference between specifying the connection string using the URL= syntax (the second parameter to Record.Open) and specifying the name of the folder or document as the source (the first parameter, no URL=) on Record.Open. Columns are exposed through the Fields collection, as usual:
Dim strURL As String Dim rec As ADODB.Record Set rec = New ADODB.Record strURL = "URL=http://bob.com/sam.htm" rec.Open , strURL, adModeReadWrite, & _ adOpenIfExists, adOpenURLBind For i = 0 To rec.Fields.Count - 1 Debug.Print rec.Fields(i).Name & " = " & _ rec.Fields(i).Value Next i
This produces the output in Figure 15 as you would expect.
     New Record objects corresponding to new folders or documents can be created by using Record.Open. You specify the source name, the connection string, and a special flag that indicates whether you would like to create a folder (collection) or document (noncollection).
Dim conn As ADODB.Connection Set conn = New ADODB.Connection conn.Open "URL=http://bob.com/" Dim rec As ADODB.Record Dim rec2 As ADODB.Record Set rec = New ADODB.Record Set rec2 = New ADODB.Record ' create a new folder under the folder specified ' in the current connection rec.Open "newdir", conn, adModeWrite, & _ adCreateCollection ' create a new document under the folder specified ' in the current connection rec2.Open "newpage.htm", conn, adModeWrite, & _ adCreateNonCollection
     You can also bind to the Stream. The ADO object can read and write ASCII, Unicode, or binary streams. Unicode is the default. The only limitation is that, in order to be usable by ADO, the provider writer must include support for IStream, which is an optional interface on the OLE DB Stream object.
     Here's how to open and write and read from a stream with ADO. The following code reads and writes a Unicode stream:
Dim strURL As String Dim stm As ADODB.Stream
Set stm = New ADODB.Stream strURL = "URL= http://bob.com/ sam.htm" stm.Open strURL, adModeReadWrite stm.WriteText "Hello, World!", adWriteLine stm.Flush stm.Close stm.Open strURL, adModeReadWrite, adOpenStreamFromURL mytext = stm.ReadText() Debug.Print mytext stm.Close
To use ASCII streams, set the Stream object's charset variable to ASCII. To read and write binary streams, use Stream.Read and Stream.Write rather than the text versions.

ADO and Scoped Operations
     ADO 2.5 includes support for scoped operations, although the term scoped operations is never mentioned. It's implemented through methods on the Record object. OpenRecord, MoveRecord, and DeleteRecord work as you would expect, whether the record is a single document or a folder (see Figure 16).
     The Record object also has a GetChildren method. GetChildren is an implementation of IScopedOperations::OpenRowset, which returns a recordset of the folder's documents:
Dim rec As New ADODB.Record Dim rs As ADODB.Recordset rec.Open "http://bob.com/" Set rs = rec.GetChildren While Not rs.EOF Debug.Print rs(0).Name & " = " & rs(0).Value rs.MoveNext Wend
     There is one other limitation of using ADO with the Internet publishing provider. Because synchronization between documents that are updated in offline mode and documents on the Web server is done through ISynchronizeRow, an extended interface (an interface defined outside of the standard interfaces for the specified cotype in the OLE DB specification), this functionality cannot be programmed using ADO.

Summary
     You've seen how OLE DB 2.5 contains extensions to efficiently model tree-structured data and how to use these extensions in native OLE DB and ADO. These new refinements to the OLE DB object model provide performance-oriented mappings for relational data, homogeneous hierarchies, multidimensional data, and tree-structure data, adding more credibility to the "Universal" in Universal Data Access.

For related information see: Microsoft OLE DB Overview in the MSDN Online Library: http://msdn.microsoft.com/library/techart/adosql_1.htm.
Also check http://msdn.microsoft.com for daily updates on developer programs, resources and events.

From the November 1999 issue of Microsoft Systems Journal.