Remote Data Service in MDAC 2.0

Kamaljit Bath
Program Manager
Microsoft Data Access Group

July 1998

Summary: Explains the available OLE DB remoting solutions and how they can help developers write scalable distributed solutions. (19 printed pages) Covers:

The Microsoft® Universal Data Access strategy is based on data sources that expose their data through OLE DB providers, and applications that use ADO or OLE DB to access the data in a uniform way. As more mature applications are written to this paradigm, the need arises to access OLE DB providers from applications that run in separate processes on the same machine or on separate machines. Remote Data Service (RDS), formerly known as ADC, has provided this functionality since the early days of Universal Data Access strategy.

This article describes the OLE DB remoting solutions available today and how developers can take advantage of these to write scalable distributed applications.

Introduction

OLE DB remoting enables applications, written to consume data from OLE DB providers, to work with remote OLE DB providers. To preserve security and robustness, it may not be preferable to run applications in the same process as the OLE DB providers they are accessing. And it is neither possible nor feasible to install OLE DB providers on all client machines, especially in today's world of distributed applications and Internet commerce. Three-tier distributed computing also requires most application logic to stay at the middle tier while the client components handle just the local validation and presentation of the data.

Remote Data Service is a component that ships with the Microsoft Data Access Components (MDAC) version 2.0. RDS enables applications to access OLE DB providers, running on remote machines or in separate processes on the same machine. RDS enables a new generation of distributed applications: Components hosted in the Internet Explorer (3.0 or later) environment can get access to live data, which comes to the client from Web servers. It brings the power of the Microsoft ActiveX® Data Objects (ADO) object model to HTML pages by using Microsoft Visual Basic®, Scripting Edition (VBScript) or Microsoft JScript™.

Problem Space

Applications hosted on Internet browsers typically use scripting routines or Java applets to interact with user actions or with data embedded inside the HTML page. With Visual Basic 5.0 or later, client applications can be built to run inside the browser and to access data sources on the server directly, rather than going through scripting. Examples of such applications are ActiveX Controls and active documents.

A typical scenario for dynamically generated Web pages looks like this:

  1. The user enters the address for the page/site from which the user wants to access data.

  2. The user request is translated into a Hypertext Transfer Protocol (HTTP) request by the browser and sent to the Web server.

    The address typically points to a Common Gateway Interface (CGI)-style application on the server or to an Active Server Page (ASP). An ASP is a specific form of Internet Server API (ISAPI) application available to the servers hosting Microsoft Internet Information Server (IIS).

  3. The server application makes a connection to the data store and creates a resultset that matches some parameters included in the initial request from the user.

  4. The resultset is converted into an HTML table that is embedded in the HTML page to be sent to the user.

  5. The user receives an HTML page that contains dynamically generated data. The page may also contain some HTML input controls that can receive some data from the user.

  6. The user submits the newly entered data by using the Submit button on the page.

This approach works very well when all programming logic must stay on the server. The HTML page sent to the user contains script code to control user actions and to perform local validations. This approach is used in most Internet applications where users cannot be trusted with live data and where there is no control on the browser version or brand the user runs.

In intranets, the environment is much better defined. Because most enterprises standardize on software components such as browsers, it is becoming common to write applications that make use of specific features of software components available on client machines. Therefore, in shops where Internet Explorer is in use, HTML pages make extensive use of ActiveX Controls. RDS provides a much better client experience for such shops and also allows a new breed of distributed applications to be created by its special implementation of OLE DB rowset marshaling.

With RDS, the HTML page at the client receives a disconnected ADO recordset that can be manipulated by the script code. RDS also enables client script to invoke methods on middle-tier Automation components in an efficient manner. This is especially important when the clients connect to servers by using HTTP.

RDS Architecture

The previous section outlined the steps in a typical scenario where the client interacts with a dynamically generated Web page. When RDS is involved, the same process can be described as follows:

  1. The user enters the address for the page/site from which the user wants to access data.

  2. The user request is translated into an HTTP request by the browser and is sent to the Web server.

  3. The user is returned an HTML page in which the RDS data control is embedded. This page may also contain some server-side script (ASP), which is executed before the page is returned.

  4. The user enters all the parameters needed for the database search. The RDS data control communicates with the server to process the user request. RDS uses standard network protocols, such as HTTP and DCOM, to communicate with the server.

  5. RDS components on the server execute the user request and typically create an ADO recordset as the result. This recordset is marshaled, by value, to the client.

  6. On the client machine, the marshaled recordset is populated and re-created inside the cursor engine component of RDS, and then embedded into the HTML page. The recordset can be directly bound to DHTML and other controls on the page, and it can also be manipulated directly by any script code present in the page.

  7. Updates made to the recordset can be directly applied to the data source on the server by the RDS data control.

The main benefit of using RDS is that the client can work directly with the advanced object model of ADO rather than a relatively static HTML table. Web page developers can take advantage of their Visual Basic skills by implementing identical data access logic. The code for working with the ADO recordset is exactly the same regardless of the language and environment where it is executed.

Figure 1.

Client applications (or HTML pages that use RDS components) can use ADO or the RDS data control to work with remote OLE DB providers. All the information needed to create the recordset is available either from the connection string and command string properties of ADO recordsets, or from similar properties on the RDS data control.

These properties are converted into an HTTP request by the RDS.DataSpace object and sent to the server. At the server, the RDS component ADISAPI maps the HTTP request into a method invocation call on the RDS.DataFactory object. RDS.DataFactory is an Automation server that creates the requested recordset by invoking the specified OLE DB provider directly. The resulting recordset is marshaled by streaming into a binary stream by the OLE DB Persistence provider.

At the client, the response to the HTTP request contains a binary stream, which is handed over to the OLE DB Persistence provider. The OLE DB Persistence provider  creates a forward-only rowset and invokes the cursor engine component. The cursor engine component buffers all the data in its memory buffers and exposes it as a disconnected recordset with static-cursor implementation. This is the recordset that is handed to the client application. Consumer applications can also directly work with the RDS.DataSpace object and use it to invoke Automation components other than the default RDS.DataFactory object.

Figure 2.

RDS components include the following:

Figure 3.

Usage Scenarios

As explained in previous sections, RDS can be used to remote recordsets across process/machine boundaries. One major factor must be noted here: The remoted recordset does not exhibit exactly the same behavior as the original recordset. The consumer application on the client gets a recordset implemented by the client-side cursor engine. All the data and metadata associated with the original recordset are remoted, but only a subset of underlying rowset properties and interfaces are remoted. So if the original recordset supports a specific nonstandard interface, the remoted recordset will not support it.

The RDS client-side recordset has these characteristics and properties:

The following OLE DB rowset properties are remoted.

Table 1.

Property Description
DBPROP_IROWSETUPDATE Controls whether the remoted recordset should be batch-updatable.
DBPROP_IROWSETCHANGE Controls whether the remoted recordset should be updatable.
DBPROP_ADC_BATCHSIZE CE-specific. Controls the number of rows that can be batched together when the rowset is being updated by the CE.
DBPROP_ADC_UPDATECRITERIA CE-specific. Specifies the update criteria (PKs, time stamps, modified columns, and so on) to be used for OCC-style batch updating by the CE when updating the recordset.
DBPROP_COMMANDTIMEOUT Standard OLE DB property. Specifies the time after which the command should be timed out.
DBPROP_MAXROWS Standard OLE DB property. Specifies the maximum number of rows that can be returned in a rowset.
DBPROP_ASYNCHFETCHSIZE CE-specific. Used by the consumers of remoted recordsets for asynchronous operations.
DBPROP_ASYNCHPREFETCHSIZE CE-specific. Used by the consumers of remoted recordsets for asynchronous operations.
DBPROP_ASYNCHTHREADPRIORITY CE-specific. Used by the consumers of remoted recordsets for asynchronous operations.

There are two ways to create remoted recordsets: by working directly with a remote OLE DB provider (implicit remoting); and by using the RDS.DataSpace object to invoke middle-tier COM Automation components.

Implicit Remoting

In this scenario, RDS is used to create client/server–style applications that can run over the intranet or Internet. The client application works directly with a data source: All the logic resides either in the client application or inside the data source (such as stored procedures or triggers).

With the help of RDS, you can create applications that run over wide area networks where the client communicates to the server through standard network protocols such as HTTP and DCOM. The advantage is that all connection-related operations—such as OLE DB providers and ODBC drivers—or database-specific operations—such as DBLIBs and SQL*NET—can stay on the server. Clients can host ADO and RDS client dynamic-link libraries (DLLs) only (which ship with Microsoft Internet Explorer 4.0 and later). There is no proprietary transport protocol needed between the client and the server. RDS uses HTTP and DCOM.

There are two ways to create remoted recordsets by using the implicit remoting paradigm: invoke RDS through ADO code; and use the RDS.DataControl object.

Invoking RDS through ADO code

The following ADO code can be used anywhere ADO can run (such as VBScript, JScript, Visual Basic, Visual C++, and Visual J++). However, there can be subtle differences, depending on the host language. On the client, you need to install MDAC 2.0 client components (available as a .cab file for run-time download from a Web site or installed with Internet Explorer 4.0 and later) along with the run-time environment for the tool used to build the application.The following code snippet shows how to open ADO recordsets created by OLE DB providers running on remote machines. The connection string used to open the ADO connection object states that it wants the OLE DB Remote provider (see Footnote) to be used. ADO treats this tag in a special manner by using the RDS.DataSpace component to send an HTTP post request to the specified Web server. The server where the OLE DB provider is installed is identified by the tag "Remote Server=". The "Remote Provider=" tag is used to specify the OLE DB provider to be used to create the recordset at the specified server. If this tag is not specified, the OLE DB Provider for ODBC (MSDASQL) is used by default. The rest of the connection string contains all the information needed by the provider to create the recordset.

Dim RS As New ADODB.Recordset
Dim conn As New ADODB.Connection
            
' specify to use remote provider, and rest of the connection info
conn.Open _ 
"Provider=MS Remote;Remote Server=http://MyServer;Remote 
          Provider=SQLOLEDB;database=pubs;"
_
Set RS.ActiveConnection = conn
' specify query to execute
RS.Source = "Select * from authors"

RS.Open

Using the RDS.DataControl object

When the Refresh method is called on the RDS.DataControl object, the specified connection string (Connect property) and command string (SQL property) are used to create a recordset on the server. This recordset is marshaled to the client side, implemented inside the cursor engine, and set to the Recordset property of the data control. The RDS.DataControl object can be set to the Datasrc property of DHTML and other controls to bind the recordset directly to these controls.

RDS.DataControl can also be used to create recordsets from remote OLE DB providers, as shown in the last section. RDS.DataControl can be created by using the following Visual Basic code (or equivalent Visual C++ or Visual J++ code):

Dim RDSDc As New RDS.Datacontrol

RDS.DataControl can also be used inside HTML pages by embedding it using the <object> tag, as shown in the following code:

<OBJECT CLASSID="clsid:BD96C556-65A3-11D0-983A-00C04FC29E33" ID="DataControl" 
<PARAM NAME="Connect" VALUE="Provider=SQLOLEDB;Database=pubs;"> 
<PARAM NAME="Server" VALUE="http://Myserver"> 
<PARAM NAME="SQL" VALUE="Select * from authors"> 
</OBJECT> 

Enhancing the implicit remoting functionality

As explained above, implicit remoting functionality uses the RDS.DataFactory object to open and update recordsets. This is a very easy way to get up and running with an RDS application, but it does not create robust, scalable applications.

Typically, some validation checks are needed at the server before recordsets can be opened or updated. Some server administrators also want to control the capabilities of the RDS.DataFactory object by allowing it to access underlying data sources based on user authentication.

The RDS custom handler enables RDS.DataFactory functionality to be customized for such needs. A custom handler is a COM object that implements the IDataFactoryHandler interface and is installed on the same machine as the RDS.DataFactory object. More than one handler can be installed on the same machine. The handler objects can intercept calls made on RDS.DataFactory methods for opening and updating recordsets. Developers can implement code inside the handler objects that can validate user actions or control access to underlying data sources by modifying the connection and/or command strings. RDS 2.0 installs a default handler whose behavior is driven by an .ini file installed in the Windows directory. This can be used by server administrators to secure the data sources from Internet clients when they use RDS to access data.

Client applications must use the "Handler=" substring in the connection strings that open recordsets when a custom handler is being used on the server. Servers can be configured to always require a handler when working with the RDS.DataFactory object. Otherwise, the use of a handler is optional.

The following code shows a typical ADO connection string when the default handler, msdfmap.handler, is used:

Dim RS As New ADODB.Recordset
Dim conn As New ADODB.Connection
            
conn.Open _
"Provider=MS Remote;Remote Server=http://MyServer;Remote 
         Provider=SQLOLEDB;Handler=msdfmap.handler,msdfmap.ini;DSN=pubs;"

Set RS.ActiveConnection = conn
RS.Source = "Select * from authors"

RS.Open

For more information on using the custom handler feature, see the Data Access Web site at www.microsoft.com/data/.

Explicit Remoting

RDS can also be used to create distributed, three-tier applications. The main difference here (as compared to implicit remoting used by two-tier client/server–style applications) is that the application logic is divided into three distinct tiers. The client components of the distributed application typically contain all the presentation-related logic and local validation checks for user input and actions. Components hosted at the middle tier typically contain the business logic, integrated validation checks, and all the code to work with underlying data sources. The third tier is composed of stored procedures and triggers residing inside the data source.

There are several advantages with using the three-tier approach:

The RDS.DataSpace object can be used to invoke middle-tier components by the client-side code. Middle-tier components that can be invoked by using the RDS.DataSpace object must meet the following criteria:

The following code snippet shows how the RDS.DataSpace object can be used to invoke middle-tier Automation components. The snippet shows how RDS.DataSpace can invoke the GetRecordset method on a COM component, identified by PROGID "MyCustObj.ProgId", running on Myserver, to which the client connects over an HTTP network. The GetRecordset method takes a parameter, myfirstparam, and returns a disconnected recordset:

Dim rdsds as new RDS.Dataspace
Dim mycustobj as new object
Dim myrs as new ADOD.Recordset

Set mycustobj = rdsds.CreateObject('MyCustObj.ProgId','http://Myserver')
Set myrs = mycustobj.GetRecordset (myfirstparam)

The proxies generated by the RDS.DataSpace object on the client in this snippet are not complete COM proxies. The CreateObject method used to generate a proxy returns a variant that contains the IDispatch * for the created object. No other interface can be obtained from this proxy object. If QueryInterface() is called to get an interface other than IDispatch, E_NOINTERFACE is returned.

All calls to methods of the object created by RDS.DataSpace.CreateObject() must be made through IDispatch::Invoke(). Furthermore, you must call GetIDsOfNames() to get the DISPIDs of the methods you want to call because RDS changes the values of the DISPIDs. They may be different from the IDs specified in the Typelib of the actual object on the server.

Security Considerations

Because you can use RDS to build datacentric applications that can be hosted inside Internet Explorer in intranet/Internet environments, it is important to know the security ramifications such applications can impose. Any time an application allows access to databases over the Internet protocol, security becomes important.

Using RDS inside the browser environment

RDS 2.0 supports cross-domain security. By default, RDS can be used to access data sources on the same server from which the page containing RDS has been downloaded. Thus, if an HTML page containing embedded ADO/RDS objects is downloaded from a server called, for example, "http://server1," RDS can be used only to connect to data sources on that server. That is, in all the following properties and methods only "http://server1" is allowed as the valid value.

Table 2.

Object Property/method Restriction
RDS.DataControl Server property Can only be set to "http://server1."
RDS.DataSpace CreateObject method Second parameter identifying the server can only be set to "http://server1."
ADODB.Connection ConnectionString property If it contains substring "Provider=MS Remote," it must also contain the substring "Remote Server=http://server1."
ADODB.Recordset Open method If the ActiveConnection parameter contains substring "Provider=MS Remote," it must also contain the substring "Remote Server=http://server1."

This is the default behavior in Internet Explorer 4.0, 4.01, and 4.01 with Service Pack 1 when the security level setting is High. For a Medium security level, a prompt is displayed any time such an operation is attempted. The user can choose whether to allow the operation. The user can specifically allow such operations either by setting the security level to Low or by customizing the security settings and enabling the Initialize and script ActiveX controls not marked as safe option.

You can set security for different Internet zones in your Internet Explorer. Internet Explorer 4.0 and later supports four security zones: Trusted Sites zone, Untrusted Sites zone, Local Intranet zone, and Internet zone.

Customizing the default RDS server behavior for remote recordsets

RDS 1.5 (shipped with Windows NT® Option Pack and Internet Information Server 4.0) installs the RDS.DataFactory object by default. Because of this, you can install RDS-enabled HTML pages on the server. However, this can also become a security risk if some Internet clients can find out the System DSNs installed on the server and the USERID and passwords needed to access them. It is possible to write HTML pages that take advantage of RDS to access these data sources over HTTP networks that can pass through firewalls.

If you are concerned about this, you can disable RDS.DataFactory functionality by removing the following keys from your server's registry:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\ W3SVC\Parameters\ADCLaunch\RDSServer.DataFactory

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\ W3SVC\Parameters\ADCLaunch\AdvancedDataFactory]

In RDS 2.0, you can use the customization handler object to safeguard data source security on your server. Customization handlers can intercept calls to open and update recordsets through the RDS.DataFactory object. They also can be used to modify the command/connection strings used to open/update the recordset, or to validate the changes to the recordset. The logic in the handler may also be used to replace the RDS.DataFactory operation altogether.

RDS 2.0 also ships a default customization handler, Msdfmap.dll, whose behavior is driven by the Msdfmap.ini file in the Windows directory. Server administrators can configure this .ini file to control the data sources that RDS.DataFactory is allowed to access, kinds of commands that RDS.DataFactory is allowed to execute, and to set up separate configurations based on USERIDs. For details on how to use this feature, see the Microsoft Data Access Web site, www.microsoft.com/data/.

Future Enhancements

In future releases, RDS will be enhanced to do the following:

RDS Object Model

RDS.DataSpace Object

Properties

Name Access permission Description
InternetTimeout R/W Indicates the number of milliseconds to wait before a request times out.

Methods

Name Description
CreateObject Creates the proxy for the target business object and returns a pointer to it.

RDS.DataControl Object

The RDS.DataControl object binds a data query recordset to one or more controls (for example, a text box, grid control, or combo box) to display the ADOR.Recordset data on a Web page.

Properties

Name Access permission Description
Connect R/W Sets or returns the database name from which the query and update operations are run.
ExecuteOptions R/W Indicates whether asynchronous execution is enabled.
FetchOptions R/W Sets or returns the type of asynchronous fetching.
FilterColumn R/W Sets or returns the column on which to evaluate the filter criteria.
FilterCriterion R/W Sets or returns which evaluation operator to use in the filter value.
FilterValue R/W Sets or returns the value with which to filter records.
InternetTimeout R/W Indicates the number of milliseconds to wait before a request times out.
ReadyState R/O Reflects the progress of an RDS.DataControl object as it fetches data into its Recordset object.
Recordset R/O Indicates the ADO recordset bound to the RDS data control.
SourceRecordset W/O Binds a new ADO recordset to the RDS data control.
Server R/W Sets or returns the IIS name and communication protocol.
SortColumn R/W Sets or returns the column by which to sort the records.
SQL R/W Sets or returns the query string used to retrieve the recordset.

Methods

Name Description
Cancel Cancels the currently running asynchronous execution or fetch.
CancelUpdate Discards all the pending changes associated with the specified Recordset object, thus restoring the values since the last Refresh method call.
CreateRecordset Creates an empty, disconnected recordset.
MoveFirst

MoveNext

MovePrevious

MoveLast

Moves to the first, last, next, or previous record in a displayed recordset.
Refresh Requeries the OLE DB data source specified in the Connect property and updates the query results.
Reset Executes the sort or filter on a client-side recordset based on the specified sort and filter properties.
SubmitChanges Submits pending changes of the locally cached updatable recordset to the ODBC data source specified in the Connect property.

RDSServer.DataFactory Object

This default server-side business object implements methods that provide read/write data access to specified data sources for client-side applications.

Methods

Name Description
ConvertToString Converts a recordset to a MIME string that represents the recordset data.
CreateRecordset Creates an empty, disconnected recordset.
Query Uses a valid SQL query string to return a recordset.
Refresh Requeries the OLE DB data source specified in the Connect property and updates the query results.
SubmitChanges Submits pending changes of the locally cached updatable recordset to the OLE DB data source specified in the Connect property.

Summary

RDS is a feature that ships with ADO. You can use it to move data from a server to a client application or Web page, manipulate the data on the client, and return updates to the server in a single round trip.

RDS features can be summarized as follows:

The Remote Data Service can work disconnected through HTTP over your intranet, or the Internet through firewalls. RDS enables client applications to work with disconnected recordsets remoted from the servers. This is a completely stateless model, which means the server might shut down and restart between RDS client requests. The process that creates the original recordset on the server is also stateless and is not kept alive between method calls. This provides for better scalability of server applications and fits well with the MTS programming model.

Footnote

The OLE DB Remote provider ("MSRemote") is not actually an OLE DB provider. It is a special tag that can be used in the connection string for opening ADO recordsets. ADO treats this tag ("Provider=MS Remote") in a special manner. You cannot directly invoke or work with MSRemote as you can with other OLE DB providers from elsewhere in your code. In the next release of RDS, MSRemote will be implemented as a special OLE DB provider with its own properties and settings. At that time it will be possible to code directly to this provider from your applications, just like you can code to other OLE DB providers such as Index Server (Monarch) and Microsoft SQL Server™ OLE DB provider (SQLOLEDB).