Kamaljit Bath
Program Manager
Microsoft Data Access Group
July 1998
Summary: Explains the available OLE DB remoting solutions and how they can help developers write scalable distributed solutions. (19 printed pages) Covers:
The Microsoft® Universal Data Access strategy is based on data sources that expose their data through OLE DB providers, and applications that use ADO or OLE DB to access the data in a uniform way. As more mature applications are written to this paradigm, the need arises to access OLE DB providers from applications that run in separate processes on the same machine or on separate machines. Remote Data Service (RDS), formerly known as ADC, has provided this functionality since the early days of Universal Data Access strategy.
This article describes the OLE DB remoting solutions available today and how developers can take advantage of these to write scalable distributed applications.
OLE DB remoting enables applications, written to consume data from OLE DB providers, to work with remote OLE DB providers. To preserve security and robustness, it may not be preferable to run applications in the same process as the OLE DB providers they are accessing. And it is neither possible nor feasible to install OLE DB providers on all client machines, especially in today's world of distributed applications and Internet commerce. Three-tier distributed computing also requires most application logic to stay at the middle tier while the client components handle just the local validation and presentation of the data.
Remote Data Service is a component that ships with the Microsoft Data Access Components (MDAC) version 2.0. RDS enables applications to access OLE DB providers, running on remote machines or in separate processes on the same machine. RDS enables a new generation of distributed applications: Components hosted in the Internet Explorer (3.0 or later) environment can get access to live data, which comes to the client from Web servers. It brings the power of the Microsoft ActiveX® Data Objects (ADO) object model to HTML pages by using Microsoft Visual Basic®, Scripting Edition (VBScript) or Microsoft JScript™.
Applications hosted on Internet browsers typically use scripting routines or Java applets to interact with user actions or with data embedded inside the HTML page. With Visual Basic 5.0 or later, client applications can be built to run inside the browser and to access data sources on the server directly, rather than going through scripting. Examples of such applications are ActiveX Controls and active documents.
A typical scenario for dynamically generated Web pages looks like this:
The address typically points to a Common Gateway Interface (CGI)-style application on the server or to an Active Server Page (ASP). An ASP is a specific form of Internet Server API (ISAPI) application available to the servers hosting Microsoft Internet Information Server (IIS).
This approach works very well when all programming logic must stay on the server. The HTML page sent to the user contains script code to control user actions and to perform local validations. This approach is used in most Internet applications where users cannot be trusted with live data and where there is no control on the browser version or brand the user runs.
In intranets, the environment is much better defined. Because most enterprises standardize on software components such as browsers, it is becoming common to write applications that make use of specific features of software components available on client machines. Therefore, in shops where Internet Explorer is in use, HTML pages make extensive use of ActiveX Controls. RDS provides a much better client experience for such shops and also allows a new breed of distributed applications to be created by its special implementation of OLE DB rowset marshaling.
With RDS, the HTML page at the client receives a disconnected ADO recordset that can be manipulated by the script code. RDS also enables client script to invoke methods on middle-tier Automation components in an efficient manner. This is especially important when the clients connect to servers by using HTTP.
The previous section outlined the steps in a typical scenario where the client interacts with a dynamically generated Web page. When RDS is involved, the same process can be described as follows:
The main benefit of using RDS is that the client can work directly with the advanced object model of ADO rather than a relatively static HTML table. Web page developers can take advantage of their Visual Basic skills by implementing identical data access logic. The code for working with the ADO recordset is exactly the same regardless of the language and environment where it is executed.
Figure 1.
Client applications (or HTML pages that use RDS components) can use ADO or the RDS data control to work with remote OLE DB providers. All the information needed to create the recordset is available either from the connection string and command string properties of ADO recordsets, or from similar properties on the RDS data control.
These properties are converted into an HTTP request by the RDS.DataSpace object and sent to the server. At the server, the RDS component ADISAPI maps the HTTP request into a method invocation call on the RDS.DataFactory object. RDS.DataFactory is an Automation server that creates the requested recordset by invoking the specified OLE DB provider directly. The resulting recordset is marshaled by streaming into a binary stream by the OLE DB Persistence provider.
At the client, the response to the HTTP request contains a binary stream, which is handed over to the OLE DB Persistence provider. The OLE DB Persistence provider creates a forward-only rowset and invokes the cursor engine component. The cursor engine component buffers all the data in its memory buffers and exposes it as a disconnected recordset with static-cursor implementation. This is the recordset that is handed to the client application. Consumer applications can also directly work with the RDS.DataSpace object and use it to invoke Automation components other than the default RDS.DataFactory object.
Figure 2.
RDS components include the following:
Because a COM component installed on the middle tier cannot be directly accessed programmatically across the process/machine boundaries by the client components, the RDS.DataSpace object is used to implement generic proxies for all such objects. These generic proxies implement the IDispatch interface, making them Automation objects that can be called from within Microsoft Visual C++®, Microsoft Visual J++™, Microsoft Visual Basic for Applications (VBA), VBScript, and JScript applications. No type checking is done on the methods called (ITypeInfo is not implemented): It is assumed that each method supports a variable number of VARIANT arguments.
The first round trip over the network is made to instantiate the business object only when the first method call is made. The method call on the proxy is translated into an HTTP post request (if using the HTTP or HTTPS communication protocol) with the name (PROGID) of the COM component to be invoked, the method name, and all input parameters packaged as multiple-part MIME packets. Each argument is packaged into a separate Multipurpose Internet Mail Extensions (MIME) subpart. The proxy can handle all Automation data types and any OLE objects that implement IPersistStream (which can save and load themselves to/from a stream).
The output parameters and return values of the method called are transported back as the response to the HTTP post request. On the client, these are unpackaged back into Automation data types, and then handed back to the scripting code that made the initial call. The RDS.DataSpace object is automatically invoked by the RDS.DataControl object and by ADO when they are used to open remote recordsets. Consumer applications can also directly work with the RDS.DataSpace object to invoke Automation components on the servers.
Figure 3.
As explained in previous sections, RDS can be used to remote recordsets across process/machine boundaries. One major factor must be noted here: The remoted recordset does not exhibit exactly the same behavior as the original recordset. The consumer application on the client gets a recordset implemented by the client-side cursor engine. All the data and metadata associated with the original recordset are remoted, but only a subset of underlying rowset properties and interfaces are remoted. So if the original recordset supports a specific nonstandard interface, the remoted recordset will not support it.
The RDS client-side recordset has these characteristics and properties:
The following OLE DB rowset properties are remoted.
Table 1.
Property | Description |
DBPROP_IROWSETUPDATE | Controls whether the remoted recordset should be batch-updatable. |
DBPROP_IROWSETCHANGE | Controls whether the remoted recordset should be updatable. |
DBPROP_ADC_BATCHSIZE | CE-specific. Controls the number of rows that can be batched together when the rowset is being updated by the CE. |
DBPROP_ADC_UPDATECRITERIA | CE-specific. Specifies the update criteria (PKs, time stamps, modified columns, and so on) to be used for OCC-style batch updating by the CE when updating the recordset. |
DBPROP_COMMANDTIMEOUT | Standard OLE DB property. Specifies the time after which the command should be timed out. |
DBPROP_MAXROWS | Standard OLE DB property. Specifies the maximum number of rows that can be returned in a rowset. |
DBPROP_ASYNCHFETCHSIZE | CE-specific. Used by the consumers of remoted recordsets for asynchronous operations. |
DBPROP_ASYNCHPREFETCHSIZE | CE-specific. Used by the consumers of remoted recordsets for asynchronous operations. |
DBPROP_ASYNCHTHREADPRIORITY | CE-specific. Used by the consumers of remoted recordsets for asynchronous operations. |
There are two ways to create remoted recordsets: by working directly with a remote OLE DB provider (implicit remoting); and by using the RDS.DataSpace object to invoke middle-tier COM Automation components.
In this scenario, RDS is used to create client/server–style applications that can run over the intranet or Internet. The client application works directly with a data source: All the logic resides either in the client application or inside the data source (such as stored procedures or triggers).
With the help of RDS, you can create applications that run over wide area networks where the client communicates to the server through standard network protocols such as HTTP and DCOM. The advantage is that all connection-related operations—such as OLE DB providers and ODBC drivers—or database-specific operations—such as DBLIBs and SQL*NET—can stay on the server. Clients can host ADO and RDS client dynamic-link libraries (DLLs) only (which ship with Microsoft Internet Explorer 4.0 and later). There is no proprietary transport protocol needed between the client and the server. RDS uses HTTP and DCOM.
There are two ways to create remoted recordsets by using the implicit remoting paradigm: invoke RDS through ADO code; and use the RDS.DataControl object.
The following ADO code can be used anywhere ADO can run (such as VBScript, JScript, Visual Basic, Visual C++, and Visual J++). However, there can be subtle differences, depending on the host language. On the client, you need to install MDAC 2.0 client components (available as a .cab file for run-time download from a Web site or installed with Internet Explorer 4.0 and later) along with the run-time environment for the tool used to build the application.The following code snippet shows how to open ADO recordsets created by OLE DB providers running on remote machines. The connection string used to open the ADO connection object states that it wants the OLE DB Remote provider (see Footnote) to be used. ADO treats this tag in a special manner by using the RDS.DataSpace component to send an HTTP post request to the specified Web server. The server where the OLE DB provider is installed is identified by the tag "Remote Server=". The "Remote Provider=" tag is used to specify the OLE DB provider to be used to create the recordset at the specified server. If this tag is not specified, the OLE DB Provider for ODBC (MSDASQL) is used by default. The rest of the connection string contains all the information needed by the provider to create the recordset.
Dim RS As New ADODB.Recordset
Dim conn As New ADODB.Connection
' specify to use remote provider, and rest of the connection info
conn.Open _
"Provider=MS Remote;Remote Server=http://MyServer;Remote
Provider=SQLOLEDB;database=pubs;"
_
Set RS.ActiveConnection = conn
' specify query to execute
RS.Source = "Select * from authors"
RS.Open
When the Refresh method is called on the RDS.DataControl object, the specified connection string (Connect property) and command string (SQL property) are used to create a recordset on the server. This recordset is marshaled to the client side, implemented inside the cursor engine, and set to the Recordset property of the data control. The RDS.DataControl object can be set to the Datasrc property of DHTML and other controls to bind the recordset directly to these controls.
RDS.DataControl can also be used to create recordsets from remote OLE DB providers, as shown in the last section. RDS.DataControl can be created by using the following Visual Basic code (or equivalent Visual C++ or Visual J++ code):
Dim RDSDc As New RDS.Datacontrol
RDS.DataControl can also be used inside HTML pages by embedding it using the <object> tag, as shown in the following code:
<OBJECT CLASSID="clsid:BD96C556-65A3-11D0-983A-00C04FC29E33" ID="DataControl"
<PARAM NAME="Connect" VALUE="Provider=SQLOLEDB;Database=pubs;">
<PARAM NAME="Server" VALUE="http://Myserver">
<PARAM NAME="SQL" VALUE="Select * from authors">
</OBJECT>
As explained above, implicit remoting functionality uses the RDS.DataFactory object to open and update recordsets. This is a very easy way to get up and running with an RDS application, but it does not create robust, scalable applications.
Typically, some validation checks are needed at the server before recordsets can be opened or updated. Some server administrators also want to control the capabilities of the RDS.DataFactory object by allowing it to access underlying data sources based on user authentication.
The RDS custom handler enables RDS.DataFactory functionality to be customized for such needs. A custom handler is a COM object that implements the IDataFactoryHandler interface and is installed on the same machine as the RDS.DataFactory object. More than one handler can be installed on the same machine. The handler objects can intercept calls made on RDS.DataFactory methods for opening and updating recordsets. Developers can implement code inside the handler objects that can validate user actions or control access to underlying data sources by modifying the connection and/or command strings. RDS 2.0 installs a default handler whose behavior is driven by an .ini file installed in the Windows directory. This can be used by server administrators to secure the data sources from Internet clients when they use RDS to access data.
Client applications must use the "Handler=" substring in the connection strings that open recordsets when a custom handler is being used on the server. Servers can be configured to always require a handler when working with the RDS.DataFactory object. Otherwise, the use of a handler is optional.
The following code shows a typical ADO connection string when the default handler, msdfmap.handler, is used:
Dim RS As New ADODB.Recordset
Dim conn As New ADODB.Connection
conn.Open _
"Provider=MS Remote;Remote Server=http://MyServer;Remote
Provider=SQLOLEDB;Handler=msdfmap.handler,msdfmap.ini;DSN=pubs;"
Set RS.ActiveConnection = conn
RS.Source = "Select * from authors"
RS.Open
For more information on using the custom handler feature, see the Data Access Web site at www.microsoft.com/data/.
RDS can also be used to create distributed, three-tier applications. The main difference here (as compared to implicit remoting used by two-tier client/server–style applications) is that the application logic is divided into three distinct tiers. The client components of the distributed application typically contain all the presentation-related logic and local validation checks for user input and actions. Components hosted at the middle tier typically contain the business logic, integrated validation checks, and all the code to work with underlying data sources. The third tier is composed of stored procedures and triggers residing inside the data source.
There are several advantages with using the three-tier approach:
The RDS.DataSpace object can be used to invoke middle-tier components by the client-side code. Middle-tier components that can be invoked by using the RDS.DataSpace object must meet the following criteria:
Public Function ReturnRS(strConnect As String, _
strQuery As String) As ADODB.Recordset
Dim adoCon1 As New ADODB.Connection
Dim adoRs1 As New ADODB.Recordset
' Open a connection to the Database.
adoCon1.Open strConnect
' Set the CursorLocation to adUseClient
adoCon1.CursorLocation = adUseClient
' Open an ADO Recordset.
adoRs1.Open strQuery, adoCon1, adOpenStatic, _
adLockBatchOptimistic
' Return ADO Recordset object to Client.
' (Returns the actual recordset, not just a pointer to it).
Set ReturnRS = adoRs1
' Cannot close the ADO Recordset object here,
' but it can be disassociated.
Set adoRs1.ActiveConnection = Nothing
' Close ADO Connection object.
adoCon1.Close
End Function
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\ W3SVC\Parameters\ADCLaunch
To manually mark your business object as safe for scripting, create a text file with a .reg extension that contains the following text. (Note that this needs to be done only on the server where the business object is installed.) The following two numbers enable the safe-for-scripting feature, where <MyActiveXGUID> is the hexadecimal globally unique identifier (GUID) number of your business object:
HKEY_CLASSES_ROOT\CLSID\<MyActiveXGUID>\Implemented Categories\{7DD95801-9882-11CF-9FA9-00AA006C42C4}
HKEY_CLASSES_ROOT\CLSID\<MyActiveXGUID>\Implemented Categories\{7DD95802-9882-11CF-9FA9-00AA006C42C4}
Save it and merge it into your registry by using the Registry Editor or by double-clicking the .reg file in the Microsoft Windows® Explorer.
Business objects created in Visual Basic can be automatically marked as "safe for scripting" with the Package and Deployment Wizard. When the wizard asks you to specify safety settings, select Safe for initialization and Safe for scripting.
HKEY_CLASSES_ROOT\<MyActiveXProgID>\Clsid\(Default) "{<MyActiveXGUID>}"
Note This can also be done by exporting the package in MTS and registering the resulting export .exe on the client.
The following code snippet shows how the RDS.DataSpace object can be used to invoke middle-tier Automation components. The snippet shows how RDS.DataSpace can invoke the GetRecordset method on a COM component, identified by PROGID "MyCustObj.ProgId", running on Myserver, to which the client connects over an HTTP network. The GetRecordset method takes a parameter, myfirstparam, and returns a disconnected recordset:
Dim rdsds as new RDS.Dataspace
Dim mycustobj as new object
Dim myrs as new ADOD.Recordset
Set mycustobj = rdsds.CreateObject('MyCustObj.ProgId','http://Myserver')
Set myrs = mycustobj.GetRecordset (myfirstparam)
The proxies generated by the RDS.DataSpace object on the client in this snippet are not complete COM proxies. The CreateObject method used to generate a proxy returns a variant that contains the IDispatch * for the created object. No other interface can be obtained from this proxy object. If QueryInterface() is called to get an interface other than IDispatch, E_NOINTERFACE is returned.
All calls to methods of the object created by RDS.DataSpace.CreateObject() must be made through IDispatch::Invoke(). Furthermore, you must call GetIDsOfNames() to get the DISPIDs of the methods you want to call because RDS changes the values of the DISPIDs. They may be different from the IDs specified in the Typelib of the actual object on the server.
Because you can use RDS to build datacentric applications that can be hosted inside Internet Explorer in intranet/Internet environments, it is important to know the security ramifications such applications can impose. Any time an application allows access to databases over the Internet protocol, security becomes important.
RDS 2.0 supports cross-domain security. By default, RDS can be used to access data sources on the same server from which the page containing RDS has been downloaded. Thus, if an HTML page containing embedded ADO/RDS objects is downloaded from a server called, for example, "http://server1," RDS can be used only to connect to data sources on that server. That is, in all the following properties and methods only "http://server1" is allowed as the valid value.
Table 2.
Object | Property/method | Restriction |
RDS.DataControl | Server property | Can only be set to "http://server1." |
RDS.DataSpace | CreateObject method | Second parameter identifying the server can only be set to "http://server1." |
ADODB.Connection | ConnectionString property | If it contains substring "Provider=MS Remote," it must also contain the substring "Remote Server=http://server1." |
ADODB.Recordset | Open method | If the ActiveConnection parameter contains substring "Provider=MS Remote," it must also contain the substring "Remote Server=http://server1." |
This is the default behavior in Internet Explorer 4.0, 4.01, and 4.01 with Service Pack 1 when the security level setting is High. For a Medium security level, a prompt is displayed any time such an operation is attempted. The user can choose whether to allow the operation. The user can specifically allow such operations either by setting the security level to Low or by customizing the security settings and enabling the Initialize and script ActiveX controls not marked as safe option.
You can set security for different Internet zones in your Internet Explorer. Internet Explorer 4.0 and later supports four security zones: Trusted Sites zone, Untrusted Sites zone, Local Intranet zone, and Internet zone.
RDS 1.5 (shipped with Windows NT® Option Pack and Internet Information Server 4.0) installs the RDS.DataFactory object by default. Because of this, you can install RDS-enabled HTML pages on the server. However, this can also become a security risk if some Internet clients can find out the System DSNs installed on the server and the USERID and passwords needed to access them. It is possible to write HTML pages that take advantage of RDS to access these data sources over HTTP networks that can pass through firewalls.
If you are concerned about this, you can disable RDS.DataFactory functionality by removing the following keys from your server's registry:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\ W3SVC\Parameters\ADCLaunch\RDSServer.DataFactory
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\ W3SVC\Parameters\ADCLaunch\AdvancedDataFactory]
In RDS 2.0, you can use the customization handler object to safeguard data source security on your server. Customization handlers can intercept calls to open and update recordsets through the RDS.DataFactory object. They also can be used to modify the command/connection strings used to open/update the recordset, or to validate the changes to the recordset. The logic in the handler may also be used to replace the RDS.DataFactory operation altogether.
RDS 2.0 also ships a default customization handler, Msdfmap.dll, whose behavior is driven by the Msdfmap.ini file in the Windows directory. Server administrators can configure this .ini file to control the data sources that RDS.DataFactory is allowed to access, kinds of commands that RDS.DataFactory is allowed to execute, and to set up separate configurations based on USERIDs. For details on how to use this feature, see the Microsoft Data Access Web site, www.microsoft.com/data/.
In future releases, RDS will be enhanced to do the following:
Properties
Name | Access permission | Description |
InternetTimeout | R/W | Indicates the number of milliseconds to wait before a request times out. |
Methods
Name | Description |
CreateObject | Creates the proxy for the target business object and returns a pointer to it. |
The RDS.DataControl object binds a data query recordset to one or more controls (for example, a text box, grid control, or combo box) to display the ADOR.Recordset data on a Web page.
Properties
Name | Access permission | Description |
Connect | R/W | Sets or returns the database name from which the query and update operations are run. |
ExecuteOptions | R/W | Indicates whether asynchronous execution is enabled. |
FetchOptions | R/W | Sets or returns the type of asynchronous fetching. |
FilterColumn | R/W | Sets or returns the column on which to evaluate the filter criteria. |
FilterCriterion | R/W | Sets or returns which evaluation operator to use in the filter value. |
FilterValue | R/W | Sets or returns the value with which to filter records. |
InternetTimeout | R/W | Indicates the number of milliseconds to wait before a request times out. |
ReadyState | R/O | Reflects the progress of an RDS.DataControl object as it fetches data into its Recordset object. |
Recordset | R/O | Indicates the ADO recordset bound to the RDS data control. |
SourceRecordset | W/O | Binds a new ADO recordset to the RDS data control. |
Server | R/W | Sets or returns the IIS name and communication protocol. |
SortColumn | R/W | Sets or returns the column by which to sort the records. |
SQL | R/W | Sets or returns the query string used to retrieve the recordset. |
Methods
Name | Description |
Cancel | Cancels the currently running asynchronous execution or fetch. |
CancelUpdate | Discards all the pending changes associated with the specified Recordset object, thus restoring the values since the last Refresh method call. |
CreateRecordset | Creates an empty, disconnected recordset. |
MoveFirst
MoveNext MovePrevious MoveLast |
Moves to the first, last, next, or previous record in a displayed recordset. |
Refresh | Requeries the OLE DB data source specified in the Connect property and updates the query results. |
Reset | Executes the sort or filter on a client-side recordset based on the specified sort and filter properties. |
SubmitChanges | Submits pending changes of the locally cached updatable recordset to the ODBC data source specified in the Connect property. |
This default server-side business object implements methods that provide read/write data access to specified data sources for client-side applications.
Methods
Name | Description |
ConvertToString | Converts a recordset to a MIME string that represents the recordset data. |
CreateRecordset | Creates an empty, disconnected recordset. |
Query | Uses a valid SQL query string to return a recordset. |
Refresh | Requeries the OLE DB data source specified in the Connect property and updates the query results. |
SubmitChanges | Submits pending changes of the locally cached updatable recordset to the OLE DB data source specified in the Connect property. |
RDS is a feature that ships with ADO. You can use it to move data from a server to a client application or Web page, manipulate the data on the client, and return updates to the server in a single round trip.
RDS features can be summarized as follows:
The Remote Data Service can work disconnected through HTTP over your intranet, or the Internet through firewalls. RDS enables client applications to work with disconnected recordsets remoted from the servers. This is a completely stateless model, which means the server might shut down and restart between RDS client requests. The process that creates the original recordset on the server is also stateless and is not kept alive between method calls. This provides for better scalability of server applications and fits well with the MTS programming model.
The OLE DB Remote provider ("MSRemote") is not actually an OLE DB provider. It is a special tag that can be used in the connection string for opening ADO recordsets. ADO treats this tag ("Provider=MS Remote") in a special manner. You cannot directly invoke or work with MSRemote as you can with other OLE DB providers from elsewhere in your code. In the next release of RDS, MSRemote will be implemented as a special OLE DB provider with its own properties and settings. At that time it will be possible to code directly to this provider from your applications, just like you can code to other OLE DB providers such as Index Server (Monarch) and Microsoft SQL Server™ OLE DB provider (SQLOLEDB).