MDAC 2.5 SDK - Technical Articles
August 1998
This article compares the Java JDBC interface to SQL data versus the Microsoft® Windows Foundation Classes (WFC) ActiveX® Data Objects (ADO) and also versus the OLE DB–based data solution that ships with Microsoft Visual J++™ and WFC. Specifically, the object model, programming model, and functionality of JDBC are compared with the corresponding features of ADO/WFC. The result of this comparison shows that JDBC is a limited API specifically targeted at the transfer of data from an SQL database into the Java client's buffers. JDBC does not provide a good model for interacting with the data. It does not provide access to non-SQL data and does not provide an architecture for building data components.
From the consumer point of view, JDBC defines four "objects."[1] The following table shows how the four JDBC objects relate to equivalent objects in ADO.
Note The JDBC documentation breaks the objects down into interfaces, types, and classes. However, for the sake of comparison with OLE DB and ADO, these are equivalent to objects.
JDBC object | ADO equivalent |
DriverManager | Connection |
Connection | Connection |
Statement | Command |
PreparedStatement | Prepared Command |
CallableStatement | Command |
ResultSet | Recordset |
The functionality of these JDBC objects can be described as follows:
The JDBC and ADO programming models are compared in the table below and discussed in the list following the table.
Feature | JDBC | ADO |
Programming level | Low | High |
Types of data | SQL only | Any |
Language | Language-neutral | Language-neutral |
Typing | Strongly typed | Both dynamic typing and static typing |
Data ownership | Client-owned | Shared |
This strong typing also applies to property setting and retrieval as well as to metadata retrieval. A separate method gets and sets each property, and separate methods retrieve each type of metadata.
In comparison, ADO and OLE DB support both late bindings (using variants) and early bindings to a variety of data types. (All SQL and Automation data types are supported.) ADO has been extended specifically to work well with Java. Variants used in ADO methods and properties have been converted into native data types, and the syntax for all properties has been converted to be more like Java—for example, getProperty, setProperty, isProperty.
In contrast to JDBC, ADO supports a number of different cursor type and lock types. Some of these features depend on the underlying OLE DB provider's implementation. But because ADO ships with its own cursor engine, which is implemented as an OLE DB service component, it typically enhances the cursoring functionality of the underlying provider. Data can be fetched asynchronously and can be loaded into the client cursor engine for disconnected operations. If the consumer asks for extended functionality that is not available in the underlying provider, ADO can invoke the cursor engine component transparently to provide the requested service. On disconnected recordsets, ADO also supports advanced features—such as sort, filter, and find—through the cursor engine component.
The functionality of JDBC is compared to ADO in the following table.
Feature | JDBC | ADO |
Data manipulation language (DML) | ||
Opening rowsets | SQL only | SQL SELECT statement; passing in the table name or object name; from a stored procedure; from a persisted file referred to by an URL |
Creating tables | SQL-based | DDL extensions |
Database admin | N/A | DDL extensions |
Result set functionality | ||
Scrollability | Scrollable | Scrollable |
Update/insert/delete | SQL only; JDBC 2.0 allows method-based updates; supports stored procedures through callable statements | From stored procedure or SQL; using provider-specific command; also method-based—for example, using the addNew, Update, and Delete methods on the Recordset object |
Row retrieval | Single row | Multiple rows |
Column retrieval | Single column | Multiple |
Database resync[1] | No | Yes |
Notifications | No | Yes |
Index navigation | No | Yes. ADO Recordset has an Index property, and a Seek method to move along the index. |
Null handling | Extra method | VARIANT_NULL |
Result set shape | Flat tabular only | Hierarchical recordsets are supported |
Transactions | ||
Isolation levels | All | All |
Auto-commit support | Yes | Yes |
Nested transactions | No | Yes |
Distributed transactions | Yes | Yes |
Extensibility | ||
Properties | Method-based | Extensible |
Object navigation | No | Yes |
Error model | Throw | Error object |
[1] This refers to the ability to re-retrieve the current values for a fetched row from the database. It is necessary for things like optimistic concurrency in the batched-update case.
The following list summarizes the results of the above comparison relating to JDBC functionality:
There is no common way to do database administration in JDBC or ODBC because, for example, there is no defined SQL grammar to create a database.
The standard procedure for updating in JDBC is based on SQL positioned UPDATE and DELETE statements that modify the data underlying the current row. Updating in ADO is accomplished simply by calling specific methods on the Recordset object.
JDBC 2.0 also now allows direct updates based on dynamic and keyset cursors and defines static, keyset, and dynamic cursor modes. Additionally, it allows scrollable result sets that support forward and backward scrolling.
In JDBC, data retrieval from the result set is by a single column at a time. This limitation was identified as a performance issue for ADO, which is why ADO now can retrieve a row at a time as an array of variants, or multiple rows.
JDBC does not have the concept of refetching the current values for a previously fetched row. Without locking the record when read, the user cannot determine whether the underlying record has changed. ADO, on the other hand, implements a sophisticated resynchronize mechanism. Users can customize the resync operation by specifying a stored procedure that should be called to resync the recordsets. Users can choose to resynchronize just the underlying values or all values. ADO also supports automatic resync on update operations. Users can specify what to resync, such as identity columns, conflicting rows, timestamp columns.
Because JDBC is inherently a single-user model, it has no need for notifications. Because it is based on SQL, JDBC does not expose the concept of indexes. In contrast, ADO supports and implements a rich event-handling model.
Because JDBC is strongly typed, attempting to read a null value from a column will return a zero value for that column. To distinguish between the zero value and the null value, the user must call the wasNull() method immediately after retrieving the zero value to see whether it was a null value or was actually the zero value.
ADO returns null values with a special data type of VARIANT_NULL. ADO also implements an isNull() method that can be called even before retrieving the column value.
ADO supports distributed as well as nested transactions. ADO also implements specific methods to retrieve properties of the underlying data store. Common properties are available at the connection, command, recordset, and field level. In addition to specific methods to access properties common to most providers, ADO provides a Properties collection that exposes provider specific properties.
JDBC is a low-level interface to SQL data, callable from Java. It is not a generic data access solution, it is not a high-level API, and it is not an architecture for building components. Because of its limited scrollability and query-based update semantics, it is suitable for little more than read-only, sequential retrieval of SQL data from a Java application.
OLE DB takes traditional data access to the next higher level. It defines and makes universal data access a reality. Because it is based on COM and is not just an API, OLE DB functionality is typically implemented via independent components. The concept of service components makes it easier to implement simple OLE DB providers and still provide a guaranteed level of functionality to the consumers. Some already-shipping service components and other useful OLE DB providers are the cursor engine component, the MSPersist provider, the MSDataShape provider for hierarchical cursors, the MSRemote provider, and Heterogeneous Query Processors from ISG Software and shipping with Microsoft SQL Server 7.0.
To serve component scenarios, OLE DB was designed from the ground up to be extensible so that components could build on a common set of interfaces without being limited in their ability to expose extended functionality.
The vast majority of information today does not reside in a relational database but exists as what we call "personal data"—that is, data in mail files, spread sheets, log files, directory systems, and so on. The ability to harness this information is critical to the success of any data architecture.
JDBC is designed as a call-level API to SQL data. As with ODBC, to expose your data through JDBC you must make it look like SQL data. This assumption is present in the way the programmer obtains a result set, updates values, obtains schema information, and in how the data is modeled. Therefore, exposing JDBC requires building a query processor and relational engine on top of each of those mail files, spreadsheets, log files, and so forth. The alternative is to put each of those types of data into a relational database and use content indexing to attempt to query those types of data. Neither solution allows the universal access to personal data necessary to have all types of information "at your fingertips."
OLE DB addresses this need because it is designed as a component architecture. Through a component architecture, data stores can easily expose only their native functionality, and general, reusable components can be written to implement common extensions to that native functionality. Because the core interfaces consumed by the component are the same as those exposed (that is, the rowset interfaces), the components can aggregate this functionality very efficiently. Because the functionality on each object is factored into interfaces, components can augment functionality without duplicating it. The result is a flattened component architecture, where each component is operating directly on the data, rather than an architecture that layers each component and requires calls and data to pass through and be handled at each layer.
As a drawback to its functionality, JDBC's extremely strong typing makes it difficult to perform bulk operations. Columns are set and retrieved one at a time. Properties are set or retrieved one at a time. While this may be acceptable for an end-user interface, it does not provide the type of low-level efficiency necessary to build good, optimized data components.
Another area of concern in JDBC is the lack of well-defined error cases, whereas a great deal of time and effort has gone into defining error cases for both ODBC and OLE DB. This type of information may not be important to a high-level interface but is integral to the building and debugging of low-level data providers and components.
Finally, JDBC does not provide any type of support for using, creating, or even knowing about any type of index information. Building efficient query processors over large quantities of non-SQL data requires leveraging indexes where they exist or can be created.
ADO/WFC is based on the underlying ADO/OLE DB infrastructure and hence allows access to a wide range of data. OLE DB providers can implement their own command objects. Executing the IOpenRowset::OpenRowset method on the connection object can also open recordsets, in case the provider doesn't implement the command object. This allows a recordset to be opened by just pointing to the object containing the data. Some non-SQL OLE DB providers that are already available or under development are OLE DB Simple Provider from Microsoft Corporation, Lotus Notes Provider from ISG Software, and OLE DB Provider for Microsoft Exchange.
By contrast, JDBC's mechanism for generating result sets, as in ODBC, is to execute SQL SELECT statements or stored procedures. This does not lend itself well for allowing non-SQL providers to expose result set interfaces over raw tabular data or other non-SQL based data stores.
Even though JDBC 2.0 allows scrollable cursors, very few drivers actually implement these. Similarly, very few ODBC drivers implement this optional functionality. But Microsoft implements scrolling in a "Cursor Library" in ODBC 2.0, which provides scrollability against any forward-only driver. Similarly, ADO ships with the cursor engine component, which is an OLE DB service component, that can implement static, client-side scrollable and updatable cursors against any OLE DB provider.
Bookmarks provide the mechanism for returning to a previously fetched row and are necessary for implementing a shared model where multiple clients are accessing the same result set. If supported, bookmarks are a separable but very useful extension to scrolling.
The cursor engine component shipping with ADO can implement bookmarks against arbitrary OLE DB providers. JDBC, on the other hand, must depend upon the underlying driver to implement support for bookmarks.
A critical component for efficient updating in a high-latency environment such as the Internet is the ability to batch updates. The easiest way to do this is by implementing a deferred update model, where modifications to individual rows are buffered until an explicit update call is defined.
ADO can operate in two modes of batch updating: single-row mode and batch mode. JDBC 2.0 allows batch updates, but the support for resynchronization is not clearly defined. For a good, customizable implementation of optimistic concurrency–based batch updating, it is necessary to provide methods to read the current value, the original value, and the underlying value of a column in a specific row in the result set. How this is implemented in JDBC is not clear; JDBC defines one Getxxx method for each type, and the JDBC implementation needs to be expanded to include means for reading the original and underlying values.
ADO implements recordset filtering, which makes it simple to create sub-recordsets that may contain only rows that the user is interested in. ADO also provides predefined filter enums that allow filtering for records with pending data, records that failed updates because of conflicts, and so forth.
ADO also ships with MSRemote and MSPersist OLE DB providers. These providers implement sophisticated functionality that allows users to do data-manipulation in distributed and mobile scenarios. MSPersist provider enables ADO recordsets to be persisted in the local file system and reopened. MS Remote provider allows ADO to work with OLE DB providers that are on remote machines, connected over standard network protocols like HTTP and DCOM. Thus, the users don't need to install OLE DB providers on client machines.
ADO implements hierarchical recordsets that are provided through chapters in OLE DB. The MSDataShape provider makes it easier to define hierarchies on top of flat result sets returned by the underlying OLE DB providers.
JDBC does not provide any type of support for hierarchical cursors. SQL3 has added concepts to the language for generation of hierarchical result sets but does not provide a mechanism to bind to the results.
As has already been discussed, one of JDBC's big limitations is its nonextensible type system. This is even more apparent when it comes to ADTs.
Java supports a concept of an "object," and JDBC allows obtaining a stream over a column for reading or writing an object. This method basically treats all objects as BLOBs and does not allow the type of functionality nor flexibility that object-relational databases are looking for.
With ADO\WFC, however, any COM object can exist as a column of the recordset. It is easy to obtain the actual object as a value of the field and to work with it independently in VJ++ 6.0.
JDBC continues to be a very limited interface that is targeted primarily at transferring data from an SQL database to a Java client. It is immature in terms of both the functionality it provides and the details necessary for implementing database-oriented applications based on JDBC.
ADO/WFC, on the other hand, offers a much more mature platform for building complex data-oriented applications using Visual J++ 6.0. With the component architecture of OLE DB, ADO/WFC can be used to dynamically access data from a variety of data sources, including SQL databases and non-SQL sources such as e-mail stores. ADO/WFC also offers a much richer programming model for navigation and working with the data.
Footnotes
[1] Section 2.1, "A SQL level API", JavaSoft JDBC: A Java SQL API, Graham Hamilton & Rick Cattell, Version 1.01, August 8, 1996.
[2] Section 2.2, "SQL Conformance", JavaSoft JDBC: A Java SQL API, Graham Hamilton & Rick Cattell, Version 1.01, August 8, 1996.