SQL Server 7.0 Developer's Overview

SQL Server 7.0 offers distinct advantages to developers of value-added business solutions. This section introduces SQL Server 7.0 to developers who are unfamiliar with client/server database application development or who need a quick refresher on SQL Server 7.0 and MSDE 1.0.

By reading these sections, a developer or architect can learn how to position the benefits of SQL Server as an integrated component of a value-added business solution, and educate others on the full breadth of capabilities available from a custom database application that leverages SQL Server 7.0 or MSDE 1.0. For comprehensive coverage of any of these topics, see the SQL Server 7.0 documentation or SQL Server Books Online for SQL Server 7.0.

History of Microsoft SQL Server

Microsoft SQL Server was introduced in 1990 for the Microsoft OS/2 platform in partnership with Sybase. The product was originally ported from the Sybase SQL Server 4.x for the UNIX platform. With the introduction of Microsoft Windows NT®, a second porting initiative was undertaken to build a native Windows NT version. Microsoft SQL Server version 4.2 was the first commercially available database server natively designed for the Windows NT platform. The partnership with Sybase continued, and two additional versions of SQL Server were released (version 6.0 and version 6.5).

SQL Server 6.5 made improvements in transactional performance, scalability and reliability, and became one of the most widely used client/server database products for operational systems on the Windows NT platform. As the database market evolved and the business relationship with Sybase changed, it became apparent that in order to address customer requirements and evolve SQL Server to support the next generation of database applications, the SQL Server product required a fundamental redesign.

This resulted in the termination of the business relationship between Sybase and Microsoft, and led to the development of SQL Server 7.0. SQL Server 7.0 is a modern database engine architecture implemented completely by Microsoft, and is designed to address the most demanding database application requirements for operational and decision-support systems implemented today and in the future.

Scalable Database Platform

SQL Server 7.0 and MSDE 1.0 support all 32-bit Microsoft Windows platforms (except Win32®) with a single, compatible code base that dynamically adapts to the hardware capabilities available on the system on which it is installed. This means that developers can write a single set of application source code and run it on a database that scales from a Windows 98 desktop, to a Windows NT Server, to a Windows NT Server Enterprise Edition cluster. Database files are also compatible across all SQL Server 7.0 editions. SQL Server 7.0 is available in several different editions that will be briefly introduced here.

SQL Server Standard, Developer, and Enterprise Editions

SQL Server 7.0, Standard Edition is designed for installation on servers running Windows NT Server, and is limited to 4 CPUs and 32-bit memory addressing. A special variant of SQL Server 7.0, Standard Edition, called SQL Server Developer's Edition, ships with Microsoft Visual Studio® and is licensed for application development usage with Visual Studio only.

SQL Server 7.0, Enterprise Edition is designed for installation on servers running Windows NT Server, Enterprise Edition, supports 8 CPUs and 64-bit memory addressing, and supports Active/Active clustering for environments requiring advanced fault tolerance.

SQL Server Desktop Edition

SQL Server 7.0, Desktop Edition ships on the same compact disc with SQL Server Standard, Developer, and Enterprise editions. It is not available as a stand-alone product. It is designed for installation on workstations running Windows 95, Windows 98, or Windows NT Workstation. SQL Server 7.0 Desktop Edition has no disk space limitations, is limited to a single processor and five concurrent queries, and ships with a full complement of graphical administration tools. A SQL Server Client Access License is required for each workstation on which a copy of SQL Server 7.0, Desktop Edition is installed.

SQL Server Small Business Server Edition

SQL Server 7.0, Small Business Server Edition ships on the same compact disk with Microsoft Small Business Server version 4.5. It is not available as a stand-alone product. It is designed for installation on servers running Small Business Server 4.5. SQL Server 7.0 Small Business Server Edition is limited to 10 GB of storage per database, a single processor, fifty concurrent queries, and ships with a full complement of graphical administration tools.

Microsoft Data Engine

The Microsoft Data Engine (MSDE) is a royalty free, redistributable database engine that is fully compatible with SQL Server 7.0. MSDE is designed to run as an embedded database for custom applications that require a local database engine on Windows 95, Windows 98, or Windows NT Workstation. MSDE is limited to 2 GB of storage per database, a single processor, and five concurrent queries.

MSDE is an attractive alternative to using the Microsoft Jet database, and is designed primarily to provide a low cost option for developers who need a database server that can be easily distributed and installed with a value-added business solution. Because it is fully compatible with other editions of SQL Server, developers can easily target both SQL Server and MSDE with the same core code base. This provides a seamless upgrade path from MSDE to SQL Server if the application grows beyond the storage and scalability limits of MSDE.

MSDE ships with Microsoft Office 2000 Developer Edition, and Microsoft Visual Studio. MSDE is not available as a stand-alone product. MSDE requires a SQL Server Client Access License to participate in replication with SQL Server Standard or Enterprise editions. Developers must obtain a Visual Studio or Microsoft Office 2000 Developers Edition license in order to redistribute MSDE with a custom application.

A special release of MSDE 1.0 ships with Microsoft Office 2000 for use with Microsoft Access as an alternative to the Microsoft Jet database, and is not licensed for redistribution. This document refers to this version of MSDE as the "MSDE 1.0 Office 2000 Release." It differs from MSDE 1.0 in the following ways:

Either Microsoft Access 2000 or the Office 2000 Server Extensions must be installed in order to install MSDE 1.0 Office 2000 Release.
A full complement of graphical administration tools is included. These tools are required by Microsoft Access 2000 to perform database administration.
Installation is performed by way of Setupsql.Exe, not by way of the PackageForTheWeb installation executable.

Database Server Architecture

There are many robust client/server database engine products available today. SQL Server 7.0 became commercially available in 1999, and although it is compatible with applications written for earlier releases, its core architecture is composed of a completely modern design. As such, SQL Server 7.0 represents a significant evolution in database technology. The basic principles involved in the architecture of the database server and the benefits of this new approach warrant further examination.

Client/Server Architecture

SQL Server 7.0 is an intelligent client/server database engine that differs substantially from traditional personal computer database technology that is implemented using a shared file-system approach (such as Borland dBASE, Microsoft Jet, Microsoft® Visual FoxPro®, and Pervasive Btrieve). Shared file-system databases rely on a single client process per user to manipulate data in shared data files on network file servers. This works in single user scenarios, but multiuser scenarios introduce the added complexity of concurrency control, which requires these products to use fairly primitive locking mechanisms exposed in the network layer. While thoroughly appropriate for a wide range of personal computer database applications, these systems tend to experience performance and reliability problems under more strenuous multiuser loads. Security facilities for shared file-system databases are typically limited to simple read/write network permissions, making it easy for users to access shared data files outside of the context of the application they were designed for by using a different tool. This makes it difficult to have absolute control over how users interact with a shared data file as a whole and how restricted access is implemented for certain types of operations.

Client/server database systems like SQL Server use a dedicated set of server processes to manipulate the data, and force client processes to connect to the server process using local or remote inter-process communication (IPC) mechanisms (for example, TCP/IP sockets). The server process, which is an intelligent server application that processes SQL commands, handles multiuser concurrency control using more sophisticated locking mechanisms than are available in the network layer. The server also handles security, enabling very granular control of how users interact with information in the database. After these commands are processed, any results and/or messages are sent back to the client process over the IPC mechanism. As a result, client/server systems perform better in multiuser scenarios where users are frequently manipulating the same data.

As an intelligent database server application, SQL Server supports a well-defined language for server programming and communication, called Transact-SQL, and well-defined programming interfaces that client applications use to conduct conversations with SQL Server. Transact-SQL is a superset of the industry standard ANSI Structured Query Language (ANSI-SQL). ANSI-SQL defines four basic commands for manipulating data in a relational database (SELECT, INSERT, UPDATE, DELETE), a variety of commands for defining the structure of databases, and a few other categories of commands. Transact-SQL expands upon the ANSI standard by adding functionality to the four basic data manipulation commands, and by adding control-of-flow constructs for programmability. These extensions make it possible to program SQL Server with custom business logic executed by the server that can be invoked from client applications. This custom business logic might be a standardized database operation such as a stored procedure for updating inventory data, or custom data integrity logic such as a trigger that checks current stock levels when parts are sold to determine if inventory is low. Some applications that need to support a variety of database platforms leverage only the ANSI-SQL compliant constructs of Transact-SQL, while other applications make extensive use of Transact-SQL programmability to enhance the performance and functionality of SQL Server applications.

The term "client/server" is somewhat of a misnomer and may cause the reader some concern because current trends in enterprise computing tend to cast it in somewhat negative terms. Client/Server is widely used to describe a two-tier architecture for enterprise applications that involved a "heavy" client, which implements both user interface and complex business processes, connected to an intelligent backend database such as SQL Server.

Although this architecture has been quite successful for a wide variety of applications, it has given way to a new "three tier" or "n-tiered" approach where the client implements only user interface functionality while complex business processing is centralized in special purpose application servers in a middle tier. These application servers then communicate with the database. This document does not discuss these architectures in detail, as SQL Server is well suited for either approach.

In the context of this document, and SQL Server in general, the term client/server refers to a client process connecting to an intelligent server process, regardless of which tier in which the client process resides. For example, a "client" process might be a middle-tier software component that does credit-card validation or even a Web server, and the "server" process is SQL Server. In either case, SQL Server functions as a special purpose database server application that handles database operations for a client process. Client and server processes can be co-located on the same computer, or can communicate over a network using IPC mechanisms.

Advanced Reliability and Concurrency Control

Advanced reliability and concurrency control are mandatory for many enterprise applications. With SQL Server 7.0, these same capabilities are available on everything from servers running Windows NT Server, to desktops running Windows 95, Windows 98, and Windows NT Workstation.

Transactions and Recovery

SQL Server uses the concept of transactions and transaction logging to guarantee the consistency and recoverability of information stored in databases. A transaction is a unit of work, and can be as small as a single row INSERT or UPDATE operation, or large as collection of several different data modifications bound by BEGIN or COMMIT TRANSACTION statements.

Using established techniques for transaction processing systems, SQL Server guarantees each unit of work will be applied to the database in a consistent fashion, regardless of a system, application or user error. It does this by writing before and after images of any data pages affected by a transaction to a durable transaction log. This log can then be used to restore the database to a consistent state if an application decides to roll back a transaction or to recover from a system failure by re-applying transactions not flushed to disk at the time of the failure. Because SQL Server, as well as MSDE, supports the Microsoft Distributed Transaction Coordinator (MS DTC), transactional integrity can be applied to distributed operations that affect multiple SQL Server installations using a two-phased commit approach.

This advanced reliability is common in client/server and host-based database systems. However, until recently, most personal computer database products did not offer these kinds of capabilities. Personal computer database products are typically implemented using shared file-system technology. Shared file-system databases may offer transaction-processing semantics in their query languages. However, most of them do not implement a durable transaction log for recovery, making them prone to corruption in the event of a system failure. Transaction Processing: Concepts and Techniques by Jim Gray and Andreas Reuter is an excellent resource for understanding the fundamentals of transaction processing.

Concurrency Control

SQL Server 7.0 concurrency control has improved markedly from earlier releases by implementing a concept called dynamic locking. Dynamic locking improves database performance and concurrency by analyzing the appropriate locking strategy for a particular transaction. SQL Server supports a variety of different lock types that can be used to control access to anything from an individual row, to a data page (a collection of rows), to an extent (a collection of pages), to an entire table.

Using dynamic locking, SQL Server 7.0 has the flexibility of applying a range of different lock types depending upon the operation involved. For example, transactions generated by an order entry application will most likely require row locks for inserting new rows quickly into an order table. Row locks ensure that the smallest amount of data possible is being locked, which reduces the chance that a user will obtain a lock on data that another user's transaction needs to access (called a collision). Collisions force the losing transaction to wait for locks to be released before it can do its work, and can negatively impact response time.

Another example might be a large query run by a reporting application against the order table. In this case, it is more efficient to lock larger chunks of the data than individual rows, especially when a very large number of rows are involved. SQL Server 7.0 has the flexibility of locking at the page, extent, or table level if appropriate. This allows it to place and release the locks faster than if locking had to be performed at the row level. SQL Server has the ability to dynamically determine the granularity of lock types required to yield the highest performance and maximum concurrency, hence the term dynamic locking.

High Availability

High availability, uninterrupted support, has been a hallmark of host and UNIX database products for quite some time, and is now a common feature of personal computer database server products such as SQL Server 7.0. There are two aspects of high availability that warrant further examination.

First, the operating system platform that hosts the database and the database itself must have mission critical stability. This document does not discuss this topic fully. The architecture of both Windows NT and SQL Server 7.0 provide the kind of robustness required by the most demanding of enterprise applications. Although no software is without defects, SQL Server 7.0 has been proven to hold up under the most strenuous usage scenarios without experiencing software-related failures or data corruption.

Second, the database must be continuously available. SQL Server 7.0 addresses continuous availability by providing support for online backup, by minimizing the need for down time due to resource-intensive maintenance procedures, and by supporting fail-over using clustering in case of system failure.

Earlier releases of SQL Server required that a performance-intensive maintenance utility called database consistency checker (DBCC) be run on a regular basis to ensure that the underlying data structures in a production database were intact. Unfortunately, this requirement often resulted in the need to schedule significant periods of downtime for DBCC execution, which was unacceptable for some very large database applications. Due to the new implementation of storage engine technology in SQL Server 7.0, the requirement for regular executions of DBCC has been greatly reduced and even eliminated for many environments. However, if an organization wants to continue running DBCC, its performance has been greatly enhanced showing exponential improvements in execution time.

Database Server Fail Over

Providing continuous availability for hardware and software failures requires advanced hardware clustering solutions. SQL Server 7.0, Enterprise Edition supports Microsoft Cluster Service (MSCS) in Windows NT Server, Enterprise Edition. Using Cluster Service, two active SQL Server nodes in a cluster can share a common disk I/O subsystem. If one of the SQL Server nodes fails due to a hardware or software failure, the remaining active SQL Server node can "inherit" control of the failed SQL Server's databases. Application software can be programmed to detect when a failover event occurs, and automatically re-connect clients to the available node. Any active transactions that were being processed at the time of the failure can then be resubmitted.

Online Backup

Support for online backup enables a fully recoverable backup of an active database and/or transaction log to be made even while users are actively querying and modifying a production database. SQL Server and MSDE implement a high-performance implementation of online backup, the overhead for which only imposes approximately five-percent degradation of overall system performance in online transaction processing (OLTP) environments while a backup process is active. Additionally, support for high-end backup hardware enables SQL Server to rapidly stripe large backups simultaneously across multiple devices, such as an array of tape devices.

Introduction to Database Server Backup Technology

Those who are new to database server technology may need a quick orientation on the difference between online database server backups and normal file backups. The data files associated with an application database are opened exclusively by the database server process, and therefore performing a file-based backup of the individual data files is usually not possible unless the database server is stopped to perform the file backup operation. File-based backups do not offer several options provided by database server backups, and so should be avoided.

Database server backups involve invoking a special online backup process within the database server. Once a backup process has been initiated, the database server proceeds to stream backup information in a very high performance fashion to a "backup device," which can be either an operating system file or some other kind of media such as a tape device. A database backup can be performed even if there are active user transactions in the database. Once the backup completes, the backup device will contain a snapshot of the database as it existed when the backup process was initiated. This snapshot includes any active, uncommitted transactions that may have been in progress at the time the backup was initiated.

When a database is restored from a backup device, it will be restored to the same state that existed when the backup process was initiated. When the restore operation is complete, the database will be put through a normal recovery process. Any active transactions that were committed to the transaction log but not yet written to the data files will be "rolled forward," and any active, uncommitted transactions that were running when the backup was initiated will be "rolled back." This recovery process ensures that the database will be recovered to a consistent state.

A single backup device can contain multiple "backup sets," which can be used later to restore the database. For example, a backup device can contain a full database backup as well as several transaction log backups. If a backup device is implemented as an operating system file, this file can be backed up using conventional file backup procedures instead of trying to backup the open database files themselves. For more information on database server backup and recovery strategies and techniques, see the SQL Server Books Online for SQL Server 7.0.

Online Backup Strategies

There are three database backup strategies supported by SQL Server and MSDE:

Full Database Backups.
Incremental Database Backups.
Differential Database Backups.

A full database backup strategy involves making regular, complete backups of the entire database. It is the preferred approach for small databases that require a minimum of complexity to restore the database after a failure. It is also the least flexible approach, and most production database installations will opt for an approach that offers superior performance for large online backups and more flexibility in controlling what data gets restored during a restore operation.

An incremental database backup strategy combines a full backup of the database with regular backups of the transaction log for the total backup solution. It is most useful in production environments where backups are performed at frequent intervals, and a high degree of flexibility is required in controlling which data gets restored during a restore operation. A full database backup can be performed during off-peak times; then transaction log backups that are less resource-intensive can be performed at frequent intervals during peak times. Restoring a database using this approach involves loading the full database backup first, then restoring each transaction log backup in the order it was backed up. Incremental backups make it possible to restore the database to a consistent state as of a particular moment in time by loading only the transaction log backups required to recover the database to that moment and discarding the others. When the last transaction log backup is loaded, an administrator can specify that only transactions that were committed up to a particular moment in time should be applied to the database. This is known as "point in time" recovery.

Incremental backups are powerful and flexible. The price for this power and flexibility is complexity. Managing multiple sets of incremental database backup files can be confusing and time consuming. This is not an issue for environments that can rely on a trained database administrator, but smaller database server installations may elect to avoid some of this complexity by using a simpler approach known as differential database backups.

A differential database backup strategy shares many of the same characteristics as an incremental database backup strategy, but offers greater simplicity. Differential backups can be performed after a full backup, and contain information about tables and indexes that have changed since the last full backup. A differential database backup is therefore faster than a full database backup.

Restoring a database that was backed up using a differential database backup strategy is a simple two-step process. First, the full database backup is restored, and then the most recent differential database backup is restored. Interim differential database backups can be discarded because each differential database backup contains all changed table and index information since the last full database backup.

Automatic Configuration and Tuning

SQL Server 7.0 automatically tunes database server configuration options based upon current system activity, eliminating the need to configure complex static server configuration options such as resource allocation for memory and lock buffers.

Automatic configuration and tuning alleviates one of the major barriers associated with deploying custom applications that use a robust client/server database system: maintenance overhead. Because SQL Server effectively tunes itself based upon current activity and resources, it requires minimal manual administration from trained database professionals. This makes SQL Server an ideal database for organizations with scarce resources for system maintenance.

Automatic configuration and tuning does not come at the expense of granularity of control. Advanced applications may require the ability to manually tune and configure various aspects of SQL Server. Database professionals can also manually configure most configuration options instead of relying on auto tuning.

Another major benefit of automatic configuration and tuning is that it enables SQL Server to dynamically adapt to the resources that are currently available on a given hardware platform. For example, on a dedicated database server with multiple processors and a large amount of memory, SQL Server allocates as much CPU time and memory as it can get to improve performance. In contrast, if SQL Server is running on a desktop computer with limited resources and is competing for those scarce resources with other applications, such as e-mail or word processing, SQL Server tunes itself down to a minimal working set size and returns precious memory and CPU time back to the operating system. This ensures even performance for all of the user's applications. As a result, developers no longer have to target separate incompatible databases for each different environment their application runs in, which greatly simplifies application development and deployment.

Automatic Disk Space Allocation

SQL Server 7.0 stores relational tables and indexes in database files. Additionally, it stores transaction log information used for recovery in a separate transaction log file. These files can be configured to grow automatically in pre-set increments when new space is needed. Additionally, these files can be configured to shrink automatically when large sections of the file contain no data.

Automatic disk space allocation reduces the maintenance overhead for custom database applications. SQL Server offers the ability to allocate storage manually when a high degree of control is necessary for advanced applications.

Cost-Based Query Optimization

As an intelligent database server application, the main function of SQL Server is to process database queries and transactions submitted by client applications in a high performance fashion against information stored in internal data structures it manages. Applications need only submit batches of Structured Query Language (SQL) commands, without regard for the physical implementation of the database tables affected by those commands. It is up to SQL Server to determine the most intelligent and efficient way to process those commands using a technique called Cost-Based Query Optimization.

For example, suppose an application submits a complex query using the Transact-SQL SELECT statement. This query correlates a subset of information in three different tables (called a "join"), and then aggregates some numeric values based upon key information derived from each table (called a GROUP BY operation). The results of the query are to be sorted in ascending order by date. Using its query optimizer, SQL Server generates a variety of different strategies for running this query (called "execution plans"). The execution plan determines which indexes should be used, the order in which the tables should be joined, the approach used to do the join, the most efficient way to calculate the aggregates, and the fastest method to sort the result set. It picks the strategy that will result in the fastest query execution time. It makes this decision by determining the "cost" of each execution plan based upon CPU usage and disk I/O, and picking the lowest cost plan.

In order to generate an execution plan that accurately reflects the amount and distribution of data in the database, SQL Server uses distribution statistics. Distribution statistics are a sort of histogram, which maps out how data in a particular index or column is distributed. For example, the distribution statistics for a column storing dates might indicate that 10 percent of the dates are in 1997, 30 percent of the dates are in 1998, and 60 percent of the dates are in 1999. In earlier releases of SQL Server, these distribution statistics required regular maintenance. Because of the performance-intensive nature of updating distribution statistics, most administrators opted to take the database offline, which was unacceptable for some environments.

In SQL Server 7.0, distribution statistics are maintained automatically, ensuring the highest performance for queries without costly maintenance. SQL Server automatically detects when distribution statistics are out of date, and initiates a background process to collect new statistics. By using statistical sampling techniques instead of reading the entire set of values, SQL Server 7.0 creates highly accurate statistics faster than in earlier releases. These enhancements reduce costly maintenance overhead.

Query plan re-use is a major focus for performance tuning and optimization of OLTP systems. SQL Server 7.0 provides several different mechanisms to re-use existing execution plans and even share them among multiple users. This cuts down on the overhead associated with generating plans for repetitive operations that are commonly performed by multiple clients. Client application developers can leverage shared plans by using stored procedures, prepare/execute semantics using ODBC or OLE DB, or by using an approach called "marked parameters." If a shared plan is not explicitly used by way of one of these approaches, SQL Server attempts to re-use existing plans for smaller, repetitive operations using a feature called auto-parameterization, or by searching for matching queries that have already been executed. These features can improve the performance of applications that were written for older database environments that do not leverage an intelligent query processor.

If you are new to application development using an intelligent client/server database engine such as SQL Server, it is important to take the time to learn how to leverage the full capabilities of the query processor properly. In general, it is recommended to hand off as many database-intensive operations as possible to the database server, rather than processing the data on a row-by-row basis in client application code. SQL is an incredibly powerful language. Sophisticated multi-row operations that affect large data sets can be performed using a single SQL command faster than equivalent client application code, which loops through data a row at a time. By leveraging the full capabilities of the query processor, you can greatly simplify application source code, and dramatically improve the performance of an application by giving SQL Server full reign to decide the most efficient way to execute a particular operation as well as eliminating costly round trips between the client process and SQL Server.

Server-Enforced Data Integrity

SQL Server 7.0, like other robust client/server database engines, can be configured to enforce its own data integrity. This enables a high degree of consistency and centralized control that can be enforced by the database server instead of in client application source code, which can be inconsistent from application to application and difficult to implement in a multiuser setting.

Column Properties

The most basic way of enforcing data integrity involves the use of column (for example, field) properties. The simplest column property is a column's data type and size, which controls what kind of data can be entered into a column and how large it can be. SQL Server provides a wide variety of data types for character, numeric, date/time, and binary data.

Other column properties include nullability, which controls whether a value is required for a particular column or not, and a special property called an IDENTITY. An integer column with an identity associated with it will have a new sequential integer value assigned to it automatically when new rows are added to the table. This is useful for assigning unique identifiers to rows.

Constraints

Constraints provide a mechanism to limit the domain of possible values for a particular column in a table. By placing constraints on a column, the database server can enforce a variety of conditions such as:

Guarantee uniqueness.
Guarantee referential integrity.
Supply a default value.
Require that new values be a member of a finite list.
Require that new values match a certain pattern.
Require that new values fall within a certain range.
Compare a new value to other values in the same row.

Triggers and Stored Procedures

Some applications elect to leverage the programmability of SQL Server by implementing custom business logic using stored procedures and triggers. A stored procedure is a set of instructions (written in Transact-SQL) for a particular database operation that is stored and executed by name at the server. They have both data integrity and performance benefits. By implementing standard database operations using stored procedures, and requiring client applications to use the stored procedure to perform that operation, developers can ensure that important transactions are performed in a consistent fashion regardless of differences between client applications.

Triggers are event-driven stored procedures that are attached to tables. Triggers fire after a data modification occurs, but before the associated transaction that caused the data modification commits. Inside the trigger, the before and after effects of the data modification can be examined. Using this information, the trigger can implement custom data integrity such as updating a flag in an associated table, implementing custom auditing routines, or cascading a delete to an associated table. A trigger can force the associated transaction to roll back to enforce application specific rules or complex referential integrity.

Security

SQL Server 7.0 provides the ability to apply a rich variety of security constraints to both database objects and administrative functionality, which is enforced by the database engine, not application source code. By enforcing security at the database, application designers and database administrators can ensure that a consistent set of security constraints are being enforced even if different client applications are manipulating the same shared data. This model, as previously discussed, is far superior to application dependent security models that are prevalent in shared file-system database architectures. Most SQL-based relational database engines (including SQL Server) implement the ANSI standard GRANT and REVOKE commands for controlling access to tables and stored procedures in the database. For example, an administrator can use the GRANT command to permit a particular set of users to run SELECT queries against a particular table, while denying the ability to perform updates or deletes.

While server-enforced security is a common feature of most robust client/server database engines, the security architecture of SQL Server 7.0 distinguishes itself by offering the strongest integration with the domain security model of Windows NT. SQL Server 7.0 natively supports the concept of a "trusted connection," which essentially relies upon Windows NT security to authenticate users. User's connecting using a trusted connection need only log on to Windows NT. Then they can use their pre-authenticated security context to connect to SQL Server. This eliminates the need to administer a separate set of user accounts and passwords for the database and facilities single network logon. SQL Server 7.0 provides the ability to grant or revoke permissions directly to Windows NT users and groups, eliminating the need to "synchronize" the syslogins table in SQL Server with the Windows NT user accounts database.

A "standard" security model is also supported for backward compatibility and for applications that cannot rely upon Windows NT domain security for authentication. Using this approach, a SQL Server-specific user account and password can be used for nontrusted connections.

SQL Server 7.0 also introduces the concept of a security role. Using a role, an application designer or database administrator can manage a collection of privileges, which can then be associated with Windows NT users, groups, or SQL Server standard user accounts. Varieties of system-defined roles exist for assigning a rich variety of administrative privileges to various users or groups. Administrators and application designers can add their own user-defined roles to control access to database objects in their own application. These user-defined roles can then be associated with particular users and groups at install time. In this fashion, a custom installation procedure can be used to "connect" a database role with actual users and groups at a particular installation. This topic is covered in detail in a subsequent section of this document.

A specialized type of role called an "application" role was introduced in SQL Server 7.0 to facilitate changing a user's security context based upon the type of application being used to access the database. For example, by default users may only be granted SELECT privileges against a transactional table containing invoice information. However, a specialized Visual Basic® order entry application can enable an application role at run time that permits data modifications to the transactional table, but only in the context of that particular application. Applications roles are enabled and disabled at run time in application source code by running a special system-stored procedure that requires the name of the application role being activated and a password associated with that role.

Integrated Database Services

SQL Server 7.0 includes a comprehensive set of integrated system services that provide an extremely rich environment for developing custom database applications. A brief introduction of these services is necessary to understand their usefulness.

Database Management and Administration

SQL Server 7.0 includes a vast array of database administration services that developers can leverage in custom applications. This set of services is referred to as the SQL Server Distributed Management Framework (SQL-DMF).

SQL Server Enterprise Manager

SQL Server 7.0 includes a comprehensive set of graphical administration tools. SQL Server Enterprise Manager is the primary administration environment for SQL Server and can be used to administer both local and remote SQL Server database installations. SQL Server Enterprise Manager provides a vast array of graphical tools that provide point-and-click administration, wizards, and informational views useful in administering all aspects of SQL Server. SQL Server Enterprise Manager is installed with all editions of SQL Server 7.0, but not with MSDE.

Database Maintenance Wizard

SQL Server Enterprise Manager provides a variety of extremely useful wizards that dramatically simplify a variety of different administration tasks. Several classes of wizards are provided, including: Database, Data Transformation Services, Management, and Replication.

One particularly powerful and useful wizard is the Database Maintenance Wizard, which automates the process of building a database maintenance plan. Microsoft is currently planning to provide the Database Maintenance Wizard in a service pack release update for MSDE. A database maintenance plan addresses such issues as database integrity checks, index statistics maintenance, and backup strategy and frequency. After a maintenance plan is created, it can be saved and scheduled for execution.

SQL Server Distributed Management Objects

SQL Server Enterprise Manager and other SQL Server administration tools leverage a shared software component called the SQL Server Distributed Management Objects (SQL-DMO) for performing administrative tasks. SQL-DMO exposes an extremely powerful set of COM-based interfaces for administering SQL Server installations. Developers can leverage the power of SQL-DMO in their own administrative tools and batch jobs, eliminating the need to write complex Transact-SQL script files to perform administration tasks while adding powerful integration with system services and scripting with robust development tools like Microsoft Visual Basic and Microsoft Visual C++®. SQL-DMO is installed with all editions of SQL Server as well as MSDE.

SQL Namespace

Additionally, a second software component called SQL Namespace (SQL-NS) allows developers to include custom dialog boxes that are part of SQL Server Enterprise Manager in their own custom administration environments. It also allows developers to encapsulate existing administration user interface components without costly recoding. For example, rather than writing a custom dialog box for backing up a SQL Server database, a developer can simply call the official database backup dialog box used in SQL Server Enterprise Manager from SQL-NS. SQL-NS is installed with all editions of SQL Server 7.0, but not with MSDE.

SQL Server Agent

SQL Server Agent is an integrated job scheduling and execution environment designed specifically for SQL Server. Using SQL Server Agent, database administrators can write and deploy custom batch jobs for performing a wide variety of database administration tasks from simple single server administration tasks to complex multiserver operations. Additionally, SQL Server Agent provides the ability to monitor a vast array of database events and performance characteristics, and generate alerts when specific conditions are encountered. SQL Server Agent can be administered graphically from within SQL Server Enterprise Manager, and can also be configured in an unattended fashion using SQL-DMO. SQL Server Agent is installed with all editions of SQL Server 7.0 as well as MSDE.

SQL Server Profiler and Index Tuning Wizard

SQL Server 7.0 provides two important tools: SQL Server Profiler and Index Tuning Wizard. SQL Server Profiler is used to monitor and record current SQL Server activity to assist in performance tuning and optimization. SQL Server Profiler log files (called workloads) can be used to determine the actual queries and transactions being submitted to SQL Server for processing, as well as a vast array of statistics about each query. These statistics can be used to determine cost and resource consumption of individual queries. Additionally, SQL Server Profiler workloads can be replayed against a different SQL Server installation, allowing database administrators to reproduce the performance characteristics of a particular workload.

SQL Server Profiler workloads can also be passed to the Index Tuning Wizard, which analyzes the workload and makes recommendations on indexing strategies that will provide more optimal performance. The combination of SQL Server Profiler and Index Tuning Wizard facilitate performance tuning and optimization of SQL Server database applications. SQL Server Profiler and Index Tuning Wizard are installed with all editions of SQL Server 7.0, but not with MSDE.

Distributed Database Services

SQL Server 7.0 is an ideal database platform for distributed applications because it includes a powerful array of features and services useful in constructing applications designed to leverage multiple backend servers.

Distributed Query Processing

The SQL Server 7.0 query processor natively supports the ability to reference tables from multiple databases in a single query, even if they reside on a different computer. This is accomplished by referencing remote tables in the FROM clause of a Transact-SQL SELECT, INSERT, UPDATE, or DELETE command.

Distributed queries are extremely powerful. For example, they can be used to join data from tables that reside on different servers. Additionally, remote tables can be accessed using OLE DB or ODBC, extending the benefits of distributed queries to heterogeneous systems as well.

Distributed Transactions

SQL Server 7.0 natively supports the Microsoft Distributed Transaction Coordinator (MS DTC) for conducting transactions that change data that resides on multiple SQL Server installations using a two-phased commit protocol. SQL Server can be configured to enlist remote servers automatically in a two-phased commit. Additionally, Microsoft Transaction Server automatically enlists any and all SQL Servers involved in a distributed transaction when accessing SQL Server from middle-tier components. These features dramatically simplify the development and management of distributed transactions.

Replication

SQL Server 7.0 supports a rich variety of replication models using a publish-and-subscribe replication architecture. Replication automates the process of synchronizing data among multiple SQL Server installations, greatly simplifying the development of distributed database applications.

Snapshot Replication

Snapshot replication is the most basic replication model supported by SQL Server. Using this model, a "snapshot" of the publication is taken on a scheduled basis, and distributed to all subscribers. This is useful when the information contained in the publication is fairly static and subscribers need only be refreshed periodically. MSDE is limited to subscription-only support for snapshot replication.

Transactional Replication

Transactional replication is used when all subscribers are normally active on a LAN or WAN, and changes to a published data set must be replicated to subscribing SQL Server installations on a near real-time basis. Transactional replication uses a high-performance log reader process to copy transactions marked for replication to a durable queue implemented as a SQL Server database, called a distribution database. A separate process then manages the process of applying each of the transactions in the distribution database to all subscribers in an incremental fashion that is highly reliable and consistent. This architecture is ideal when replicating among several server-based installations of SQL Server that are normally continuously available, and when changes must be seen as soon as possible at subscribing sites without the overhead of a complete resynchronization. MSDE is limited to subscription-only support for transactional replication.

Updating Subscribers

Both snapshot and transactional replication models support the concept of "updating subscribers," which allows a subscribing site to update their local copy of replicated data and rely on SQL Server to synchronize the change with the publishing system. Merge replication also supports the concept of updating subscribers, but with a different mechanism.

If a publication is enabled for updating subscribers and a user changes data on a subscriber, SQL Server automatically initiates a distributed transaction using MS DTC between the subscribing site and the publisher. This ensures that the change is applied in a consistent fashion on both sites, with no custom programming required. If the transaction fails for any reason, the modification will be rolled back on both sites. After the distributed transaction commits, the publishing site distributes the change accordingly to all other subscribing sites, and avoids sending a redundant synchronization to the subscribing site where the change was made. This happens on a near real-time basis using transactional replication, and on a scheduled basis using snapshot replication.

Merge Replication

Merge replication is used when subscribers are autonomous and are frequently operating in a disconnected fashion. For example, users on disconnected laptops can be permitted to make changes to data in their subscription while they are offline. When the user connects to the network and initiates synchronization, the subscription is then "merged" with the publication on the server on a per-row basis. Only new rows or rows that have been changed are involved in this process. This is accomplished by selecting a special column added to all merge publications using the new SQL Server 7.0 uniqueidentifier data type. In this fashion, synchronization causes changes to the subscription to be applied at the publishing site, and changes made to the publisher to be applied at the subscribing site.

Because a publication can be simultaneously available to multiple subscribers who are offline, the potential for conflicts exists when users are making changes to the same data. Merge replication can be configured to support a variety of pre-defined conflict resolution schemes that detect conflicts on a column or row level, or a developer can implement a custom conflict resolver that can be used for more sophisticated application requirements. MSDE fully supports both publication and subscription using merge replication.

Data Transformation Services

Microsoft Data Transformation Services (DTS) is an integrated data movement and transformation tool introduced in SQL Server 7.0. DTS itself is a COM object model that exposes extremely rich objects for moving data between a data source and a data destination, while optionally applying transformations en route. DTS can perform simple data movement operations such as import and export, as well as extremely complex extraction, transformation, and loading tasks common in data warehousing applications complete with multiple sources and destinations and workflow. DTS supports any OLE DB provider as a source and destination, including SQL Server, Oracle, personal computer databases, text files, and ODBC. This makes DTS extremely useful in heterogeneous environments.

The DTS object model loads and executes DTS packages. A DTS package is a collection of data movement and transformation instructions that can be stored in a file, in the SQL Server msdb system database, or in the Microsoft Repository. DTS packages can be assembled graphically using tools in SQL Server Enterprise Manager. Import and export wizards are provided for simple operations, and the DTS Designer tool is available for constructing more complex data movement operations.

The DTS object model, as well as the dtsrun utility for executing DTS packages, is installed with all editions of SQL Server 7.0 as well as MSDE. The DTS Import and Export wizards and the DTS Designer are installed as part of SQL Server Enterprise Manager, and therefore are available with all editions of SQL Server 7.0, but not with MSDE.

Decision-Support Services

SQL Server 7.0 is a multipurpose database engine that can be used for a wide variety of applications. Earlier releases of SQL Server were designed for optimal performance in an online analytical processing (OLAP) environment, making it an ideal database for operational systems. OLAP applications typically access the database with high volumes of short, pre-defined and repetitive transactions, such as a transaction for inserting a new order and its line items in an order entry system.

While SQL Server 7.0 continues to excel in OLAP environments, the architecture of this release makes it ideal for an entirely different kind of database application called decision support, also known as "data warehousing" or "business intelligence." Decision support involves the user-driven analysis of very large amounts of database information. When compared to OLAP systems, decision-support applications access the database in a very different manner. Queries are more ad-hoc, and involve correlating information in several different tables (called a "multi-table join"). Extensive sorting, grouping and data aggregation is performed, and scans of very large numbers of rows are common. This section briefly introduces some of the key features of SQL Server 7.0 that are important for decision-support applications.

Extraction, Transformation, and Loading

The databases' decision-support systems are implemented as separate read-only copies of information stored in other operational systems, and are typically referred to as a "data mart" or a "data warehouse." There are several reasons for maintaining a separate decision-support environment:

Decision-support activity can be off-loaded to a separate database, and even a separate server, to ensure that the performance and integrity of the operational systems is not compromised.
Operational database structures are typically very complicated and are optimized for transactional performance. By reorganizing the information into easy-to-use database structures, improved performance for decision-support queries can be achieved and end-users will have an easier time understanding the structure of the database when developing their own queries and reports.
Decision-support databases may represent a consolidation of a variety of different operational systems from different database, operating system, and hardware platforms. Unifying information from these disparate sources in a common decision-support environment makes it possible to perform queries and analytical activities more efficiently.

Data does not magically appear in a decision-support database. It must be moved there periodically and maintained. The data itself is often unfit for use unless it has been validated, cleansed, and transformed en-route. This process is called extraction, transformation, and loading (ETL). SQL Server 7.0 provides a set of tools and services called Data Transformation Services (DTS) that are ideal for developing custom ETL routines for populating decision-support databases. DTS is discussed in more detail in the section titled Data Transformation Services. Using DTS, a developer can construct complex ETL routines to extract data from a wide variety of operational systems using OLE DB or ODBC, apply custom validation and transformation logic to that data, and then load it in a high performance fashion into a SQL Server 7.0 decision-support database.

Query Processing for Decision-Support Environments

As mentioned earlier, SQL Server 7.0 employs a cost-based query optimizer that dynamically creates optimal execution plans for queries. The new query processor can be easily extended with new algorithms and strategies for creating execution plans. Several extensions for decision support have been implemented for SQL Server 7.0 that are quite significant, and many more are planned for future releases.

Advanced Join Techniques

A variety of different strategies can be employed for correlating information between multiple tables (called a "join"). Besides supporting the standard "nested-loop" join algorithm, SQL Server 7.0 adds support for "merge" and "hash" join techniques. In a merge join, the query processor recognizes that useful indexes exist for the keys being used to join two tables. It then performs a high-speed merge operation between the indexes to determine the intersection between the two rather than looping through one of the tables in an iterative fashion looking for matches.

A hash join is useful when no indexes can be found to speed up a join operation. Indexes can be costly in terms of disk space for very large tables, yet users may want to correlate non-indexed information whether the index exists or not. The smaller of the two tables is sequentially scanned in, and a high performance data structure called a "hash table" is created. The larger of the two tables is then sequentially scanned, and its keys are matched in a high performance fashion with entries in the hash table. Because only a single sequential scan is required for each table, costly iteration and additional disk I/O is avoided for non-indexed joins. This same type of algorithm can be used for performing aggregation or "group by" operations when no useful index can be found.

Bulk Update Processing

Because decision-support databases are often redundant, reorganized copies of operational databases, they may require periodic maintenance that involves very large quantities of data. The SQL Server 7.0 query processor employs a completely new set of techniques referred to as "bulk update processing" designed to improve the performance of these kinds of operations. For example, when updating large quantities of rows, the query processor attempts to apply all changes to the actual data pages themselves, and then resort these changes and apply them to each affected index in a single pass if possible. This is faster than updating the indexes on a row-by-row basis.

Data Partitioning

Some tables are so large that it may make sense to "partition" them into smaller tables so that queries can be run against specific partitions instead of the table as a whole. The SQL Server 7.0 query processor adds a new feature that facilitates partitioning without losing the logical structure of the table as a whole. A feature called a "partitioned view" provides the ability to create a "view" or virtual table that combines each of the partitions of a large table into a single logical view that preserves the logical structure of the table. For example, a table of sales figures could be broken down into separate partitions for each quarter, and then a partitioned view could be built, which combines them using a UNION operation so that users do not have to figure out which partition they need to query. When optimizing queries against a partitioned view, the query processor can examine constraints applied to each partition to intelligently determine which partitions will be involved in the query and avoid accessing partitions that will not be affected. More work is planned in the area of data partitioning for future releases of SQL Server to make it easy to partition large tables over multiple servers.

Parallel Query Processing

The SQL Server 7.0 query processor automatically analyzes each query to determine if it would benefit from parallel execution on hardware that supports symmetrical multi-processing (SMP). For example, a large query that aggregates information by state might be broken up into three or four smaller queries that aggregate different ranges of states in parallel. The results from each of these smaller queries are then merged into a single result set. Parallel query processing can result in exponential improvements for large queries in a decision-support environment.

64-Bit Memory Addressing

SQL Server 7.0, Enterprise Edition supports the ability to address up to 32 GB of memory for caching data and index pages on certain hardware platforms. Because decision-support environments typically involve very large databases, increasing the amount of memory available for caching database information can dramatically improve the performance of large queries.

Natural Language Query Processing

SQL Server 7.0 includes a new set of tools, called English Query, which enables natural language queries against database information. Developers use the English Query domain authoring tool to create semantic descriptions of the various entities and relationships that are represented in a database. After compilation, the domain is used by English Query to interpret questions and commands stated in the English language, and to build the corresponding Transact-SQL commands required to carry out the users' instructions.

Custom applications can then embed the custom domain and the English Query engine (implemented as a Microsoft Component Object Model software component) in decision-support applications that require a natural language query interface. For example, a Web-based application that provides ad-hoc query capabilities to analyze purchasing activities in a corporate procurement system could allow users to ask such sophisticated questions as: "Which vendors sold the most personal computers to the marketing division in the third quarter of 1998?" Such a system would allow the user to submit their question using a Web browser. The question would then be passed to the English Query COM component, which loads the custom domain file describing the database for the corporate procurement system and uses it to try to understand what the user is asking. English Query then returns a "restatement" of the question to ensure that it has properly interpreted the user's request. After the restatement has been accepted, English Query returns a Transact-SQL query to the application that can then run the query against the database and return the results to the user.

English Query employs extremely sophisticated natural language processing algorithms and can understand very complex English statements, provided the developer has taken the time to build a comprehensive domain file that describes the database. Adding a natural query interface to a decision-support application using English Query can dramatically improve the power and usability of the system.

Multidimensional Analysis

As previously stated, many queries in a decision-support environment involve extensive grouping and calculation of aggregates such as sums, minimums, maximums, averages, and so on. These types of operations take time to process regardless of how powerful the underlying database engine and hardware platform are. The need for instantaneous retrieval and interactive analysis of large quantities of aggregate information have spawned a new generation of decision-support technology called Online Analytical Processing (OLAP). Several vendors over the past several years have marketed a variety of different OLAP server technologies as well as analytical tools to address this market.

These tools and technologies have been very costly and complex to implement, making OLAP-based decision-support systems accessible only to a privileged few with the resources and expertise to implement such technology. With the advent of SQL Server 7.0, this situation is changing dramatically. SQL Server 7.0 includes an integrated OLAP Server called "SQL Server 7.0 OLAP Services," which simplifies the process of building and deploying powerful OLAP applications.

OLAP Services

OLAP Services is a new multidimensional database server that is tightly integrated with SQL Server 7.0. Using OLAP Services, developers and administrators can construct multidimensional cubes that are populated with information stored in a decision-support database. These cubes are stored on a server and accessed by desktop analytical tools or Web browsers, and enable extremely high-performance analysis of multidimensional information as described in the following section titled Multidimensional Analysis Tools.

OLAP Services dramatically simplifies the cube design process, and supports three different storage models for multidimensional information. These storage models are defined as follows:

Multidimensional OLAP, or (MOLAP), uses a proprietary database structure specific to a particular OLAP Server technology to save cube information to disk. When creating a MOLAP cube, OLAP Services creates a redundant copy of the detail information in the source database, and stores all of the cube aggregations in a highly efficient data structure that is optimized for performance and low disk space usage.
Relational OLAP, or (ROLAP), uses a standard relational database such as SQL Server 7.0 to save cube information to disk. When creating a ROLAP cube, OLAP services creates cube aggregations and stores them in relational tables. OLAP Services resolves queries against the cube by running them against the underlying database server.
Hybrid OLAP, or (HOLAP), as its name suggests, is a hybrid approach, which leverages both MOLAP and ROLAP modes. When creating a HOLAP cube, OLAP services creates cube aggregations and stores them in a proprietary multidimensional database, but leaves the lowest level detail information (called "measures" or "facts") in the underlying database server. OLAP Services resolves queries against the cube by running them against the MOLAP data structure, and only queries the underlying database server when users "drill down" to the lowest level details associated with the cube.

One of the most complicated aspects of creating cubes is defining how much data needs to be pre-aggregated. Cubes with many measures and many dimensions can suffer from data explosion. OLAP Services graphically depicts the tradeoff between disk storage space and query performance, making it easier to implement an initial cube design. After the cube is in regular use, OLAP Services can analyze query activity and suggest alternative aggregation strategies that improve query performance. This approach is called "usage-based optimization."

Some decision-support environments are more demanding than others. For this reason, the architecture of OLAP Services is designed to scale well from desktops, to single-server installations, to multi-server installations.

OLAP Services can offload much of the calculation-intensive overhead associated with analytical activities to the desktop using a client-side component called Pivot Table Services that will be discussed in the next section titled Multidimensional Analysis Tools. By caching portions of a cube locally, Pivot Table Services can perform many CPU-intensive calculations locally without impacting the performance of the OLAP Server.

By creating a virtual cube that is comprised of several smaller cubes, OLAP Services can transparently distribute the processing requirements for very large decision-support environments across several servers. This capability provides significant scalability benefits for high-end environments.

Multidimensional Analysis Tools

There are a wide variety of tools that use OLAP server technology to provide multidimensional analytical capabilities to end-users. Multidimensional analysis allows users to explore vast quantities of pre-aggregated data interactively using a structure called a cube. Cubes are comprised of measures (also called "facts") and dimensions. Measures are the core data values that are pre-aggregated for analysis. Examples of measures are costs, prices, quantities, temperatures, run rates, and so on. Dimensions are simply attributes of measures such as time, customer, location, part, and so on.

Dimensions are typically arranged in a hierarchy, enabling users to "drill down" from high level aggregates to more detailed aggregates. For example, a manager analyzing sales figures might want to start looking at them by quarter, and then drill down to weekly figures. Analytical tools also provide the ability for users to interactively combine and change the dimensions they are using to analyze data using an operation typically known as a "pivot." For example, after analyzing sales figures by time, the manager may want to add analysis by sales region so that the total sales for a particular period of time are broken down by region.

Multidimensional analysis tools can be custom-developed, or purchased from Microsoft using Microsoft Excel® 2000, or third party Independent Software Vendors (ISV). OLAP Services provides a standard application programming interface (API) that can be used by developers or ISVs building such tools. Microsoft has published a standardized COM-based API called OLE DB for OLAP that can be used for querying multidimensional servers such as OLAP Services. OLE DB for OLAP is implemented as a set of multidimensional extensions to OLE DB, combined with a specialized SQL-like query language called MDX. Developers familiar with COM and C++ can write code natively to OLE DB for OLAP. A simplified version of OLE DB for OLAP is also available called ADO/MD. ADO/MD can be used to develop applications that support multidimensional analysis using any ActiveX® compliant programming language such as Microsoft Visual Basic®.

Metadata Services

Large decision-support environments, such as a data warehouse, are difficult to manage due to the large number of tables, databases, servers, ETL logic, multidimensional databases and applications involved. Large data warehousing environments can greatly benefit from tools and technologies that help describe all of the metadata for the various components involved in a data warehouse and the inter-relationships between each component.

SQL Server 7.0 includes the Microsoft Repository 2.0, a flexible metadata repository that uses the industry standard Open Information Model to capture, store, manage, and publish metadata for all major components of a data warehouse system, including relational databases, ETL logic, multidimensional databases, and even the semantic models associated with natural language query processing. See http://msdn.microsoft.com/repository/technical/whitepapers.asp for references for additional information on the Microsoft Repository.

Application Development

SQL Server 7.0 offers an extremely rich environment for developing custom database applications. Custom applications can reside on a user's desktop, communicate with SQL Server using middle-tier components, or provide access to SQL Server data using a Web server. This section provides a brief overview of the key technologies of interest to developers who build custom database applications.

Application Programming Interfaces

Database engines such as SQL Server require a mechanism that allows software developers to build applications that communicate with the database either locally on the same machine or across a network. This mechanism, called an application programming interface (API), enables a custom application to establish a secure connection with the database server, submit query language commands to the database, and process the results of those commands as well as any associated error or informational messages. SQL Server 7.0 supports a variety of APIs that will be briefly introduced in this section.

Universal Data Access using OLE-DB

One persistent problem that confronts developers building custom database applications is the vast array of APIs required to communicate with various "data providers." In the past, it was necessary to program to proprietary, vendor-specific APIs for every different data provider. This made the task of building a single set of data access code in a particular application that could target a variety of different data providers impossible, requiring developers to re-write large portions of their applications when a different data source needed to be targeted.

Microsoft's first attempt to address this problem was the development of an industry standard data access mechanism called Open Database Connectivity (ODBC). ODBC made it possible to use a single API for data access to relational database engines that support the industry standard Structured Query Language (SQL). By writing to ODBC, developers could target any database engine that supports ODBC by installing the appropriate ODBC driver. ODBC has become widely adopted by software developers and remains an important data access mechanism that Microsoft intends to support in the future.

As enterprise software development on the Windows platform became more popular, it was apparent that the concept of a standard data access API must be expanded to address other data providers beyond SQL-based relational databases to achieve the same economies of scale. Examples of such providers include OLAP servers for multidimensional analysis, host databases that do not support SQL (for example, IMS, VSAM, and so on), flat text files, structured documents (for example, Excel spreadsheets), or even hierarchical directory services (for example, Windows 2000 Active Directory™). The concept of an industry standard data access API that provides access to a rich variety of data providers across multiple platforms is known as Universal Data Access.

Unfortunately, the architecture of ODBC was not well suited for Universal Data Access because it was designed to support only SQL databases and did not support Microsoft's Component Object Model (COM) that has become the key mechanism for assembling applications based upon a collection of software components using standardized interfaces. For this reason, it was necessary to design a new data access API that could be easily adapted to a wider variety of data providers and accessed using standard COM interfaces.

This new API is called OLE DB. OLE DB is now in its second major release, and is becoming the de-facto mechanism for Universal Data Access for applications on the Windows platform. SQL Server 7.0 supports OLE DB as a native programming interface.

Applications that utilize OLE DB as a data access API connect to SQL Server 7.0 by way of the SQL Server 7.0 OLE DB Provider. Connections can be established to a local SQL Server installation on the same machine, or a remote SQL Server installation across a network. Remote connections can be configured to support a variety of network protocols and IPC mechanisms, such as TCP/IP sockets, by way of a network provider layer beneath OLE DB known as a network library or "NetLib." This network layer is transparent to the application developer.

OLE DB providers are being made available from Microsoft and third parties for other SQL database platforms such as IBM DB/2, Oracle, and Sybase. Additionally, non-SQL data providers are accessible using OLE DB, and many more will be available in the near future. Examples include:

Text files.
VSAM, IMS, and AS/400 using Microsoft Systems Network Architecture (SNA) Server.
Multidimensional databases such as Microsoft SQL Server OLAP Services.
Windows 2000 Active Directory.

Because a wide variety of high-quality ODBC drivers already exist for several popular SQL databases, OLE DB also supports the ability to build applications that communicate with any ODBC compliant data provider using a mechanism called the OLE DB Provider for ODBC. OLE DB is an extremely powerful and flexible data access API for developers who build applications using Microsoft Visual C++ and COM.

OLE DB supports extensions for multidimensional analysis (called "OLE DB for OLAP") useful for application development against multidimensional data providers such as Microsoft Excel Pivot Table Services and Microsoft SQL Server OLAP Services. OLE DB for OLAP also provides a native multidimensional query language called MDX for building complex multidimensional queries.

Active Data Objects (ADO)

An alternative OLE DB programming model called Active Data Objects (ADO) is provided for developers using any ActiveX compliant development. ADO provides a greatly simplified data-access programming model for developers who want to leverage OLE DB for Universal Data Access in their applications. The latest editions of Microsoft Visual Basic and Microsoft Visual Basic for Applications use ADO as the standard data access mechanism for developers building custom applications. ADO fully supports the OLE DB for OLAP multidimensional extensions using corollary extensions called ADO/MD.

Open Database Connectivity (ODBC)

Prior to the release of SQL Server 7.0, Microsoft ODBC was the preferred data access API for developing custom SQL Server applications. SQL Server 7.0 fully supports ODBC and includes an updated ODBC driver that provides access to all the new features and capabilities available in the database server. Applications written to support earlier releases of the Microsoft ODBC SQL Server driver are fully compatible with SQL Server 7.0, but may not have access to a variety of new features and capabilities provided in the most recent release. Microsoft will continue to support ODBC as a native SQL Server data access interface for the foreseeable future.

DB-Library

Prior to the release of SQL Server 6.5, Microsoft DB-Library was the preferred data access API for developing custom SQL Server applications. SQL Server 7.0 fully supports DB-Library for backward compatibility; however, DB-Library applications do not have access to a variety of new features and capabilities provided in the most recent release. Microsoft will not be enhancing DB-Library and the future and it is considered to be a legacy interface supported for backward compatibility only.

Query Language Support (Transact-SQL)

A data access API is used to establish connections to a database server, send commands, and process the results of those commands. The syntax of the commands supported by a particular database is referred to as its "query language." The Structured Query Language (SQL), pioneered by IBM in the 1980s, has become the de-facto standard query language for relational database products. Although some standardization exists for SQL query languages, most database products implement their own vendor-specific SQL language syntaxes.

SQL Server 7.0 supports its own native query language called Transact-SQL. Transact-SQL is compatible with the American National Standard for Database Language SQL (ANSI X3.135-1992), which improves interoperability between SQL Server and other database products that support the same standard (for example, Oracle, Sybase, IBM DB/2).

Transact-SQL offers significant enhancements to the ANSI standard, improving the power and flexibility of custom applications developed for SQL Server 7.0. Transact-SQL supports the four basic data manipulation language (DML) commands common to all SQL dialects:

The SELECT command is used to retrieve information from relational tables. This command is incredibly rich, allowing for both simple queries and complicated multi-table queries that correlate information among multiple tables (called a "join").
The INSERT command is used to add new rows to a relational table.
The UPDATE command is used to modify existing data values stored in columns in a relational table.
The DELETE command is used to delete rows from a relational table.

Transact-SQL also supports a second category of commands known as data definition language (DDL). These commands enable developers to create new database objects such as tables and indexes, apply security permissions to these objects, and generally manage the structure of a database.

A third category of commands can be referred to as control-of-flow. Examples include IF statements and WHILE loops. These commands make it possible to add conditional logic to batches of Transact-SQL commands. Support for control-of-flow commands and variables make it possible to implement the server programmability features of SQL Server discussed previously in the section titled Triggers and Stored Procedures.

Developing Multitiered Applications (Windows DNA)

Multitiered application development architectures, such as Microsoft's Windows Distributed interNet Applications Architecture (Windows DNA) are becoming a popular paradigm for designing enterprise software applications. For more information about Windows DNA, see www.microsoft.com/dna/.

In a multitiered application, presentation, business logic and data are implemented in separate "tiers." Each tier can be physically located on a different computer over the network. This approach makes it easier to build scalable enterprise applications that are easier to maintain.

SQL Server 7.0 is an ideal database platform for building multitiered applications because of its native support for several key aspects of Windows DNA:

SQL Server supports OLE DB as its native data access mechanism.
SQL Server supports the Microsoft Distributed Transaction Coordinator (MS DTC) for distributed transactions.
Microsoft Transaction Server supports native SQL Server integration, including high-speed connection pooling.

Developing Web Applications

Most applications built for the Web or corporate intranets require a database for storage and retrieval of structured information. Microsoft SQL Server 7.0 is an ideal database platform for Web-based applications for a variety of reasons:

Native support for ADO for data access. ADO is the preferred data access interface when developing Web-based applications using Internet Information Services (IIS) and Active Server Pages (ASP).
Web data publishing and retrieval of HTML table data using the SQL Server Web Assistant.
Web data distribution using SQL Server 7.0 replication and "anonymous subscribers."
Support for TCP/IP, encryption and firewalls for applications requiring a direct connection to the database over the Internet.

Microsoft BackOffice and Small Business Server Integration

Microsoft SQL Server 7.0 is an important member of the Microsoft® BackOffice™ and Microsoft Small Business Server suites of enterprise software products, and as such offers a high degree if integration and interoperability with each of these suites. This section is useful if your organization uses or is evaluating Microsoft BackOffice or Small Business Server. For more information on the Microsoft BackOffice suite of products, see www.microsoft.com/backoffice/. For more information on Microsoft Small Business Server, see www.microsoft.com/smallbusinessserver/.

Windows NT Server Integration

SQL Server 7.0 was natively designed for the Windows NT operating system. This native design ensures that SQL Server exploits key underlying features of the Windows NT environment, including:

Trusted Connections by way of Windows NT Authentication.
Windows NT Performance Monitor Integration.
Windows NT Event Log Integration.
Microsoft Distributed Transaction Coordinator (MS-DTC) Integration.
Microsoft Transaction Server Integration.
Microsoft Internet Information Services Integration.
Microsoft Message Queue (MSMQ) Integration.

Exchange Server Integration

SQL Server 7.0 includes a support service called SQL Mail that enables SQL Server to send and receive e-mail messages by way of the simple MAPI (Messaging API) supported by Exchange Server. SQL Mail offers the following capabilities:

Generate e-mail from Transact-SQL stored procedures and triggers, which includes message strings, query result sets, and other useful content.
E-mail can be sent to standard users, or to special pager accounts.
Process e-mail messages, sent to a special database server mailbox, that contain Transact-SQL command batches, and reply with the results of the commands that were executed.

SNA Server Integration

SQL Server 7.0 relies upon SNA Server for a variety of host-connectivity and database integration features, including:

OLE DB connectivity to AS/400, VSAM and DB/2 data for application integration and host data replication services.
Microsoft Transaction Server interoperability with CICS/IMS host transactions.
Microsoft Message Queue interoperability with MQSeries message queues.

Proxy Server Integration

The TCP/IP and multi-protocol network libraries provided with the SQL Server 7.0 OLE DB and ODBC providers supports the ability to provide secure connections to SQL Server 7.0 installations on a private network by way of the public internet through a firewall implemented using Proxy Server. Additionally, the multi-protocol network library supports the ability to encrypt data sent to and received from SQL Server 7.0 over a secure connection. SQL Server 7.0 can also replicate database information to anonymous internet subscribers through a Proxy Server firewall.

Site Server Integration

Site Server 3.0 provides the ability to manage and publish full-text indexes of SQL Server 7.0 databases.

Systems Management Server Integration

Systems Management Server 2.0 can be used to perform centralized deployment of SQL Server 7.0 servers and clients. Systems Management Server uses SQL Server databases to store software inventory information on a primary site server.

Understanding SQL Server and MSDE Licensing

It is important to understand the software licensing requirements for SQL Server and MSDE before attempting to redistribute either product. This section serves as a brief introduction to licensing and contains references to other resources for more information.

If you are an organization interested in purchasing and deploying SQL Server for internal use, you can find general information on Microsoft licensing policies at www.microsoft.com/enterprise/licensing/ and specific information on SQL Server pricing and licensing policies at www.microsoft.com/SQL/productinfo/pricing.htm.

If you are an Independent Software Vendor (ISV) or Microsoft Certified Solution Provider who is interested in integrating and reselling SQL Server with a value-added business solution, you should investigate the Microsoft Product Integration Program (PIP). Details on the PIP program can be found at www.microsoft.com/enterprise/licensing/general/pip.htm.

Additional SQL Server licensing options for large ISVs are available through Microsoft's Royalty Licensing Program. To find out if your company is eligible for the Royalty Licensing Program, contact your local Microsoft account representative. If you do not have an account representative, you can investigate this option further by contacting your local Microsoft Sales Office. To locate the closest Microsoft Sales Office, see www.microsoft.com/worldwide/.

SQL Server Licensing

Both a server license and the appropriate number of client access licenses must be obtained to ensure your SQL Server installation complies with Microsoft's software licensing policies.

SQL Server Standard or Enterprise Edition Server Licenses

A server license gives you the right to install SQL Server on server editions of Windows NT or Windows 2000. You must obtain a server license either by a new product purchase, or by an upgrade from an earlier release of SQL Server. You can obtain a server license either by purchasing SQL Server individually, or by purchasing Microsoft BackOffice.

SQL Server Client Access Licenses

SQL Server operates in two different licensing modes, "per-seat" and "per-server." In per-seat mode, a SQL Server or BackOffice client access license (CAL) must be obtained for each workstation that connects to the database server.

In per-server licensing mode, each CAL that is purchased is used to increase the maximum number of simultaneous SQL Server connections, regardless of which individual workstations will connect to SQL Server.

SQL Server Desktop Edition Licensing

A SQL Server CAL is required in order to install SQL Server 7.0, Desktop Edition, on a workstation. The CAL must be applied at the workstation, and cannot be applied at a server that is running SQL Server in per-server licensing mode.

Special licensing requirements apply for distributed database applications that leverage SQL Server installations on servers combined with SQL Server installations on workstations. Per-seat licensing is required when you are connecting to SQL Server Standard or Enterprise Edition from either a SQL Server Desktop Edition or MSDE installation on a workstation using technology like replication.

MSDE Licensing

A license to redistribute MSDE can be obtained by purchasing either Microsoft Office 2000 Developer Edition or Microsoft Visual Studio. As with SQL Server 7.0 Desktop Edition, a SQL Server CAL is required if you plan to connect an MSDE installation with a SQL Server installation on a server using technology like replication.

MSDE 1.0 Office 2000 Release is not licensed for redistribution, and is intended for use with Microsoft Access 2000 as an alternative to the Microsoft Jet database engine.