Enhancing DTS Performance

There are a number of factors that can affect the performance of DTS packages. Several considerations are included here. Note that in many scenarios, thorough testing before implementation is strongly recommended.

Using ActiveX Scripts

Performance tests show that Microsoft® ActiveX® transformation scripts can slow down data pump operation by as much as two to four times compared to copy operations. The choice of scripting languages can have an effect, too. In general, Microsoft Visual Basic® Scripting Edition is faster than Microsoft JScript®, which in turn is faster than PerlScript.

Using ordinal values to refer to columns in ActiveX transformations can be much faster then referring to columns by name. For example, use:

DTSSource(1)

  

instead of:

DTSSource("CustomerID").

  

Using the Data Pump and Transformations

The Data Pump task is designed for speed and can compete with command prompt bcp if it is doing only copies. Performance effect comes when you start using custom transformations.

Performance can be improved by avoiding mapping a separate transformation function for each column involved in a data pump operation. Using the DTS Import and DTS Export wizards or DTS Designer, it is very easy to define one-to-one mappings between columns. When each column contains a transformation, the script engine has to be invoked for each transformation.

Because each transformation may have an associated script, language specification, and function, consider mapping as many like transformations into a single invocation as possible. This allows a single invocation of the transformation script and exposes all the columns to the transformation.

When a small number of columns is involved, performance differences may be negligible, but when a large number of columns is involved in many-to-many instead of one-to-one mappings, performance can be improved by approximately five percent or more.

Data-Driven Queries vs. Transformations

When Microsoft SQL Server™ is a data destination of the data pump, transformations use the IRowsetFastLoad interface. When SQL Server is not the destination, the data pump uses the IRowsetChange interface (typically, sending INSERT statements).

Data-driven queries differ from the data pump in that they use the ICommand interface on the destination using prepared insert operations. This might be faster than the IRowsetChange interface that transformations use, depending on how your destination OLE DB provider implements the interfaces.

Using Bulk Insert and bcp

The Transact-SQL BULK INSERT command, supported by the SQLOLEDB provider, is significantly faster than bcp or the data pump for performing text file import operations. Therefore, if no transformations are necessary, use the Bulk Insert task and achieve as much as double its throughput.


Note BULK INSERT is limited to file import operations.


When importing data, DTS copy and bcp operations are approximately the same speed. However, when exporting data, bcp operations can be approximately three to six times faster. Native bcp, which only applies to SQL Server data, is faster. BULK INSERT and native bcp are comparable in speed.

Using DTS Lookups

DTS lookups are best used when the input to the lookup is an external value, such as a global variable. An advantage of the lookup feature is that it can be invoked from within a transformation function, allowing you to associate a query and a connection with the lookup. In a transformation function, you can then execute the query and have a single value returned. You can achieve similar functionality using COM objects, VARIANTs, and the DTS GlobalVariables collection. However, the advantage of the DTS Lookup object is that it uses an established connection it is optimized for quick data retrieval.

If the transformation function can use an SQL statement instead of a lookup (for example, a SELECT statement with a join clause), performance will be greatly improved.

Using Connections

The ExecuteOnMainThread property is intended to be used only when necessary (for example, it is required by some drivers that are not thread-safe). Assuming you do not have precedence constraints, which may affect the order of operation of a DTS package, avoid using ExecuteOnMainThread in ActiveX scripts to achieve concurrency.

For safety, only one task can use a connection at a time. Thus, to achieve parallel execution you must set up different connections for each task. For example, a source (A) might connect to two destinations (C) and (D), but the operation occurs serially. The same source (A) could also be configured as (B), and parallelism could be achieved by (A) connecting to (C) and (B) connecting to (D).

By default, the maximum number of concurrent steps is four, and can be set using the MaxConcurrentSteps property.

Using Other SQL Server Solutions

When simply moving data between tables without transformations or validations, the fastest method is to use a SQL Server Transact-SQL query. For multiple sources, consider running a distributed query such as a SELECT INTO statement.

As an alternative to using ActiveX scripts in DTS to transform data one row at a time, the new SQL Server Query Analyzer gives excellent performance with complex joins, often reducing existing SQL Server version 6.5 processes by half or more.

In an environment using multiple packages, each DTS package must be run as a separate process using the DTS Run method in either a batch operation or using SQL Server Agent. This makes DTS a client process. If you must run a large number of packages (for example, 1,000 or more) as a server process, consider using SQL Server replication, which provides snapshot, transactional, and merge capabilities.


(c) 1988-98 Microsoft Corporation. All Rights Reserved.