Microsoft SQL Server 7.0 Developer Conversation

Microsoft Corporation

March 5, 1999

Summary: Questions and answers about Microsoft® SQL Server™ 7.0, with extensive code samples. Includes discussions of SQL Server administration, the Microsoft Data Engine, indexes, Data Transfer Services, and other topics. (52 printed pages)

Contents

Administration
Microsoft Data Engine (MSDE)
Index
Bulk Copy Program
Replication
SQL Language
Licenses
Data Transformation Services (DTS)
Other Questions

Administration

Changing Machine Names

When I change my machine name after installing SQL Server 7.0, I receive an error message "Your SQL Server installation is either corrupt or has been tampered with (unknown package id). Please rerun Setup." Can I fix this without having to rerun Setup?

Answer

This error message appears if you have renamed the Windows® 2000 Server computer on which you installed SQL Server 7.0. SQL Server 7.0 uses the Windows 2000 Server computer name internally; if you change the Windows 2000 Server computer name, SQL Server detects a different name and generates the error message. Your data has not been corrupted or damaged in any way.

To resolve this error and start SQL Server again, you need to do either of the following:

For more information, see SQL Server Books Online under "Frequently Asked Questions."

Dereferencing Error Messages

I create a table with a unique key constraint, and then try to insert a duplicate key, I receive the following error message: "Violation of PRIMARY KEY constraint 'cst_FOOValue': Attempt to insert duplicate key in object 'foo'."

I would like to take that string and put it in my audit table so that I can do error reporting on it. The trouble occurs, however, when I go to get the string out of sysmessages based on the error code; I receive this message: "Violation of %s constraint '%.*s': Cannot insert duplicate key in object '%.*s.'"

How do I get the fully dereferenced string into a local variable?

Answer

Use the built-in FORMATMESSAGE that is new in SQL Server 7.0. See SQL Server Books Online under "FORMATMESSAGE" for details. Keep in mind that both RAISERROR and FORMATMESSAGE have restrictions on formatting system messages by users who are not members of the sysadmin server role.

Code Sample

Use dbTOPISV
goExecute sp_dropmessage 50001
Drop table foo
go

Create Table foo
   (fooctv char(2) constraint ffconstr primary key)
go
Execute Sp_addmessage 50001,10,N'Violazione constraint %s, la chiave è: %s'
go
Insert into foo values ('a')
go

Insert into foo values ('a')
go
Declare @valore char(2)
Declare @messaggio char(250) 
Select @valore = 'a'
Insert into foo values (@valore)
If @@error = 2627
   Begin
   select @messaggio = Formatmessage(50001,'foo',@valore)
   Select @messaggio Messaggio
   End

Arithmetic Overflow

I perform on one server an operation that is identical, with identical tables, to an operation on another server. The operation on the second server runs fine, however, on the first server I receive the following message: "Server: Msg 8115, Level 16, State 2, Line 2/Arithmetic overflow error converting expression to data type int/The statement has been terminated."

What server settings should I check for differences?

Answer

It appears that your user options have been set differently on the two servers. You can test this with the following query:

SELECT   @@Options

The number you get back will be the sum of the options specified.

The arithmetic overflow options (ARITHABORT and ARITHIGNORE) control whether or not arithmetic overflows are reported explicitly or returned as NULLs. More information is available in SQL Server Books Online under "User Options."

Code Sample

Set nocount on 
Set Ansi_Warnings on

SET ARITHIGNORE OFF
select @@options 'Arithignore_off & Ansi_Warnings_on'select 1/0Print ' 'SET ARITHIGNORE ON
select @@options 'Arithignore_on & Ansi_Warnings_on'select 1/0
/*
If either SET ARITHABORT or SET ARITHIGNORE is OFF and SET ANSI_WARNINGS is ON, 
SQL Server still returns an error message when encountering divide-by-zero or overflow errors.
*/
Set Ansi_Warnings off
Print ' '
SET ARITHIGNORE OFF
select @@options 'Arithignore_off & Ansi_Warnings_off'select 1/0print' 'SET ARITHIGNORE ON
select @@options 'Arithignore_on & Ansi_Warnings_off'select 1/0

Ownership Chains and Permission Checks

I am trying to understand what is happening in the following situation: Users do not have select permission on the underlying table, but have EXECUTE permission on the stored procedure. Within the stored procedure, the first example below works, but the second does not; access is denied.

Select * from tablename             
EXEC("Select * from tablename")   

How does the EXECUTE statement change the situation so that select permission become a factor? Is there a good workaround for this? A view would handle the security issues but could not have an index and would not seem to be the best solution.

Answer

Since the statement isn't resolved until runtime, SQL Server can't know beforehand that the statement will not be a break in the ownership chain; thus we treat this as a break, and permissions are checked. For more information, SQL Server Books Online contains a detailed section on ownership chains and permission checks.

Code Sample

Set nocount on 
Use dbTopIsv
Drop Table foo
go
Drop procedure prfoo1
go
Drop procedure prfoo2
go
execute sp_dropuser topisv
go
execute sp_droplogin topisv
go
Create Table foo
   (fooctv char(2) )
go
Insert into foo values ('xx')

Select * from foo          

EXEC('Select * from foo')      
go

Create procedure prfoo1 as
EXEC('Select * from foo')  
go             
Create procedure prfoo2 as
Select * from foo          
go

EXEC sp_addlogin 'Topisv', Null, 'dbTOPISV'
exec sp_adduser Topisv
Grant execute on prfoo1 to topisv
Grant execute on prfoo2 to topisv
---- Logoff and logon as user Topisv
Execute prfoo1         --        ('Select * from foo') Doesn't Work—Access is denied
Execute prfoo2         --   Select * from foo    Works

Behavior of Temporary Tables in Stored Procedures

I understand that SQL Server 7.0 has added a feature where a temporary table will force stored procedures to recompile as follows:

  1. The temp table is created.

  2. The temp table is populated.

  3. The first temp table read causes the stored procedure to recompile, using the current statistics for the temp table.

  4. Processing resumes at the step that forced the recompile.

Is this true? If so, why does my first SELECT statement from a temporary table (population 40 rows) take 0.3 seconds? This is a long time for such a small population. The SELECT statement is:

select * from #tAttr where IsAttrGroupInd = 1 and GroupBuiltInd = 0 

If it's true, is there any way to turn off this behavior?

Answer

Use of temporary tables in stored procedures may cause a recompilation of the stored procedure every time it is used. In order to avoid such recompilation, use of temporary tables in a stored procedure must meet the following requirements:

The first two requirements are the same as those for SQL Server 6.5. The third is new to SQL Server 7.0.

For instance, the following procedure will not be recompiled on every execution:

create procedure foo as 
begin 
create table #temp1 (col1 int) 
insert #temp values(1) 
select * from #temp 
end 

The following procedure will be recompiled, however, as it violates the third requirement:

create procedure bar as

begin 
create table #temp1 (col1 int) 
declare bad_cursor cursor for select * from #temp1 
open bad_cursor 
fetch from bad_cursor 
close bad_cursor 
end 

Regarding the role of statistics in causing a stored procedure to be recompiled:

Auto create statistics does not cause a recompilation of a procedure. Statistics creation takes place during optimization and then continues without inducing a recompilation.

Auto update statistics may cause recompilation of a procedure that makes use of temporary tables. When a query uses a temporary table and that table has zero rows at the point it is compiled, a small recompile limit is set for the table, if auto update statistics is enabled in tempdb. If this limit is exceeded, the query triggers a recompilation when the procedure is executed. This rule is introduced to address the case in which a temporary table is created in a procedure, populated, and then used in a query. Since table size and statistics determine the query optimizer's plan choice, the rule gives the query optimizer the opportunity to select a new plan.

The user who finds that the recompilation introduced in the second case above is undesirable may either:

Code Sample

Set nocount on 
drop table foo
drop procedure foo
drop procedure bar

go
---- the following procedure will not be recompile on every execution
create procedure foo as 
begin 
create table #temp1 (col1 int) 
insert #temp1 values(1) 
select * from #temp1 
end 
go
---- The following one will, however (violates #3) 
create procedure bar as 
begin 
create table #temp1 (col1 int) 
declare bad_cursor cursor for select * from #temp1 
open bad_cursor 
fetch from bad_cursor 
close bad_cursor 
end 
/* Open SQLProfiler
Menu Tools/Options.. in the Events frame set All event classes
Select the Sp:Recompile event to be traced
Exec the storedprocedures, you'll se the bar procedure being recompiled
/*

exec foo 
exec bar

An Example of Parallel Backup-to-Multiple Dump Devices

Do you have an example of parallel backup-to-multiple dump devices?

Answer

Define the backup devices using sp_addumpdevice, then list them in the backup statement.

Example:

backup database MST to DUMP1, DUMP2, DUMP3

SQL Server 7.0 spawns multiple threads for multiple backup devices and multiple data files. If you have three tape drives and two data files on different physical drives, SQL Server 7.0 uses two threads to read the database data, and three threads to write the data to the tape drives. It is written to the first device that is ready, rather than in a "round-robin" method. If you have two fast tape drives and one slower one, the fast tape drives will write their block of data and be ready for another block before the slower tape drive. When the backup has written all of the data in the database to the tape drives, it moves on to the transaction log.

The transaction log is written in a "round-robin" method. Therefore, the transaction log backup is only as fast as the slowest backup device.

The Fastest Way to Recompile All Stored Procedures

I have a problem with a database running in SQL Server 6.5 (Service Pack 4). We moved the database (object transfer) from one machine to another last night, and an error (specific to a stored procedure) is cropping up. However, I can't tell which procedure is causing it. Permissions are granted in all of our stored procedures; is there a way from the isql utility to force all stored procedures to recompile?

Answer

Try the following:

SELECT   'SP_Recompile ' + name + char(10) + 'go'
FROM      Sysobjects
WHERE   Type = 'U'

This query will produce a script that you can then copy and execute from isql to set SP_Recompile on every user table in the database. Setting SP_Recompile forces the recompilation of stored procedures; it is triggered the next time they are called.

Detection of an Orphaned Client Connection

Assume you have a client connected to SQL Server 7.0. The client starts a transaction and locks some resources. Suddenly the client dies; the locked resources must be released as soon as possible. How and when does the release of the locks occur when the client is connected using the following?

How was the release done in SQL Server 6.x?

Answer

This is the "orphaned connections" problem. There is a description of the problem in SQL Server Books Online under "Troubleshooting."

Basically, the answer depends on how the client dies. It is independent of whether connection is created via DB-Library, ODBC, or OLE DB.

If the client dies "gracefully," and the client application is closed or hard-killed (for example, by Windows 2000 task manager), or the client machine has shutdown cleanly, the connection, the server side session, and all its resources will be released immediately.

However, if the client machine loses power or undergoes cold shutdown, or the network connection is terminated, then the server-side session remains until:

KeepAliveTime in SQL Server 6.5 on Windows NT® 4 was set to two hours. KeepAliveTime in SQL Server 7.0 is set to two hours for desktop servers (because of the possibility of interaction with other applications on the system) and five minutes for standard and enterprise servers.

A low setting of about five to ten minutes for KeepAliveTime is recommended for servers used only for SQL Server 7.0.

More information is available in SQL Server Books Online under "Orphaned Sessions."

Why Expand Master for Upgrade from SQL Server 6.5 to SQL Server 7.0?

Why would I want to expand the SQL Server 6.5 Master Database by 2000 kilobytes (KB)?

Answer

Because the SQL Server Upgrade Wizard needs to run the Instcat.sql script (hotfix.sql) on the SQL Server 6.x side, and there isn't enough room to run it successfully.

Host Data Replicator

Is SQL Server 7.0 compatible with the Microsoft Host Data Replicator (HDR)?

Answer

No, it is not. I suggest you use the SQL Server 7.0 Data Transformation Services (DTS) with the OLE DB provider for DB2 (shipping in the Microsoft SNA Server version 4.0, Service Pack 2).

Inability to Select from a Linked Server with a Hyphen in the Server Name

When a server with a hyphen in its name is linked, it doesn't seem possible to reference it in a SQL statement. I tried setting QUOTED_IDENTIFIER ON to eliminate the illegal hyphen, but that doesn't work. In fact, it appears that any quoted four-part table reference is seen as invalid. For example:

SET QUOTED_IDENTIFIER ON
EXEC sp_addlinkedserver 'DONHEMME-5', 'SQL Server'
EXEC sp_droplinkedsrvlogin  'DONHEMME-5',NULL
EXEC sp_addlinkedsrvlogin 'DONHEMME-5', 'false', NULL, 'user', 'password'

EXEC sp_addlinkedserver 'DONHEMME3', 'SQL Server'
EXEC sp_droplinkedsrvlogin  'DONHEMME3', NULL
EXEC sp_addlinkedsrvlogin 'DONHEMME3', 'false', NULL, 'user', 'password'

select count(*) from "donhemme-5.master.dbo.sysdatabases"
GO
select count(*) from "donhemme3.master.dbo.sysdatabases"
GO
select count(*) from donhemme3.master.dbo.sysdatabases
GO
select count(*) from "sysdatabases"
GO

Results (from queries only):
Server: Msg 208, Level 16, State 1, Line 1 Invalid object name 
'donhemme-5.master.dbo.sysdatabases'.  
Server: Msg 208, Level 16, State 1, Line 1 Invalid object name 'donhemme3.master.dbo.sysdatabases'.

Answer

Use brackets [ ] to enclose hyphenated names, for example:

select count(*) from [donhemme-5].master.dbo.sysdatabases

SQL Server 7.0 Profiler Output to ASCII

I have a very large .trc file that I need to write to ASCII (a SQL Server 6.5 trace log file, for example). Is there a way to do this?

Answer

There are two ways:

If you only need the SQL content, save the .trc file as a SQL script. You'll only get the SQL content.

If you need to write the entire trace to a text file, not just SQL content, use the component object model (COM) trace reader that will come with the Microsoft Back Office® Resource Kit. With this, you can read the .trc file and write to a text file.

Moving File Groups to a New Disk

With SQL Server 7.0, what do I do when I run out of disk space? Can I move the file to a new location? How do I do that?

Answer

When you run out of disk space you just add another file to the file group—you don't have to move the file to a larger disk.

If you want to move a file, the easiest way to move do that is to take the database offline:

  1. Take the database offline with sp_detach_db

  2. Move the file.

  3. Restart the database with sp_attach_db specifying the path to the primary file and the moved file.

If you must move the file without taking the database down, here's how to do that:

  1. Create the new file on the new disk.

  2. Shrink the original file to zero size—thus forcing data to move.

  3. Drop the original file.

Note that adding/dropping a file invalidates the log backup chain, and thus you must backup the entire database (using file or database backups) before taking log backups again.

Code Sample

Set nocount on 
Use Master
go
Drop database Archive
go
CREATE DATABASE Archive 
ON
PRIMARY (NAME = Arch1,
          FILENAME = 'c:\mssql7\data\archdat1.mdf',
          SIZE = 1MB,
          MAXSIZE = 200,
          FILEGROWTH = 1 ),
  (NAME = Arch2,
  FILENAME = 'c:\mssql7\data\archdat2.ndf',
  SIZE = 1MB,
  MAXSIZE = 200,
  FILEGROWTH = 1)

LOG ON 
  (NAME = Archlog1,
  FILENAME = 'c:\mssql7\data\archlog1.ldf',
  SIZE = 1MB,
  MAXSIZE = 200,
  FILEGROWTH = 20)
go
Use Archive
go
Create table foo (c1 char(7000))
Declare @giro int
Select @giro = 0
While @giro < 400
   begin
   Insert foo values (convert(char(5),@giro))
   Select @giro=@giro+1
   end
---- Have a look to the space allocated using Enterprise Manager.
---- To create a new file using EM with dimension egual or grater then archdat2

-----run  shrinkfile and look what happened to archdat2 
-----DBCC SHRINKFILE (arch2, emptyfile)

Shrinking a Log

According to SQL Server Books Online, I can shrink my LogFile, but I have not been able to do this. Yes, it is deferred, but it seems as though I am only able to force this by running a BACKUP LOG dbname WITH TRUNCATE_ONLY and running SHRINKDATABASE or SHRINKFILE.

Based on the discussion of virtual LogFiles, shrinking a LogFile does not seem possible.

Answer

Running DBCC SHRINKFILE followed by BACKUP LOG does not guarantee a log shrink. The log will shrink as far as the current end of log and stop. If you continue to use the database, the end of the log will move out of the portion you are trying to shrink and the file will shrink eventually. However, to shrink the file immediately, you need to generate some log records into the log. This script will do what you need:

use <dbname>
go

create table t1 (char(4000))
go
declare @i int
select @i = 0
while (1 = 1)
begin
while (@i < 100)
begin
insert into t1 values ('a')
select @i = @i +1
end
truncate table t1
backup log <dbname> with truncate_only
end
go

Code Sample

Set nocount on
use dbtopisv
go
drop table t1
go
create table t1 (field1 char(6000))
go
--- BACKUP THE DATABASE, so the log start to grow
BACKUP DATABASE [dbTopIsv] TO  DISK = N'C:\MSSQL7\BACKUP\test' WITH  NOINIT ,  
NOUNLOAD ,  NAME = N'dbTopIsv backup',  SKIP ,  STATS = 10,  NOFORMAT 
go
--- Consume log space
declare @i int
select @i = 0
while (1 = 1)
begin
   while (@i < 100)
   begin
      insert into t1 values ('a')
      select @i = @i +1
   end
   ----truncate table t1
   select @i = 0
   select count(*) from t1
end
go
truncate table t1
DBCC SHRINKdatabase (dbtopisv)
--- The Database was shrinked, but the log not yet.

--- run this code to move the end of the log and finally the log will be shrinked
declare @i int
while (1 = 1)
begin
   select @i = 0
   while (@i < 100)
   begin
      insert into t1 values ('a')
      select @i = @i +1
   end
   truncate table t1
   backup log dbtopisv with truncate_only --- Otherwise the log is not shrinked
end
go

--- resume the backup chain
BACKUP DATABASE [dbTopIsv] TO  DISK = N'C:\MSSQL7\BACKUP\test' WITH  NOINIT ,  
NOUNLOAD ,  NAME = N'dbTopIsv backup',  SKIP ,  STATS = 10,  NOFORMAT 

Locking Caused by Inserts on Pages with a Clustered Index.

In SQL Server 7.0, what kind of lock overhead is generated by the need to move rows around on a page, when an insert occurs between existing rows on a page? If we need to move a row on a page to accommodate an insert, will we take another row lock or will we escalate to a page lock?

For example: there's a clustered index on column 1; the table contains rows with column 1 values "a" and "c," with an insert of a row with a column 1 value of "b."

Answer

With SQL Server 7.0, there is no lock overhead. You latch the page, physically reorganize it to insert or coalesce free space, and then release the latch. The lock manager isn't involved at all. (Latches are short-term synchronization primitives that maintain physical consistency on the page.)

Furthermore, no locking impact or secondary index fix-up is required when a clustered index page is split because of a new insert. This was not the case with SQL Server 6.x.

Also note that SQL Server 7.0 reclaims free space in tables that don't have a clustered index. Again, this is different from version 6.x.

Moving the Transaction Log from One Drive to Another

SQL Server Books Online gives a process for restoring or moving a database file from one drive to another. However, I didn't see anything that allowed me to move the transaction log from one disk to another. Can you give a pointer to those steps?

Answer

You can move a transaction logs file or files using RESTORE in the same manner as a database file, with the MOVE option.

In another method, you can move transaction log file to another drive by using the sp_detach_db and sp_attach_db stored procedures.

  1. Run sp_detach_db to detach the database, before you move the transaction log file.

  2. Move the transaction log file to the other drive.

  3. Run sp_attach_db.

Following is the sample script:

sp_detach_db 'database1'
go

/* Move the transaction log file */
sp_attach_db 'database1', 'c:\mssql7\data\datafile.mdf', 'd:\mssql7\log\logfile.ldf'
go

Using Views with Partitioned Data

I am trying to use the union view feature of SQL Server 7.0. The problem is that for the existing data in the underlying tables, the "check existing data on creation" check box was clear when the constraint was created. Once the check constraint was verified, the query optimizer worked as described.

Is there any way to bypass the "check existing data" flag when creating the constraint, so I don't have to wait for the data to be verified?

Answer

See ALTER TABLE ADD CONSTRAINT. You can specify WITH CHECK or WITH NOCHECK as to whether or not the data in the table is validated against a newly added or re-enabled FOREIGN KEY or CHECK constraint. If neither is specified, WITH CHECK is assumed for new constraints, and WITH NOCHECK is assumed for re-enabled constraints.

If you do not want to verify new CHECK or FOREIGN KEY constraints against existing data, use WITH NOCHECK. (However, doing so is not recommended except in rare cases.) The new constraint will be evaluated in all future updates. Any constraint violations that are suppressed by WITH NOCHECK when the constraint is added may cause future updates to fail, if they update rows with data that does not comply with the constraint.

The query optimizer does not consider constraints defined WITH NOCHECK. These constraints are ignored until all such constraints are re-enabled using ALTER TABLE table CHECK CONSTRAINT ALL.

When you create a CHECK constraint WITH NOCHECK, you lose the ability to do an optimized union view.

Code Sample

Set nocount on 
Use dbTopisv
Drop table May1998Sales
Drop table June1998Sales
Drop table July1998Sales
Drop VIEW Year1998Sales

CREATE TABLE May1998Sales
    (OrderID    INT        PRIMARY KEY,
    CustomerID        INT            NOT NULL,
    OrderDate        DATETIME        NULL
        CHECK (DATEPART(yy, OrderDate) = 1998),
    OrderMonth        INT
        CHECK (OrderMonth = 5),
    DeliveryDate        DATETIME        NULL,
        CHECK (DATEPART(mm, OrderDate) = OrderMonth)
    )

 CREATE TABLE June1998Sales
    (OrderID    INT        PRIMARY KEY,
    CustomerID        INT            NOT NULL,
    OrderDate        DATETIME        NULL
        CHECK (DATEPART(yy, OrderDate) = 1998),
    OrderMonth        INT
        CHECK (OrderMonth = 6),
    DeliveryDate        DATETIME        NULL,
        CHECK (DATEPART(mm, OrderDate) = OrderMonth)
    )
 CREATE TABLE July1998Sales
    (OrderID    INT        PRIMARY KEY,
    CustomerID        INT            NOT NULL,
    OrderDate        DATETIME        NULL
        CHECK (DATEPART(yy, OrderDate) = 1998),
    OrderMonth        INT
        CHECK (OrderMonth = 7),
    DeliveryDate        DATETIME        NULL,
        CHECK (DATEPART(mm, OrderDate) = OrderMonth)
    )

---The application populating May1998Sales must ensure all rows have 5 
---in the OrderMonth column and the order date specifies a date in May, 1998. 
---This is enforced by the constraints defined on the table.

---A view is then defined that uses UNION ALL to select the data from all 
12 tables as a single result set:
go
CREATE VIEW Year1998Sales
AS
SELECT * FROM May1998Sales
UNION ALL
SELECT * FROM June1998Sales
UNION ALL
SELECT * FROM July1998Sales

/* Use the graphical query analizer to view how the tables are accessed
SELECT *
FROM Year1998Sales
WHERE  CustomerID = 64892

SELECT *
FROM Year1998Sales
WHERE OrderMonth IN (5,6,7) AND CustomerID = 64892

SELECT *
FROM Year1998Sales
WHERE OrderMonth IN (5,7) AND CustomerID = 64892

SELECT *
FROM Year1998Sales
WHERE OrderMonth IN (6) AND CustomerID = 64892

/*

Microsoft Data Engine (MSDE)

Replication between Versions of MSDE

Can transactional and merge replication be used between versions of the Microsoft Data Engine (MSDE)?

Answer

MSDE can publish merge replication to any other subscriber (MSDE or SQL Server 7.0, or Jet in Microsoft Access 2000). MSDE can be a subscriber to transactional replication, as long as the distribution server is running Per Seat licensing. It cannot publish transactional replication, as it has no log reader.

Information on MSDE

I would like information regarding the features, capabilities, and limits of the Microsoft Data Engine (MSDE). Can you help?

Answer

A good source is the white paper available at www.microsoft.com/Office/enterprise/prodinfo/MSDtaEng.htm.

Here is a simple FAQ:

Q: What platforms (NTS, NTW, Windows 98)?

A: NTS, NTW, and Win9x.

Q: Are driver updates needed for client machines?

A: No, you can use the regular ODBC SQL Server driver and SQL Server OLE DB Provider for both MSDE and SQL Server.

Q: Is an OLEDB driver available (SQL Driver?)

A: Yes the same as for SQL Server, shipped with MSDE, SQL Server, and the MDAC 2.1.

Q: Can ADO be used unchanged between SQL Server 7.0 and MSDE: same types, locking, options?

A: No changes are needed anywhere.

Q: Can RDO/DAO (odbc direct) be used, linked table access to MSDE from access 97?

A: Yes, the same way as you would use them against SQL Server 7.0.

Q: Is there full support for SQL Server 7.0 DDL commands and all column types?

A: It is identical between MSDE and SQL Server 7.0.

Q: A supported Security Model on Windows 2000 and Windows 98?

A: On Windows 2000, MSDE support is integrated and standard; on Win9x there is only standard security just as in SQL Server 7.0.

Q: Regarding concurrent user support, what are the limiting factors?

A: The data part of the database size is limited to 2 gigabytes (GB), the engine is recommended for about five concurrent users.

Q: Is there support for views, triggers, stored procedures, with exactly the same syntax as SQL Server 7.0?

A: The syntax is identical.

MSDE Licensing on Replication

I understand that Microsoft Data Engine (MSDE) does not support replication to SQL Server, unless one purchases a SQL CAL. How does a paper license technically allow MSDE to replicate? When the customer gets the SQL CAL, must they then upgrade the MSDE to SQL Desktop?

Answer

No, there wouldn't be a need for this.

You can use either Microsoft SQL Server 7.0, Desktop Edition, or Microsoft Data Engine (MSDE) to access SQL Server—it's merely a choice of tools. There are three key questions to consider:

If you know prior to creating the solution that you need to optimize for replication, you may consider deploying a solution with Microsoft SQL Server 7.0 Desktop Edition (an option with Microsoft SQL Server 7.0, Standard Edition, and Microsoft SQL Server 7.0, Enterprise Edition; it is not licensed separately). Desktop Edition includes the CALs, which are legally required. Plus, the Desktop Edition includes all of the SQL Server management tools (which MSDE does not) and is unlimited in database size. MSDE is limited to 2 GB in database size.

MSDE is a data engine that will ship in Microsoft Office 2000 Professional and Microsoft Office 2000 Developer and possibly other tools, and does not include any CALs. You would consider using MSDE for the same types of individual and small group runtime applications for which you use Access/Jet today. MSDE is completely compatible with SQL Server, so you've protected your investment if you ever need to scale. MSDE does not include a user interface (Access or another custom user interface can be used).

You may not need to access a SQL Server database today, although technically you could. In this scenario, you use a custom user interface (such as Access or another tool) with MSDE and you've purchased the CALs if you do need to access SQL Server.

The Desktop Edition includes SQL Server Enterprise Manager and such tools as SQL Server Books Online.

In terms of benchmark information, the Desktop Edition and MSDE are based on SQL Server 7.0 technology and should perform about the same.

Index

Background Noise of Index Probes

On SQL Server 7.0, I sometimes notice a "background noise" of index probes.

There are only two clients, 2 Query Analyzer windows, which are dormant (outside of me running sp_who). The SQL index probe counter is turned on in PERFMON. The SQL Server process CPU is 0.00. Periodically, about every eight seconds, there is one index probe. This behavior can last for hours.

Anyone know what causes it? I often run huge random (Rags) queries, so perhaps it's leftover from that.

Answer

It could very well be the background thread kicking in and sniffing to see if there's any work to do, (such as removing ghost records from indexes or shrinking files on the desktop edition).

Supporting a Bitmap Index

Does SQL Server 7.0 support a bitmap index as an option in addition to the B-tree index structure?

Answer

SQL Server 7.0 does not support bitmap indexes. However, index intersection and union, the main advantages often associated with bitmap indexes, are available.

Distribution Steps in a Composite Index

I've noticed that SQL Server now keeps density statistics for all levels on a composite index. For example, with three columns (C1, C2, C3) we keep density on (C1), (c1, c2), and (c1, c2, c3). However, SHOW_STATISTICS only shows the distribution of the first column in the key.

Do we keep distribution data on the other columns and just not show it? Is it true that the query optimizer may sometimes overlook a useful index where the first column is fairly uniform and dense? The combination of columns may result in a highly selective but highly skewed index: for example, a Part Number column has many more items in some categories than others, where the Part is, say, Category, ItemCode, and so on.

Does the query optimizer notice the skew and use it when looking for some part numbers and not others? Or should we just forget about composite indexes and covering indexes and use multiple single column indexes instead?

I realize that SQL Server 6.5 didn't capture the statistics on the second column either.

Answer

Multicolumn indexes continue to be very valuable; you should continue using them.

You are correct in that the statistics kept do not include all possible statistics and that the query optimizer may sometimes be fooled by missing statistics. However, there are multiple means to remedy these problems.

Index Traversal I

It seems that using an NC index to access data seems to be most efficient if the table is just a heap.

The NC index leaf stores a row locator. In a heap, the row locator is a RID (File,Page,Row Num) and points directly to the data row. In a table with a clustered index, however, the row locator is the Unique CI key. Thus, the row locator must traverse both the NC index and the clustered index to find the data row.

Is this correct?

Answer

Yes it is, but the difference is probably insignificant in practice because of buffering of upper index levels in the CI.

Index Traversal II

If I change the value of a clustering key, must I walk every NC index and update all leaf pointers that use this CI key to point to this data row?

Answer

Yes, you must. Obviously, this argues for a clustering key that is stable. For example, don't use account balance; use account zip code instead.

Index Traversal III

In SQL Server 6.5, there was a problem with external fragmentation occurring on online transaction processing (OLTP) systems on tables with no clustered index. This problem was due to inserts always going to the end of the heap (or last page) that was allocated to the table. I am under the impression that in SQL Server 7.0, this is not the case. During inserts, SQL Server scans a map page and the inserts go wherever they can fit, thus holes caused by deletes and certain updates are no longer a problem. Is this true? Or does "forwarded records'" refer to rows that span more than one page?

Answer

SQL Server 7.0 includes free-space management. New inserts into a heap can make use of available space on any page. Since SQL Server 7.0 heaps no longer include page links, there is really no end of the table.

SQL Server 7.0 row IDs (RIDs) are stable with respect to updates. If a row is updated to make it larger and the new size is larger than the free space on a page, the content of the row moves to a new page with sufficient space, with a forwarding pointer at the original location. If the row is subsequently moved again, the forwarding pointer is updated, so there's always at most one level of direction.

The locking implementation is "pure" regarding RIDs, so it's possible to implement an online background task that cleans up these forwarded rows.

Bulk Copy Program

Indexes and DROP_EXISTING vs. SORTED_DATA_REORG

With SQL Server 6.x, if I bulk copy program data into a table and the data is presorted by LASTNAME, I can create a clustered index using the SORTED_DATA_REORG option. This allows me to avoid doing the sort routine when the new index is being built.

Is there a way to get this same functionality in SQL Server 7.0?

Answer

The SORTED_DATA_REORG option no longer exists. Instead, SQL Server 7.0 features DROP_EXISTING, which only applies to an existing index.

Currently in SQL Server 7.0, there is no mechanism to bulk copy program data out and maintain the order. Bulk copy programming from a view doesn't solve this, because views don't allow ORDER BY.

SQL Server 7.0 features the added ability to supply a SELECT statement instead of a table name to bulk copy program out. One of the benefits of that capability is to preserve the order of your data in the flat file.

If you can, do the entire load as a single batch for the best performance.

Answer

In SQL Server 7.0, there are no page links between pages on a table without a clustered index. As such, you can't bulk copy program into a table of this form and then build an index without resorting. SQL Server 7.0 pushes the notion of "sortedness" back to the bulk copy program client so you can still do what you want. Check SQL Server Books Online under "Ordered Data Files" and the ORDER hint for BULK INSERT and the bcp utility.

Memory Leak Problems with BCP

We have a job that bulk copies about 140megs of data in 208000 rows which consistently consumes nearly all of the memory on a 512M machine. Sometimes it results in a crash and always reduces system available memory to less than 4M. The system is Proliatn 5500; 4x P200; 512M; 160G drive array. The paging file is set to 1024-2048.

Any insight is appreciated.

Answer

For SQL Server 6.x, bcp.exe will map the whole file into memory if it can. Try bcp.exe from SQL Server 7.0.

Varchar Greater than 255

When I do a bulk copy program in Visual Basic 6 (I use the format file) it only allows me to bulk copy program Varchar of 255 max. However, SQL Server 7.0 now permits Varchar up to 8000.

Answer

It looks like bcp.exe invoked is the dblib or SQL Server 6.5 version. If you make sure that bcp.exe invoked is the SQL Server 7.0 version, you will be able to bulk copy program Varchar greater than 255.

Using the TOP Command

The command TOP does not work. Even in SQL Server Enterprise Manager, when I right-click a table and choose Return Top, I receive an incorrect syntax near "10."

Here is an example of the command produced by the SQL Server Enterprise Manager:

SELECT TOP 10 * FROM "dbo"."Bid_Header".

Answer

You need to have your compatibility set to SQL Server 7.0.

To check your compatibility, run the following:

Sp_dbcmptlevel @dbname

To set your compatibility to SQL Server 7.0, run the following:

Sp_dbcmptlevel @dbname, 70

Replication

Replication over RAS

I have a product that uses Remote Access Service (RAS) for replication, and I have much experience trying to get replication working successfully over RAS. This experience has brought me to the conclusion that the only way to successfully replicate over unreliable dial-up lines is by message-based replication. I currently use Sybase SQL Anywhere, and the SQL Remote utility for replication, in particular. SQL Remote performs replication by analyzing the transaction log. The transactions marked for replication are then sent through messages (either e-mail or shared file access) among the various databases.

Do you have any information now regarding how the SQL Server 7.0 replication technologies are "tuned" for RAS scenarios? Do I restart if I lose the link, or can I continue from the point of failure?

Answer

You can resume rather than restart. The Merge Agent maintains metadata it keeps for exactly what has changed and so moves only rows that must be either uploaded or downloaded. It then further keeps track of what has been merged to date, so that if the merge process should abort, it can be resumed from essentially a recent savepoint, without reapplying the entire job. Similarly, for transactional replication, should the line fail, it can then be resumed at the point of the last successful transaction.

Timestamp and Merge Replication

I understand that in SQL Server 7.0, timestamp fields are not supported for merge publications. I have a C/S application that relies on timestamp fields for cursor concurrency, and I want to support disconnected users from the same database. What strategies should be used to support both types of users?

Answer

With SQL Server 7.0, timestamps are supported by transactional replication.

If you can disallow updating of subscription database while disconnected, you can use the immediate updating subscribers option on snapshot or transactional replication. Effectively though, with this option, any disconnected user would have read-only data for the duration of their offline session.

Alternatively, if you can structure this situation as a roll-up scenario where the often-disconnected servers are the only ones who update the data, you can set each of these up as a publisher and your central server as a read-only subscriber.

If you can get away from requiring the timestamp, your options not only include merge but also nonpartitioned, bidirectional, transactional replication. See the entry "Nonpartitioned, Bidirectional, Transactional Replication" in SQL Server Books Online.

If the cursors are server-side, they can also use row values for cursor concurrency. In SQL Server 7.0, row value calculations are nearly as fast as timestamps and require no extra space in tempdb. With SQL Server 6.5, row value calculations take up more space in tempdb and are a bit slower, however. The only change needed would be to drop the timestamp columns from the tables; cursors will automatically use row values if optimistic concurrency is requested and no timestamp column is present.

SQL Language

Writing a Query Running on SQL Server

I'm having some trouble writing a query that will do what I need.

I have a table T1 with an integer field FIELD1 and the following records:

FIELD1 
0
1
2
3

I want to perform a query that will give me the result:

FOO
false
false
true
false

Using Microsoft Access, I can simply write a the following query:

Select FIELD1=2 as FOO From T1

Using Oracle, the query would be:

Select decode(FIELD1,2,1,0) as FOO From T1

However, I just can't seem to write something like these for SQL Server 7.0. The best that I could do:

Select FIELD1-2 as FOO From T1  

This gets me the result:

FOO
2
1 
0 
1

This is very close to what I want, but I just need to negate the values. However, I cannot find anything that will do this in SQL Server syntax. I tried, but since this is a bitwise NOT operation, it doesn't give me the right answer.

Is such a query possible using SQL Server 7.0, without resorting to stored procedures and really complex queries, such as a union of two queries?

Answer

It's actually quite easy. Use the CASE function.

Select    'FOO' = 
       CASE 
            WHEN Field1 = 2 THEN 'True'
            ELSE 'False'
        END
From   T1

Returns
FOO   
----- 
False
False
True
False

Code Sample

Set nocount on 
Drop table T1
go
Create table T1 (field1 int)
go
insert T1 values(0)
insert T1 values(1)
insert T1 values(2)
insert T1 values(3)
go

Select * from T1

Select    'FOO' = 
       CASE 
            WHEN Field1 = 2 THEN 'True'
            ELSE 'False'
        END
From   T1

Shortening a Datetime Field to Date

I can't find a function that will take a datetime field value such as 11/19/98 8:16:00 PM and shorten it to 11/19/98. Do you have a solution?

Answer

Try the following:

select convert (char(20),getdate(),101)

This results in 01/08/1999.

To get the two-digit year, use 1 instead of 101.

Code Sample

Set nocount on 
Drop table t1
go
Create table T1 (c1 datetime)
go
Insert T1 values (getdate())

select * from t1
select convert (char(20),c1,101) from t1

select convert (char(20),c1,1) from t1

Calculating the Number of Days in a Month

I am trying to find a way to calculate the total number of days in each
month. I am trying to avoid using a case statement, as it fails for February
on leap years. Is there a way to do it using the date functions in SQL Server ?

Answer

You can use DATEDIFF for part of it:

select datediff(dd,'Oct 1, 1998','Nov 1, 1998') ---> Result: 31
select datediff(dd,'Feb 1, 1998','Mar 1, 1998') ---> Result: 28
select datediff(dd,'Feb 1, 2000','Mar 1, 2000') ---> Result: 29 
select datediff(dd,'Feb 1, 1900','Mar 1, 1900') ---> Result: 28

Code Sample

Set nocount on 
declare @dateInQuestion datetime,
@howManyDaysThisMonth int
select @dateInQuestion = getdate() 
select @dateInQuestion = dateadd(DAY, 1 + (-1 * datepart(d, @dateInQuestion)), @dateInQuestion)
select @howManyDaysThisMonth = datediff(DAY,@dateInQuestion,(dateadd(MONTH,1,@dateInQuestion))) 
select 'thisMonth' = datepart(m,@dateInQuestion), 'DaysInThisMonth' = @howManyDaysThisMonth 

Cross-Tab Report

How do I create a cross-tab report in SQL Server 7.0?

Answer

Here is code that is similar to what I've used to create cross-tab reports on an ASP page:

USE pubs
SELECT stor_name,
SUM(qty) AS 'Total',
SUM(CASE WHEN ord_date BETWEEN '1/1/1993' AND '3/31/1993' THEN qty ELSE 0  END) AS Qtr1,
SUM(CASE WHEN ord_date BETWEEN '4/1/1993' AND '6/30/1993' THEN qty ELSE 0  END) AS Qtr2,
SUM(CASE WHEN ord_date BETWEEN '7/1/1993' AND '9/30/1993' THEN qty ELSE 0  END) AS Qtr3,
SUM(CASE WHEN ord_date BETWEEN '10/1/1993' AND '12/31/1993' THEN qty ELSE 0  END) AS Qtr4
FROM sales s INNER JOIN stores st ON s.stor_id = st.stor_id
WHERE ord_date between '1/1/1993' AND '12/31/1993'
GROUP BY stor_name
/*

This produces:

stor_name Total Qtr1 Qtr2 Qtr3 Qtr4
Barnum's 50 0 50 0 0
Bookbeat 55 25 30 0 0
Doc-U-Mat: Quality Laundry and Books 85 0 85 0 0
Fricative Bookshop 60 35 0 0 25

       

*/

To get a total line at the end, I use a union query.

However, if the datetime column is also keeping track of times, the preceding solution won't work.

"BETWEEN '1/1/93' AND '3/31/93'" will not include anything that happened on 3/31 after midnight. Any rows that have a date of 3/31/93 anytime after midnight will not be included in any sum (and the same for 6/30/93, 9/30/93, and 12/31/93).

I suggest using the datepart qq instead for the sums, and the datepart yy for the WHERE clause.

USE pubs
SELECT stor_name,
SUM(qty) AS 'Total',
SUM(CASE datepart(qq, ord_date) WHEN 1 THEN qty ELSE 0  END)  AS Qtr1,
SUM(CASE datepart(qq, ord_date) WHEN 2 THEN qty ELSE 0  END)  AS Qtr2,
SUM(CASE datepart(qq, ord_date) WHEN 3 THEN qty ELSE 0  END)  AS Qtr3,
SUM(CASE datepart(qq, ord_date) WHEN 4 THEN qty ELSE 0  END)  AS Qtr4
FROM sales s INNER JOIN stores st ON s.stor_id = st.stor_id
WHERE datepart(yy, ord_date) = 1993
GROUP BY stor_name

Code Sample

Set nocount on 
USE pubs
SELECT stor_name,
SUM(qty) AS 'Total',
SUM(CASE datepart(qq, ord_date) WHEN 1 THEN qty ELSE 0  END)  AS Qtr1,
SUM(CASE datepart(qq, ord_date) WHEN 2 THEN qty ELSE 0  END)  AS Qtr2,
SUM(CASE datepart(qq, ord_date) WHEN 3 THEN qty ELSE 0  END)  AS Qtr3,
SUM(CASE datepart(qq, ord_date) WHEN 4 THEN qty ELSE 0  END)  AS Qtr4
FROM sales s INNER JOIN stores st ON s.stor_id = st.stor_id
WHERE datepart(yy, ord_date) = 1993
GROUP BY stor_name


Union all

SELECT 'Total',
SUM(qty) AS 'Total',
SUM(CASE datepart(qq, ord_date) WHEN 1 THEN qty ELSE 0  END)  AS Qtr1,
SUM(CASE datepart(qq, ord_date) WHEN 2 THEN qty ELSE 0  END)  AS Qtr2,
SUM(CASE datepart(qq, ord_date) WHEN 3 THEN qty ELSE 0  END)  AS Qtr3,
SUM(CASE datepart(qq, ord_date) WHEN 4 THEN qty ELSE 0  END)  AS Qtr4
FROM sales s INNER JOIN stores st ON s.stor_id = st.stor_id
WHERE datepart(yy, ord_date) = 1993

Getting a MAX Value Query

I have a table called "comments," defined as the following:

Id int identity
Comment varchar (125)
posnumId  int
ChangedAt  datetime

The user can enter an unlimited number of comments for each position. I need to extract only the last one entered. This is the record with the MAX ChangedAt value. However, when I use a query such as the following:

select MAX(ChangedAt) as MaxChangedAt, comment, 

I don't get the results I need. Is there a way to do this?

Answer

Here is one way to do it:

select top 1 * from comments
order by ChangedAt DESC

And here is another way to do it. I'm not sure which way is faster for you.

select c.*
from  comments c
join  (select MAX(ChangedAt) as MaxChangedAt from comments) AS m
on  c.ChangedAt = m.MaxChangedAt

/****
create table comments (
Id int identity
,comment varchar (125)
,posnumId  int
,ChangedAt  datetime)

insert into comments (comment,posnumId,changedAt)
values ('hi',1,'1999/03/01')
insert into comments (comment,posnumId,changedAt)
values ('bye',1,'1999/05/01')

Here is a very similar looking way:

select *
from   comments 
where  ChangedAt IN (select MAX(ChangedAt) from comments)

(You can also use "=" in place of the "IN" because the subquery returns a single value.)

If you need the last comment for each position, try the following:

SELECT *
FROM Comments INNER JOIN
   (
SELECT MAX(ChangedAt) changedat,posnumID 
FROM comments
GROUP BY posnumID ) LastEntries
ON Comments.posnumId = LastEntries.posnumid and 
Comments.ChangedAt = LastEntries.changedat

Code Sample

Set nocount on 
Use dbTopisv

Drop table comments
go

Create table comments 
(
Id int identity,
comment varchar (125),
posnumId  int,
ChangedAt  datetime
)

Insert comments values ('aaaa',1,'01-01-99')
Insert comments values ('bbbb',1,'01-02-99')
Insert comments values ('cccc',3,'01-01-99')
Insert comments values ('dddd',3,'01-02-99')
Insert comments values ('eeee',2,'01-03-99')
Insert comments values ('ffff',1,'02-01-99')
Insert comments values ('gggg',1,'03-02-99')
Insert comments values ('hhhh',2,'03-01-99')
Insert comments values ('iiii',3,'04-02-99')
Insert comments values ('llll',3,'04-03-99')

SELECT  *
FROM comments
select MAX(ChangedAt) as MaxChangedAt, comment from comments  --- error
Select MAX(ChangedAt) as MaxChangedAt from comments group by posnumID

Select *
from  comments c
join  (select MAX(ChangedAt) as MaxChangedAt from comments) AS m
on  c.ChangedAt = m.MaxChangedAt

select *
from   comments 
where  ChangedAt IN (select MAX(ChangedAt) from comments)

SELECT *
FROM Comments INNER JOIN
   (
SELECT MAX(ChangedAt) changedat,posnumID 
FROM comments
GROUP BY posnumID ) LastEntries
ON Comments.posnumId = LastEntries.posnumid and Comments.ChangedAt = LastEntries.changedat

A Method to Sum a Count

In a query that returns a count using COUNT(*), is there a way to sum the counts? In SQL Server 7.0 Books Online, I read that it "Cannot perform an aggregate function on an expression containing an aggregate or a subquery."

When I use a GROUP BY clause, I get a series of counts; I would like to find a way to total them. I don't necessarily need to show the counts, just the resultant SUM.

Answer

The SUM of the counts would be the count from the entire table without grouping.

If you're using other aggregates besides COUNT, aggregating the aggregates may give a different value than just a simple aggregate on the whole table. You can use a derived table:

select sum(counter) from 
(select counter=count(*) from titles group by type) as t

To group in order to show the counts by grouping bas well as the grand total, you can use the rollup.

select type, number=count(*) from titles 
group by type with rollup

Doing this will give you a NULL for the type in the row with the total, but you can change that as follows:

select case 
when grouping(type)=1 then 'TOTAL'
else type
end,
number=count(*) from titles 
group by type with rollup

Code Sample

Set nocount on 
use dbTopisv
drop table foo
go
create table foo
(counter int,type int)

Insert foo values(1,1)
Insert foo values(1,1)
Insert foo values(1,1)
Insert foo values(10,2)
Insert foo values(20,2)
Insert foo values(100 ,3)
Insert foo values(200,3)

select * from foo

select counter=count(*) from foo 

select counter=count(*) from foo group by type

select sum(counter) from 
(select counter=count(*) from foo group by type) as t

select counter=avg(counter) from foo group by type

select sum(counter) from 
(select counter=avg(counter) from foo group by type) as t

select type, number=count(*) from foo 
group by type with rollup

select case 
when grouping(type)=1 then 'TOTAL'
else convert(char(10),type)
end Type,
number=count(*) from foo
group by type with rollup

Dropping all but the MAX Values in a GROUP BY Query

I'd like to get a query that returns the last entries in a table.

I have these fields:

UniqueID (incremented by one each time)

City Name

Date

Sales

I'd like to return the last entry in the table for each of the cities that have existing sales.

My GROUP BY clause is CityName, Sales. I can't cause the MAX function that is used against the UniqueID to return just the last sales figures by city. There may be weeks or days where not all cities have sales, so they may be spread out in the table.

Answer

Try this:

SELECT ....
FROM T INNER JOIN
   (
SELECT MAX(UniqueID) as UniqueId
FROM T 
GROUP BY CityName, Sales
) LastEntries
ON T.UniqueId = LastEntries.UniqueId

Code Sample

Set nocount on 
Use dbTopisv
drop table foo
go
Create table foo (UniqueID int,daysale datetime,CityName char(10),Sales int)
Insert into foo values (1,'1999-02-01','aaaaa',10)
Insert into foo values (2,'1999-02-02','aaaaa',10)
Insert into foo values (3,'1999-02-03','aaaaa',20)
Insert into foo values (4,'1999-02-01','bbbbb',30)
Insert into foo values (5,'1999-02-03','bbbbb',30)
Insert into foo values (6,'1999-02-04','bbbbb',40)
Insert into foo values (7,'1999-02-02','ccccc',50)
Insert into foo values (8,'1999-02-03','ccccc',50)
Insert into foo values (9,'1999-02-05','ccccc',50)
Insert into foo values (10,'1999-02-07','ccccc',60)

SELECT *
FROM foo INNER JOIN
   (
SELECT MAX(UniqueID) as UniqueId
FROM foo 
GROUP BY CityName, Sales
) LastEntries
ON foo.UniqueId = LastEntries.UniqueId

Setting the Current Date as the Default Value for a Datetime Field

How can I cause the default value in a datetime table field to be set to the current date stamp?

Answer

Use the GETDATE function as your default:

create table foo (datefield datetime default getdate(), bar varchar(10))

Then,

insert foo (bar) select 'blah'

The datetime field will be set to the current time.

Showing Unique Records

I need to be able to display the count of unique rows within a SQL Server 7.0 database. With a SELECT statement and a GROUP BY clause, I get a result set containing citynames and unique user names with duplicate citynames. Instead, I want to display the count of unique users within a city. Using DISTINCT does not seem to help.

Answer

Use the following SELECT statement:

select count(distinct users), city
from ...
group by city

Code Sample

Set nocount on
Use dbtopisv
Drop table foo
go
Create table foo (users char(10), city char(10))
go
Insert foo values ('useraa','citya')
Insert foo values ('useraa','citya')

Insert foo values ('useraa','citya')
Insert foo values ('userab','citya')
Insert foo values ('userab','citya')
Insert foo values ('userab','citya')
Insert foo values ('userac','cityb')
Insert foo values ('userad','cityb')
Insert foo values ('useraa','cityb')
Insert foo values ('useraa','cityb')

select count(distinct users) users, city
from foo
group by city

Select * from foo

A Casting Question

I have a table that stores locale information (ID varchar(20), Description varchar(50)), with such data as:

'0409', 'English'

'041D', 'Swedish'

I need to convert the data into another table, where the ID column is int. I want to convert a string '0x43' into an integer value 67. How do I do this conversion?

I tried the following:

SELECT Cast('0x'+ID as Int) FROM 

It did not work, however.

Answer

Here's how to do that:

declare @snum char(10)
select @snum = '010'

declare @dec int
exec master..xp_sscanf @snum, "%x", @dec output

select @dec

Code Sample

set nocount on
declare @snum char(10)
select @snum = '010'
declare @dec int
exec master..xp_sscanf @snum, "%x", @dec output

select @dec

Using a Stored Procedure in a Subquery

I use the following subquery:

SELECT * FROM table1 WHERE uniqueID IN (SELECT uniqueID FROM table2)

However, when I make the subquery a stored procedure, as in the following:

CREATE PROC sp1 AS
SELECT uniqueID FROM table2

and try to integrate the stored procedure in the first query statement, I get errors:

SELECT * FROM table1 WHERE uniqueID IN (EXEC sp1)

I've searched SQL Server Books Online and the SQL Server documentation, all to no avail. Am I trying to do something that's not legal? Are there workarounds?

Answer

For SQL Server 7.0, there is a roundabout way of doing this.

  1. Create a linked server (for example, MyLinkedServer) which refers back to the SQL Server you're using (Data source=MyServer).

  2. Set the catalog to be your application database.

  3. Set security to "impersonation."

  4. Use the following:
    SELECT * FROM table1 WHERE uniqueID IN (SELECT * FROM OPENQUERY(MyLinkedServer,'exec sp1'))
    

This may not be the most efficient way to do what you want, but it avoids creating temporary objects.

Code Sample

set nocount on
use dbTopIsv
go
Drop table table1
go
Drop table table2
go
Drop procedure sp1
go
Create table table1 (uniqueIDt1 int)
Create table table2 (UniqueIDt2 int)
Insert table1 values(1)
Insert table1 values(2)
Insert table1 values(3)
Insert table1 values(4)
Insert table1 values(5)
Insert table1 values(6)
Insert table2 values(2)
Insert table2 values(4)
Insert table2 values(6)
go
CREATE PROC sp1 AS
SELECT uniqueIDt2 FROM table2
go
SELECT * FROM table1 WHERE uniqueIDt1 IN (SELECT uniqueIDt2 FROM table2)

execute sp_dropserver mylinkedserver
execute sp_addlinkedserver @server = 'mylinkedserver' , 
     @provider = 'SQLOLEDB' , 
   @datasrc = 'topisv03', 
   @srvproduct = 'SQLServer',
    @catalog = 'dbtopisv'

select * from table1
select * from mylinkedserver.dbtopisv.dbo.table2
SELECT * FROM table1 WHERE uniqueIDt1 IN (SELECT uniqueIDt2 FROM table2)

SELECT * FROM table1 WHERE uniqueIDt1 IN 
(SELECT * FROM OPENQUERY(MyLinkedServer,'exec sp1'))

Narrowing a Query by a SELECT DISTINCT Statement that is not in a GROUP BY Clause

In a query I'm doing, I need to narrow down the result set property by adding a SELECT DISTINCT statement. However, the field I'd like to use the statement against is not something I'm grouping by, since that would expand the result set from a summary to a detail report. In the query below, I have highlighted the part that I'm trying to add to a working query:

SELECT A.Year,
A.Week,
A.ScopeID,
A.NewsletterNm,
A.NewsletterID,
A.IsPromoMailInd,
A.Cnt 'subscriptions',
B.Cnt 'unsubscribes'
FROM (SELECT distinct EventLog.EventUserID, DatePart(yy,EventLog.EventDt) 'Year', DatePart(wk,EventLog.EventDt) 'Week', Newsletter.ScopeID, 
Newsletter.NewsletterNm, Newsletter.NewsletterID, CONVERT(INT,Newsletter.IsPromoMailInd) IsPromoMailInd, Cnt = count(*)
FROM EventLog, Newsletter
WHERE Newsletter.NewsletterID = EventLog.EventNewsletterID AND ((EventLog.EventType='Subscribe') AND 
(DatePart(wk,EventLog.EventDt)<=DatePart(wk,getdate())) AND (DatePart(yy,EventLog.EventDt)=DatePart(yy,getdate())))
GROUP BY DatePart(yy,EventLog.EventDt), DatePart(wk,EventLog.EventDt), Newsletter.ScopeID, 
Newsletter.NewsletterNm, Newsletter.NewsletterID, CONVERT(INT,Newsletter.IsPromoMailInd)
) A

My error: Column "EventLog.EventUserID" is invalid in the SELECT list because it is not contained in either an aggregate function or the GROUP BY clause.

Is there a way I can redo this query to add the distinct statement I need?

Answer

You need to add an additional subquery for the SELECT DISTINCT statement.

What follows would be an analogous query in pubs.

Here is what you have now:

SELECT stor_id, SUM(qty) FROM sales
GROUP BY stor_id

Here is what you want that fails:

SELECT DISTINCT ord_num, stor_id, SUM(qty) FROM sales
GROUP BY stor_id

A solution:

SELECT s.stor_id, SUM(s.qty) FROM 
(SELECT DISTINCT ord_num, stor_id, qty FROM sales) AS s
GROUP BY s.stor_id

In your case, you will have two nested subqueries.

SELECT A.year...
FROM (SELECT ...
FROM (SELECT DISTINCT eventlog.EventUserID, ...) AS e, Newsletter WHERE ...
GROUP BY ...

By the way, you should switch to the American National Standards Institute (ANSI) standard join syntax:

FROM EventLog JOIN Newsletter ON EventLog.NewsletterID = Newsletter.EventNewsletterID WHERE ...

If you do not, you risk ambiguous results.

Code Sample

set nocount on
use pubs

SELECT s.stor_id, SUM(s.qty) FROM 
(SELECT DISTINCT ord_num, stor_id, qty FROM sales) AS s
GROUP BY s.stor_id

Using LIKE for Underscores

When I write the following query:

select x where x like '%_%'

it returns all values for "x," whether they contain an underscore or not. How do I fix this?

Answer

You need to use brackets, like this:

 '%[_]%'

Or you can use the escape character. Underscore means a single character. SQL Server Books Online has more information, available under "LIKE."

Code Sample

Set nocount on
Use dbtopisv
go
drop table foo
go
create table foo (c1 char(100))
go
insert foo values ('a_aaaaa')
insert foo values ('b10%cccc')
insert foo values ('b10bcccc')

select c1 as KO from foo where c1 like '%_%'  -- KO
select c1 as OK from foo where c1 like '%[_]%' --- OK
select c1 as KO from foo where c1 like 'b10%c%'       -- KO
select c1 as OK from foo where c1 like 'b10|%c%' escape '|'  --OK

Programming Defensively

In SQL Server 7.0, what should I expect from the following, if ColumnName is all spaces:

SELECT LTRIM(RTRIM(ColumnName))

In SQL Server 6.5, I would receive a NULL. In SQL Server 7.0, it appears that I receive an empty string. Is that the case? Do routines written to expect a NULL in this statement have to be rewritten, or is this a configurable feature of SQL Server 7.0?

Answer

As good programming practice, I would recommend that you program defensively; a developer should never rely on side effects and undefined behavior, especially when there is no clear reason why one behavior should be favored over another. In this case, defensive programming would amount to testing both for NULL and for zero length.

However, you can use sp_dbcmptlevel() to return to SQL Server 6.5 behavior. For more information, search "sp_dbcmptlevel" in SQL Server Books Online.

My recommendations: If this is a system that is likely to be further modified in the future, fix the code. You can write a macro to do this. For a dead or truly legacy system, use sp_dbcmptlevel() for a quick fix, but doing so will change many things and will make it hard for developers going forward.

Code Sample

Set nocount on
Use dbtopisv
go
execute sp_dbcmptlevel dbtopisv,65 
select LTRIM(' ') as 'returns NULL'
execute sp_dbcmptlevel dbtopisv,70 
select LTRIM(' ') as 'returns an empty string'

SET IMPLICIT_TRANSACTIONS vs. BEGIN TRANSACTION

What is the effect of SET IMPLICIT_TRANSACTIONS ON? It seems equivalent to the BEGIN TRANSACTION statement. In one case, SQL Server Profiler did not show a BEGIN TRANSACTION statement when my ADO program called the BeginTrans method. Instead, it calls

set implicit_transactions on 

and you can do a ROLLBACK TRANSACTION with no starting BEGIN TRANSACTION statement.

Answer

The main purpose of the SET IMPLICIT_TRANSACTIONS command is to allow Transact-SQL to be used in a way that is closer to standard SQL transaction semantics. In ANSI SQL, there is no BEGIN TRANSACTION statement; you are always in a transaction.

Implicit transactions begin a transaction on the server for each statement, and commits do not occur until they are issued manually.

So, if you have the following:

set implicit_transactions on
go
insert 
insert 
insert

this is internally turned into

BEGIN TRAN
Insert
insert
insert
.

The transaction will not be rolled back or committed unless the user issues the correct statement.

Without IMPLICIT_TRANSACTIONS, the following occurs:

begin tran
insert
commit tran
begin tran
insert 
commit tran
...

Licenses

The Number of Licenses and User Connections Per Server

Does the number of licenses per server equal the number of user connections? I have SQL Server 7.0, with 10 as the number of licenses per server, and 200 user connections. I received the following error message:

"Login failed - The maximum simultaneous user count of 10 licenses for this server has been exceeded. Additional licenses should be obtained and registered via Licensing in the NT Control Panel."

At that time, I ran sp_who and saw nine users (under the same user login name) connected to the server. Is the number of licenses per server actually same as the number of user connections?

Answer

There is no connection between the "number of licensed users" and the SQL Server configuration option called "user connections." They are two fundamentally different issues, separated by a few million lines of code. One does not affect the other.

Per-server licensing indicates the number of distinct computers that are capable of establishing a server message block (SMB) to the machine hosting SQL Server. In other words, if I buy a 100-user SQL Server license, I now have the legal right for 100 different machines to connect to SQL Server, and the machine hosting SQL Server, using an SMB to establish the session. Therefore, if 100 different machines each open one SQL Server connection, then the 100 "per-server" limit has been reached—all 100 SMB ports are now consumed.

If each of those 100 machines is running 17 different applications, and each application opens 14 different connections to the same database (trust me, I've seen poorer application design), then SQL Server's "user connections" parameter could be set as follows: allow five connections for system functions (such as checkpoint), allow another five just for free space, plus 2380 (14 connections/application x 17 applications/machine x 100 machines = 23800) to handle the workload. This would give you a value of “23810” for SQL Server "user connections." This scenario is perfectly legal, and in fact is the correct configuration for supporting the user's various applications. The customer buys 100 user licenses and configures SQL Server for 24000 connections.

Data Transformation Services (DTS)

Data Transformation Services Licensing Error

I can transfer from most SQL Server 7.0 servers to most other SQL Server 7.0 servers with Data Transformation Services (DTS), but I receive a licensing error on one. Can you shed some light on this?

Answer

Your server must be per-seat licensed to use replication or DTS between it and Microsoft SQL Server 7.0 Desktop Edition. Also, you must have the appropriate CAL. See your licensing agreement for details.

Other Questions

A Billion-Row Table

What are the issues of a one-billion-row table?

The SQL Server 6.5 documentation gives the maximum table size as one terabyte (TB), and SQL Server 7.0 documentation says that one is only limited by "available storage."

Has anyone actually created a billion-row table?

Answer

Yes. Earlier this year we used one of the large fibre farms in our lab (connected to an eight-way DG system) to build a database with a full terabyte of row/column data. The project was done to ferret out potential problems with databases of this size. We used our internal TPC-C database loader, since it is fast and well tested. The database build included bulk loading the data and indexing the tables. We then used the system to test a wide variety of administrative operations and queries. While we did find a few minor problems during the load, all were addressed and none were serious. We continue to maintain and test with this system in our lab. This database contains almost five billion rows of data. The largest individual table is over three billion rows. The actual cardinality of the database tables is as follows:

Table Rows
warehouse 10,000
district 100,000
customer 300,000,000
history 300,000,000
orders 300,000,000
new_order 90,000,000
order_line 3,000,000,000
stock 1,000,000,000
item 100,000

Total Rows:4,990,210,000

More than One Result Set

I am using CDatabase and CRecordset in Visual C++ 6.0® to access SQL Server 6.5. My queries require several statements, including multiple SELECT and SET statements. I have not been able to create a stored procedure for these queries because the text of the SELECT statements is dependent on the number of parameters to my search, as well as the values of those parameters. To get around both these issues, I created a simple stored procedure,

ExecText.
Create Proc ExecText (@param text) As
exec(@param)
return
GO

This procedure will execute an entire script that I send to the procedure as the sole parameter.

Here is the kind of script I am writing:

ExecText
'Select ID
Into #TempTbl1
From MyTable

Select ID
Into #TempTbl2
From #TempTbl1

Select ID
From #TempTbl2'

I know that this sample involves a lot of legwork for a simple query. The point is, I have to populate some temporary tables using a Select Into statement twice before returning an actual record set. This doesn't work. After executing the structured Query language, CRecordset::IsEOF returns true, telling me that there were no records returned. I know that there are records, and when the above is executed in ISQL_w, I do see records returned. The Select Into statement does work, however, when I only "Select Into" a single time before returning a recordset.

I notice in ISQL_w that when the query is executed, I get back a rowcount for the second statement, then the results of the third statement, then the rowcount of the third statement. I think this is key to the problem. I believe that the rowcount for the second Select, which is a SELECT INTO statement (and when executed by itself will return no data or rows), is evidence of the opening of a cursor when that does not need to happen. I think that CRecordset is being returned this cursor, instead of the "real" cursor returned by the third Select statement.

How can I execute the above script so that the actual results of the query are returned to my CRecordset? The operation works if there is only one SELECT INTO statement before the final Select, but not if there are two.

Answer

You may be generating more than one result set.

Try rs.NextRecordset after you read the first, where rs is the recordset. This method applies to ADO.

When you are returning recordsets through Open Database Connectivity (ODBC), it is recommended to use SET NOCOUNT ON in your stored procedure, so that the "number of rows affected" message is not sent to the client. If you are doing a relatively large number of inserts, updates, and select into statements, these messages can fill the client side network buffers and make it look as if no records are being sent. If you are using ODBC API, you should be using SQLMoreResults until you get SQL_NO_DATA_FOUND, to make sure that there are no records.

The best solution in any case is to use SET NOCOUNT ON in the stored procedure.

Slow-Performing Batch Insert

The following simple query took almost five hours, running on SQL Server 7.0.

print "Inserting 1 million line items"
GO
set nocount on
declare @ctr int
select @ctr = 1
while @ctr <= 1000000 --1 million line items
begin
insert into LineItem (Attribs) values ('LineitemAttribute'+convert(char(8), @ctr))
select @ctr = @ctr + 1
end
GO

The same query took only 1.5 hour on SQL Server 6.5. Is this to be expected? If yes, what can be done to improve insert performance, besides dropping constraints? The table had no insert triggers and looked like this:

CREATE TABLE LineItem
(
  LineItemID     int IDENTITY (1, 1) NOT NULL,
  Attribs        char(32) NOT NULL

CONSTRAINT PK_LineItem PRIMARY KEY CLUSTERED (LineItemID)
)

Answer

Create a stored procedure or use the sp_executesql statement to avoid recompilation of the insert:

set nocount on
declare @ctr int
select @ctr = 1
while @ctr <= 1000000 --1 million line items
begin
sp_executesql('insert into LineItem (Attribs) values (@p1)','@p1 
char(32)','LineitemAttribute'+convert(char(8), @ctr))
select @ctr = @ctr + 1
end
GO

When you have done that, batch them up:

set nocount on
declare @ctr int
select @ctr = 1
while @ctr <= 1000000 --1 million line items
begin
sp_executesql('
insert into LineItem (Attribs) values (@p1)
insert into LIneItem (Attribs) values (@p2) 
      ....
insert into LIneItem (Attribs) values (@p10) 
      ',
'@p1 char(32), @p2 char(32), ..., @p10 char(32)',
'LineitemAttribute'+convert(char(8), @ctr+1)),
'LineitemAttribute'+convert(char(8), @ctr+2)),
      ...
'LineitemAttribute'+convert(char(8), @ctr+10)),
      )
select @ctr = @ctr + 10
end

If the data inserted is as simple as your example, create a table with 1,000 values in it (say Digits) and do a loop of:

declare @thou_ctr int
select @thou_ctr = 1
while @thou_ctr <= 1000 --1 million line items
begin
sp_executesql(
'insert LineItem (Attribs) select 'LineItemAttribute'+convert(char(8),@thou_ctr*1000+col1)
from Digits', '@thou_ctr int',@thou_ctr)
select @thou_ctr = @thou_ctr + 1
end