Common SQL Server Questions from SQL Developers

Rick Vicik and Greg Crafts

Created: March 20, 1992

ABSTRACT

This article contains common programming questions received from MicrosoftÒ SQL Server developers.

Question:

I see that SQL Server provides built-in functions such as object_id and object_name. I don’t understand how these functions can be useful.

Answer:

Built-in functions can greatly simplify writing queries against SQL Server catalog tables. For example, to find all objects that depend on a particular object, you can issue a query against the sysdepends table. But if you want object names rather than object IDs, you must join with the sysobjects table twice. Doing so introduces the complication of correlation names on the twice-named sysobjects and in the join conditions.

SELECT t3.name

FROM sysdepends t1, sysobjects t2, sysobjects t3

WHERE t2.name = "foo"

AND t2.id = t1.id

AND t3.id = t1.depid

Using the object_id built-in function to translate the input object name to an object ID and using object_name to translate the output object ID to an object name eliminates the joins and makes the query much easier to understand.

SELECT OBJECT_NAME(depid)

FROM sysdepends

WHERE id = OBJECT_ID("foo")

Question:

Is it okay to use the substring function to look at part of a column’s data and to use the results in a view?

Answer:

You can use convert, substring, concatenate, and similar functions in a view definition to convert data to the desired format. For example, if a column is used as an atomic data item in most cases but occasionally needs to be treated as a separate column for a particular application, you can define a view that substrings out the pieces (and gives them names). These view columns can then be sorted, selected on, and so on, but they cannot be updated.

CREATE VIEW decode

AS SELECT

piece1 = SUBSTRING(whole,1,2),

piece2 = SUBSTRING(whole,3,2),

piece3 = SUBSTRING(whole,5,2)

FROM table1

It is even possible to ORDER BY on the pieces:

SELECT *

FROM decode

ORDER BY piece3, piece1

You can construct a short lookup translation table in a view using substring. For example, suppose that a table contains a single-integer reply that ranges from 1 to 3. If you want to convert this to the fixed-length character strings YES, NO, and MAYBE, you can use the following query:

SELECT rtrim( SUBSTRING( "yes no maybe",(col1*5)-4,5 ) )

FROM table1

Note that the “yes no maybe” string is padded with blanks so that each choice contains exactly five characters. The substring function uses this value to determine the start position and the length of the substring.

Bitmask theta join

Suppose you want to categorize the items in the items table in multiple ways so that categories can be added dynamically without adding columns to tables. A single item may belong to many categories. One way to do this is to create a relationship table that consists of the keys of the items table and the categories table. Another approach is to assign category numbers in powers of two and to represent the membership of an item in a category by setting the corresponding bit in a category bitvector in the items table. You can use a trigger to assign the next highest power of two (or the application can do so).

ITEMS

item_name varchar(30) pkey

item_description varchar(100)

.

.

.

catvector varbinary(25) null

CATEGORIES

catcode varbinary(255) pkey

category_name

You can define a view that joins the items table to the categories table to translate the bit codes to their respective character descriptions.

CREATE VIEW categorize

AS SELECT item_name, item_description, ..., category_name

FROM items, categories

WHERE CONVERT( INT, catvector )

& CONVERT( INT, catcode )

This view returns all categories for each item. Transact-SQL requires the conversion to integer before the bitwise AND is allowed. This limits the number of categories to 32, although a varbinary column longer than 4 bytes can be substringed into multiple integers.

Expressions in the select list

Andrew Warden’s Test Number 1 for relational systems (see C.J. Date’s Relational Database Writings, 1985-1989, page 462) states:

Does the system allow you to phrase, in a single expression, the query ‘For each employee, show his or her name, salary, and average salary for his or her department’?

Standard SQL fails this test because it does not permit aggregates to appear in rows along with detail data (actually this can be accomplished through some extremely convoluted standard SQL). Transact-SQL passes the test in a simple way because it allows complete SELECT expressions in place of column names in the select list.

Suppose the emp table contains an employee’s name, salary, and department number. The following is not legal in standard SQL because name and salary are neither in the GROUP BY clause nor are they aggregates.

SELECT name, salary, AVG(salary)

FROM emp

GROUP BY dept

The Transact-SQL version works because the average salary for the department is computed in the expression for the third item in the select list. That expression is recomputed for each row in the emp table because a column from the outer table is referenced within the subselect WHERE clause. The average is computed for each row in emp, which is not the most efficient way to find the answer.

SELECT

name,

salary,

( SELECT AVG(salary)

FROM emp

WHERE dept = t1.dept )

FROM emp t1

Question:

I’m writing a double-entry accounting package using SQL Server. Are there any tricks to maintaining a transaction table that tracks all my debits and credits?

Answer:

Suppose an account-tracking system is designed as follows:

TRANS

id int pkey

account int

amount money

type char(2)

ACCT

account int pkey

.

.

.

The trans table contains a record of every transaction for each account. Each record contains the amount and the transaction type. Some transaction types are credits, and some are debits. The effective balance for each account can be computed as follows:

SELECT

( SELECT SUM(amount)

FROM trans

WHERE account = acct.account

AND type = 'CR' )

- ( SELECT SUM(amount)

FROM trans

WHERE account = acct.account

AND type = 'DB' )

FROM acct

The first expression sums up the credit transactions for a particular account, and the second sums up the debits. The sums are computed for each account because the outer table acct is referenced inside the subselects.

Although this solution is interesting, a simpler solution is more efficient. It requires a small table to translate transaction codes to either +1 or –1, depending on whether they are credit or debit transactions (the example above assumes only one type of each, but the solution can handle multiple transaction codes).

LOOKUP

code char(2)

sign tinyint

SELECT account, SUM(amount*sign)

FROM trans, lookup

WHERE type = code

GROUP BY account

Question:

Transact-SQL and standard SQL seem to differ when DELETE or UPDATE is used with a join. Can you clarify the differences?

Answer:

Transact-SQL allows a join to be specified on DELETE and UPDATE statements. This doesn’t delete or update rows in multiple tables. The join is used only to specify which rows to delete or update in a target table.

Standard SQL syntax vs. Transact-SQL syntax for DELETE:

DELETE FROM table/view WHERE condition

DELETE table/view [FROM table/view,...] WHERE condition

Standard SQL syntax vs. Transact-SQL syntax for UPDATE:

UPDATE table/view SET col=value,... WHERE condition

UPDATE table/view SET col=value,...

[FROM table/view...] WHERE condition

The classic “employees who make more than their bosses” query demonstrates the differences. Suppose the emp table contains empno, salary, and boss. In standard SQL, the statement to delete all employees who make more than their bosses is:

DELETE

FROM emp

WHERE empno IN

( SELECT t1.empno

FROM emp t1, emp t2

WHERE t1.boss = t2.empno

AND t1.salary > t2.salary )

To do the same thing in Transact-SQL:

DELETE emp

FROM emp t1, emp t2

WHERE t1.boss = t2.empno

AND t1.salary > t2.salary

Standard SQL uses a subselect to accomplish what the join does in Transact-SQL. If the requirement was to change the salary of all employees who make more than their bosses to a value that depends on the value of their boss’s salary, standard SQL can’t do it because columns retrieved in the subselect cannot be used in expressions in the SET clause. In Transact-SQL, the value of the boss’s salary can be used in the SET clause because it was obtained from a join, not from a subselect.

UPDATE emp

SET t1.salary = t2.salary

FROM emp t1, emp t2

WHERE t1.boss = t2.empno

AND t1.salary > t2.salary