Rick Vicik and Greg Crafts
Created: March 20, 1992
ABSTRACT
This article contains common programming questions received from MicrosoftÒ SQL Server developers.
Question:
I see that SQL Server provides built-in functions such as object_id and object_name. I don’t understand how these functions can be useful.
Answer:
Built-in functions can greatly simplify writing queries against SQL Server catalog tables. For example, to find all objects that depend on a particular object, you can issue a query against the sysdepends table. But if you want object names rather than object IDs, you must join with the sysobjects table twice. Doing so introduces the complication of correlation names on the twice-named sysobjects and in the join conditions.
SELECT t3.name
FROM sysdepends t1, sysobjects t2, sysobjects t3
WHERE t2.name = "foo"
AND t2.id = t1.id
AND t3.id = t1.depid
Using the object_id built-in function to translate the input object name to an object ID and using object_name to translate the output object ID to an object name eliminates the joins and makes the query much easier to understand.
SELECT OBJECT_NAME(depid)
FROM sysdepends
WHERE id = OBJECT_ID("foo")
Question:
Is it okay to use the substring function to look at part of a column’s data and to use the results in a view?
Answer:
You can use convert, substring, concatenate, and similar functions in a view definition to convert data to the desired format. For example, if a column is used as an atomic data item in most cases but occasionally needs to be treated as a separate column for a particular application, you can define a view that substrings out the pieces (and gives them names). These view columns can then be sorted, selected on, and so on, but they cannot be updated.
CREATE VIEW decode
AS SELECT
piece1 = SUBSTRING(whole,1,2),
piece2 = SUBSTRING(whole,3,2),
piece3 = SUBSTRING(whole,5,2)
FROM table1
It is even possible to ORDER BY on the pieces:
SELECT *
FROM decode
ORDER BY piece3, piece1
You can construct a short lookup translation table in a view using substring. For example, suppose that a table contains a single-integer reply that ranges from 1 to 3. If you want to convert this to the fixed-length character strings YES, NO, and MAYBE, you can use the following query:
SELECT rtrim( SUBSTRING( "yes no maybe",(col1*5)-4,5 ) )
FROM table1
Note that the “yes no maybe” string is padded with blanks so that each choice contains exactly five characters. The substring function uses this value to determine the start position and the length of the substring.
Bitmask theta join
Suppose you want to categorize the items in the items table in multiple ways so that categories can be added dynamically without adding columns to tables. A single item may belong to many categories. One way to do this is to create a relationship table that consists of the keys of the items table and the categories table. Another approach is to assign category numbers in powers of two and to represent the membership of an item in a category by setting the corresponding bit in a category bitvector in the items table. You can use a trigger to assign the next highest power of two (or the application can do so).
ITEMS
item_name varchar(30) pkey
item_description varchar(100)
.
.
.
catvector varbinary(25) null
CATEGORIES
catcode varbinary(255) pkey
category_name
You can define a view that joins the items table to the categories table to translate the bit codes to their respective character descriptions.
CREATE VIEW categorize
AS SELECT item_name, item_description, ..., category_name
FROM items, categories
WHERE CONVERT( INT, catvector )
& CONVERT( INT, catcode )
This view returns all categories for each item. Transact-SQL requires the conversion to integer before the bitwise AND is allowed. This limits the number of categories to 32, although a varbinary column longer than 4 bytes can be substringed into multiple integers.
Expressions in the select list
Andrew Warden’s Test Number 1 for relational systems (see C.J. Date’s Relational Database Writings, 1985-1989, page 462) states:
Does the system allow you to phrase, in a single expression, the query ‘For each employee, show his or her name, salary, and average salary for his or her department’?
Standard SQL fails this test because it does not permit aggregates to appear in rows along with detail data (actually this can be accomplished through some extremely convoluted standard SQL). Transact-SQL passes the test in a simple way because it allows complete SELECT expressions in place of column names in the select list.
Suppose the emp table contains an employee’s name, salary, and department number. The following is not legal in standard SQL because name and salary are neither in the GROUP BY clause nor are they aggregates.
SELECT name, salary, AVG(salary)
FROM emp
GROUP BY dept
The Transact-SQL version works because the average salary for the department is computed in the expression for the third item in the select list. That expression is recomputed for each row in the emp table because a column from the outer table is referenced within the subselect WHERE clause. The average is computed for each row in emp, which is not the most efficient way to find the answer.
SELECT
name,
salary,
( SELECT AVG(salary)
FROM emp
WHERE dept = t1.dept )
FROM emp t1
Question:
I’m writing a double-entry accounting package using SQL Server. Are there any tricks to maintaining a transaction table that tracks all my debits and credits?
Answer:
Suppose an account-tracking system is designed as follows:
TRANS
id int pkey
account int
amount money
type char(2)
ACCT
account int pkey
.
.
.
The trans table contains a record of every transaction for each account. Each record contains the amount and the transaction type. Some transaction types are credits, and some are debits. The effective balance for each account can be computed as follows:
SELECT
( SELECT SUM(amount)
FROM trans
WHERE account = acct.account
AND type = 'CR' )
- ( SELECT SUM(amount)
FROM trans
WHERE account = acct.account
AND type = 'DB' )
FROM acct
The first expression sums up the credit transactions for a particular account, and the second sums up the debits. The sums are computed for each account because the outer table acct is referenced inside the subselects.
Although this solution is interesting, a simpler solution is more efficient. It requires a small table to translate transaction codes to either +1 or –1, depending on whether they are credit or debit transactions (the example above assumes only one type of each, but the solution can handle multiple transaction codes).
LOOKUP
code char(2)
sign tinyint
SELECT account, SUM(amount*sign)
FROM trans, lookup
WHERE type = code
GROUP BY account
Question:
Transact-SQL and standard SQL seem to differ when DELETE or UPDATE is used with a join. Can you clarify the differences?
Answer:
Transact-SQL allows a join to be specified on DELETE and UPDATE statements. This doesn’t delete or update rows in multiple tables. The join is used only to specify which rows to delete or update in a target table.
Standard SQL syntax vs. Transact-SQL syntax for DELETE:
DELETE FROM table/view WHERE condition
DELETE table/view [FROM table/view,...] WHERE condition
Standard SQL syntax vs. Transact-SQL syntax for UPDATE:
UPDATE table/view SET col=value,... WHERE condition
UPDATE table/view SET col=value,...
[FROM table/view...] WHERE condition
The classic “employees who make more than their bosses” query demonstrates the differences. Suppose the emp table contains empno, salary, and boss. In standard SQL, the statement to delete all employees who make more than their bosses is:
DELETE
FROM emp
WHERE empno IN
( SELECT t1.empno
FROM emp t1, emp t2
WHERE t1.boss = t2.empno
AND t1.salary > t2.salary )
To do the same thing in Transact-SQL:
DELETE emp
FROM emp t1, emp t2
WHERE t1.boss = t2.empno
AND t1.salary > t2.salary
Standard SQL uses a subselect to accomplish what the join does in Transact-SQL. If the requirement was to change the salary of all employees who make more than their bosses to a value that depends on the value of their boss’s salary, standard SQL can’t do it because columns retrieved in the subselect cannot be used in expressions in the SET clause. In Transact-SQL, the value of the boss’s salary can be used in the SET clause because it was obtained from a join, not from a subselect.
UPDATE emp
SET t1.salary = t2.salary
FROM emp t1, emp t2
WHERE t1.boss = t2.empno
AND t1.salary > t2.salary