You can summarize data by using the GROUP BY and HAVING clauses. GROUP BY organizes data into groups; HAVING sets conditions on which groups to include in the results. These clauses should be used together; HAVING used without GROUP BY can produce confusing results.
Aggregate functions return summary values, either for the whole table or for groups within the table. For this reason they are used with GROUP BY. Aggregate functions can appear in a select list or in a HAVING clause, but not in a WHERE clause.
You can use the following aggregate functions with GROUP BY (expression in this table is almost always a column name):
Aggregate function | Result |
---|---|
SUM([ALL | DISTINCT] expression) | Total of the values in the numeric expression, either all or distinct |
AVG([ALL | DISTINCT] expression) | Average of the values in the numeric expression, either all or distinct |
COUNT([ALL | DISTINCT] expression) | Number of values in the expression, either all or distinct |
COUNT(*) | Number of selected rows |
MAX(expression) | Highest value in the expression |
MIN(expression) | Lowest value in the expression |
SUM and AVG work only with numeric values. SUM, AVG, COUNT, MAX, and MIN ignore null values, but COUNT(*) does not.
Note You cannot use GROUP BY or HAVING on columns that have the text or image datatype.
The following query finds the total year-to-date sales of each publisher in the database:
SELECT pub_id, total = SUM(ytd_sales) FROM titles GROUP BY pub_id
pub_id |
total |
|
------ |
----- |
|
0736 |
28286 |
|
0877 |
44219 |
|
1389 |
24941 |
|
(3 row(s) affected) |
Because of the GROUP BY clause, only one row is returned for each publisher, and it contains the sum of all sales for that publisher.
Important The exact results of a GROUP BY clause depend on the sort order chosen during installation. For information about the effects of different sort orders, see Microsoft SQL Server Setup.
The HAVING clause sets conditions on the GROUP BY clause similarly to the way the WHERE clause interacts with the SELECT clause. The HAVING clause syntax is exactly like the WHERE clause syntax except that it can contain aggregate functions.
If, for example, you want to see the publishers who have had year-to-date sales greater than $40,000, use this query:
SELECT pud_id, total = SUM(ytd_sales) FROM titles GROUP BY pub_id HAVING SUM(ytd_sales) > 40000
If, for example, you want to make sure that there are at least six books involved in the calculations for each publisher, use HAVING COUNT(*) > 5 to eliminate the publishers that return totals for fewer than six books. The query looks like this:
SELECT pub_id, total = SUM(ytd_sales) FROM titles GROUP BY pub_id HAVING COUNT(*) > 5
pub_id |
total |
------ |
----- |
0877 |
44219 |
1389 |
24941 |
(2 row(s) affected)
With this statement, two rows are returned. New Moon Books (0736) is eliminated.
For a detailed description of GROUP BY, HAVING, and aggregate functions, see Displaying Totals with GROUP BY or COMPUTE.