Grouping Rows with GROUP BY

When used without a GROUP BY clause, aggregate functions report only one aggregate value for a SELECT statement. The GROUP BY clause is used to produce aggregate values for each row in the result set.

This example returns the number of units sold for each product in category 2:

SELECT OrdD.ProductID AS ProdID,

       SUM(OrdD.Quantity) AS AmountSold

FROM [Order Details] AS OrdD JOIN Products as Prd

     ON OrdD.ProductID = Prd.ProductID

     AND Prd.CategoryID = 2

GROUP BY OrdD.ProductID

  

Here is the result set:

ProdID      AmountSold 

----------- -----------

3           328        

4           453        

5           298        

6           301        

8           372        

15          122        

44          601        

61          603        

63          445        

65          745        

66          239        

77          791        

  

(12 row(s) affected)

  

The GROUP BY keywords are followed by a list of columns, known as the grouping columns. The GROUP BY clause restricts the rows of the result set, there is only one row for each distinct value in the grouping column or columns. Each result set row contains summary data related to the specific value in its grouping columns.

There are restrictions on the items that can be specified in the select list when a SELECT statement contains a GROUP BY. The only items allowed in the select list are:

For example, TableX contains:

ColumnA ColumnB ColumnC
------- ------- -------
1 abc 5
1 def 4
1 ghi 9
2 jkl 8
2 mno 3

If ColumnA is used as the grouping column, there will be two rows in the result set, one summarizing the information for the value 1, and the other summarizing the information for value 2.

When ColumnA is the grouping column, the only way ColumnB or ColumnC can be referenced is if they are parameters in an aggregate function that can return a single value for each value in ColumnA. It is legal for the select list to include expressions such as MAX(ColumnB), SUM(ColumnC), or AVG(ColumnC):

SELECT ColumnA,

       MAX(ColumnB) AS MaxB,

       SUM(ColumnC) AS SumC

FROM TableX

GROUP BY ColumnA

  

This select returns two rows, one for each unique value in ColumnA:

ColumnA     MaxB SumC       

----------- ---- -----------

1           ghi  18         

2           mno  11         

  

(2 row(s) affected)

It is not legal, however, to have just the expression ColumnB in the select list:

SELECT ColumnA,

       ColumnB,

       SUM(ColumnC) AS SumC

FROM TableX

GROUP BY ColumnA

  

Because the GROUP BY can return only one row with a value of 1 in ColumnA, there is no way to return the three values of ColumnB (abc, def, and ghi) associated with the value 1 in ColumnA.


Note You cannot use GROUP BY or HAVING on ntext, text, image, or bit columns unless they are in a function that returns a value having another data type, such as SUBSTRING or CAST.


See Also
CAST and CONVERT SUBSTRING
SELECT  

  


(c) 1988-98 Microsoft Corporation. All Rights Reserved.