Eliminating Duplicates with DISTINCT

The DISTINCT keyword eliminates duplicate rows from the results of a SELECT statement. If DISTINCT is not specified, all rows are returned, including duplicates. For example, if you select all the author IDs in titleauthor without DISTINCT, the following rows are returned (with some duplicate listings):

USE pubs

SELECT au_id

FROM titleauthor

Here is the result set:

au_id

-----------

172-32-1176

213-46-8915

238-95-7766

267-41-2394

274-80-9391

409-56-7008

427-17-2319

472-27-2349

486-29-1786

648-92-1872

672-71-3249

712-45-1867

722-51-5454

724-80-9391

756-30-7391

807-91-6654

846-92-7186

899-46-2035

998-72-3567

(25 row(s) affected)

With DISTINCT, you can eliminate duplicates and see only the unique author IDs:

USE pubs

SELECT DISTINCT au_id

FROM titleauthor

Here is the result set:

au_id

-----------

172-32-1176

213-46-8915

238-95-7766

267-41-2394

274-80-9391

409-56-7008

427-17-2319

472-27-2349

486-29-1786

648-92-1872

672-71-3249

712-45-1867

722-51-5454

724-80-9391

756-30-7391

807-91-6654

846-92-7186

899-46-2035

998-72-3567

(19 row(s) affected)

Important The output for statements involving DISTINCT depends on the sort order chosen during installation and an ORDER BY clause. For information about the effects of different sort orders, see Sort Order.

For the DISTINCT keyword, null values are considered to be duplicates of each other. When DISTINCT is included in a SELECT statement, only one NULL is returned in the results, regardless of how many null values are encountered.

Note For compatibility with the SQL-92 standard and other implementations of Microsoft® SQL Server™, the ALL keyword can explicitly ask for all rows. However, there is no need to specify ALL because it is the default.