Eliminating Duplicates: DISTINCT

The DISTINCT keyword eliminates duplicate rows from the results of a SELECT statement. If you don't specify DISTINCT, you see all rows, including duplicates. For example, if you select all the author IDs in the titleauthor table without DISTINCT, you see the following rows:

SELECT au_id
FROM titleauthor

au_id       
----------- 
172-32-1176 
213-46-8915 
213-46-8915 
238-95-7766 
267-41-2394 
267-41-2394 
274-80-9391 
409-56-7008 
427-17-2319 
472-27-2349 
486-29-1786 
486-29-1786 
648-92-1872 
672-71-3249 
712-45-1867 
722-51-5454 
724-80-9391 
724-80-9391 
756-30-7391 
807-91-6654 
846-92-7186 
899-46-2035 
899-46-2035 
998-72-3567 
998-72-3567 

(25 row(s) affected)

Notice that there are some duplicate listings. With DISTINCT, you can eliminate duplicates and see only the unique author IDs:

SELECT distinct au_id
FROM titleauthor

au_id       
----------- 
172-32-1176 
213-46-8915 
238-95-7766 
267-41-2394 
274-80-9391 
409-56-7008 
427-17-2319 
472-27-2349 
486-29-1786 
648-92-1872 
672-71-3249 
712-45-1867 
722-51-5454 
724-80-9391 
756-30-7391 
807-91-6654 
846-92-7186 
899-46-2035 
998-72-3567 

(19 row(s) affected)

Important The output for statements involving DISTINCT depends on the sort order chosen during installation. For information about the effects of different sort orders, see Microsoft SQL Server Setup.

For the DISTINCT keyword, null values are considered to be duplicates of each other. When DISTINCT is included in a SELECT statement, only one NULL is returned in the results, no matter how many null values are encountered.

Note For compatibility with other implementations of SQL, the ALL keyword can explicitly ask for all rows. However, there is no need to use ALL because displaying all rows is the default.