ACC1x: "Unique Values Only" Select Query Shows Duplicates
ID: Q93692
|
The information in this article applies to:
SYMPTOMS
The result of a query includes duplicate values even though you
selected the Unique Values Only check box (SELECT DISTINCT) in the
Query Properties dialog box.
CAUSE
Duplicate values occur when both of the following conditions are met:
- The query input data set contains more than approximately 64K of data
and the duplicate values are widely scattered in the input data.
For example, a SELECT DISTINCT query on a text column with an average
length of 20 characters will not display duplicate values unless the
query input data set included more than 3,300 values (65536 / 20 =
3276.8. Note that if the query is based on a table that has 10,000
rows, but the specified criteria retrieves only 20 rows, it is the
set of 20 rows that determines the size of the query input data set.
In other words, it is the WHERE clause that determines the size of
the query input data set.
- The values that have duplicates appear close (maybe in the first
dozen values) to what would normally be the beginning of the query
output. By default, a SELECT DISTINCT query displays data sorted
implicitly. For example, the following displays the result sorted by
Order ID:
SELECT DISTINCT [Order Details].[Order ID]
FROM [Order Details];
However, in a SELECT DISTINCT query that uses more than one column
and sorts by a column other than the first one, the duplicates could
appear anywhere in the output. To see if there are duplicates, you
need to remove the sort.
NOTE: This problem also appears in a Crosstab query as duplicated
columns with names like Field1. Use Fixed Column heading to prevent
the problem.
RESOLUTION
In addition to SELECT DISTINCT queries, you can use GROUP BYs in a
totals query to eliminate duplicates in the output. Both SELECT
DISTINCT queries and GROUP BYs in a totals query are limited to
10 columns. Usually, GROUP BY is faster when the resulting data set
is much smaller than the input data set, and SELECT DISTINCT is
faster when there are only a few duplicates.
To ensure the correct results, use GROUP BYs in a totals query instead
of using the Unique Values Only query property. For example, the
following two select queries give identical results:
SELECT DISTINCT [Order Details].[Order ID]
FROM [Order Details];
SELECT DISTINCTROW [Order Details].[Order ID]
FROM [Order Details]
GROUP BY [Order Details].[Order ID];
STATUS
Microsoft has confirmed this to be a problem in Microsoft Access version
1.0. This problem was corrected in Microsoft Access version 1.1.
REFERENCES
Microsoft Access "User's Guide," version 1.0, Chapter 6, "Designing Select
Queries," pages 128-130 and 142-148
Keywords : kbusage QryOthr
Version : 1.0
Platform : WINDOWS
Issue type : kbbug