INF: Case-Sensitive Query Results w/ Case-Insensitive Sort OrderLast reviewed: August 27, 1997Article ID: Q171299 |
The information in this article applies to:
SUMMARYA sort order is a set of rules that determines how SQL Server collates and presents data in response to database queries. The sort order determines the order in which data is presented in response to SQL Server statements involving GROUP BY, ORDER BY, and DISTINCT clauses. The sort order also determines how certain queries are resolved, such as queries involving WHERE and DISTINCT clauses. During installation, SQL Server allows you to select a character set and a sort order that the server will use. Several sort orders and character sets are available. The default character set is ANSI (ISO 8859-1), and the default sort order is dictionary order, case-insensitive. If a case-insensitive sort order is chosen, SQL Server considers character values that differ only in case as equal in value. That is, 'John' = 'john'. With a case-insensitive sort order, it may often be necessary toobtain query results with case-sensitivity considered. This article explains techniques to accomplish this.
MORE INFORMATIONTo examine the ramifications of a case-insensitive sort order on query results, consider the following scenario:
create table T1(col1 char(5) NOT NULL, col2 int NOT NULL) go insert T1 values('John',1) insert T1 values('John',1) insert T1 values('JOhn',1) insert T1 values('JOhn',1) insert T1 values('JoHn',1) insert T1 values('JoHn',1) insert T1 values('John',1) insert T1 values('JohN',1) insert T1 values('JohN',1) insert T1 values('john',1) insert T1 values('john',1) insert T1 values('john',1) goExecuting the following query on a server with a case-insensitive sort order returns the following results:
select DISTINCT(col1) from T1 go col1 ----- John (1 row(s) affected)However, executing the same query on a server with a case-sensitive sort order returns the following results:
col1 ----- John JOhn JoHn JohN john (5 row(s) affected)Regarding the usage of GROUP BY, ORDER BY, and WHERE clauses, observe the results of these queries that are run on a server with a case-insensitive sort order, and compare them with the modified queries below. Note that in all cases, all of the values in col1 are treated as one distinct value:
SELECT sum(col2)'col2 Group Totals' from T1 group by col1 go col2 Group Totals ----------------- 12 (1 row(s) affected) SELECT * from T1 order by col1 go col1 col2 ----- ----------- John 1 John 1 JOhn 1 JOhn 1 JoHn 1 JoHn 1 John 1 JohN 1 JohN 1 john 1 john 1 john 1 (12 row(s) affected) SELECT * from T1 where col1='john' go col1 col2 ----- ----------- John 1 John 1 JOhn 1 JOhn 1 JoHn 1 JoHn 1 John 1 JohN 1 JohN 1 john 1 john 1 john 1 (12 row(s) affected)The ability to obtain case-sensitive results with the DISTINCT function and the GROUP BY, ORDER BY, and WHERE clauses relies on the fact that the actual data is stored in binary form, with each character represented by a unique hexadecimal value. You can use the CONVERT function to convert the data from character format to a string of hexadecimal values. The results of the conversion can be operated on by the DISTINCT function and the GROUP BY, ORDER BY, and WHERE clauses without regard to case-sensitivity. Consider the results of the following query, which displays the original character data and its corresponding hexadecimal representation:
select col1,'=',CONVERT(binary(5),col1)'hex values' from T1 col1 hex values ----- - ------------ John = 0x4a6f686e20 John = 0x4a6f686e20 JOhn = 0x4a4f686e20 John = 0x4a4f686e20 JoHn = 0x4a6f486e20 JoHn = 0x4a6f486e20 John = 0x4a6f686e20 JohN = 0x4a6f684e20 JohN = 0x4a6f684e20 john = 0x6a6f686e20 john = 0x6a6f686e20 john = 0x6a6f686e20 (12 row(s) affected)Note that in the results, each unique value of col1 has a distinct hexadecimal representation. To get the distinct values in character format, run the next set of statements. It produces the same results as a query using the DISTINCT function on a case-sensitive server.
select DISTINCT(CONVERT(binary(5),col1))'col1' into #t1 from T1 go select CONVERT(char(5),col1)'distinct character results' from #t1 go (5 row(s) affected) ----- JOhn JoHn JohN John john (5 row(s) affected)For queries using GROUP BY and/or ORDER BY clauses, consider the following examples:
select DISTINCT(CONVERT(binary(5),col1))'col1',sum(col2)'col2' into #t1 from T1 group by CONVERT(binary(5),col1) go select CONVERT(char(5),col1)'distinct character results',col2'Group Totals' from #t1 go (5 row(s) affected) distinct character results Group Totals -------------------------- ------------ JOhn 2 JoHn 2 JohN 2 John 3 john 3 (5 row(s) affected) SELECT * from T1 order by CONVERT(binary(5),col1) go col1 col2 ----- ----------- JOhn 1 JOhn 1 JoHn 1 JoHn 1 JohN 1 JohN 1 John 1 John 1 John 1 john 1 john 1 john 1 (12 row(s) affected)For a query using the WHERE clause, use the CONVERT function to convert the constant and/or column values to comparable hexadecimal strings, as in the following example:
SELECT * from T1 where CONVERT(binary(5),col1)=CONVERT(binary(5),'john ') go col1 col2 ----- ----------- john 1 john 1 john 1 (3 row(s) affected)It is important to understand that the results returned from queries can be dramatically affected by the choice of sort order. Due consideration must be given to the importance of case-sensitivity when planning the installation and configuration of SQL Server. For further information on this topic, see "Character Sets And Sort Orders" in the SQL Server Books Online and the Administrator's Companion.
|
Additional query words: hex
© 1998 Microsoft Corporation. All rights reserved. Terms of Use. |