INF: How the SQL SOUNDEX Algorithm WorksLast reviewed: April 25, 1997Article ID: Q67754 |
The information in this article applies to:
- Microsoft SQL Server version 4.2 for OS/2
SUMMARYThe following is a description of the functionality of the SQL SOUNDEX algorithm and examples of its abilities and limitations in distinguishing between words of similar length and/or phonetic structure.
MORE INFORMATIONThe SOUNDEX function returns a four-character code that describes the phonetic characteristics of the word that was used as an argument to the function. The first character of this code returns the first letter of the word, and the remaining three characters are single digits that describe the phonetic "value" of the first three syllables of the word. As an example, a possible return from the SOUNDEX function might be the value "A123". This is translated as follows:
"A" is the first letter of the word. "1" is the phonetic value of the first syllable. "2" is the phonetic value of the second syllable. "3" is the phonetic value of the third syllable.Unfortunately, the SOUNDEX algorithm suffers from some serious limitations, which are imposed by the fact that it can only record nine possible phoneme patterns. the SOUNDEX algorithm was not designed to distinguish between such sound alike words as "string" and "sing" (and it doesn't), but it also cannot register differences between vowel sounds. For example, the value "B300" is returned for all of the following words: "bit," "bite," "bat," "bait," "boat," "beet."
|
Additional query words:
© 1998 Microsoft Corporation. All rights reserved. Terms of Use. |