ID Number: Q67754
1.10 1.11 4.20
OS/2
Summary:
The following is a description of the functionality of the SQL SOUNDEX
algorithm and examples of its abilities and limitations in
distinguishing between words of similar length and/or phonetic
structure.
The SOUNDEX function returns a four-character code that describes the
phonetic characteristics of the word that was used as an argument to
the function. The first character of this code returns the first
letter of the word, and the remaining three characters are single
digits that describe the phonetic "value" of the first three syllables
of the word. As an example, a possible return from the SOUNDEX
function might be the value "A123". This is translated as follows:
"A" is the first letter of the word.
"1" is the phonetic value of the first syllable.
"2" is the phonetic value of the second syllable.
"3" is the phonetic value of the third syllable.
Unfortunately, the SOUNDEX algorithm suffers from some serious
limitations, which are imposed by the fact that it can only record
nine possible phoneme patterns. the SOUNDEX algorithm was not designed
to distinguish between such soundalike words as "string" and "sing"
(and it doesn't), but it also cannot register differences between
vowel sounds. For example, the value "B300" is returned for all of the
following words: "bit," "bite," "bat," "bait," "boat," "beet."
Additional reference words: 1.10 1.11 4.20