Fuzzy Queries

Index Server provides "fuzzy" query support," where the system generates words that are similar to the ones the user enters. The system supports simple DOS-like wildcards and UNIX-like regular expression matching against textual properties. Content queries support simple prefix matching, meaning that typing in the word "dog*" will return "dogmatic" and "doghouse". The system also provides linguistic stemming support that matches the various tenses of query words. So the word "swim" will expand to include "swimming", "swam", "swum" and other related words—this process is called inflection. Index Server performs this linguistic analysis of the words that are entered, as well as the reverse operation, called stemming.

The ranking mechanism is weighted so that the more highly inflected the word is from the version asked for originally, the lower its rank in the result set. For example, "swim" would be closer to "swims" and further from "swimmer" because "swim" and "swimmer" are less related grammatically. In other words, the plural noun form is more related grammatically than the past-tense verb form of the same word. When resolving queries, the linguistic engine and ranking algorithm take these linguistic features into account. Fuzzy queries and stemming are available in all supported languages.