The Types of Information Index Server Stores

Index server doesn't just keep a list of file names (if we wanted to do this, we could activate the Windows Find dialog from the Start menu, and use this dialog to perform the search). It also stores a multitude of document details. These include many of the stored document's properties, the time and date it was created and last updated, the size, the file attribute status, etc. Plus—and here's the clever bit—it keeps an abstract of the contents.

This abstract is a selection of the text in the document, irrespective of what type of document it actually is. And Index Server contains natural language processing systems, dedicated to your own particular spoken language, so that it really "understands" (as far as computers can understand) the file contents.

This means that you can search for a word, or group of words, as well as specifying the type of the file and other properties such as the author. The language engine can match words literally, so that

catch*
will include
catcher
and
catching
, and also grammatically where
catch**
will include
catching
and
caught
. No doubt you're aware that in a typical file search (from the Windows command line, for example) an asterisk is a wildcard character for which any combination of characters will be substituted to perform the search. In the second example provided here, the double asterisk is also a wildcard character, but one that encompasses all grammatical variations on the word
catch
.

The Index Server also includes a noise list file. This is an editable file that prevents words such as

and
,
or
the
, etc., from being included in the index. This makes the Index Server a very powerful and precise way of finding information stored anywhere on your system.

© 1998 by Wrox Press. All rights reserved.