How Index Server Works

Before we go on to look at how we can integrate Index Server into our Active Server Pages sites, or at how we can exploit its new capabilities, let's look at what the Index Server is, and how it actually works. It's a surprisingly powerful search engine package, and can quite easily perform the most complex kinds of query and search tasks.

Once installed, Index Server is, to a large extent, self-maintaining. During periods of little or no system activity, it trawls the folders in the selected areas of your system, building and maintaining catalogs of information about each document stored there. When you perform a search in the catalog, it uses the contents of the index to perform the search and, optionally, to build an HTML results page and return it to the client. The cataloging and indexing system is practically transparent as far as the user is concerned. It is also very fast, because only the index file needs to be searched each time.

Searches that you perform in a catalog are expressed in terms of the Index Server query language, which, like the familiar SQL, combines field names and comparison and logical operators to identify the documents you want to retrieve. For more information on this language, see "The Index Server Query Language" later in this chapter.

Using Catalogs

A catalog is a collection of directories that make up a discrete unit within an Index Server search. You can think of a catalog like you would think of a database. It is a collection of virtual tables (directories) that contain data indexed by the Index Server. When you execute queries against the Index Server, you specify the catalog that you want to query.

If you do not specify a catalog, the default catalog is used. This catalog contains a number of default directories, including (most importantly) the

\Inetpub\wwwroot
directory, which you typically use to store your server's Web sites.

To add or delete catalogs to and from the Index Server, to add directories to an existing catalog, or to perform routine Index Server maintenance (such as directory re-scans), use the Index Server Manager. This is a snap-in for the Microsoft Management Console application, and is installed with Index Server.

Note that the index file can be up to 40% of the total size of the documents on your system, if you have full indexing features in use.

Using Scope

If a catalog is analogous to a database, the concept of scope encompasses one or more tables in that database. As mentioned in the section on Catalogs, a catalog can consist of any number of directories. However, it is often not in your interest to search every single directory in a catalog. This is where scope comes into play. The scope of a search defines the sub-directories within a catalog that are included in the search.

Consider most web hosting facilities host scores of sites, commercial and personal. Chances are, if your server hosts the Acme Service Corporation, and if the virtual directory for Acme appears as a sub-directory of

\Inetpub\wwwroot
, Acme's directory is part of the default catalog of directories.

However, if Acme sports a search form in their pages, they'll only want to return information about Acme to users, and want specifically to exclude information about any other of the sites you happen to be hosting.

Each of the search types that the Index Server supports provides a means by which you can limit the query's scope to a specific directory or directories. Within an ASP search, for example, you use the

Utility
object's
AddScopeToQuery
method to specify the scope. With an ADO search, you specify the query's scope within the SQL text of the command that you execute.

Within the directory to which you limit your search, you can also limit the traversal level, which defines the depth to which a search extends. If the scope of a search specifies the shallow traversal of a directory, only the files in that directory are searched. If deep traversal is specified, all the sub-directories of a given directory are included in the search.

© 1998 by Wrox Press. All rights reserved.