Basic Indexing Features
Index Server provides the following basic indexing features:
- Indexes full text in Web pages and also indexes HTML tags as properties. This allows Index Server to make use of the structure of the document, rather than just the raw content.
- Indexes full text in formatted data such as Microsoft Excel or Word documents. Index Server also has the ability to look inside private data formats via the open standard IFilter interface. Unlike some other systems that are limited to text files, Index Server can read a variety of document formats.
- Incrementally refreshes indexes. This means that if a user only changes one document, only one document is indexed, conserving system resources.
- Controls indexing on an IIS virtual directory basis. This allows users to have control over how much of their document collection they want to index. For example, users could only index and make available only their public data. It also allows users to block indexing of a directory containing a large quantity of dynamic information which would not be useful for a search.
- Indexes file and document property values. This allows users to query for author, date, and other properties.
- Indexes text in seven supported languages. Index Server handles multi-lingual documents and can change languages on the fly as it is indexing a document. For example, the system can switch from English to German to French, and back to English again.
- Automatic index updates. The system indexes documents in the background as they are being modified. So there is no need to update documents manually; the system is tracking changes and indexing them as they are made. This allows documents to stay more current.
- Performance monitoring. Server administration. This allows administrators to know the number of indexers, the status of the indexing in progress, the number of documents indexed, and other operational criteria.
- Zero-Maintenance. By design, Index Server offers 7 x 24 reliability, with automatic corruption detection and recovery.
- Low overhead. A major goal has been to keep the impact of Index Server on the system as low as possible.
Indexing is controlled on a per-virtual-root basis. An index is built over a set of virtual roots, and their child directories. It is possible to incrementally refresh an index, that is, refresh an index by indexing only changed files. Index Server doesn't need to index all documents to pick up just a few changes. Index Server also provides zero-maintenance design, so that the system runs as autonomously as possible, maintaining its own statistics regarding the state of the indexes and optimizing them when necessary. This provides peace of mind for server administrators who can install Index Server with the confidence that the machine can handle routine operations automatically.
Index Server is also designed to support indexing and searching for documents in an international information environment Full support for seven languages is built in, and additional languages can be added through an open specification.