Using the IFilter Interface

[This is preliminary documentation and subject to change.]

The IFilter interface is used to extract text from objects for placement in the Microsoft® Index Server content index.

The primary purpose of the IFilter interface is to extract text, without formatting, from documents. IFilter is the foundation upon which higher level operations such as document indexing or application-independent viewers can be built.

Although clients of IFilter may use the interface in any way they see fit, it was designed to meet the specific needs of full text search engines, such as Index Server. An implementation of the IFilter interface scans objects for plain text and properties (attributes). The search engine must break the results of IFilter::GetText into words, normalize them and store the results in an index. It may use the locale identifier specified with a text chunk to perform proper language-specific word breaking and normalization.