CHUNK_BREAKTYPE

The CHUNK_BREAKTYPE enum describes the type of break that separates the current chunk from the previous chunk.

typedef enum tagCHUNK_BREAKTYPE
{
    CHUNK_NO_BREAK = 0,
    CHUNK_EOW      = 1,
    CHUNK_EOS      = 2,
    CHUNK_EOP      = 3,
    CHUNK_EOC      = 4
} CHUNK_BREAKTYPE;
 

Elements

CHUNK_NO_BREAK
No break will be placed between the current chunk and the previous chunk. The chunks will be glued together.
CHUNK_EOW
A word break will be placed between this chunk and the previous chunk that had the same attribute. Use of CHUNK_EOW should be minimized because the choice of word breaks is language dependent, so determining word breaks is best left to the search engine).
CHUNK_EOS
A sentence break will be placed between this chunk and the previous chunk that had the same attribute.
CHUNK_EOP
A paragraph break will be placed between this chunk and the previous chunk that had the same attribute.
CHUNK_EOC
A chapter break will be placed between this chunk and the previous chunk that had the same attribute.

Remarks

A change in attributes implies a word, sentence, paragraph, or chapter break.

See Also

IFilter::GetChunk, STAT_CHUNK