[This is preliminary documentation and subject to change.]
The following operators describe search operations on columns containing text. These operators behave like any other boolean operators, however, when used in conjunction with the DBOP_content_select operator, the resulting tables include two special columns describing the hit count and rank.
typedef struct tagDBCONTENT {
DWORD dwGenerateMethod; // exact, prefix, inflect
LONG lWeight; // weight of node
LCID lcid; // locale
LPWSTR pwszPhrase; // text
} DBCONTENT;
#define GENERATE_METHOD_EXACT ( 0 )
#define GENERATE_METHOD_PREFIX ( 1 )
#define GENERATE_METHOD_INFLECT ( 2 )
PSGUID_QUERY is the guid for the property set for special content columns. The following are some of the interesting columns in this set.
#define PROPID_QUERY_RANKVECTOR (0x2) // column used to return rank
// values of the
// content_vector_or operator
#define PROPID_QUERY_RANK (0x3) // column used to return the final
// rank of each row
#define PROPID_QUERY_HITCOUNT (0x4) // column used to return the
// number of content hits found
// in a row
#define PROPID_QUERY_ALL (0x6) // search in all text associated
// with a row
#define PROPID_STG_CONTENTS (0x13) // search inside the contents on
// an object
typedef struct tagDBCONTENTPROXIMITY {
DWORD dwProximityUnit; // units
ULONG ulProximityDistance; // how near is near?
LONG lWeight; // node weight
} DBCONTENTPROXIMITY;
The following proximity units may be supported:
#define PROXIMITY_UNIT_WORD ( 0 )
#define PROXIMITY_UNIT_SENTENCE ( 1 )
#define PROXIMITY_UNIT_PARAGRAPH ( 2 )
#define PROXIMITY_UNIT_CHAPTER ( 3 )
The node takes two or more input subtrees, each of which should contain DBOP_content, DBOP_content_freetext, DBOP_content_proximity, DBOP_and, DBOP_or, or DBOP_not nodes. The output of the node is Boolean. It is logically similar to an AND, in the sense that, to produce a true value, all input subtrees must evaluate to true.
Minimum/AND (VECTOR_RANK_MIN)
Maximum/OR (VECTOR_RANK_MAX)
Inner product (VECTOR_RANK_INNER)
Dice coefficient (VECTOR_RANK_DICE)
Jaccard coefficient (VECTOR_RANK_JACCARD)
Cosine (VECTOR_RANK_COSINE)
There may be at most one vector node in the tree. The internal arguments of this node are a ranking method (DWORD) and a weight on the node (LONG), specified within the DBCONTENTVECTOR structure.
typedef struct tagDBCONTENTVECTOR {
DWORD dwRankingMethod; // jaccard, cosine, etc.
LONG lWeight; // weight of the vector node.
}DBCONTENTVECTOR;
There are two or more children, of exactly the same types as with DBOP_content_proximity above. The output is Boolean. It acts as an n-ary OR of the input subtrees.