2.2.1.14 CProbRestriction
A CProbRestriction structure contains parameters for probabilistic ranking.
|
|
|
|
|
|
|
|
|
|
1 |
|
|
|
|
|
|
|
|
|
2 |
|
|
|
|
|
|
|
|
|
3 |
|
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
_Property (variable) |
|||||||||||||||||||||||||||||||
... |
|||||||||||||||||||||||||||||||
_flK1 |
|||||||||||||||||||||||||||||||
_flK2 |
|||||||||||||||||||||||||||||||
_flK3 |
|||||||||||||||||||||||||||||||
_flB |
|||||||||||||||||||||||||||||||
_cFeedbackDoc |
|||||||||||||||||||||||||||||||
_ProbQueryPid |
_Property (variable): A CFullPropSpec structure, indicating which property to use for probabilistic ranking or the columns' group full property specification (which corresponds to _groupPid field in the CColumnGroup structure). In the latter case, CFullPropSpec MUST have the _guidPropSet field set to zero, the ulKind field set to PRSPEC_LPWSTR and the Property name field set to the name of the referenced group property.
_flK1 (4 bytes): An IEEE 32-bit floating point number [IEEE754] that indicates parameter k1 in formula [1], specified below.
_flK2 (4 bytes): An IEEE 32-bit floating point number.
-
Note MUST be set to 0.0.
_flK3 (4 bytes): An IEEE 32-bit floating point number that indicates parameter k3 in formula [1].
_flB (4 bytes): An IEEE 32-bit floating point number that indicates parameter b in formula [1] below.
_cFeedbackDoc (4 bytes): A 32-bit unsigned integer specifying the count of relevant documents.
_ProbQueryPid (4 bytes): A 32-bit unsigned integer.
-
Note Reserved. MUST be set to 0x00000000.
-
Formula [1] for probabilistic ranking is the following sum for each query term:
-
Figure 1: Linear equation for probabilistic ranking restrictions in search request
-
Where:
wtf (weighted term frequency) is the sum of term frequencies (the number of times a term occurs in a document) of a given term multiplied by weights across all properties.
dl is the document length, in terms of number of all words (including noise words).
avdl is the average document length in the corpus, in terms of number of words (including noise words).
N is the total number of documents in the corpus.
n is the number of documents in the corpus that have the given query term.
qtf is the number of documents containing the given query term; the sum is across all query terms.
k1, k3, and b are parameters, specified in the CProbRestriction structure.