2.2.1.14 CProbRestriction

A CProbRestriction structure contains parameters for probabilistic ranking.


0


1


2


3


4


5


6


7


8


9

1
0


1


2


3


4


5


6


7


8


9

2
0


1


2


3


4


5


6


7


8


9

3
0


1

_Property (variable)

...

_flK1

_flK2

_flK3

_flB

_cFeedbackDoc

_ProbQueryPid

_Property (variable): A CFullPropSpec structure, indicating which property to use for probabilistic ranking or the columns' group full property specification (which corresponds to _groupPid field in the CColumnGroup structure). In the latter case, CFullPropSpec MUST have the _guidPropSet field set to zero, the ulKind field set to PRSPEC_LPWSTR and the Property name field set to the name of the referenced group property.

_flK1 (4 bytes): An IEEE 32-bit floating point number [IEEE754] that indicates parameter k1 in formula [1], specified below.

_flK2 (4 bytes): An IEEE 32-bit floating point number.

Note MUST be set to 0.0.

_flK3 (4 bytes): An IEEE 32-bit floating point number that indicates parameter k3 in formula [1].

_flB (4 bytes): An IEEE 32-bit floating point number that indicates parameter b in formula [1] below.

_cFeedbackDoc (4 bytes): A 32-bit unsigned integer specifying the count of relevant documents.

_ProbQueryPid (4 bytes): A 32-bit unsigned integer.

Note Reserved. MUST be set to 0x00000000.

Formula [1] for probabilistic ranking is the following sum for each query term:

Linear equation for probabilistic ranking restrictions in search request

Figure 1: Linear equation for probabilistic ranking restrictions in search request

Where:

  • wtf (weighted term frequency) is the sum of term frequencies (the number of times a term occurs in a document) of a given term multiplied by weights across all properties.

  • dl is the document length, in terms of number of all words (including noise words).

  • avdl is the average document length in the corpus, in terms of number of words (including noise words).

  • N is the total number of documents in the corpus.

  • n is the number of documents in the corpus that have the given query term.

  • qtf is the number of documents containing the given query term; the sum is across all query terms.

  • k1, k3, and b are parameters, specified in the CProbRestriction structure.