2.17 Full-Text Index Component

A full-text index component is a set of files that contain all of the index keys extracted from a set of items. Each full-text index component is identified based on an index identifier.

The index identifier is a numeric value in the range from 65,537 to 65,791 (in hexadecimal 0x10001 to 0x100FF). The index identifier is assigned to every full-text index component by the index server. The index identifier for each full-text index component MUST be unique within the search scope of a full-text index catalog.

The individual files that belong to the same full-text index component MUST be identified based on a naming convention in which all file names derive from the index identifier. The naming convention for the files that make up a full-text index component is defined in section 2.17.1.

The following input parameters need to be known to read or write a full-text index component. For full-text index components in full-text index catalogs, these values are defined in corresponding CIndexRecord structures in the catalog Index Table file.

  • DocIDMax: A document identifier value that is guaranteed to be greater than or equal to any document identifier value of any document in the document set represented by the full-text index component.

  • Format version: Determines several format variations for the subcomponents of the full-text index component. MUST be 0x52 or 0x53 or 0x54.

The following table enumerates the files that make up a full-text index component.

Component name

Format

Example

Content Index

Content Index File Format

(section 2.3)

Section 3.1.5

Content Index Extension (optional)<32>

Content Index Extension File Format (section 2.6)

Section 3.1.6.2

Content Index Directory

Index Directory File Format

(section 2.5)

Section 3.1.6.1

Basic Scope Index

Scope Index File Format

(section 2.4)

Section 3.1.4

Basic Scope Index Directory

Index Directory File Format

(section 2.5)

Section 3.1.3

Compound Scope Index

Scope Index File Format

(section 2.4)

Section 3.1.2

Compound Scope Index Directory

Index Directory File Format

(section 2.5)

Section 3.1.1

Document Set

Document Set Files Format

(section 2.7)

Section 3.1.7

Content Index: A content index file that contains content index keys generated from the words extracted from the properties of the indexed items. The parameters DocIDMax and format version, as specified in section 2.17, determine the representation of this component.

Content Index Extension (optional): A CIX file associated with the full-text index component. This file is not present if the version is 0x52.<33>

Content Index Directory: An index directory file associated with the full-text index component.

Basic Scope Index: A scope index file that contains records with either basic scope index keys or anchor scope index keys.

Basic Scope Index Directory: An index directory file associated with the basic scope index.

Compound Scope Index: A scope index file for which the sort keys are compound scope index keys.

Compound Scope Index Directory: Index directory file associated with the compound scope index.

Document Set: Several files associated with the full-text index component.