ShingleTokenFilter Class

Creates combinations of tokens as a single token. This token filter is implemented using Apache Lucene.

All required parameters must be populated in order to send to Azure.

Inheritance
azure.search.documents.indexes._generated.models._models_py3.TokenFilter
ShingleTokenFilter

Constructor

ShingleTokenFilter(*, name: str, max_shingle_size: Optional[int] = 2, min_shingle_size: Optional[int] = 2, output_unigrams: Optional[bool] = True, output_unigrams_if_no_shingles: Optional[bool] = False, token_separator: Optional[str] = ' ', filter_token: Optional[str] = '_', **kwargs)

Parameters

odata_type
str
Required

Required. Identifies the concrete type of the token filter.Constant filled by server.

name
str
Required

Required. The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

max_shingle_size
int
Required

The maximum shingle size. Default and minimum value is 2.

min_shingle_size
int
Required

The minimum shingle size. Default and minimum value is 2. Must be less than the value of maxShingleSize.

output_unigrams
bool
Required

A value indicating whether the output stream will contain the input tokens (unigrams) as well as shingles. Default is true.

output_unigrams_if_no_shingles
bool
Required

A value indicating whether to output unigrams for those times when no shingles are available. This property takes precedence when outputUnigrams is set to false. Default is false.

token_separator
str
Required

The string to use when joining adjacent tokens to form a shingle. Default is a single space (" ").

filter_token
str
Required

The string to insert for each position at which there is no token. Default is an underscore ("_").