ShingleTokenFilter Class
Creates combinations of tokens as a single token. This token filter is implemented using Apache Lucene.
All required parameters must be populated in order to send to Azure.
- Inheritance
-
azure.search.documents.indexes._generated.models._models_py3.TokenFilterShingleTokenFilter
Constructor
ShingleTokenFilter(*, name: str, max_shingle_size: Optional[int] = 2, min_shingle_size: Optional[int] = 2, output_unigrams: Optional[bool] = True, output_unigrams_if_no_shingles: Optional[bool] = False, token_separator: Optional[str] = ' ', filter_token: Optional[str] = '_', **kwargs)
Parameters
- odata_type
- str
Required. Identifies the concrete type of the token filter.Constant filled by server.
- name
- str
Required. The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
- min_shingle_size
- int
The minimum shingle size. Default and minimum value is 2. Must be less than the value of maxShingleSize.
- output_unigrams
- bool
A value indicating whether the output stream will contain the input tokens (unigrams) as well as shingles. Default is true.
- output_unigrams_if_no_shingles
- bool
A value indicating whether to output unigrams for those times when no shingles are available. This property takes precedence when outputUnigrams is set to false. Default is false.
- token_separator
- str
The string to use when joining adjacent tokens to form a shingle. Default is a single space (" ").
- filter_token
- str
The string to insert for each position at which there is no token. Default is an underscore ("_").
Feedback
Submit and view feedback for