ClassicTokenizer Class
Grammar-based tokenizer that is suitable for processing most European-language documents. This tokenizer is implemented using Apache Lucene.
All required parameters must be populated in order to send to Azure.
- Inheritance
-
azure.search.documents.indexes._generated.models._models_py3.LexicalTokenizerClassicTokenizer
Constructor
ClassicTokenizer(*, name: str, max_token_length: Optional[int] = 255, **kwargs)
Parameters
- odata_type
- str
Required. Identifies the concrete type of the tokenizer.Constant filled by server.
- name
- str
Required. The name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
- max_token_length
- int
The maximum token length. Default is 255. Tokens longer than the maximum length are split. The maximum token length that can be used is 300 characters.
Feedback
Submit and view feedback for