EdgeNGramTokenizer Class

Tokenizes the input from an edge into n-grams of the given size(s). This tokenizer is implemented using Apache Lucene.

All required parameters must be populated in order to send to Azure.

Inheritance
azure.search.documents.indexes._generated.models._models_py3.LexicalTokenizer
EdgeNGramTokenizer

Constructor

EdgeNGramTokenizer(*, name: str, min_gram: Optional[int] = 1, max_gram: Optional[int] = 2, token_chars: Optional[List[Union[str, azure.search.documents.indexes._generated.models._search_client_enums.TokenCharacterKind]]] = None, **kwargs)

Parameters

odata_type
str
Required

Required. Identifies the concrete type of the tokenizer.Constant filled by server.

name
str
Required

Required. The name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

min_gram
int
Required

The minimum n-gram length. Default is 1. Maximum is 300. Must be less than the value of maxGram.

max_gram
int
Required

The maximum n-gram length. Default is 2. Maximum is 300.

token_chars
list[str or TokenCharacterKind]
Required

Character classes to keep in the tokens.