Azure Search provides the Lucene Query Syntax for cases that require fine-grained control.
How the simple parser works with your application
As noted, any text with one or more terms is considered a valid starting point for query execution. Azure Search will match documents containing any or all of the terms, including any variations found during analysis of the text.
As straightforward as this sounds, there is one aspect of query execution in Azure Search that might produce unexpected results, increasing rather than decreasing search results as more terms and operators are added to the input string.
Whether this expansion actually occurs depends on the inclusion of a NOT operator, combined with a
searchMode parameter setting that determines how NOT is interpreted in terms of AND or OR behaviors.
Given the default,
searchMode=Any, and a NOT operator, the operation is computed as an OR action, such that
"New York" NOT Seattle returns all cities that are not Seattle.
Typically, you're more likely to see these behaviors in user interaction patterns for applications that search over content, where users are more likely to include an operator in a query, as opposed to e-commerce sites that have more built-in navigation structures. See below for more detail about
Additionally, the following characters may be used to fine-tune the query:
The AND operator is a plus sign. For example,
wifi+luxury will search for documents containing both
The OR operator is a vertical bar or pipe character. For example,
wifi | luxury will search for documents containing either
luxury or both.
The NOT operator is a minus sign. For example,
wifi –luxury will search for documents that have the
wifi term and/or do not have
luxury (and/or is controlled by
searchMode option controls whether a term with the NOT operator is ANDed or ORed with the other terms in the query in the absence of a
| operator. Recall that
searchMode can be set to either
any (default) or
all. If you use
any, it will increase the recall of queries by including more results, and by default
-* will be interpreted as "OR NOT". For example,
wifi –luxury will match documents that either contain the term
wifi or those that do not contain the term
luxury. If you use
all, it will increase the precision of queries by including fewer results, and by default - will be interpreted as "AND NOT". For example,
wifi –luxury will match documents that contain the term
wifi and do not contain the term "luxury". This is arguably a more intuitive behavior for the
- operator. Therefore, you should consider using
searchMode=all instead of
searchMode=any if You want to optimize searches for precision instead of recall, and Your users frequently use the
- operator in searches.
The suffix operator is an asterisk. For example,
lux* will search for documents that have a term that starts with
lux, ignoring case.
Phrase search operator
The phrase operator encloses a phrase in quotation marks. For example, while
Roach Motel (without quotes) would search for documents containing
Motel anywhere in any order,
"Roach Motel" (with quotes) will only match documents that contains that whole phrase together and in that order (text analysis still applies).
The precedence operator encloses the string in parentheses. For example,
motel+(wifi | luxury) will search for documents containing the motel term and either
luxury (or both).|
Guidance for using search operators
In order to use the above symbols as actual part of the search text, they should be escaped by prefixing them with a backslash. For example,
luxury\+hotel will result in the term
luxury+hotel. In order to make things simple for the more typical cases, there are two exceptions to this rule where escaping is not needed:
- The NOT operator
-only needs to be escaped if it's the first character after whitespace, not if it's in the middle of a term. For example,
wi-fiis a single term; whereas GUIDs (such as
3352CDD0-EF30-4A2E-A512-3B30AF40F3FD) are treated as a single token.
- The suffix operator
*only needs to be escaped if it's the last character before whitespace, not if it's in the middle of a term. For example,
wi*fiis treated as a single token.
Although escaping keeps tokens together, text analysis may split them up, depending on the analysis mode. See Language support (Azure Search Service REST API) for details.