Service limits in Azure Cognitive Search
Maximum limits on storage, workloads, and quantities of indexes and other objects depend on whether you provision Azure Cognitive Search at Free, Basic, Standard, or Storage Optimized pricing tiers.
Free is a multi-tenant shared service that comes with your Azure subscription. Indexing and query requests execute on replicas and partitions that are used by other tenants.
Basic provides dedicated computing resources for production workloads at a smaller scale, but shares some networking infrastructure with other tenants.
Standard runs on dedicated machines with more storage and processing capacity at every level. Standard comes in four levels: S1, S2, S3, and S3 HD.
Storage Optimized runs on dedicated machines with more total storage, storage bandwidth, and memory than Standard. Storage Optimized comes in two levels: L1 and L2
As of July 1, all tiers are generally available, including the Storage Optimized tier. All pricing can be found on the Pricing Details page.
S3 High Density (S3 HD) is engineered for specific workloads: multi-tenancy and large quantities of small indexes (one million documents per index, three thousand indexes per service). This tier does not provide the indexer feature. On S3 HD, data ingestion must leverage the push approach, using API calls to push data from source to index.
A service is provisioned at a specific tier. Jumping tiers to gain capacity involves provisioning a new service (there is no in-place upgrade). For more information, see Choose a SKU or tier. To learn more about adjusting capacity within a service you've already provisioned, see Scale resource levels for query and indexing workloads.
You can create multiple services within a subscription. Each one can be provisioned at a specific tier. You're limited only by the number of services allowed at each tier. For example, you could create up to 12 services at the Basic tier and another 12 services at the S1 tier within the same subscription. For more information about tiers, see Choose an SKU or tier for Azure Cognitive Search.
Maximum service limits can be raised upon request. If you need more services within the same subscription, contact Azure Support.
|Maximum scale in search units (SU)2||N/A||3 SU||36 SU||36 SU||36 SU||36 SU||36 SU||36 SU|
1 Free is based on shared, not dedicated, resources. Scale-up is not supported on shared resources.
2 Search units are billing units, allocated as either a replica or a partition. You need both resources for storage, indexing, and query operations. To learn more about SU computations, see Scale resource levels for query and index workloads.
Storage is constrained by disk space or by a hard limit on the maximum number of indexes, document, or other high-level resources, whichever comes first. The following table documents storage limits. For maximum limits on indexes, documents, and other objects, see Limits by resource.
|Service level agreement (SLA)3||No||Yes||Yes||Yes||Yes||Yes||Yes||Yes|
|Storage per partition||50 MB||2 GB||25 GB||100 GB||200 GB||200 GB||1 TB||2 TB|
|Partitions per service||N/A||1||12||12||12||3||12||12|
|Partition size||N/A||2 GB||25 GB||100 GB||200 GB||200 GB||1 TB||2 TB|
1 Basic has one fixed partition. At this tier, additional search units are used for allocating more replicas for increased query workloads.
2 S3 HD has a hard limit of three partitions, which is lower than the partition limit for S3. The lower partition limit is imposed because the index count for S3 HD is substantially higher. Given that service limits exist for both computing resources (storage and processing) and content (indexes and documents), the content limit is reached first.
3 Service level agreements are offered for billable services on dedicated resources. Free services and preview features have no SLA. For billable services, SLAs take effect when you provision sufficient redundancy for your service. Two or more replicas are required for query (read) SLAs. Three or more replicas are required for query and indexing (read-write) SLAs. The number of partitions isn't an SLA consideration.
|Resource||Free||Basic 1||S1||S2||S3||S3 HD||L1||L2|
|Maximum indexes||3||5 or 15||50||200||200||1000 per partition or 3000 per service||10||10|
|Maximum simple fields per index||1000||100||1000||1000||1000||1000||1000||1000|
|Maximum complex collection fields per index||40||40||40||40||40||40||40||40|
|Maximum elements across all complex collections per document 2||3000||3000||3000||3000||3000||3000||3000||3000|
|Maximum depth of complex fields||10||10||10||10||10||10||10||10|
|Maximum suggesters per index||1||1||1||1||1||1||1||1|
|Maximum scoring profiles per index||100||100||100||100||100||100||100||100|
|Maximum functions per profile||8||8||8||8||8||8||8||8|
1 Basic services created before December 2017 have lower limits (5 instead of 15) on indexes. Basic tier is the only SKU with a lower limit of 100 fields per index.
2 Having a very large number of elements in complex collections per document currently causes high storage utilization. This is a known issue. In the meantime, a limit of 3000 is a safe upper bound for all service tiers. This limit is only enforced for indexing operations that utilize the earliest generally available (GA) API version that supports complex type fields (
2019-05-06) onwards. To not break clients who might be using earlier preview API versions (that support complex type fields), we will not be enforcing this limit for indexing operations that use these preview API versions. Note that preview API versions are not meant to be used for production scenarios and we highly recommend customers move to the latest GA API version.
As of October 2018, there are no longer any document limits for any new service created at any billable tier (Basic, S1, S2, S3, S3 HD) in any region. While most regions have had unlimited document counts since November/December 2017, there were a few regions that continued to impose document limits after that date. Depending on when and where you created a search service, you might be running a service that is still subject to document limits.
To determine whether your service has document limits, use the GET Service Statistics REST API. Document limits are reflected in the response, with
null indicating no limits.
Even though there aren't any SKU specific document limits, every index is still subject to a maximum safe limit to ensure stability of the service. This limit comes from Lucene. Every Azure Cognitive Search document is internally indexed as one or more Lucene documents. The number of Lucene documents per search document depends on the total number of elements in complex collection fields. Each element is indexed as a separate Lucene document. For example, a document with 3 elements in a complex collection field, will be indexed as 4 Lucene documents - 1 for the document itself and 3 for the elements. The maximum number of Lucene documents is roughly 25 billion per index.
Regions previously having document limits
If the portal indicates a document limit, your service was either created before late 2017, or it was created on a data center using lower-capacity clusters for hosting Azure Cognitive Search services:
- Australia East
- East Asia
- Central India
- Japan West
- West Central US
For services subject to document limits, the following maximum limits apply:
|10,000||1 million||15 million per partition or 180 million per service||60 million per partition or 720 million per service||120 million per partition or 1.4 billion per service||1 million per index or 200 million per partition|
If your service has limits that are blocking you, create a new service and then republish all content to that service. There is no mechanism for seamlessly reprovisioning your service onto new hardware behind the scenes.
For S3 High Density services created after late 2017, the 200 million document per partition has been removed but the 1 million document per index limit remains.
Document size limits per API call
The maximum document size when calling an Index API is approximately 16 megabytes.
Document size is actually a limit on the size of the Index API request body. Since you can pass a batch of multiple documents to the Index API at once, the size limit realistically depends on how many documents are in the batch. For a batch with a single document, the maximum document size is 16 MB of JSON.
To keep document size down, remember to exclude non-queryable data from the request. Images and other binary data are not directly queryable and shouldn't be stored in the index. To integrate non-queryable data into search results, define a non-searchable field that stores a URL reference to the resource.
Maximum running times exist to provide balance and stability to the service as a whole, but larger data sets might need more indexing time than the maximum allows. If an indexing job cannot complete within the maximum time allowed, try running it on a schedule. The scheduler keeps track of indexing status. If a scheduled indexing job is interrupted for any reason, the indexer can pick up where it last left off at the next scheduled run.
|Resource||Free 1||Basic 2||S1||S2||S3||S3 HD 3||L1||L2|
|Maximum indexers||3||5 or 15||50||200||200||N/A||10||10|
|Maximum datasources||3||5 or 15||50||200||200||N/A||10||10|
|Maximum skillsets 4||3||5 or 15||50||200||200||N/A||10||10|
|Maximum indexing load per invocation||10,000 documents||Limited only by maximum documents||Limited only by maximum documents||Limited only by maximum documents||Limited only by maximum documents||N/A||No limit||No limit|
|Minimum schedule||5 minutes||5 minutes||5 minutes||5 minutes||5 minutes||5 minutes||5 minutes||5 minutes|
|Maximum running time 5||1-3 minutes||24 hours||24 hours||24 hours||24 hours||N/A||24 hours||24 hours|
|Maximum running time for cognitive search skillsets or blob indexing with image analysis 5||3-10 minutes||2 hours||2 hours||2 hours||2 hours||N/A||2 hours||2 hours|
|Blob indexer: maximum blob size, MB||16||16||128||256||256||N/A||256||256|
|Blob indexer: maximum characters of content extracted from a blob||32,000||64,000||4 million||4 million||4 million||N/A||4 million||4 million|
1 Free services have indexer maximum execution time of 3 minutes for blob sources and 1 minute for all other data sources. For AI indexing that calls into Cognitive Services, free services are limited to 20 free transactions per day, where a transaction is defined as a document that successfully passes through the enrichment pipeline.
2 Basic services created before December 2017 have lower limits (5 instead of 15) on indexers, data sources, and skillsets.
3 S3 HD services do not include indexer support.
4 Maximum of 30 skills per skillset.
5 Cognitive search workloads and image analysis in Azure blob indexing have shorter running times than regular text indexing. Image analysis and natural language processing are computationally intensive and consume disproportionate amounts of available processing power. Running time was reduced to give other jobs in the queue an opportunity to run.
As stated in the Index limits, indexers will also enforce the upper limit of 3000 elements across all complex collections per document starting with the latest GA API version that supports complex types (
2019-05-06) onwards. This means that if you've created your indexer with a prior API version, you will not be subject to this limit. To preserve maximum compatibility, an indexer that was created with a prior API version and then updated with an API version
2019-05-06 or later, will still be excluded from the limits. Customers should be aware of the adverse impact of having very large complex collections (as stated previously) and we highly recommend creating any new indexers with the latest GA API version.
The maximum number of synonym maps allowed varies by pricing tier. Each rule can have up to 20 expansions, where an expansion is an equivalent term. For example, given "cat", association with "kitty", "feline", and "felis" (the genus for cats) would count as 3 expansions.
|Maximum synonym maps||3||3||5||10||20||20||10||10|
|Maximum number of rules per map||5000||20000||20000||20000||20000||20000||20000||20000|
Queries per second (QPS)
QPS estimates must be developed independently by every customer. Index size and complexity, query size and complexity, and the amount of traffic are primary determinants of QPS. There is no way to offer meaningful estimates when such factors are unknown.
Estimates are more predictable when calculated on services running on dedicated resources (Basic and Standard tiers). You can estimate QPS more closely because you have control over more of the parameters. For guidance on how to approach estimation, see Azure Cognitive Search performance and optimization.
For the Storage Optimized tiers, you should expect a lower query throughput and higher latency than the Standard tiers. The methodology for estimating the query performance you'll experience is the same as the Standard tiers.
Data limits (AI enrichment)
An AI enrichment pipeline that makes calls to a Text Analytics resource for entity recognition, key phrase extraction, sentiment analysis, and language detection is subject to data limits. The maximum size of a record should be 50,000 characters as measured by
String.Length. If you need to break up your data before sending it to the sentiment analyzer, use the Text Split skill.
Search query and indexing requests are throttled as the system approaches peak capacity. Throttling behaves differently for different APIs. Query APIs (Search/Suggest/Autocomplete) and indexing APIs throttle dynamically based on the load on the service. Index APIs have static request rate limits.
Static rate request limits for operations related to an index:
- List Indexes (GET /indexes): 5 per second per search unit
- Get Index (GET /indexes/myindex): 10 per second per search unit
- Create Index (POST /indexes): 12 per minute per search unit
- Create or Update Index (PUT /indexes/myindex): 6 per second per search unit
- Delete Index (DELETE /indexes/myindex): 12 per minute per search unit
API request limits
- Maximum of 16 MB per request 1
- Maximum 8 KB URL length
- Maximum 1000 documents per batch of index uploads, merges, or deletes
- Maximum 32 fields in $orderby clause
- Maximum search term size is 32,766 bytes (32 KB minus 2 bytes) of UTF-8 encoded text
1 In Azure Cognitive Search, the body of a request is subject to an upper limit of 16 MB, imposing a practical limit on the contents of individual fields or collections that are not otherwise constrained by theoretical limits (see Supported data types for more information about field composition and restrictions).
API response limits
- Maximum 1000 documents returned per page of search results
- Maximum 100 suggestions returned per Suggest API request
API key limits
API keys are used for service authentication. There are two types. Admin keys are specified in the request header and grant full read-write access to the service. Query keys are read-only, specified on the URL, and typically distributed to client applications.
- Maximum of 2 admin keys per service
- Maximum of 50 query keys per service