Indexer operations (Azure Search Service REST API)

An indexer is a resource that crawls a data source and loads documents into a target search index. Key scenarios for indexers can be described as follows:

  • Perform a one-time copy of the data to populate an index.

  • Sync an index with incremental changes from the data source on a recurring schedule. The schedule is part of the indexer definition.

  • Invoke an indexer on-demand to update an index as needed.

    All of the above scenarios are achieved through the Run Indexer (Azure Search Service REST API), which you can run as a standalone operation or scheduled using the built-in scheduler, to load data from supported data sources.

Supportability

Version 2016-09-01 of the Service REST API adds support for the Azure Blob and Azure Table indexers. Previously, these features were only available in the preview API.

Note

Indexing of CSV blobs and blobs containing JSON arrays is still in preview and is therefore not supported in the 2016-09-01 API version.

A data source specifies what data needs to be indexed, credentials to access the data, and policies to enable Azure Search to efficiently identify changes in the data (such as modified or deleted rows in a database table). It's defined as an independent resource so that it can be used by multiple indexers.

The following data sources are currently supported:

  • Azure SQL Database and SQL Server on Azure VMs. For a targeted walk-through, see this article.
  • Azure DocumentDB. For a targeted walk-through, see this article.
  • Azure Blob Storage, including the following document formats: PDF, Microsoft Office (DOCX/DOC, XSLX/XLS, PPTX/PPT, MSG), HTML, XML, ZIP, and plain text files (including JSON). For a targeted walk-through, see this article.
  • Azure Table Storage. For a targeted walk-through, see this article.

    We're considering adding support for additional data sources in the future. To help us prioritize these decisions, please provide your feedback on the Azure Search feedback forum.

    See Service Limits for maximum limits related to indexer and data source resources.

Typical workflow

Using an indexer is efficient, removing the need to write code to index your data. To set up this up, you can call the search service REST API to create and manage indexers and data sources. You can create and manage indexers and data sources via simple HTTP requests (POST, GET, PUT, DELETE) against a given data source or indexer resource.

Setting up automatic indexing is typically a four step process:

  1. Identify the data source that contains the data that needs to be indexed. Keep in mind that Azure Search may not support all of the data types present in your data source. See Supported data types (Azure Search) for the list.

  2. Create an Azure Search index whose schema is compatible with your data source.

  3. Create an Azure Search data source as described in Create Data Source (Azure Search Service REST API).

  4. Create an Azure Search indexer as described in Create Indexer (Azure Search Service REST API).

    You should plan on creating one indexer for every target index and data source combination. You can have multiple indexers writing into the same index, and you can reuse the same data source for multiple indexers. However, an indexer can only consume one data source at a time, and can only write to a single index. As the following graphic illustrates, one data source provides input to one indexer, which then populates a single index:

    Data Source, Indexer, Index chain in Azure Search

    Although you can only use one at a time, resources can be used in different combinations. The main takeaway of the next illustration is to notice is that a data source can be paired with more than one indexer, and multiple indexers can write to same index.

    Resource combinations used in indexers

    After creating an indexer, you can retrieve its execution status using the Get Indexer Status (Azure Search Service REST API) operation. You can also run an indexer at any time (instead of or in addition to running it periodically on a schedule) using the Run Indexer (Azure Search Service REST API) operation.

Operations on indexers

The REST API for indexers and data sources includes the operations shown in the following table.

Create Data Source

POST https://[service name].search.windows.net/datasources?api-version=[api-version]  
    Content-Type: application/json  
    api-key: [admin key]  
PUT https://[service name].search.windows.net/datasources/[datasource name]?api-version=[api-version]  

Update Data Source

PUT https://[service name].search.windows.net/datasources/[datasource name]?api-version=[api-version]  
    Content-Type: application/json  
    api-key: [admin key]  

List Data Sources

GET https://[service name].search.windows.net/datasources?api-version=[api-version]  
    api-key: [admin key]  

Get Data Source

GET https://[service name].search.windows.net/datasources/[datasource name]?api-version=[api-version]  
    api-key: [admin key]  

Delete Data Source

DELETE https://[service name].search.windows.net/datasources/[datasource name]?api-version=[api-version]  
    api-key: [admin key]  

Create Indexer

POST https://[service name].search.windows.net/indexers?api-version=[api-version]  
    Content-Type: application/json  
    api-key: [admin key]  
PUT https://[service name].search.windows.net/indexers/[indexer name]?api-version=[api-version]  

Update Indexer

PUT https://[service name].search.windows.net/indexers/[indexer name]?api-version=[api-version]  
    Content-Type: application/json  
    api-key: [admin key]  

List Indexers

GET https://[service name].search.windows.net/indexers?api-version=[api-version]  
    api-key: [admin key  

Get Indexer

GET https://[service name].search.windows.net/indexers/[indexer name]?api-version=[api-version]  
    api-key: [admin key]  

Delete Indexer

DELETE https://[service name].search.windows.net/indexers/[indexer name]?api-version=[api-version]  
    api-key: [admin key]  

Run Indexer

POST https://[service name].search.windows.net/indexers/[indexer name]/run?api-version=[api-version]  
    api-key: [admin key]  

Get Indexer Status

GET https://[service name].search.windows.net/indexers/[indexer name]/status?api-version=[api-version]  
    api-key: [admin key]  

Reset Indexer

POST https://[service name].search.windows.net/indexers/[indexer name]/reset?api-version=[api-version]  
    api-key: [admin key]  

See also

Azure Search Service REST
Service Limits