Make indexer connections through a private endpoint

Note

You can use the trusted Microsoft service approach to bypass virtual network or IP restrictions on a storage account. You can also enable the search service to access data in the storage account. To do so, see Indexer access to Azure Storage with the trusted service exception.

However, when you use this approach, communication between Azure Cognitive Search and your storage account happens via the public IP address of the storage account, over the secure Microsoft backbone network.

Many Azure resources, such as Azure storage accounts, can be configured to accept connections from a list of virtual networks and refuse outside connections that originate from a public network. If you're using an indexer to index data in Azure Cognitive Search, and your data source is on a private network, you can create an outbound private endpoint connection to reach the data.

This indexer connection method is subject to the following two requirements:

  • The Azure resource that provides content or code must be previously registered with the Azure Private Link service.

  • The Azure Cognitive Search service must be on the Basic tier or higher. The feature isn't available on the Free tier. Additionally, if your indexer has a skillset, the tier must be Standard 2 (S2) or higher. For more information, see Service limits.

Private endpoints of secured resources that are created through Azure Cognitive Search APIs are referred to as shared private link resources. This is because you're "sharing" access to a resource, such as a storage account, that has been integrated with the Azure Private Link service.

Through its Management REST API, Azure Cognitive Search provides a CreateOrUpdate operation that you can use to configure access from an Azure Cognitive Search indexer.

You can create private endpoint connections to some resources only by using the preview version of the Search Management API (version 2020-08-01-preview or later), which is designated preview in the following table. Resources without a preview designation can be created with either the preview or generally available API version (2020-08-01 or later).

The following table lists Azure resources for which you can create outbound private endpoints from Azure Cognitive Search. To create a shared private link resource, enter the Group ID values exactly as they're written in the API. The values are case-sensitive.

Azure resource Group ID
Azure Storage - Blob blob
Azure Storage - Data Lake Storage Gen2 dfs and blob
Azure Storage - Tables table
Azure Cosmos DB - SQL API Sql
Azure SQL Database sqlServer
Azure Database for MySQL (preview) mysqlServer
Azure Key Vault vault
Azure Functions (preview) sites

You can also query the Azure resources for which outbound private endpoint connections are supported by using the list of supported APIs.

In the remainder of this article, a mix of Azure portal (or the Azure CLI if you prefer) and Postman (or any other HTTP client like curl if you prefer) is used to demonstrate the REST API calls.

Note

There are Azure Cognitive Search data sources and other configurations that require creating more than one shared private link to work appropriately. Here is a list of the configurations with this requirement and which group IDs are necessary for each:

  • Azure Data Lake Storage Gen2 data source - Create two shared private links: One shared private link with the groupID 'dfs' and another shared private link with the groupID 'blob'.
  • Skillset with Knowledge store configured - One or two shared private links are necessary, depending on the projections set for Knowledge store:
    • If using blob and/or file projections, create one shared private link with the groupID 'blob'.
    • If using table projections, create one shared private link with the groupID 'table'.
    • In case blob/file and also table projections are used, create two shared private links: one with groupID 'blob' and one with groupID 'table'.
  • Indexer with cache enabled - Create two shared private links: One shared private link with the groupID 'table' and another shared private link with the groupID 'blob'.

Set up indexer connection through private endpoint

Use the following instructions to set up an indexer connection through a private endpoint to a secure Azure resource.

The examples in this article are based on the following assumptions:

  • The name of the search service is contoso-search, which exists in the contoso resource group of a subscription with subscription ID 00000000-0000-0000-0000-000000000000.
  • The resource ID of this search service is /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/contoso/providers/Microsoft.Search/searchServices/contoso-search.

Step 1: Secure your Azure resource

The steps for restricting access varies by resource. The following scenarios show three of the more common types of resources.

  • Scenario 1: Azure Storage

    The following is an example of how to configure an Azure storage account firewall. If you select this option and leave the page empty, it means that no traffic from virtual networks is allowed.

    Screenshot of the "Firewalls and virtual networks" pane for Azure storage, showing the option to allow access to selected networks.

  • Scenario 2: Azure Key Vault

    The following is an example of how to configure Azure Key Vault firewall.

    Screenshot of the "Firewalls and virtual networks" pane for Azure Key Vault, showing the option to allow access to selected networks.

  • Scenario 3: Azure Functions

    No network setting changes are needed for Azure Functions firewalls. Later in the following steps, when you create the shared private endpoint, the Function will automatically only allow access through private link after the creation of a shared private endpoint to the Function.

The following section describes how to create a shared private link resource either using the Azure portal or the Azure CLI.

Option 1: Portal

Note

The portal only supports creating a shared private endpoint using group ID values that are GA. For MySQL and Azure Functions, use the Azure CLI steps described in option 2, which follows.

To request Azure Cognitive Search to create an outbound private endpoint connection, via the Shared Private Access blade, click on "Add Shared Private Access". On the blade that opens on the right, you can choose to "Connect to an Azure resource in my directory" or "Connect to an Azure resource by resource ID or alias".

When using the first option (recommended), the blade will help guide you to pick the appropriate Azure resource and will autofill in other properties such as the group ID of the resource and the resource type.

Screenshot of the "Add Shared Private Access" pane, showing a guided experience for creating a shared private link resource.

When using the second option, you can enter the Azure resource ID manually and choose the appropriate group ID. The group IDs are listed at the beginning of this article.

Screenshot of the "Add Shared Private Access" pane, showing the manual experience for creating a shared private link resource.

Option 2: Azure CLI

Alternatively, you can make the following API call with the Azure CLI. Use the 2020-08-01-preview API version if you're using a group ID that is in preview. For example, group IDs sites and mysqlServer and in preview and require you to use the preview API.

az rest --method put --uri https://management.azure.com/subscriptions/<search service subscription ID>/resourceGroups/<search service resource group name>/providers/Microsoft.Search/searchServices/<search service name>/sharedPrivateLinkResources/<shared private endpoint name>?api-version=2020-08-01 --body @create-pe.json

The following is an example of the contents of the create-pe.json file:

{
      "name": "blob-pe",
      "properties": {
        "privateLinkResourceId": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/contoso/providers/Microsoft.Storage/storageAccounts/contoso-storage",
        "groupId": "blob",
        "requestMessage": "please approve"
      }
}

A 202 Accepted response is returned on success. The process of creating an outbound private endpoint is a long-running (asynchronous) operation. It involves deploying the following resources:

  • A private endpoint, allocated with a private IP address in a "Pending" state. The private IP address is obtained from the address space that's allocated to the virtual network of the execution environment for the search service-specific private indexer. Upon approval of the private endpoint, any communication from Azure Cognitive Search to the Azure resource originates from the private IP address and a secure private link channel.

  • A private DNS zone for the type of resource, based on the groupId. By deploying this resource, you ensure that any DNS lookup to the private resource utilizes the IP address that's associated with the private endpoint.

Be sure to specify the correct groupId for the type of resource for which you're creating the private endpoint. Any mismatch will result in a non-successful response message.

Step 3: Check the status of the private endpoint creation

In this step you'll confirm that the provisioning state of the resource changes to "Succeeded".

Option 1: Portal

Note

The provisioning state will be visible in the portal for both GA and group IDs that are in preview.

The portal will show you the state of the shared private endpoint. In the following example the status is "Updating".

Screenshot of the "Add Shared Private Access" pane, showing the resource creation in progress.

Once the resource is successfully created, you will receive a portal notification and the provisioning state of the resource will change to "Succeeded".

Screenshot of the "Add Shared Private Access" pane, showing the resource creation completed.

Option 2: Azure CLI

The PUT call to create the shared private endpoint returns an Azure-AsyncOperation header value that looks like the following:

"Azure-AsyncOperation": "https://management.azure.com/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/contoso/providers/Microsoft.Search/searchServices/contoso-search/sharedPrivateLinkResources/blob-pe/operationStatuses/08586060559526078782?api-version=2020-08-01"

You can poll for the status by manually querying the Azure-AsyncOperationHeader value.

az rest --method get --uri https://management.azure.com/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/contoso/providers/Microsoft.Search/searchServices/contoso-search/sharedPrivateLinkResources/blob-pe/operationStatuses/08586060559526078782?api-version=2020-08-01

Step 4: Approve the private endpoint connection

Note

In this section, you use the Azure portal to walk through the approval flow for a private endpoint to the Azure resource you're connecting to. Alternately, you could use the REST API that's available via the storage resource provider.

Other providers, such as Azure Cosmos DB or Azure SQL Server, offer similar storage resource provider APIs for managing private endpoint connections.

  1. In the Azure portal, navigate to the Azure resource that you're connecting to and select the Networking tab. Then navigate to the section that lists the private endpoint connections. Following is an example for a storage account. After the asynchronous operation has succeeded, there should be a request for a private endpoint connection with the request message from the previous API call.

    Screenshot of the Azure portal, showing the "Private endpoint connections" pane.

  2. Select the private endpoint that Azure Cognitive Search created. In the Private endpoint column, identify the private endpoint connection by the name that's specified in the previous API, select Approve, and then enter an appropriate message. The message content isn't significant.

    Make sure that the private endpoint connection appears as shown in the following screenshot. It could take one to two minutes for the status to be updated in the portal.

    Screenshot of the Azure portal, showing an "Approved" status on the "Private endpoint connections" pane.

After the private endpoint connection request is approved, traffic is capable of flowing through the private endpoint. After the private endpoint is approved, Azure Cognitive Search creates the necessary DNS zone mappings in the DNS zone that's created for it.

To confirm that the shared private link resource has been updated after approval, revisit the "Shared Private Access" blade of the search service on the Azure portal and check the "Connection State".

Screenshot of the Azure portal, showing an "Approved" shared private link resource.

Alternatively you can also obtain the "Connection state" by using the GET API.

az rest --method get --uri https://management.azure.com/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/contoso/providers/Microsoft.Search/searchServices/contoso-search/sharedPrivateLinkResources/blob-pe?api-version=2020-08-01

This would return a JSON, where the connection state would show up as "status" under the "properties" section. Following is an example for a storage account.

{
      "name": "blob-pe",
      "properties": {
        "privateLinkResourceId": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/contoso/providers/Microsoft.Storage/storageAccounts/contoso-storage",
        "groupId": "blob",
        "requestMessage": "please approve",
        "status": "Approved",
        "resourceRegion": null,
        "provisioningState": "Succeeded"
      }
}

If the "Provisioning State" (properties.provisioningState) of the resource is Succeeded and "Connection State" (properties.status) is Approved, it means that the shared private link resource is functional and the indexer can be configured to communicate over the private endpoint.

Step 6: Configure the indexer to run in the private environment

Note

You can perform this step before the private endpoint connection is approved. Until the private endpoint connection is approved, any indexer that tries to communicate with a secure resource (such as the storage account) will end up in a transient failure state. New indexers will fail to be created. As soon as the private endpoint connection is approved, indexers can access the private storage account.

The following steps show how to configure the indexer to run in the private environment using the REST API. You can also set the execution environment using the JSON editor in the portal.

  1. Create the data source definition, index, and skillset (if you're using one) as you would normally. There are no properties in any of these definitions that vary when using a shared private endpoint.

  2. Create an indexer that points to the data source, index, and skillset that you created in the preceding step. In addition, force the indexer to run in the private execution environment by setting the indexer executionEnvironment configuration property to private.

    {
        "name": "indexer",
        "dataSourceName": "blob-datasource",
        "targetIndexName": "index",
        "parameters": {
            "configuration": {
                "executionEnvironment": "private"
            }
        },
        "fieldMappings": []
    }
    

    Following is an example of the request in Postman.

    Screenshot showing the creation of an indexer on the Postman user interface.

After the indexer is created successfully, it should connect to the Azure resource over the private endpoint connection. You can monitor the status of the indexer by using the Indexer Status API.

Note

If you already have existing indexers, you can update them via the PUT API by setting the executionEnvironment to private or using the JSON editor in the portal.

Troubleshooting

  • If your indexer creation fails with an error message such as "Data source credentials are invalid," it means that either the status of the private endpoint connection is not yet Approved or the connection is not functional. To remedy the issue:

    • Obtain the status of the shared private link resource by using the GET API. If the status is Approved, check the properties.provisioningState of the resource. If the status here is Incomplete, this means that some of the underlying dependencies for the resource failed to be set up. Reissuing the PUT request to re-create the shared private link resource should fix the issue. A reapproval might be necessary. Re-check the status of the resource to verify that the issue is fixed.
  • If you create the indexer without setting its executionEnvironment property, the creation might succeed, but its execution history will show that the indexer runs are unsuccessful. To remedy the issue:

  • If you've created the indexer without setting the executionEnvironment property and it runs successfully, it means that Azure Cognitive Search has decided that its execution environment is the search service-specific private environment. This can change, depending on resources consumed by the indexer, the load on the search service, and other factors, and it can fail later. To remedy the issue:

    • We highly recommend that you set the executionEnvironment property to private to ensure that it won't fail in the future.
  • If you're viewing your data source's networking page in the Azure portal and you select a private endpoint that you created for your Azure Cognitive Search service to access this data source, you may receive a No Access error. This is expected. You can change the status of the connection request via the target service's portal page but to further manage the shared private link resource you need to view the shared private link resource in your search service's network page in the Azure portal.

Quotas and limits determine how many shared private link resources can be created and depend on the SKU of the search service.

Next steps

Learn more about private endpoints: