Configure and use Azure Synapse Link for Azure Cosmos DB (preview)

Azure Synapse Link for Azure Cosmos DB is a cloud-native hybrid transactional and analytical processing (HTAP) capability that enables you to run near real-time analytics over operational data in Azure Cosmos DB. Synapse Link creates a tight seamless integration between Azure Cosmos DB and Azure Synapse Analytics.

Important

To use Azure Synapse Link, ensure you provision your Azure Cosmos DB account & Azure Synapse Analytics workspace in one of the supported regions. Azure Synapse Link is currently available in the following Azure regions: US West Central, East US, West US2, North Europe, West Europe, South Central US, Southeast Asia, Australia East, East U2, UK South.

Azure Synapse Link is available for Azure Cosmos DB SQL API containers or for Azure Cosmos DB API for Mongo DB collections. Use the following steps to run analytical queries with the Azure Synapse Link for Azure Cosmos DB:

Azure portal

  1. Sign into the Azure portal.

  2. Create a new Azure account, or select an existing Azure Cosmos DB account.

  3. Navigate to your Azure Cosmos DB account and open the Features pane.

  4. Select Synapse Link from the features list.

    Find Synapse Link preview feature

  5. Next it prompts you to enable synapse link on your account. Select Enable. This process can take 1 to 5 minutes to complete.

    Enable Synapse Link feature

  6. Your account is now enabled to use Synapse Link. Next see how to create analytical store enabled containers to automatically start replicating your operational data from the transactional store to the analytical store.

Note

Turning on Synapse Link does not turn on the analytical store automatically. Once you enable Synapse Link on the Cosmos DB account, enable analytical store on containers when you create them, to start replicating your operation data to analytical store.

Create an Azure Cosmos container with analytical store

You can turn on analytical store on an Azure Cosmos container while creating the container. You can use the Azure portal or configure the analyticalTTL property during container creation by using the Azure Cosmos DB SDKs.

Note

Currently you can enable analytical store for new containers (both in new and existing accounts). You can migrate data from your exisitng containers to new containers using Azure Cosmos DB migration tools.

Azure portal

  1. Sign in to the Azure portal or the Azure Cosmos explorer.

  2. Navigate to your Azure Cosmos DB account and open the Data Explorer tab.

  3. Select New Container and enter a name for your database, container, partition key and throughput details. Turn on the Analytical store option. After you enable the analytical store, it creates a container with AnalyicalTTL property set to the default value of -1 (infinite retention). This analytical store that retains all the historical versions of records.

    Turn on analytical store for Azure Cosmos container

  4. If you have previously not enabled Synapse Link on this account, it will prompt you to do so because it's a pre-requisite to create an analytical store enabled container. If prompted, select Enable Synapse Link. This process can take 1 to 5 minutes to complete.

  5. Select OK, to create an analytical store enabled Azure Cosmos container.

  6. After the container is created, verify that analytical store has been enabled by clicking Settings, right below Documents in Data Explorer, and check if the Analytical Store Time to Live option is turned on.

.NET SDK

The following code creates a container with analytical store by using the .NET SDK. Set the analytical TTL property to the required value. For the list of allowed values, see the analytical TTL supported values article:

// Create a container with a partition key, and analytical TTL configured to  -1 (infinite retention)
string containerId = “myContainerName”;
int analyticalTtlInSec = -1;
ContainerProperties cpInput = new ContainerProperties()
            {
Id = containerId,
PartitionKeyPath = "/id",
AnalyticalStorageTimeToLiveInSeconds = analyticalTtlInSec,
};
 await this. cosmosClient.GetDatabase("myDatabase").CreateContainerAsync(cpInput);

Java V4 SDK

The following code creates a container with analytical store by using the Java V4 SDK. Set the AnalyticalStoreTimeToLiveInSeconds property to the required value:

// Create a container with a partition key and  analytical TTL configured to  -1 (infinite retention) 
CosmosContainerProperties containerProperties = new CosmosContainerProperties("myContainer", "/myPartitionKey");

containerProperties.setAnalyticalStoreTimeToLiveInSeconds(-1);

container = database.createContainerIfNotExists(containerProperties, 400).block().getContainer();

Python V4 SDK

Python 2.7 and Azure Cosmos DB SDK 4.1.0 are the minimum versions required, and the SDK is only compatible with the SQL API.

The first step is to make sure that you are using at least version 4.1.0 of the Azure Cosmos DB Python SDK:

import azure.cosmos as cosmos

print (cosmos.__version__)

The next step creates a container with analytical store by using the Azure Cosmos DB Python SDK:

# Azure Cosmos DB Python SDK, for SQL API only.
# Creating an analytical store enabled container.

import azure.cosmos.cosmos_client as cosmos_client
import azure.cosmos.exceptions as exceptions
from azure.cosmos.partition_key import PartitionKey

HOST = 'your-cosmos-db-account-URI'
KEY = 'your-cosmos-db-account-key'
DATABASE = 'your-cosmos-db-database-name'
CONTAINER = 'your-cosmos-db-container-name'

client = cosmos_client.CosmosClient(HOST,  KEY )
# setup database for this sample. 
# If doesn't exist, creates a new one with the name informed above.
try:
    db = client.create_database(DATABASE)

except exceptions.CosmosResourceExistsError:
    db = client.get_database_client(DATABASE)

# Creating the container with analytical store enabled, using the name informed above.
# If a container with the same name exists, an error is returned.
#
# The 3 options for the analytical_storage_ttl parameter are:
# 1) 0 or Null or not informed (Not enabled).
# 2) -1 (The data will be stored in analytical store infinitely).
# 3) Any other number is the actual ttl, in seconds.

try:
    container = db.create_container(
        id=CONTAINER,
        partition_key=PartitionKey(path='/id', kind='Hash'),analytical_storage_ttl=-1
    )
    properties = container.read()
    print('Container with id \'{0}\' created'.format(container.id))
    print('Partition Key - \'{0}\''.format(properties['partitionKey']))

except exceptions.CosmosResourceExistsError:
    print('A container with already exists')

Update the analytical store time to live

After the analytical store is enabled with a particular TTL value, you can update it to a different valid value later. You can update the value by using the Azure portal or SDKs. For information on the various Analytical TTL config options, see the analytical TTL supported values article.

Azure portal

If you created an analytical store enabled container through the Azure portal, it contains a default analytical TTL of -1. Use the following steps to update this value:

  1. Sign in to the Azure portal or the Azure Cosmos explorer.

  2. Navigate to your Azure Cosmos DB account and open the Data Explorer tab.

  3. Select an existing container that has analytical store enabled. Expand it and modify the following values:

  • Open the Scale & Settings window.
  • Under Setting find,** Analytical Storage Time to Live**.
  • Select On (no default) or select On and set a TTL value
  • Click Save to save the changes.

.NET SDK

The following code shows how to update the TTL for analytical store by using the .NET SDK:

// Get the container, update AnalyticalStorageTimeToLiveInSeconds 
ContainerResponse containerResponse = await client.GetContainer("database", "container").ReadContainerAsync();
// Update analytical store TTL
containerResponse.Resource. AnalyticalStorageTimeToLiveInSeconds = 60 * 60 * 24 * 180  // Expire analytical store data in 6 months;
await client.GetContainer("database", "container").ReplaceContainerAsync(containerResponse.Resource);

Java V4 SDK

The following code shows how to update the TTL for analytical store by using the Java V4 SDK:

CosmosContainerProperties containerProperties = new CosmosContainerProperties("myContainer", "/myPartitionKey");

// Update analytical store TTL to expire analytical store data in 6 months;
containerProperties.setAnalyticalStoreTimeToLiveInSeconds (60 * 60 * 24 * 180 );  
 
// Update container settings
container.replace(containerProperties).block();

Connect to a Synapse workspace

Use the instructions in Connect to Azure Synapse Link on how to access an Azure Cosmos DB database from Azure Synapse Analytics Studio with Azure Synapse Link.

Query analytical store using Apache Spark for Azure Synapse Analytics

Use the instructions in the Query Azure Cosmos DB analytical store article on how to query with Synapse Spark. That article gives some examples on how you can interact with the analytical store from Synapse gestures. Those gestures are visible when you right-click on a container. With gestures, you can quickly generate code and tweak it to your needs. They are also perfect for discovering data with a single click.

Query the analytical store using Synapse SQL serverless

Synapse SQL serverless (a preview feature which, was previously referred to as SQL on-demand) allows you to query and analyze data in your Azure Cosmos DB containers that are enabled with Azure Synapse Link. You can analyze data in near real-time without impacting the performance of your transactional workloads. It offers a familiar T-SQL syntax to query data from the analytical store and integrated connectivity to a wide range of BI and ad-hoc querying tools via the T-SQL interface. To learn more, see the Query analytical store using Synapse SQL serverless article.

Note

Using the Azure Cosmos DB analytic store with Synapse SQL serverless is currently under gated preview. To request access, reach out to the Azure Cosmos DB team.

Use Synapse SQL serverless to analyze and visualize data in Power BI

You can build a Synapse SQL serverless database and views over Synapse Link for Azure Cosmos DB. Later you can query the Azure Cosmos containers and then build a model with Power BI over those views to reflect that query. To learn more, see how to use Synapse SQL serverless to analyze Azure Cosmos DB data with Synapse Link article.

Azure Resource Manager template

The Azure Resource Manager template creates a Synapse Link enabled Azure Cosmos DB account for SQL API. This template creates a Core (SQL) API account in one region with a container configured with analytical TTL enabled, and an option to use manual or autoscale throughput. To deploy this template, click on Deploy to Azure on the readme page.

You can find samples to get started with Azure Synapse Link on GitHub. These showcase end-to-end solutions with IoT and retail scenarios. You can also find the samples corresponding to Azure Cosmos DB API for MongoDB in the same repo under the MongoDB folder.

Next steps

To learn more, see the following docs: