Monitor Azure Cosmos DB

APPLIES TO: SQL API Cassandra API Gremlin API Table API Azure Cosmos DB API for MongoDB

When you have critical applications and business processes relying on Azure resources, you want to monitor those resources for their availability, performance, and operation. This article describes the monitoring data generated by Azure Cosmos databases and how you can use the features of Azure Monitor to analyze and alert on this data.

You can monitor your data with client-side and server-side metrics. When using server-side metrics, you can monitor the data stored in Azure Cosmos DB with the following options:

  • Monitor from Azure Cosmos DB portal: You can monitor with the metrics available within the Metrics tab of the Azure Cosmos account. The metrics on this tab include throughput, storage, availability, latency, consistency, and system level metrics. By default, these metrics have a retention period of 7 days. To learn more, see the Monitoring data collected from Azure Cosmos DB section of this article.

  • Monitor with metrics in Azure monitor: You can monitor the metrics of your Azure Cosmos account and create dashboards from the Azure Monitor. Azure Monitor collects the Azure Cosmos DB metrics by default, you don’t have configure anything explicitly. These metrics are collected with one-minute granularity, the granularity may vary based on the metric you choose. By default, these metrics have a retention period of 30 days. Most of the metrics that are available from the previous options are also available in these metrics. The dimension values for the metrics such as container name are case-insensitive. So you need to use case-insensitive comparison when doing string comparisons on these dimension values. To learn more, see the Analyze metric data section of this article.

  • Monitor with diagnostic logs in Azure Monitor: You can monitor the logs of your Azure Cosmos account and create dashboards from the Azure Monitor. Telemetry such as events and traces that occur at a second granularity are stored as logs. For example, if the throughput of a container is changes, the properties of a Cosmos account are changed these events are captures within the logs. You can analyze these logs by running queries on the gathered data. To learn more, see the Analyze log data section of this article.

  • Monitor programmatically with SDKs: You can monitor your Azure Cosmos account programmatically by using the .NET, Java, Python, Node.js SDKs, and the headers in REST API. To learn more, see the Monitoring Azure Cosmos DB programmatically section of this article.

The following image shows different options available to monitor Azure Cosmos DB account through Azure portal:

Monitoring options available in Azure portal

When using Azure Cosmos DB, at the client-side you can collect the details for request charge, activity ID, exception/stack trace information, HTTP status/sub-status code, diagnostic string to debug any issue that might occur. This information is also required if you need to reach out to the Azure Cosmos DB support team.

Monitor overview

The Overview page in the Azure portal for each Azure Cosmos DB account includes a brief view of the resource usage, such as total requests, requests that resulted in a specific HTTP status code, and hourly billing. This information is helpful, however only a small amount of the monitoring data is available from this pane. Some of this data is collected automatically and is available for analysis as soon as you create the resource. You can enable additional types of data collection with some configuration.

What is Azure Monitor?

Azure Cosmos DB creates monitoring data using Azure Monitor which is a full stack monitoring service in Azure that provides a complete set of features to monitor your Azure resources in addition to resources in other clouds and on-premises.

If you're not already familiar with monitoring Azure services, start with the article Monitoring Azure resources with Azure Monitor which describes the following concepts:

  • What is Azure Monitor?
  • Costs associated with monitoring
  • Monitoring data collected in Azure
  • Configuring data collection
  • Standard tools in Azure for analyzing and alerting on monitoring data

The following sections build on this article by describing the specific data gathered from Azure Cosmos DB and providing examples for configuring data collection and analyzing this data with Azure tools.

Cosmos DB insights

Cosmos DB insights is based on the workbooks feature of Azure Monitor and uses the same monitoring data collected for Azure Cosmos DB described in the sections below. Use Azure Monitor for a view of the overall performance, failures, capacity, and operational health of all your Azure Cosmos DB resources in a unified interactive experience, and leverage the other features of Azure Monitor for detailed analysis and alerting. To learn more, see the Explore Cosmos DB insights article.

Note

When creating containers, make sure you don’t create two containers with the same name but different casing. That’s because some parts of the Azure platform are not case-sensitive, and this can result in confusion/collision of telemetry and actions on containers with such names.

Monitoring data

Azure Cosmos DB collects the same kinds of monitoring data as other Azure resources which are described in Monitoring data from Azure resources. See Azure Cosmos DB monitoring data reference for a detailed reference of the logs and metrics created by Azure Cosmos DB.

The Overview page in the Azure portal for each Azure Cosmos database includes a brief view of the database usage including its request and hourly billing usage. This is useful information but only a small amount of the monitoring data available. Some of this data is collected automatically and available for analysis as soon as you create the database while you can enable additional data collection with some configuration.

Overview page

Collection and routing

Platform metrics and the Activity log are collected and stored automatically, but can be routed to other locations by using a diagnostic setting.

Resource Logs are not collected and stored until you create a diagnostic setting and route them to one or more locations.

See Create diagnostic setting to collect platform logs and metrics in Azure for the detailed process for creating a diagnostic setting using the Azure portal and some diagnostic query examples. When you create a diagnostic setting, you specify which categories of logs to collect.

The metrics and logs you can collect are discussed in the following sections.

Analyzing metrics

Azure Cosmos DB provides a custom experience for working with metrics. You can analyze metrics for Azure Cosmos DB with metrics from other Azure services using Metrics explorer by opening Metrics from the Azure Monitor menu. See Getting started with Azure Metrics Explorer for details on using this tool. You can also checkout how to monitor server-side latency, request unit usage, and normalized request unit usage for your Azure Cosmos DB resources.

For a list of the platform metrics collected for Azure Cosmos DB, see Monitoring Azure Cosmos DB data reference metrics article.

All metrics for Azure Cosmos DB are in the namespace Cosmos DB standard metrics. You can use the following dimensions with these metrics when adding a filter to a chart:

  • CollectionName
  • DatabaseName
  • OperationType
  • Region
  • StatusCode

For reference, you can see a list of all resource metrics supported in Azure Monitor.

View operation level metrics for Azure Cosmos DB

  1. Sign in to the Azure portal.

  2. Select Monitor from the left-hand navigation bar, and select Metrics.

    Metrics pane in Azure Monitor

  3. From the Metrics pane > Select a resource > choose the required subscription, and resource group. For the Resource type, select Azure Cosmos DB accounts, choose one of your existing Azure Cosmos accounts, and select Apply.

    Choose a Cosmos DB account to view metrics

  4. Next you can select a metric from the list of available metrics. You can select metrics specific to request units, storage, latency, availability, Cassandra, and others. To learn in detail about all the available metrics in this list, see the Metrics by category article. In this example, let's select Request units and Avg as the aggregation value.

    In addition to these details, you can also select the Time range and Time granularity of the metrics. At max, you can view metrics for the past 30 days. After you apply the filter, a chart is displayed based on your filter. You can see the average number of request units consumed per minute for the selected period.

    Choose a metric from the Azure portal

Add filters to metrics

You can also filter metrics and the chart displayed by a specific CollectionName, DatabaseName, OperationType, Region, and StatusCode. To filter the metrics, select Add filter and choose the required property such as OperationType and select a value such as Query. The graph then displays the request units consumed for the query operation for the selected period. The operations executed via Stored procedure are not logged so they are not available under the OperationType metric.

Add a filter to select the metric granularity

You can group metrics by using the Apply splitting option. For example, you can group the request units per operation type and view the graph for all the operations at once as shown in the following image:

Add apply splitting filter

Analyzing logs

Data in Azure Monitor Logs is stored in tables where each table has its own set of unique properties.

All resource logs in Azure Monitor have the same fields followed by service-specific fields. The common schema is outlined in Azure Monitor resource log schema. For a list of the types of resource logs collected for Azure Cosmos DB, see Monitoring Azure Cosmos DB data reference.

The Activity log is a platform login Azure that provides insight into subscription-level events. You can view it independently or route it to Azure Monitor Logs, where you can do much more complex queries using Log Analytics.

Azure Cosmos DB stores data in the following tables.

Table Description
AzureDiagnostics Common table used by multiple services to store Resource logs. Resource logs from Azure Cosmos DB can be identified with MICROSOFT.DOCUMENTDB.
AzureActivity Common table that stores all records from the Activity log.

Sample Kusto queries

Important

When you select Logs from the Azure Cosmos DB menu, Log Analytics is opened with the query scope set to the current Azure Cosmos DB account. This means that log queries will only include data from that resource. If you want to run a query that includes data from other accounts or data from other Azure services, select Logs from the Azure Monitor menu. See Log query scope and time range in Azure Monitor Log Analytics for details.

Here are some queries that you can enter into the Log search search bar to help you monitor your Azure Cosmos resources. These queries work with the new language.

  • To query for all of the diagnostic logs from Azure Cosmos DB for a specified time period:

    AzureDiagnostics 
    | where ResourceProvider=="Microsoft.DocumentDb" and Category=="DataPlaneRequests"
    
    
  • To query for all operations, grouped by resource:

    AzureActivity 
    | where ResourceProvider=="Microsoft.DocumentDb" and Category=="DataPlaneRequests" 
    | summarize count() by Resource
    
    
  • To query for all user activity, grouped by resource:

    AzureActivity 
    | where Caller == "test@company.com" and ResourceProvider=="Microsoft.DocumentDb" and Category=="DataPlaneRequests" 
    | summarize count() by Resource
    

Alerts

Azure Monitor alerts proactively notify you when important conditions are found in your monitoring data. They allow you to identify and address issues in your system before your customers notice them. You can set alerts on metrics, logs, and the activity log. Different types of alerts have benefits and drawbacks

For example, the following table lists few alert rules for your resources. You can find a detailed list of alert rules from the Azure portal. To learn more, see how to configure alerts article.

Alert type Condition Description
Rate limiting on request units (metric alert) Dimension name: StatusCode, Operator: Equals, Dimension values: 429 Alerts if the container or a database has exceeded the provisioned throughput limit.
Region failed over Operator: Greater than, Aggregation type: Count, Threshold value: 1 When a single region is failed over. This alert is helpful if you didn't enable automatic failover.
Rotate keys(activity log alert) Event level: Informational , Status: started Alerts when the account keys are rotated. You can update your application with the new keys.

Monitor Azure Cosmos DB programmatically

The account level metrics available in the portal, such as account storage usage and total requests, are not available via the SQL APIs. However, you can retrieve usage data at the collection level by using the SQL APIs. To retrieve collection level data, do the following:

To access additional metrics, use the Azure Monitor SDK. Available metric definitions can be retrieved by calling:

https://management.azure.com/subscriptions/{SubscriptionId}/resourceGroups/{ResourceGroup}/providers/Microsoft.DocumentDb/databaseAccounts/{DocumentDBAccountName}/providers/microsoft.insights/metricDefinitions?api-version=2018-01-01

To retrieve individual metrics use the following format:

https://management.azure.com/subscriptions/{SubscriptionId}/resourceGroups/{ResourceGroup}/providers/Microsoft.DocumentDb/databaseAccounts/{DocumentDBAccountName}/providers/microsoft.insights/metrics?timespan={StartTime}/{EndTime}&interval={AggregationInterval}&metricnames={MetricName}&aggregation={AggregationType}&`$filter={Filter}&api-version=2018-01-01

To learn more, see the Azure monitoring REST API article.

Next steps