Azure Cosmos DB FAQ

Azure Cosmos DB fundamentals

What is Azure Cosmos DB?

Azure Cosmos DB is a globally replicated, multi-model database service that offers rich querying over schema-free data, helps deliver configurable and reliable performance, and enables rapid development. It's all achieved through a managed platform that's backed by the power and reach of Microsoft Azure.

Azure Cosmos DB is the right solution for web, mobile, gaming, and IoT applications when predictable throughput, high availability, low latency, and a schema-free data model are key requirements. It delivers schema flexibility and rich indexing, and it includes multi-document transactional support with integrated JavaScript.

For more database questions, answers, and instructions for deploying and using this service, see the Azure Cosmos DB documentation page.

What happened to the DocumentDB API?

The Azure Cosmos DB DocumentDB API or SQL (DocumentDB) API is now known as Azure Cosmos DB SQL API. You don't need to change anything to continue running your apps built with DocumentDB API. The functionality remains the same.

If you had a DocumentDB API account before, you now have a SQL API account, with no change to your billing.

What happened to Azure DocumentDB as a service?

The Azure DocumentDB service is now a part of the Azure Cosmos DB service and manifests itself in the form of the SQL API. Applications built against Azure DocumentDB will run without any changes against Azure Cosmos DB SQL API. In addition, Azure Cosmos DB supports the Graph API, Table API, MongoDB API, and Cassandra API (Preview).

What are the typical use cases for Azure Cosmos DB?

Azure Cosmos DB is a good choice for new web, mobile, gaming, and IoT applications where automatic scale, predictable performance, fast order of millisecond response times, and the ability to query over schema-free data is important. Azure Cosmos DB lends itself to rapid development and supporting the continuous iteration of application data models. Applications that manage user-generated content and data are common use cases for Azure Cosmos DB.

How does Azure Cosmos DB offer predictable performance?

A request unit (RU) is the measure of throughput in Azure Cosmos DB. A 1-RU throughput corresponds to the throughput of the GET of a 1-KB document. Every operation in Azure Cosmos DB, including reads, writes, SQL queries, and stored procedure executions, has a deterministic RU value that's based on the throughput required to complete the operation. Instead of thinking about CPU, IO, and memory and how they each affect your application throughput, you can think in terms of a single RU measure.

You can reserve each Azure Cosmos DB container with provisioned throughput in terms of RUs of throughput per second. For applications of any scale, you can benchmark individual requests to measure their RU values, and provision a container to handle the total of request units across all requests. You can also scale up or scale down your container's throughput as the needs of your application evolve. For more information about request units and for help determining your container needs, see Estimating throughput needs and try the throughput calculator. The term container here refers to refers to a SQL API collection, Graph API graph, MongoDB API collection, and Table API table.

How does Azure Cosmos DB support various data models such as key/value, columnar, document and graph?

Key/value (table), columnar, document and graph data models are all natively supported because of the ARS (atoms, records and sequences) design that Azure Cosmos DB is built on. Atoms, records, and sequences can be easily mapped and projected to various data models. The APIs for a subset of models are available right now (SQL, MongoDB, Table, and Graph APIs) and others specific to additional data models will be available in the future.

Azure Cosmos DB has a schema agnostic indexing engine capable of automatically indexing all the data it ingests without requiring any schema or secondary indexes from the developer. The engine relies on a set of logical index layouts (inverted, columnar, tree) which decouple the storage layout from the index and query processing subsystems. Cosmos DB also has the ability to support a set of wire protocols and APIs in an extensible manner and translate them efficiently to the core data model (1) and the logical index layouts (2) making it uniquely capable of supporting multiple data models natively.

Is Azure Cosmos DB HIPAA compliant?

Yes, Azure Cosmos DB is HIPAA-compliant. HIPAA establishes requirements for the use, disclosure, and safeguarding of individually identifiable health information. For more information, see the Microsoft Trust Center.

What are the storage limits of Azure Cosmos DB?

There is no limit to the total amount of data that a container can store in Azure Cosmos DB.

What are the throughput limits of Azure Cosmos DB?

There is no limit to the total amount of throughput that a container can support in Azure Cosmos DB. The key idea is to distribute your workload roughly evenly among a sufficiently large number of partition keys.

Are Direct and Gateway connectivity modes encrypted ?

Yes both modes are always fully encrypted.

How much does Azure Cosmos DB cost?

For details, refer to the Azure Cosmos DB pricing details page. Azure Cosmos DB usage charges are determined by the number of provisioned containers, the number of hours the containers were online, and the provisioned throughput for each container. The term containers here refers to the SQL API collection, Graph API graph, MongoDB API collection, and Table API tables.

Is a free account available?

Yes, you can sign up for a time-limited account at no charge, with no commitment. To sign up, visit Try Azure Cosmos DB for free or read more in the Try Azure Cosmos DB FAQ.

If you are new to Azure, you can sign up for an Azure free account, which gives you 30 days and and a credit to try all the Azure services. If you have a Visual Studio subscription, you are also eligible for free Azure credits to use on any Azure service.

You can also use the Azure Cosmos DB Emulator to develop and test your application locally for free, without creating an Azure subscription. When you're satisfied with how your application is working in the Azure Cosmos DB Emulator, you can switch to using an Azure Cosmos DB account in the cloud.

How can I get additional help with Azure Cosmos DB?

To ask a technical question, you can post to one of these two question and answer forums:

To request new features, create a new request on Uservoice.

To fix an issue with your account, file a support request in the Azure portal.

Other questions can be submitted to the team at askcosmosdb@microsoft.com; however this is not a technical support alias.

Try Azure Cosmos DB subscriptions

You can now enjoy a time-limited Azure Cosmos DB experience without a subscription, free of charge and commitments. To sign up for a Try Azure Cosmos DB subscription, go to Try Azure Cosmos DB for free. This subscription is separate from the Azure Free Trial, and can be used in addition to an Azure Free Trial or an Azure paid subscription.

Try Azure Cosmos DB subscriptions appear in the Azure portal next other subscriptions associated with your user ID.

The following conditions apply to Try Azure Cosmos DB subscriptions:

  • One container per subscription for SQL, Gremlin (Graph API), and Table accounts.
  • Up to 3 collections per subscription for MongoDB accounts.
  • 10 GB storage capacity.
  • Global replication is available in the following Azure regions: Central US, North Europe and Southeast Asia
  • Maximum throughput of 5K RU/s.
  • Subscriptions expire after 24 hours, and can be extended to a maximum of 48 hours total.
  • Azure support tickets cannot be created for Try Azure Cosmos DB accounts; however, support is provided for subscribers with existing support plans.

Set up Azure Cosmos DB

How do I sign up for Azure Cosmos DB?

Azure Cosmos DB is available in the Azure portal. First, sign up for an Azure subscription. After you've signed up, you can add a SQL API, Graph API, Table API, MongoDB API, or Cassandra API account to your Azure subscription.

What is a master key?

A master key is a security token to access all resources for an account. Individuals with the key have read and write access to all resources in the database account. Use caution when you distribute master keys. The primary master key and secondary master key are available on the Keys blade of the Azure portal. For more information about keys, see View, copy, and regenerate access keys.

What are the regions that PreferredLocations can be set to?

The PreferredLocations value can be set to any of the Azure regions in which Cosmos DB is available. For a list of available regions, see Azure regions.

Is there anything I should be aware of when distributing data across the world via the Azure datacenters?

Azure Cosmos DB is present across all Azure regions, as specified on the Azure regions page. Because it is the core service, every new datacenter has an Azure Cosmos DB presence.

When you set a region, remember that Azure Cosmos DB respects sovereign and government clouds. That is, if you create an account in a sovereign region, you cannot replicate out of that sovereign region. Similarly, you cannot enable replication into other sovereign locations from an outside account.

Is it possible to switch from container level throughput provisioning to database level throughput provisioning? Or vice versa

Container and database level throughput provisioning are separate offerings and switching between either of these require migrating data from source to destination. Which means you need to create a new database or a new collection and then migrate data by using bulk executor library or Azure Data Factory.

How do I create fixed collection with partition key

Currently you can create collection with a partition key throughput by using the CreatePartitionedCollection method of .Net SDK or by using the Azure CLI. Creating a fixed collection by using Azure portal is not currenlty supported.

Develop against the SQL API

How do I start developing against the SQL API?

First you must sign up for an Azure subscription. Once you sign up for an Azure subscription, you can add a SQL API container to your Azure subscription. For instructions on adding an Azure Cosmos DB account, see Create an Azure Cosmos DB database account.

SDKs are available for .NET, Python, Node.js, JavaScript, and Java. Developers can also use the RESTful HTTP APIs to interact with Azure Cosmos DB resources from various platforms and languages.

Can I access some ready-made samples to get a head start?

Samples for the SQL API .NET, Java, Node.js, and Python SDKs are available on GitHub.

Does the SQL API database support schema-free data?

Yes, the SQL API allows applications to store arbitrary JSON documents without schema definitions or hints. Data is immediately available for query through the Azure Cosmos DB SQL query interface.

Does the SQL API support ACID transactions?

Yes, the SQL API supports cross-document transactions expressed as JavaScript-stored procedures and triggers. Transactions are scoped to a single partition within each container and executed with ACID semantics as "all or nothing," isolated from other concurrently executing code and user requests. If exceptions are thrown through the server-side execution of JavaScript application code, the entire transaction is rolled back. For more information about transactions, see Database program transactions.

What is a container?

A container is a group of documents and their associated JavaScript application logic. A container is a billable entity, where the cost is determined by the throughput and used storage. Containers can span one or more partitions or servers and can scale to handle practically unlimited volumes of storage or throughput.

  • For SQL and MongoDB API accounts, a container maps to a Collection.
  • For Cassandra and Table API accounts, a container maps to a Table.
  • For Gremlin API accounts, a container maps to a Graph.

Containers are also the billing entities for Azure Cosmos DB. Each container is billed hourly, based on the provisioned throughput and used storage space. For more information, see Azure Cosmos DB Pricing.

How do I create a database?

You can create databases by using the Azure portal, as described in Add a collection, one of the Azure Cosmos DB SDKs, or the REST APIs.

How do I set up users and permissions?

You can create users and permissions by using one of the Cosmos DB API SDKs or the REST APIs.

Does the SQL API support SQL?

The SQL query language supported by SQL API accounts is an enhanced subset of the query functionality that's supported by SQL Server. The Azure Cosmos DB SQL query language provides rich hierarchical and relational operators and extensibility via JavaScript-based, user-defined functions (UDFs). JSON grammar allows for modeling JSON documents as trees with labeled nodes, which are used by both the Azure Cosmos DB automatic indexing techniques and the SQL query dialect of Azure Cosmos DB. For information about using SQL grammar, see the SQL Query article.

Does the SQL API support SQL aggregation functions?

The SQL API supports low-latency aggregation at any scale via aggregate functions COUNT, MIN, MAX, AVG, and SUM via the SQL grammar. For more information, see Aggregate functions.

How does the SQL API provide concurrency?

The SQL API supports optimistic concurrency control (OCC) through HTTP entity tags, or ETags. Every SQL API resource has an ETag, and the ETag is set on the server every time a document is updated. The ETag header and the current value are included in all response messages. ETags can be used with the If-Match header to allow the server to decide whether a resource should be updated. The If-Match value is the ETag value to be checked against. If the ETag value matches the server ETag value, the resource is updated. If the ETag is no longer current, the server rejects the operation with an "HTTP 412 Precondition failure" response code. The client then re-fetches the resource to acquire the current ETag value for the resource. In addition, ETags can be used with the If-None-Match header to determine whether a re-fetch of a resource is needed.

To use optimistic concurrency in .NET, use the AccessCondition class. For a .NET sample, see Program.cs in the DocumentManagement sample on GitHub.

How do I perform transactions in the SQL API?

The SQL API supports language-integrated transactions via JavaScript-stored procedures and triggers. All database operations inside scripts are executed under snapshot isolation. If it is a single-partition collection, the execution is scoped to the collection. If the collection is partitioned, the execution is scoped to documents with the same partition-key value within the collection. A snapshot of the document versions (ETags) is taken at the start of the transaction and committed only if the script succeeds. If the JavaScript throws an error, the transaction is rolled back. For more information, see Server-side JavaScript programming for Azure Cosmos DB.

How can I bulk-insert documents into Cosmos DB?

You can bulk-insert documents into Azure Cosmos DB in one of the following ways:

I have setup my container to use lazy indexing, I see that my queries do not return expected results.

As explained in the indexing section, lazy indexing can result in this behavior. You should always use consistent indexing for all the applications.

Yes, because Azure Cosmos DB is a RESTful service, resource links are immutable and can be cached. SQL API clients can specify an "If-None-Match" header for reads against any resource-like document or collection and then update their local copies after the server version has changed.

Is a local instance of SQL API available?

Yes. The Azure Cosmos DB Emulator provides a high-fidelity emulation of the Cosmos DB service. It supports functionality that's identical to Azure Cosmos DB, including support for creating and querying JSON documents, provisioning and scaling collections, and executing stored procedures and triggers. You can develop and test applications by using the Azure Cosmos DB Emulator, and deploy them to Azure at a global scale by making a single configuration change to the connection endpoint for Azure Cosmos DB.

Why are long floating-point values in a document rounded when viewed from data explorer in the portal.

This is limitation of JavaScript. JavaScript uses double-precision floating-point format numbers as specified in IEEE 754 and it can safely represent numbers between -(253 - 1) and 253-1 (i.e., 9007199254740991) only.

Where are permissions allowed in the object hierarchy?

Creating permissions by using ResourceTokens is allowed at the container level and its descendants (such as documents, attachments). This implies that trying to create a permission at the database or an account level is not currently allowed.

Develop against the API for MongoDB

What is the Azure Cosmos DB API for MongoDB?

The Azure Cosmos DB API for MongoDB is a compatibility layer that allows applications to easily and transparently communicate with the native Azure Cosmos DB database engine by using existing, community-supported Apache MongoDB APIs and drivers. Developers can now use existing MongoDB tool chains and skills to build applications that take advantage of Azure Cosmos DB. Developers benefit from the unique capabilities of Azure Cosmos DB, which include auto-indexing, backup maintenance, financially backed service level agreements (SLAs), and so on.

How do I connect to my API for MongoDB database?

The quickest way to connect to the Azure Cosmos DB API for MongoDB is to head over to the Azure portal. Go to your account and then, on the left navigation menu, click Quick Start. Quick Start is the best way to get code snippets to connect to your database.

Azure Cosmos DB enforces strict security requirements and standards. Azure Cosmos DB accounts require authentication and secure communication via SSL, so be sure to use TLSv1.2.

For more information, see Connect to your API for MongoDB database.

Are there additional error codes for an API for MongoDB database?

In addition to the common MongoDB error codes, the MongoDB API has its own specific error codes:

Error Code Description Solution
TooManyRequests 16500 The total number of request units consumed has exceeded the provisioned request-unit rate for the collection and has been throttled. Consider scaling the throughput assigned to a container or a set of containers from the Azure portal or retrying again.
ExceededMemoryLimit 16501 As a multi-tenant service, the operation has exceeded the client's memory allotment. Reduce the scope of the operation through more restrictive query criteria or contact support from the Azure portal.

Example:     db.getCollection('users').aggregate([
        {$match: {name: "Andy"}},
        {$sort: {age: -1}}
    ])
)

Develop with the Table API

How can I use the Table API offering?

The Azure Cosmos DB Table API is available in the Azure portal. First you must sign up for an Azure subscription. After you've signed up, you can add an Azure Cosmos DB Table API account to your Azure subscription, and then add tables to your account.

You can find the supported languages and associated quick-starts in the Introduction to Azure Cosmos DB Table API.

Do I need a new SDK to use the Table API?

No, existing storage SDKs should still work. However it is recommended that one always gets the latest SDKs for the best support and in many cases superior performance. See the list of available languages in the Introduction to Azure Cosmos DB Table API.

Where is Table API not identical with Azure Table storage behavior?

There are some behavior differences that users coming from Azure Table storage who want to create tables with the Azure Cosmos DB Table API should be aware of:

  • Azure Cosmos DB Table API uses a reserved capacity model in order to ensure guaranteed performance but this means that one pays for the capacity as soon as the table is created, even if the capacity isn't being used. With Azure Table storage one only pays for capacity that is actually used. This helps to explain why Table API can offer a 10 ms read and 15 ms write SLA at the 99th percentile while Azure Table storage offers a 10 second SLA. But as a consequence, with Table API tables, even empty tables without any requests, cost money in order to ensure the capacity is available to handle any requests to them at the SLA offered by Azure Cosmos DB.
  • Query results returned by the Table API are not sorted in partition key/row key order as they are in Azure Table storage.
  • Row keys can only be up to 255 bytes
  • Batches can only contain up to 2 MBs
  • CORS is not currently supported
  • Table names in Azure Table storage are not case-sensitive, but they are in Azure Cosmos DB Table API
  • Some of Azure Cosmos DB's internal formats for encoding information, such as binary fields, are currently not as efficient as one might like. Therefore this can cause unexpected limitations on data size. For example, currently one couldn't use the full 1 Meg of a table entity to store binary data because the encoding increases the data's size.
  • Entity property name 'Id' currently not supported
  • TableQuery TakeCount is not limited to 1000

In terms of the REST API there are a number of endpoints/query options that are not supported by Azure Cosmos DB Table API:

Rest Method(s) Rest Endpoint/Query Option Doc URLs Explanation
GET, PUT /?restype=service@comp=properties Set Table Service Properties and Get Table Service Properties This endpoint is used to set CORS rules, storage analytics configuration, and logging settings. CORS is currently not supported and analytics and logging are handled differently in Azure Cosmos DB than Azure Storage Tables
OPTIONS / Pre-flight CORS table request This is part of CORS which Azure Cosmos DB does not currently support.
GET /?restype=service@comp=stats Get Table Service Stats Provides information how quickly data is replicating between primary and secondaries. This isn't needed in Cosmos DB as the replication is part of writes.
GET, PUT /mytable?comp=acl Get Table ACL and Set Table ACL This gets and sets the stored access policies used to manage Shared Access Signatures (SAS). Although SAS is supported, they are set and managed differently.

In addition Azure Cosmos DB Table API only supports the JSON format, not ATOM.

While Azure Cosmos DB supports Shared Access Signatures (SAS) there are certain policies it doesn't support, specifically those related to management operations such as the right to create new tables.

For the .NET SDK in particular, there are some classes and methods that Azure Cosmos DB does not currently support.

Class Unsupported Method
CloudTableClient *ServiceProperties*
*ServiceStats*
CloudTable SetPermissions*
GetPermissions*
TableServiceContext * (this class is actually deprecated)
TableServiceEntity " "
TableServiceExtensions " "
TableServiceQuery " "

If any of these differences are a problem for your project please contact askcosmosdb@microsoft.com and let us know.

How do I provide feedback about the SDK or bugs?

You can share your feedback in any of the following ways:

What is the connection string that I need to use to connect to the Table API?

The connection string is:

DefaultEndpointsProtocol=https;AccountName=<AccountNamefromCosmos DB;AccountKey=<FromKeysPaneofCosmosDB>;TableEndpoint=https://<AccountName>.table.cosmosdb.azure.com

You can get the connection string from the Connection String page in the Azure portal.

How do I override the config settings for the request options in the .NET SDK for the Table API?

For information about config settings, see Azure Cosmos DB capabilities. Some settings are handled on the CreateCloudTableClient method and other via the app.config in the appSettings section in the client application.

Are there any changes for customers who are using the existing Azure Table storage SDKs?

None. There are no changes for existing or new customers who are using the existing Azure Table storage SDKs.

How do I view table data that is stored in Azure Cosmos DB for use with the Table API?

You can use the Azure portal to browse the data. You can also use the Table API code or the tools mentioned in the next answer.

Which tools work with the Table API?

You can use the Azure Storage Explorer.

Tools with the flexibility to take a connection string in the format specified previously can support the new Table API. A list of table tools is provided on the Azure Storage Client Tools page.

Is the concurrency on operations controlled?

Yes, optimistic concurrency is provided via the use of the ETag mechanism.

Is the OData query model supported for entities?

Yes, the Table API supports OData query and LINQ query.

Can I connect to Azure Table Storage and Azure Cosmos DB Table API side by side in the same application?

Yes, you can connect by creating two separate instances of the CloudTableClient, each pointing to its own URI via the connection string.

How do I migrate an existing Azure Table storage application to this offering?

AzCopy and the Azure Cosmos DB Data Migration Tool are both supported.

How is expansion of the storage size done for this service if, for example, I start with n GB of data and my data will grow to 1 TB over time?

Azure Cosmos DB is designed to provide unlimited storage via the use of horizontal scaling. The service can monitor and effectively increase your storage.

How do I monitor the Table API offering?

You can use the Table API Metrics pane to monitor requests and storage usage.

How do I calculate the throughput I require?

You can use the capacity estimator to calculate the TableThroughput that's required for the operations. For more information, see Estimate Request Units and Data Storage. In general, you can represent your entity as JSON and provide the numbers for your operations.

Can I use the Table API SDK locally with the emulator?

Not at this time.

Can my existing application work with the Table API?

Yes, the same API is supported.

Do I need to migrate my existing Azure Table storage applications to the SDK if I do not want to use the Table API features?

No, you can create and use existing Azure Table storage assets without interruption of any kind. However, if you do not use the Table API, you cannot benefit from the automatic index, the additional consistency option, or global distribution.

How do I add replication of the data in the Table API across multiple regions of Azure?

You can use the Azure Cosmos DB portal's global replication settings to add regions that are suitable for your application. To develop a globally distributed application, you should also add your application with the PreferredLocation information set to the local region for providing low read latency.

How do I change the primary write region for the account in the Table API?

You can use the Azure Cosmos DB global replication portal pane to add a region and then fail over to the required region. For instructions, see Developing with multi-region Azure Cosmos DB accounts.

How do I configure my preferred read regions for low latency when I distribute my data?

To help read from the local location, use the PreferredLocation key in the app.config file. For existing applications, the Table API throws an error if LocationMode is set. Remove that code, because the Table API picks up this information from the app.config file. For more information, see Azure Cosmos DB capabilities.

How should I think about consistency levels in the Table API?

Azure Cosmos DB provides well-reasoned trade-offs between consistency, availability, and latency. Azure Cosmos DB offers five consistency levels to Table API developers, so you can choose the right consistency model at the table level and make individual requests while querying the data. When a client connects, it can specify a consistency level. You can change the level via the consistencyLevel argument of CreateCloudTableClient.

The Table API provides low-latency reads with "Read your own writes," with Bounded-staleness consistency as the default. For more information, see Consistency levels.

By default, Azure Table storage provides Strong consistency within a region and Eventual consistency in the secondary locations.

Does Azure Cosmos DB Table API offer more consistency levels than Azure Table storage?

Yes, for information about how to benefit from the distributed nature of Azure Cosmos DB, see Consistency levels. Because guarantees are provided for the consistency levels, you can use them with confidence. For more information, see Azure Cosmos DB capabilities.

When global distribution is enabled, how long does it take to replicate the data?

Azure Cosmos DB commits the data durably in the local region and pushes the data to other regions immediately in a matter of milliseconds. This replication is dependent only on the round-trip time (RTT) of the datacenter. To learn more about the global-distribution capability of Azure Cosmos DB, see Azure Cosmos DB: A globally distributed database service on Azure.

Can the read request consistency level be changed?

With Azure Cosmos DB, you can set the consistency level at the container level (on the table). By using the .NET SDK, you can change the level by providing the value for TableConsistencyLevel key in the app.config file. The possible values are: Strong, Bounded Staleness, Session, Consistent Prefix, and Eventual. For more information, see Tunable data consistency levels in Azure Cosmos DB. The key idea is that you cannot set the request consistency level at more than the setting for the table. For example, you cannot set the consistency level for the table at Eventual and the request consistency level at Strong.

How does the Table API handle failover if a region goes down?

The Table API leverages the globally distributed platform of Azure Cosmos DB. To ensure that your application can tolerate datacenter downtime, enable at least one more region for the account in the Azure Cosmos DB portal Developing with multi-region Azure Cosmos DB accounts. You can set the priority of the region by using the portal Developing with multi-region Azure Cosmos DB accounts.

You can add as many regions as you want for the account and control where it can fail over to by providing a failover priority. Of course, to use the database, you need to provide an application there too. When you do so, your customers will not experience downtime. The latest .NET client SDK is auto homing but the other SDKs are not. That is, it can detect the region that's down and automatically fail over to the new region.

Is the Table API enabled for backups?

Yes, the Table API leverages the platform of Azure Cosmos DB for backups. Backups are made automatically. For more information, see Online backup and restore with Azure Cosmos DB.

Does the Table API index all attributes of an entity by default?

Yes, all attributes of an entity are indexed by default. For more information, see Azure Cosmos DB: Indexing policies.

Does this mean I do not have to create multiple indexes to satisfy the queries?

Yes, Azure Cosmos DB Table API provides automatic indexing of all attributes without any schema definition. This automation frees developers to focus on the application rather than on index creation and management. For more information, see Azure Cosmos DB: Indexing policies.

Can I change the indexing policy?

Yes, you can change the indexing policy by providing the index definition. For more information, see Azure Cosmos DB capabilities. You need to properly encode and escape the settings.

for the non-.NET SDKs the indexing policy can only be set in the portal at Data Explorer, navigate to the specific table you want to change and then go to the Scale & Settings->Indexing Policy, make the desired change and then Save.

From the .NET SDK it can be submitted in the app.config file:

{
  "indexingMode": "consistent",
  "automatic": true,
  "includedPaths": [
    {
      "path": "/somepath",
      "indexes": [
        {
          "kind": "Range",
          "dataType": "Number",
          "precision": -1
        },
        {
          "kind": "Range",
          "dataType": "String",
          "precision": -1
        } 
      ]
    }
  ],
  "excludedPaths": 
[
 {
      "path": "/anotherpath"
 }
]
}

Azure Cosmos DB as a platform seems to have lot of capabilities, such as sorting, aggregates, hierarchy, and other functionality. Will you be adding these capabilities to the Table API?

The Table API provides the same query functionality as Azure Table storage. Azure Cosmos DB also supports sorting, aggregates, geospatial query, hierarchy, and a wide range of built-in functions. We will provide additional functionality in the Table API in a future service update. For more information, see SQL queries.

When should I change TableThroughput for the Table API?

You should change TableThroughput when either of the following conditions applies:

  • You're performing an extract, transform, and load (ETL) of data, or you want to upload a lot of data in short amount of time.
  • You need more throughput from the container or from a set of containers at the back end. For example, you see that the used throughput is more than the provisioned throughput, and you are getting throttled. For more information, see Set throughput for Azure Cosmos DB containers.

Can I scale up or scale down the throughput of my Table API table?

Yes, you can use the Azure Cosmos DB portal's scale pane to scale the throughput. For more information, see Set throughput.

Is a default TableThroughput set for newly provisioned tables?

Yes, if you do not override the TableThroughput via app.config and do not use a pre-created container in Azure Cosmos DB, the service creates a table with throughput of 400.

Is there any change of pricing for existing customers of the Azure Table storage service?

None. There is no change in price for existing Azure Table storage customers.

How is the price calculated for the Table API?

The price depends on the allocated TableThroughput.

How do I handle any rate limiting on the tables in Table API offering?

If the request rate exceeds the capacity of the provisioned throughput for the underlying container or a set of containers, you get an error, and the SDK retries the call by applying the retry policy.

Why do I need to choose a throughput apart from PartitionKey and RowKey to take advantage of the Table API offering of Azure Cosmos DB?

Azure Cosmos DB sets a default throughput for your container if you do not provide one in the app.config file or via the portal.

Azure Cosmos DB provides guarantees for performance and latency, with upper bounds on operation. This guarantee is possible when the engine can enforce governance on the tenant's operations. Setting TableThroughput ensures that you get the guaranteed throughput and latency, because the platform reserves this capacity and guarantees operational success.

By using the throughput specification, you can elastically change it to benefit from the seasonality of your application, meet the throughput needs, and save costs.

Azure Table storage has been very inexpensive for me, because I pay only to store the data, and I rarely query. The Azure Cosmos DB Table API offering seems to be charging me even though I have not performed a single transaction or stored anything. Can you please explain?

Azure Cosmos DB is designed to be a globally distributed, SLA-based system with guarantees for availability, latency, and throughput. When you reserve throughput in Azure Cosmos DB, it is guaranteed, unlike the throughput of other systems. Azure Cosmos DB provides additional capabilities that customers have requested, such as secondary indexes and global distribution.

I never get a quota full" notification (indicating that a partition is full) when I ingest data into Azure Table storage. With the Table API, I do get this message. Is this offering limiting me and forcing me to change my existing application?

Azure Cosmos DB is an SLA-based system that provides unlimited scale, with guarantees for latency, throughput, availability, and consistency. To ensure guaranteed premium performance, make sure that your data size and index are manageable and scalable. The 10-GB limit on the number of entities or items per partition key is to ensure that we provide great lookup and query performance. To ensure that your application scales well, even for Azure Storage, we recommend that you not create a hot partition by storing all information in one partition and querying it.

So PartitionKey and RowKey are still required with the Table API?

Yes. Because the surface area of the Table API is similar to that of the Azure Table storage SDK, the partition key provides an efficient way to distribute the data. The row key is unique within that partition. The row key needs to be present and can't be null as in the standard SDK. The length of RowKey is 255 bytes and the length of PartitionKey is 1 KB.

What are the error messages for the Table API?

Azure Table storage and Azure Cosmos DB Table API use the same SDKs so most of the errors will be the same.

Why do I get throttled when I try to create lot of tables one after another in the Table API?

Azure Cosmos DB is an SLA-based system that provides latency, throughput, availability, and consistency guarantees. Because it is a provisioned system, it reserves resources to guarantee these requirements. The rapid rate of creation of tables is detected and throttled. We recommend that you look at the rate of creation of tables and lower it to less than 5 per minute. Remember that the Table API is a provisioned system. The moment you provision it, you will begin to pay for it.

Develop against the Graph API

How can I apply the functionality of Graph API to Azure Cosmos DB?

You can use an extension library to apply the functionality of Graph API. This library is called Microsoft Azure Graphs, and it is available on NuGet.

It looks like you support the Gremlin graph traversal language. Do you plan to add more forms of query?

Yes, we plan to add other mechanisms for query in the future.

How can I use the new Graph API offering?

To get started, complete the Graph API quick-start article.

Develop with the Apache Cassandra API (preview)

What is the protocol version supported in the private preview? Is there a plan to support other protocols?

Apache Cassandra API for Azure Cosmos DB supports today CQL version 4. If you have feedback about supporting other protocols, let us know via uservoice feedback or send an email to askcosmosdbcassandra@microsoft.com.

Why is choosing a throughput for a table a requirement?

Azure Cosmos DB sets default throughput for your container based on where you create the table from - portal or CQL. Azure Cosmos DB provides guarantees for performance and latency, with upper bounds on operation. This guarantee is possible when the engine can enforce governance on the tenant's operations. Setting throughput ensures that you get the guaranteed throughput and latency, because the platform reserves this capacity and guarantees operation success. You can elastically change throughput to benefit from the seasonality of your application and save costs.

The throughput concept is explained in the Request Units in Azure Cosmos DB article. The throughput for a table is distributed across the underlying physical partitions equally.

What is the default RU/s of table when created through CQL? What If I need to change it?

Azure Cosmos DB uses request units per second (RU/s) as a currency for providing throughput. Tables created through CQL have 400 RU. You can change the RU from the portal.

CQL

CREATE TABLE keyspaceName.tablename (user_id int PRIMARY KEY, lastname text) WITH cosmosdb_provisioned_throughput=1200

.NET

int provisionedThroughput = 400;
var simpleStatement = new SimpleStatement($"CREATE TABLE {keyspaceName}.{tableName} (user_id int PRIMARY KEY, lastname text)");
var outgoingPayload = new Dictionary<string, byte[]>();
outgoingPayload["cosmosdb_provisioned_throughput"] = Encoding.UTF8.GetBytes(provisionedThroughput.ToString());
simpleStatement.SetOutgoingPayload(outgoingPayload); 

What happens when throughput is exceeded?

Azure Cosmos DB provides guarantees for performance and latency, with upper bounds on operation. This guarantee is possible when the engine can enforce governance on the tenant's operations. This is possible based on setting the throughput, which ensures that you get the guaranteed throughput and latency, because platform reserves this capacity and guarantees operation success. When you exceed this capacity you get overloaded error message indicating your capacity was exceeded. 0x1001 Overloaded: the request cannot be processed because "Request Rate is large". At this juncture it is essential to see what operations and their volume causes this issue. You can get an idea about consumed capacity exceeding the provisioned capacity with metrics on the portal. Then you need to ensure capacity is consumed nearly equally across all underlying partitions. If you see most of the throughput is consumed by one partition, you have skew of workload.

Metrics are available that show you how throughput is used over hours, days, and per seven days, across partitions or in aggregate. For more information, see Monitoring and debugging with metrics in Azure Cosmos DB.

Diagnostic logs are explained in the Azure Cosmos DB diagnostic logging article.

Does the primary key map to the partition key concept of Azure Cosmos DB?

Yes, the partition key is used to place the entity in right location. In Azure Cosmos DB it is used to find right logical partition that is stored on a physical partition. The partitioning concept is well explained in the Partition and scale in Azure Cosmos DB article. The essential take away here is that a logical partition should not exceed the 10 GB limit today.

What happens when I get a quota full" notification indicating that a partition is full?

Azure Cosmos DB is a SLA-based system that provides unlimited scale, with guarantees for latency, throughput, availability, and consistency. It's Cassandra API too allows unlimited storage of data. This unlimited storage is based on horizontal scaleout of data using partitioning as the key concept. The partitioning concept is well explained in the Partition and scale in Azure Cosmos DB article.

The 10-GB limit on the number of entities or items per logical partition you should adhere to. To ensure that your application scales well, we recommend that you not create a hot partition by storing all information in one partition and querying it. This error can only come if you data is skewed - that is you have lot of data for one partition key - i.e., more than 10 GB. You can find the distribution of data using the storage portal. Way to fix this error is to recrete the table and choose a granular primary (partition key), which allows better distribution of data.

Is it possible to use Cassandra API as key value store with millions or billions of individual partition keys?

Azure Cosmos DB can store unlimited data by scaling out the storage. This is independent of the throughput. Yes you can always just use Cassandra API to store and retrieve key/values by specifying right primary/partition key. These individual keys get their own logical partition and sit atop physical partition without issues.

Is it possible to create multiple tables with Apache Cassandra API of Azure Cosmos DB?

Yes, it is possible to crete multiple tables with Apache Cassandra API. Each of those tables is treated as unit for throughput and storage.

Is it possible to create multiple tables in succession?

Azure Cosmos DB is resource governed system for both data and control plane activities. Containers like collections, tables are runtime entities which are provisioned for given throughput capacity. The creation of these containers in quick succession is not expected activity and throttled. If you have tests which drop/create tables immediately - please try to space them out.

What is maximum number of tables which can be created?

There is no physical limit on number of tables, please send an email at askcosmosdbcassandra@microsoft.com if you have very large number of tables (where the total steady size exceeds 10 TB of data) that need to be created from usual 10s or 100s.

What is the maximum # of keyspace which we can create?

There is no physical limit on number of keyspaces as they are metadata containers, please send an email at askcosmosdbcassandra@microsoft.com if you have very large number of keyspaces for some reason.

Is it possible to bring in lot of data after starting from normal table?

The storage capacity is automatically managed and increases as you push in more data. So you can confidently import as much data as you need without managing and provisioning nodes, etc.

Is it possible to supply yaml file settings to configure Apache Casssandra API of Azure Cosmos DB behavior?

Apache Cassandra API of Azure Cosmos DB is a platform service. It provides protocol level compatibilty for executing operations. It hides away the complexity of management, monitoring and configuration. As a developer/user you do not need to worry about availability, tombstones, key cache, row cache, bloom filter and multitude of other settings. Azure Cosmos DB's Apache Cassandra API focuses on providing read and write performance that you require without the overhead of configuration and management.

Will Apache Cassandra API for Azure Cosmos DB support node addition/cluster status/node status commands?

Apache Cassandra API is a platform service which makes capacity planning, responding to the elasticity demands for throughput & storage a breeze. With Azure Cosmos DB you provision throughput you need. Then you can scale it up and down any number of times through the day without worrying about adding/deleting nodes or managing them. This implies you do not need to use the node, cluster management tool too.

What happens with respect to various config settings for keyspace creation like simple/network?

Azure Cosmos DB provides global distribution out of the box for availability and low latency reasons. You do not need to setup replicas etc. All writes are always durably quorum committed in a any region where you write while providing performance guarantees.

What happens with respect to various settings for table metadata like bloom filter, caching, read repair change, gc_grace, compression memtable_flush_period etc?

Azure Cosmos DB provides performance for reads/writes and throughput without need for touching any of the configuration settings and accidently manipulating them.

Is time-to-live (TTL) supported for Cassandra tables?

Yes, TTL is supported.

Is it possible to monitor node status, replica status, gc, and OS parameters earlier with various tools? What needs to be monitored now?

Azure Cosmos DB is a platform service that helps you increase productivity and not worry about managing and monitoring infrastructure. You just need to take care of throughput which is available on portal metrics to find if you are getting throttled and increase or decrease that throughput. Monitor SLAs. Use Metrics Use Diagnostic logs.

Which client SDKs can work with Apache Cassandra API of Azure Cosmos DB?

In private preview Apache Cassandra SDK's client drivers which use CQLv3 were used for client programs. If you have other drivers that you use or if you are facing issues, send mail to askcosmosdbcassandra@microsoft.com.

Is composite partition key supported?

Yes, you can use regular syntax to create composite partition key.

Can I use sstable loader for data loading?

No, during preview sstable loader is not supported.

Can an on-premises cassandra cluster be paired with Azure Cosmos DB's Apache Cassandra API?

At present Azure Cosmos DB has an optimized experience for cloud environment without overhead of operations. If you require pairing, please send mail to askcosmosdbcassandra@microsoft.com with a description of your scenario.

Does Cassandra API provide full backups?

Azure Cosmos DB provides two free full backups taken at four hours interval today across all APIs. This ensures you do not need to setup a backup schedule etc. If you want to modify retention and frequency, send an email to askcosmosdbcassandra@microsoft.com or raise a support case. Information about backup capability is provided in the Automatic online backup and restore with Azure Cosmos DB article.

How does the Cassandra API account handle failover if a region goes down?

The Azure Cosmos DB Cassandra API borrows from the globally distributed platform of Azure Cosmos DB. To ensure that your application can tolerate datacenter downtime, enable at least one more region for the account in the Azure Cosmos DB portal Developing with multi-region Azure Cosmos DB accounts. You can set the priority of the region by using the portal Developing with multi-region Azure Cosmos DB accounts.

You can add as many regions as you want for the account and control where it can fail over to by providing a failover priority. To use the database, you need to provide an application there too. When you do so, your customers will not experience downtime.

Does the Apache Cassandra API index all attributes of an entity by default?

Yes, all attributes of an entity are indexed by default by Azure Cosmos DB. For more information, see Azure Cosmos DB: Indexing policies. You get benefits of guaranteed performance with consistent indexing and durable quorum committed writes always.

Does this mean I do not have to create multiple indexes to satisfy the queries?

Yes, Azure Cosmos DB provides automatic indexing of all attributes without any schema definition. This automation frees developers to focus on the application rather than on index creation and management. For more information, see Azure Cosmos DB: Indexing policies.

Can I use the new Cassandra API SDK locally with the emulator?

We plan to support this capability in future.

Azure Cosmos DB as a platform seems to have lot of capabilities, such as changefeed and other functionality. Will these capabilities be added to the Cassandra API?

The Apache Cassandra API provides the same CQL functionality as Apache Cassandra. We do plan to look into feasibility of supporting various capabilities in future.

Feature x of regular Cassandra API is not working as today, where can the feedback be provided?

Provide feedback via uservoice feedback.