Microsoft Azure Storage Performance and Scalability Checklist
Since the release of the Microsoft Azure Storage services, Microsoft has developed a number of proven practices for using these services in a performant manner, and this article serves to consolidate the most important of them into a checklist-style list. The intention of this article is to help application developers verify they are using proven practices with Azure Storage and to help them identify other proven practices they should consider adopting. This article does not attempt to cover every possible performance and scalability optimization — it excludes those that are small in their impact or not broadly applicable. To the extent that the application's behavior can be predicted during design, it's useful to keep these in mind early on to avoid designs that will run into performance problems.
Every application developer using Azure Storage should take the time to read this article, and check that his or her application follows each of the proven practices listed below.
This article organizes the proven practices into the following groups. Proven practices applicable to:
- All Azure Storage services (blobs, tables, queues, and files)
This section lists proven practices that are applicable to the use of any of the Azure Storage services (blobs, tables, queues, or files).
Azure Storage itself has a limit of 250 storage accounts per region per subscription. If you reach that limit, you will be unable to create any more storage accounts in that subscription/region combination.
Each of the Azure Storage services has scalability targets for capacity (GB), transaction rate, and bandwidth. If your application approaches or exceeds any of the scalability targets, it may encounter increased transaction latencies or throttling. When a Storage service throttles your application, the service begins to return "503 Server busy" or "500 Operation timeout" error codes for some storage transactions. This section discusses both the general approach to dealing with scalability targets and bandwidth scalability targets in particular. Later sections that deal with individual storage services discuss scalability targets in the context of that specific service:
Bandwidth Scalability Target for All Services
At the time of writing, the bandwidth targets in the US for a geo-redundant storage (GRS) account are 10 gigabits per second (Gbps) for ingress (data sent to the storage account) and 20 Gbps for egress (data sent from the storage account). For a locally redundant storage (LRS) account, the limits are higher – 20 Gbps for ingress and 30 Gbps for egress. International bandwidth limits may be lower and can be found on our scalability targets page. For more information on the storage redundancy options, see the links in Useful Resources below.
What to do when approaching a scalability target
If you're approaching the limit of storage accounts you can have in a particular subscription/region combination, evaluate your application and usage of storage accounts and determine if any of these conditions apply.
- Using storage accounts as unmanaged disks and adding those disks to your virtual machines. In this scenario, we recommend using managed disks, as they handle storage disk scalability for you without you having to create and manage individual storage accounts.
- Using one storage account on a per customer basis, for the purpose of data isolation. In this scenario, we recommend using storage containers for each customer rather than an entire storage account. Azure Storage now allows you to specify role-based access control on a per container basis.
- Using multiple storage accounts to shard for greater scalability of ingress/egress/iops/capacity. In this scenario, if possible, we recommend you take advantage of the increased limits of standard storage accounts to reduce the number of storage accounts required for your workload.
If your application is approaching the scalability targets for a single storage account, consider adopting one of the following approaches:
- Reconsider the workload that causes your application to approach or exceed the scalability target. Can you design it differently to use less bandwidth or capacity, or fewer transactions?
- If an application must exceed one of the scalability targets, you should create multiple storage accounts and partition your application data across those multiple storage accounts. If you use this pattern, then be sure to design your application so that you can add more storage accounts in the future for load balancing. At time of writing, each Azure subscription can have up to 100 storage accounts. Storage accounts also have no cost other than your usage in terms of data stored, transactions made, or data transferred.
- If your application hits the bandwidth targets, consider compressing data in the client to reduce the bandwidth required to send the data to the storage service. While this may save bandwidth and improve network performance, it can also have some negative impacts. You should evaluate the performance impact of this due to the additional processing requirements for compressing and decompressing data in the client. In addition, storing compressed data can make it more difficult to troubleshoot issues since it could be more difficult to view stored data using standard tools.
- If your application hits the scalability targets, then ensure that you are using an exponential backoff for retries (see Retries). It's better to make sure you never approach the scalability targets (by using one of the above methods), but this will ensure your application won't just keep retrying rapidly, making the throttling worse.
The following links provide additional detail on scalability targets:
- See Azure Storage Scalability and Performance Targets for information about scalability targets.
- See Azure Storage replication and the blog post Azure Storage Redundancy Options and Read Access Geo Redundant Storage for information about storage redundancy options.
- For current information about pricing for Azure services, see Azure pricing.
Partition Naming Convention
Azure Storage uses a range-based partitioning scheme to scale and load balance the system. The partition key is used to partition data into ranges and these ranges are load-balanced across the system. This means naming conventions such as lexical ordering (for example, msftpayroll, msftperformance, msftemployees, etc.) or using time-stamps (log20160101, log20160102, log20160102, etc.) will lend itself to the partitions being potentially colocated on the same partition server, until a load-balancing operation splits them out into smaller ranges. For example, all blobs within a container can be served by a single server until the load on these blobs requires further rebalancing of the partition ranges. Similarly, a group of lightly loaded accounts with their names arranged in lexical order may be served by a single server until the load on one or all of these accounts require them to be split across multiple partitions servers. Each load-balancing operation may impact the latency of storage calls during the operation. The system's ability to handle a sudden burst of traffic to a partition is limited by the scalability of a single partition server until the load-balancing operation kicks-in and rebalances the partition key range.
You can follow some best practices to reduce the frequency of such operations.
- Examine the naming convention you use for accounts, containers, blobs, tables, and queues, closely. Consider prefixing account names with a 3-digit hash using a hashing function that best suits your needs.
- If you organize your data using timestamps or numerical identifiers, you have to ensure you are not using an append-only (or prepend-only) traffic patterns. These patterns are not suitable for a range -based partitioning system, and could lead to all the traffic going to a single partition and limiting the system from effectively load balancing. For instance, if you have daily operations that use a blob object with a timestamp such as yyyymmdd, then all the traffic for that daily operation is directed to a single object, which is served by a single partition server. Look at whether the per blob limits and per partition limits meet your needs, and consider breaking this operation into multiple blobs if needed. Similarly, if you store time series data in your tables, all the traffic could be directed to the last part of the key namespace. If you must use timestamps or numerical IDs, prefix the ID with a 3-digit hash, or in the case of timestamps prefix the seconds part of the time such as ssyyyymmdd. If listing and querying operations are routinely performed, choose a hashing function that will limit your number of queries. In other cases, a random prefix may be sufficient.
- For additional information on the partitioning scheme used in Azure Storage, read the SOSP paper here.
While the API calls matter, often the physical network constraints of the application have a significant impact on performance. The following describe some of limitations users may encounter.
Client Network Capability
For bandwidth, the problem is often the capabilities of the client. For example, while a single storage account can handle 10 Gbps or more of ingress (see bandwidth scalability targets), the network speed in a "Small" Azure Worker Role instance is only capable of approximately 100 Mbps. Larger Azure instances have NICs with greater capacity, so you should consider using a larger instance or more VMs if you need higher network limits from a single machine. If you are accessing a Storage service from an on premises application, then the same rule applies: understand the network capabilities of the client device and the network connectivity to the Azure Storage location and either improve them as needed or design your application to work within their capabilities.
As with any network usage, be aware that network conditions resulting in errors and packet loss will slow effective throughput. Using WireShark or NetMon may help in diagnosing this issue.
In any distributed environment, placing the client near to the server delivers in the best performance. For accessing Azure Storage with the lowest latency, the best location for your client is within the same Azure region. For example, if you have an Azure Web Site that uses Azure Storage, you should locate them both within a single region (for example, US West or Asia Southeast). This reduces the latency and the cost — at the time of writing, bandwidth usage within a single region is free.
If your client applications are not hosted within Azure (such as mobile device apps or on premises enterprise services), then again placing the storage account in a region near to the devices that will access it, will generally reduce latency. If your clients are broadly distributed (for example, some in North America, and some in Europe), then you should consider using multiple storage accounts: one located in a North American region and one in a European region. This will help to reduce latency for users in both regions. This approach is easier to implement if the data the application stores is specific to individual users, and does not require replicating data between storage accounts. For broad content distribution, a CDN is recommended – see the next section for more details.
Sometimes, an application needs to serve the same content to many users (for example, a product demo video used in the home page of a website), located in either the same or multiple regions. In this scenario, you should use a Content Delivery Network (CDN) such as Azure CDN, and the CDN would use Azure storage as the origin of the data. Unlike an Azure Storage account that exists in a single region and that cannot deliver content with low latency to other regions, Azure CDN uses servers in multiple data centers around the world. Additionally, a CDN can typically support much higher egress limits than a single storage account.
For more information about Azure CDN, see Azure CDN.
Using SAS and CORS
Both of these technologies can help you avoid unnecessary load (and bottlenecks) on your web application.
For more information about SAS, see Shared Access Signatures, Part 1: Understanding the SAS Model.
For more information about CORS, see Cross-Origin Resource Sharing (CORS) Support for the Azure Storage Services.
In general, getting data from a service once is better than getting it twice. Consider the example of an MVC web application running in a web role that has already retrieved a 50-MB blob from the storage service to serve as content to a user. The application could then retrieve that same blob every time a user requests it, or it could cache it locally to disk and reuse the cached version for subsequent user requests. Furthermore, whenever a user requests the data, the application could issue GET with a conditional header for modification time, which would avoid getting the entire blob if it hasn't been modified. You can apply this same pattern to working with table entities.
In some cases, you may decide that your application can assume that the blob remains valid for a short period after retrieving it, and that during this period the application does not need to check if the blob was modified.
Configuration, lookup, and other data that are always used by the application are great candidates for caching.
For an example of how to get a blob's properties to discover the last modified date using .NET, see Set and Retrieve Properties and Metadata. For more information about conditional downloads, see Conditionally Refresh a Local Copy of a Blob.
Uploading Data in Batches
In some application scenarios, you can aggregate data locally, and then periodically upload it in a batch instead of uploading each piece of data immediately. For example, a web application might keep a log file of activities: the application could either upload details of every activity as it happens as a table entity (which requires many storage operations), or it could save activity details to a local log file, and then periodically upload all activity details as a delimited file to a blob. If each log entry is 1 KB in size, you can upload thousands in a single "Put Blob" transaction (you can upload a blob of up to 64 MB in size in a single transaction). If the local machine crashes prior to the upload, you will potentially lose some log data: the application developer must design for the possibility of client device or upload failures. If the activity data needs to be downloaded for timespans (not just single activity), then blobs are recommended over tables.
If using the .NET Framework, this section lists several quick configuration settings that you can use to make significant performance improvements. If using other languages, check to see if similar concepts apply in your chosen language.
Increase default connection limit
In .NET, the following code increases the default connection limit (which is usually 2 in a client environment or 10 in a server environment) to 100. Typically, you should set the value to approximately the number of threads used by your application.
ServicePointManager.DefaultConnectionLimit = 100; //(Or More)
You must set the connection limit before opening any connections.
For other programming languages, see that language's documentation to determine how to set the connection limit.
For more information, see the blog post Web Services: Concurrent Connections.
Increase ThreadPool Min Threads if using synchronous code with Async Tasks
This code will increase the thread pool min threads:
ThreadPool.SetMinThreads(100,100); //(Determine the right number for your application)
For more information, see ThreadPool.SetMinThreads Method.
Take advantage of .NET 4.5 Garbage Collection
Use .NET 4.5 or later for the client application to take advantage of performance improvements in server garbage collection.
For more information, see the article An Overview of Performance Improvements in .NET 4.5.
While parallelism can be great for performance, be careful about using unbounded parallelism (no limit on the number of threads and/or parallel requests) to upload or download data, using multiple workers to access multiple partitions (containers, queues, or table partitions) in the same storage account or to access multiple items in the same partition. If the parallelism is unbounded, your application can exceed the client device's capabilities or the storage account's scalability targets resulting in longer latencies and throttling.
Storage Client Libraries and Tools
Always use the latest Microsoft provided client libraries and tools. At the time of writing, there are client libraries available for .NET, Windows Phone, Windows Runtime, Java, and C++, as well as preview libraries for other languages. In addition, Microsoft has released PowerShell cmdlets and Azure CLI commands for working with Azure Storage. Microsoft actively develops these tools with performance in mind, keeps them up-to-date with the latest service versions, and ensures they handle many of the proven performance practices internally.
In some cases, the storage service may throttle your application or may simply be unable to serve the request due to some transient condition and return a "503 Server busy" message or "500 Timeout". This can happen if your application is approaching any of the scalability targets, or if the system is rebalancing your partitioned data to allow for higher throughput. The client application should typically retry the operation that causes such an error: attempting the same request later can succeed. However, if the storage service is throttling your application because it is exceeding scalability targets, or even if the service was unable to serve the request for some other reason, aggressive retries usually make the problem worse. For this reason, you should use an exponential back off (the client libraries default to this behavior). For example, your application may retry after 2 seconds, then 4 seconds, then 10 seconds, then 30 seconds, and then give up completely. This behavior results in your application significantly reducing its load on the service rather than exacerbating any problems.
Connectivity errors can be retried immediately, because they are not the result of throttling and are expected to be transient.
The client libraries are aware of which errors are retry-able and which are not. However, if you are writing your own code against the storage REST API, remember there are some errors that you should not retry: for example, a 400 (Bad Request) response indicates that the client application sent a request that could not be processed because it was not in an expected form. Resending this request will result the same response every time, so there is no point in retrying it. If you are writing your own code against the storage REST API, be aware of what the error codes mean and the proper way to retry (or not) for each of them.
For more information about storage error codes, see Status and Error Codes on the Microsoft Azure web site.
In addition to the proven practices for All Services described previously, the following proven practices apply specifically to the blob service.
Blob-Specific Scalability Targets
Multiple clients accessing a single object concurrently
If you have a large number of clients accessing a single object concurrently, you will need to consider per object and storage account scalability targets. The exact number of clients that can access a single object will vary depending on factors such as the number of clients requesting the object simultaneously, the size of the object, network conditions etc.
If the object can be distributed through a CDN such as images or videos served from a website, then you should use a CDN. See here.
In other scenarios such as scientific simulations where the data is confidential you have two options. The first is to stagger your workload's access such that the object is accessed over a period of time vs being accessed simultaneously. Alternatively, you can temporarily copy the object to multiple storage accounts thus increasing the total IOPS per object and across storage accounts. In limited testing, we found that around 25 VMs could simultaneously download a 100-GB blob in parallel (each VM was parallelizing the download using 32 threads). If you had 100 clients needing to access the object, first copy it to a second storage account and then have the first 50 VMs access the first blob and the second 50 VMs access the second blob. Results will vary depending on your applications behavior so you should test this during design.
Bandwidth and operations per Blob
You can read or write to a single blob at up to a maximum of 60 MB/second (this is approximately 480 Mbps, which exceeds the capabilities of many client-side networks (including the physical NIC on the client device). In addition, a single blob supports up to 500 requests per second. If you have multiple clients that need to read the same blob and you might exceed these limits, you should consider using a CDN for distributing the blob.
For more information about target throughput for blobs, see Azure Storage Scalability and Performance Targets.
Copying and Moving Blobs
The storage REST API version 2012-02-12 introduced the useful ability to copy blobs across accounts: a client application can instruct the storage service to copy a blob from another source (possibly in a different storage account), and then let the service perform the copy asynchronously. This can significantly reduce the bandwidth needed for the application when you are migrating data from other storage accounts because you do not need to download and upload the data.
One consideration, however, is that, when copying between storage accounts, there is no time guarantee on when the copy will complete. If your application needs to complete a blob copy quickly under your control, it may be better to copy the blob by downloading it to a VM and then uploading it to the destination. For full predictability in that situation, ensure that the copy is performed by a VM running in the same Azure region, or else network conditions may (and probably will) affect your copy performance. In addition, you can monitor the progress of an asynchronous copy programmatically.
Copies within the same storage account itself are generally completed quickly.
For more information, see Copy Blob.
The Azure Storage team has released a command-line tool "AzCopy" that is meant to help with bulk transferring many blobs to, from, and across storage accounts. This tool is optimized for this scenario, and can achieve high transfer rates. Its use is encouraged for bulk upload, download, and copy scenarios. To learn more about it and download it, see Transfer data with the AzCopy Command-Line Utility.
Azure Import/Export Service
For large volumes of data (more than 1 TB), the Azure Storage offers the Import/Export service, which allows for uploading and downloading from blob storage by shipping hard drives. You can put your data on a hard drive and send it to Microsoft for upload, or send a blank hard drive to Microsoft to download data. For more information, see Use the Microsoft Azure Import/Export Service to Transfer Data to Blob Storage. This can be much more efficient than uploading/downloading this volume of data over the network.
The blob service supports head requests, which can include metadata about the blob. For example, if your application needed the EXIF data out of a photo, it could retrieve the photo and extract it. To save bandwidth and improve performance, your application could store the EXIF data in the blob's metadata when the application uploaded the photo: you can then retrieve the EXIF data in metadata using only a HEAD request, saving significant bandwidth, and the processing time needed to extract the EXIF data each time the blob is read. This would be useful in scenarios where you only need the metadata, and not the full content of a blob. Only 8 KB of metadata can be stored per blob (the service will not accept a request to store more than that), so if the data does not fit in that size, you may not be able to use this approach.
For an example of how to get a blob's metadata using .NET, see Set and Retrieve Properties and Metadata.
To upload blobs fast, the first question to answer is: are you uploading one blob or many? Use the below guidance to determine the correct method to use depending on your scenario.
Uploading one large blob quickly
To upload a single large blob quickly, your client application should upload its blocks or pages in parallel (being mindful of the scalability targets for individual blobs and the storage account as a whole). The official Microsoft-provided RTM Storage Client libraries (.NET, Java) have the ability to do this. For each of the libraries, use the below specified object/property to set the level of concurrency:
- .NET: Set ParallelOperationThreadCount on a BlobRequestOptions object to be used.
- Java/Android: Use BlobRequestOptions.setConcurrentRequestCount()
- Node.js: Use parallelOperationThreadCount on either the request options or on the blob service.
- C++: Use the blob_request_options::set_parallelism_factor method.
Uploading many blobs quickly
To upload many blobs quickly, upload blobs in parallel. This is faster than uploading single blobs at a time with parallel block uploads because it spreads the upload across multiple partitions of the storage service. A single blob only supports a throughput of 60 MB/second (approximately 480 Mbps). At the time of writing, a US-based LRS account supports up to 20-Gbps ingress, which is far more than the throughput supported by an individual blob. AzCopy performs uploads in parallel by default, and is recommended for this scenario.
Choosing the correct type of blob
Azure Storage supports two types of blob: page blobs and block blobs. For a given usage scenario, your choice of blob type will affect the performance and scalability of your solution. Block blobs are appropriate when you want to upload large amounts of data efficiently: for example, a client application may need to upload photos or video to blob storage. Page blobs are appropriate if the application needs to perform random writes on the data: for example, Azure VHDs are stored as page blobs.
For more information, see Understanding Block Blobs, Append Blobs, and Page Blobs.
In addition to the proven practices for All Services described previously, the following proven practices apply specifically to the table service.
Table-Specific Scalability Targets
In addition to the bandwidth limitations of an entire storage account, tables have the following specific scalability limit. The system will load balance as your traffic increases, but if your traffic has sudden bursts, you may not be able to get this volume of throughput immediately. If your pattern has bursts, you should expect to see throttling and/or timeouts during the burst as the storage service automatically load balances out your table. Ramping up slowly generally has better results as it gives the system time to load balance appropriately.
Entities per Second (Account)
The scalability limit for accessing tables is up to 20,000 entities (1 KB each) per second for an account. In general, each entity that is inserted, updated, deleted, or scanned counts toward this target. So a batch insert that contains 100 entities would count as 100 entities. A query that scans 1000 entities and returns 5 would count as 1000 entities.
Entities per Second (Partition)
Within a single partition, the scalability target for accessing tables is 2,000 entities (1 KB each) per second, using the same counting as described in the previous section.
This section lists several quick configuration settings that you can use to make significant performance improvements in the table service:
Beginning with storage service version 2013-08-15, the table service supports using JSON instead of the XML-based AtomPub format for transferring table data. This can reduce payload sizes by as much as 75% and can significantly improve the performance of your application.
Nagle's algorithm is widely implemented across TCP/IP networks as a means to improve network performance. However, it is not optimal in all circumstances (such as highly interactive environments). For Azure Storage, Nagle's algorithm has a negative impact on the performance of requests to the table and queue services, and you should disable it if possible.
How you represent and query your data is the biggest single factor that affects the performance of the table service. While every application is different, this section outlines some general proven practices that relate to:
- Table design
- Efficient queries
- Efficient data updates
Tables and partitions
Tables are divided into partitions. Every entity stored in a partition shares the same partition key and has a unique row key to identify it within that partition. Partitions provide benefits but also introduce scalability limits.
- Benefits: You can update entities in the same partition in a single, atomic, batch transaction that contains up to 100 separate storage operations (limit of 4 MB total size). Assuming the same number of entities to be retrieved, you can also query data within a single partition more efficiently than data that spans partitions (though read on for further recommendations on querying table data).
- Scalability limit: Access to entities stored in a single partition cannot be load-balanced because partitions support atomic batch transactions. For this reason, the scalability target for an individual table partition is lower than for the table service as a whole.
Because of these characteristics of tables and partitions, you should adopt the following design principles:
- Data that your client application frequently updated or queried in the same logical unit of work should be located in the same partition. This may be because your application is aggregating writes, or because you want to take advantage of atomic batch operations. Also, data in a single partition can be more efficiently queried in a single query than data across partitions.
- Data that your client application does not insert/update or query in the same logical unit of work (single query or batch update) should be located in separate partitions. One important note is that there is no limit to the number of partition keys in a single table, so having millions of partition keys is not a problem and will not impact performance. For example, if your application is a popular website with user login, using the User ID as the partition key could be a good choice.
A hot partition is one that is receiving a disproportionate percentage of the traffic to an account, and cannot be load balanced because it is a single partition. In general, hot partitions are created one of two ways:
Append Only and Prepend Only patterns
The "Append Only" pattern is one where all (or nearly all) of the traffic to a given PK increases and decreases according to the current time. An example is if your application used the current date as a partition key for log data. This results in all of the inserts going to the last partition in your table, and the system cannot load balance because all of the writes are going to the end of your table. If the volume of traffic to that partition exceeds the partition-level scalability target, then it will result in throttling. It's better to ensure that traffic is sent to multiple partitions, to enable load balance the requests across your table.
If your partitioning scheme results in a single partition that just has data that is far more used than other partitions, you may also see throttling as that partition approaches the scalability target for a single partition. It's better to make sure that your partition scheme results in no single partition approaching the scalability targets.
This section describes proven practices for querying the table service.
There are several ways to specify the range of entities to query. The following is a discussion of the uses of each.
In general, avoid scans (queries larger than a single entity), but if you must scan, try to organize your data so that your scans retrieve the data you need without scanning or returning significant amounts of entities you don't need.
A point query retrieves exactly one entity. It does this by specifying both the partition key and row key of the entity to retrieve. These queries are efficient, and you should use them wherever possible.
A partition query is a query that retrieves a set of data that shares a common partition key. Typically, the query specifies a range of row key values or a range of values for some entity property in addition to a partition key. These are less efficient than point queries, and should be used sparingly.
A table query is a query that retrieves a set of entities that does not share a common partition key. These queries are not efficient and you should avoid them if possible.
Another key factor in query efficiency is the number of entities returned as compared to the number of entities scanned to find the returned set. If your application performs a table query with a filter for a property value that only 1% of the data shares, the query will scan 100 entities for every one entity it returns. The table scalability targets discussed previously all relate to the number of entities scanned, and not the number of entities returned: a low query density can easily cause the table service to throttle your application because it must scan so many entities to retrieve the entity you are looking for. See the section below on denormalization for more information on how to avoid this.
Limiting the Amount of Data Returned
Where you know that a query will return entities that you don't need in the client application, consider using a filter to reduce the size of the returned set. While the entities not returned to the client still count toward the scalability limits, your application performance will improve because of the reduced network payload size and the reduced number of entities that your client application must process. See above note on Query Density, however – the scalability targets relate to the number of entities scanned, so a query that filters out many entities may still result in throttling, even if few entities are returned.
If your client application needs only a limited set of properties from the entities in your table, you can use projection to limit the size of the returned data set. As with filtering, this helps to reduce network load and client processing.
Unlike working with relational databases, the proven practices for efficiently querying table data lead to denormalizing your data. That is, duplicating the same data in multiple entities (one for each key you may use to find the data) to minimize the number of entities that a query must scan to find the data the client needs, rather than having to scan large numbers of entities to find the data your application needs. For example, in an e-commerce website, you may want to find an order both by the customer ID (give me this customer's orders) and by the date (give me orders on a date). In Table Storage, it is best to store the entity (or a reference to it) twice – once with Table Name, PK, and RK to facilitate finding by customer ID, once to facilitate finding it by the date.
This section describes proven practices for modifying entities stored in the table service.
Batch transactions are known as Entity Group Transactions (ETG) in Azure Storage; all the operations within an ETG must be on a single partition in a single table. Where possible, use ETGs to perform inserts, updates, and deletes in batches. This reduces the number of round trips from your client application to the server, reduces the number of billable transactions (an ETG counts as a single transaction for billing purposes and can contain up to 100 storage operations), and enables atomic updates (all operations succeed or all fail within an ETG). Environments with high latencies such as mobile devices will benefit greatly from using ETGs.
Use table Upsert operations wherever possible. There are two types of Upsert, both of which can be more efficient than a traditional Insert and Update operations:
- InsertOrMerge: Use this when you want to upload a subset of the entity's properties, but aren't sure whether the entity already exists. If the entity exists, this call updates the properties included in the Upsert operation, and leaves all existing properties as they are, if the entity does not exist, it inserts the new entity. This is similar to using projection in a query, in that you only need to upload the properties that are changing.
- InsertOrReplace: Use this when you want to upload an entirely new entity, but you aren't sure whether it already exists. You should only use this when you know that the newly uploaded entity is entirely correct because it completely overwrites the old entity. For example, you want to update the entity that stores a user's current location regardless of whether or not the application has previously stored location data for the user; the new location entity is complete, and you do not need any information from any previous entity.
Storing Data Series in a Single Entity
Sometimes, an application stores a series of data that it frequently needs to retrieve all at once: for example, an application might track CPU usage over time in order to plot a rolling chart of the data from the last 24 hours. One approach is to have one table entity per hour, with each entity representing a specific hour and storing the CPU usage for that hour. To plot this data, the application needs to retrieve the entities holding the data from the 24 most recent hours.
Alternatively, your application could store the CPU usage for each hour as a separate property of a single entity: to update each hour, your application can use a single InsertOrMerge Upsert call to update the value for the most recent hour. To plot the data, the application only needs to retrieve a single entity instead of 24, making for an efficient query (see above discussion on query scope).
Storing structured data in blobs
Sometimes structured data feels like it should go in tables, but ranges of entities are always retrieved together and can be batch inserted. A good example of this is a log file. In this case, you can batch several minutes of logs, insert them, and then you are always retrieving several minutes of logs at a time as well. In this case, for performance, it's better to use blobs instead of tables, since you can significantly reduce the number of objects written/returned, as well as usually the number of requests that need made.
In addition to the proven practices for All Services described previously, the following proven practices apply specifically to the queue service.
A single queue can process approximately 2,000 messages (1 KB each) per second (each AddMessage, GetMessage, and DeleteMessage count as a message here). If this is insufficient for your application, you should use multiple queues and spread the messages across them.
View current scalability targets at Azure Storage Scalability and Performance Targets.
See the section on table configuration that discusses the Nagle algorithm — the Nagle algorithm is generally bad for the performance of queue requests, and you should disable it.
Queue performance and scalability decrease as message size increases. You should place only the information the receiver needs in a message.
You can retrieve up to 32 messages from a queue in a single operation. This can reduce the number of roundtrips from the client application, which is especially useful for environments, such as mobile devices, with high latency.
Queue Polling Interval
Most applications poll for messages from a queue, which can be one of the largest sources of transactions for that application. Select your polling interval wisely: polling too frequently could cause your application to approach the scalability targets for the queue. However, at 200,000 transactions for $0.01 (at the time of writing), a single processor polling once every second for a month would cost less than 15 cents so cost is not typically a factor that affects your choice of polling interval.
For up-to-date cost information, see Azure Storage Pricing.
You can use UpdateMessage to increase the invisibility timeout or to update state information of a message. While this is powerful, remember that each UpdateMessage operation counts towards the scalability target. However, this can be a much more efficient approach than having a workflow that passes a job from one queue to the next, as each step of the job is completed. Using the UpdateMessage operation allows your application to save the job state to the message and then continue working, instead of requeuing the message for the next step of the job every time a step completes.
For more information, see the article How to: Change the contents of a queued message.
You should use queues to make your application architecture scalable. The following lists some ways you can use queues to make your application more scalable:
- You can use queues to create backlogs of work for processing and smooth out workloads in your application. For example, you could queue up requests from users to perform processor intensive work such as resizing uploaded images.
- You can use queues to decouple parts of your application so that you can scale them independently. For example, a web front end could place survey results from users into a queue for later analysis and storage. You could add more worker role instances to process the queue data as required.
This article discussed some of the most common, proven practices for optimizing performance when using Azure Storage. We encourage every application developer to assess their application against each of the above practices and consider acting on the recommendations to get great performance for their applications that use Azure Storage.
Send feedback about: