Microsoft Azure Storage is a Microsoft-managed cloud service that provides storage that is highly available, secure, durable, scalable, and redundant. Microsoft takes care of maintenance and handles critical problems for you.
Azure Storage consists of three data services: Blob storage, File storage, and Queue storage. Blob storage supports both standard and premium storage, with premium storage using only SSDs for the fastest performance possible. Another feature is cool storage, allowing you to storage large amounts of rarely accessed data for a lower cost.
In this article, you learn about the following:
- the Azure Storage services
- the types of storage accounts
- accessing your blobs, queues, and files
- transferring data into or out of storage
- the many storage client libraries available.
Introducing the Azure Storage services
To use any of the services provided by Azure Storage -- Blob storage, File storage, and Queue storage -- you first create a storage account, and then you can transfer data to/from a specific service in that storage account.
Blobs are basically files like those that you store on your computer (or tablet, mobile device, and so on). They can be pictures, Microsoft Excel files, HTML files, virtual hard disks (VHDs), big data such as logs, database backups -- pretty much anything. Blobs are stored in containers, which are similar to folders.
After storing files in Blob storage, you can access them from anywhere in the world using URLs, the REST interface, or one of the Azure SDK storage client libraries. Storage client libraries are available for multiple languages, including Node.js, Java, PHP, Ruby, Python, and .NET.
There are three types of blobs -- block blobs, append blobs, and page blobs (used for VHD files).
- Block blobs are used to hold ordinary files up to about 4.7 TB.
- Page blobs are used to hold random access files up to 8 TB in size. These are used for the VHD files that back VMs.
- Append blobs are made up of blocks like the block blobs, but are optimized for append operations. These are used for things like logging information to the same blob from multiple VMs.
For very large datasets where network constraints make uploading or downloading data to Blob storage over the wire unrealistic, you can ship a set of hard drives to Microsoft to import or export data directly from the data center. See Use the Microsoft Azure Import/Export Service to Transfer Data to Blob Storage.
The Azure Files service enables you to set up highly available network file shares that can be accessed by using the standard Server Message Block (SMB) protocol. That means that multiple VMs can share the same files with both read and write access. You can also read the files using the REST interface or the storage client libraries.
One thing that distinguishes Azure File storage from files on a corporate file share is that you can access the files from anywhere in the world using a URL that points to the file and includes a shared access signature (SAS) token. You can generate SAS tokens; they allow specific access to a private asset for a specific amount of time.
File shares can be used for many common scenarios:
Many on-premises applications use file shares. This feature makes it easier to migrate those applications that share data to Azure. If you mount the file share to the same drive letter that the on-premises application uses, the part of your application that accesses the file share should work with minimal, if any, changes.
Configuration files can be stored on a file share and accessed from multiple VMs. Tools and utilities used by multiple developers in a group can be stored on a file share, ensuring that everybody can find them, and that they use the same version.
Diagnostic logs, metrics, and crash dumps are just three examples of data that can be written to a file share and processed or analyzed later.
At this time, Active Directory-based authentication and access control lists (ACLs) are not supported, but they will be at some time in the future. The storage account credentials are used to provide authentication for access to the file share. This means anybody with the share mounted will have full read/write access to the share.
The Azure Queue service is used to store and retrieve messages. Queue messages can be up to 64 KB in size, and a queue can contain millions of messages. Queues are generally used to store lists of messages to be processed asynchronously.
For example, say you want your customers to be able to upload pictures, and you want to create thumbnails for each picture. You could have your customer wait for you to create the thumbnails while uploading the pictures. An alternative would be to use a queue. When the customer finishes his upload, write a message to the queue. Then have an Azure Function retrieve the message from the queue and create the thumbnails. Each of the parts of this processing can be scaled separately, giving you more control when tuning it for your usage.
Standard Azure Table Storage is now part of Cosmos DB. Also available is Premium Tables for Azure Table storage, offering throughput-optimized tables, global distribution, and automatic secondary indexes. To learn more and try out the new premium experience, please check out Azure Cosmos DB: Table API.
The Azure Storage team also owns Disks, which includes all of the managed and unmanaged disk capabilities used by virtual machines. For more information about these features, please see the Compute Service documentation.
Types of storage accounts
This table shows the various kinds of storage accounts and which objects can be used with each.
|Type of storage account||General-purpose Standard||General-purpose Premium||Blob storage, hot and cool access tiers|
|Services supported||Blob, File, Queue Services||Blob Service||Blob Service|
|Types of blobs supported||Block blobs, page blobs, and append blobs||Page blobs||Block blobs and append blobs|
General-purpose storage accounts
There are two kinds of general-purpose storage accounts.
The most widely used storage accounts are standard storage accounts, which can be used for all types of data. Standard storage accounts use magnetic media to store data.
Premium storage provides high-performance storage for page blobs, which are primarily used for VHD files. Premium storage accounts use SSD to store data. Microsoft recommends using Premium Storage for all of your VMs.
Blob Storage accounts
The Blob Storage account is a specialized storage account used to store block blobs and append blobs. You can't store page blobs in these accounts, therefore you can't store VHD files. These accounts allow you to set an access tier to Hot or Cool; the tier can be changed at any time.
The hot access tier is used for files that are accessed frequently -- you pay a higher cost for storage, but the cost of accessing the blobs is much lower. For blobs stored in the cool access tier, you pay a higher cost for accessing the blobs, but the cost of storage is much lower.
Accessing your blobs, files, and queues
Each storage account has two authentication keys, either of which can be used for any operation. There are two keys so you can roll over the keys occasionally to enhance security. It is critical that these keys be kept secure because their possession, along with the account name, allows unlimited access to all data in the storage account.
This section looks two ways to secure the storage account and its data. For detailed information about securing your storage account and your data, see the Azure Storage security guide.
Securing access to storage accounts using Azure AD
One way to secure access to your storage data is by controlling access to the storage account keys. With Resource Manager Role-Based Access Control (RBAC), you can assign roles to users, groups, or applications. These roles are tied to a specific set of actions that are allowed or disallowed. Using RBAC to grant access to a storage account only handles the management operations for that storage account, such as changing the access tier. You can't use RBAC to grant access to data objects like a specific container or file share. You can, however, use RBAC to grant access to the storage account keys, which can then be used to read the data objects.
Securing access using shared access signatures
You can use shared access signatures and stored access policies to secure your data objects. A shared access signature (SAS) is a string containing a security token that can be attached to the URI for an asset that allows you to delegate access to specific storage objects and to specify constraints such as permissions and the date/time range of access. This feature has extensive capabilities. For detailed information, refer to Using Shared Access Signatures (SAS).
Public access to blobs
The Blob Service allows you to provide public access to a container and its blobs, or a specific blob. When you indicate that a container or blob is public, anyone can read it anonymously; no authentication is required. An example of when you would want to do this is when you have a website that is using images, video, or documents from Blob storage. For more information, see Manage anonymous read access to containers and blobs
There are a couple of basic kinds of encryption available for the Storage services.
Encryption at rest
You can enable Storage Service Encryption (SSE) on either the Files service (preview) or the Blob service for an Azure storage account. If enabled, all data written to the specific service is encrypted before written. When you read the data, it is decrypted before returned.
The storage client libraries have methods you can call to programmatically encrypt data before sending it across the wire from the client to Azure. It is stored encrypted, which means it also is encrypted at rest. When reading the data back, you decrypt the information after receiving it.
Encryption in transit with Azure File Shares
See Using Shared Access Signatures (SAS) for more information on shared access signatures. See Manage anonymous read access to containers and blobs and Authentication for the Azure Storage Services for more information on secure access to your storage account.
For more details about securing your storage account and encryption, see the Azure Storage security guide.
In order to ensure that your data is durable, Azure Storage has the ability to keep (and manage) multiple copies of your data. This is called replication, or sometimes redundancy. When you set up your storage account, you select replication type. In most cases, this setting can be modified after the storage account is set up.
All storage accounts have locally redundant storage (LRS). This means three copies of your data are managed by Azure Storage in the data center specified when the storage account was set up. When changes are committed to one copy, the other two copies are updated before returning success. This means the three replicas are always in sync. Also, the three copies reside in separate fault domains and upgrade domains, which means your data is available even if a storage node holding your data fails or is taken offline to be updated.
Locally redundant storage (LRS)
As explained above, with LRS you have three copies of your data in a single datacenter. This handles the problem of data becoming unavailable if a storage node fails or is taken offline to be updated, but not the case of an entire datacenter becoming unavailable.
Zone redundant storage (ZRS)
Zone-redundant storage (ZRS) maintains the three local copies of your data as well as another set of three copies of your data. The second set of three copies is replicated asynchronously across datacenters within one or two regions. Note that ZRS is only available for block blobs in general-purpose storage accounts. Also, once you have created your storage account and selected ZRS, you cannot convert it to use to any other type of replication, or vice versa.
ZRS accounts provide higher durability than LRS, but ZRS accounts do not have metrics or logging capability.
Geo-redundant storage (GRS)
Geo-redundant storage (GRS) maintains the three local copies of your data in a primary region plus another set of three copies of your data in a secondary region hundreds of miles away from the primary region. In the event of a failure at the primary region, Azure Storage will fail over to the secondary region.
Read-access geo-redundant storage (RA-GRS)
Read-access geo-redundant storage is exactly like GRS except that you get read access to the data in the secondary location. If the primary data center becomes unavailable temporarily, you can continue to read the data from the secondary location. This can be very helpful. For example, you could have a web application that changes into read-only mode and points to the secondary copy, allowing some access even though updates are not available.
You can change how your data is replicated after your storage account has been created, unless you specified ZRS when you created the account. However, note that you may incur an additional one-time data transfer cost if you switch from LRS to GRS or RA-GRS.
For more information about replication, see Azure Storage replication.
For disaster recovery information, see What to do if an Azure Storage outage occurs.
For an example of how to leverage RA-GRS storage to ensure high availability, see Designing Highly Available Applications using RA-GRS.
Transferring data to and from Azure Storage
You can use the AzCopy command-line utility to copy blob, and file data within your storage account or across storage accounts. See one of the following articles for help:
AzCopy is built on top of the Azure Data Movement Library, which is currently available in preview.
The Azure Import/Export service can be used to import or export large amounts of blob data to or from your storage account. You prepare and mail multiple hard drives to an Azure data center, where they will transfer the data to/from the hard drives and send the hard drives back to you. For more information about the Import/Export service, see Use the Microsoft Azure Import/Export Service to Transfer Data to Blob Storage.
For detailed information about pricing for Azure Storage, see the Pricing page.
Storage APIs, libraries, and tools
Azure Storage resources can be accessed by any language that can make HTTP/HTTPS requests. Additionally, Azure Storage offers programming libraries for several popular languages. These libraries simplify many aspects of working with Azure Storage by handling details such as synchronous and asynchronous invocation, batching of operations, exception management, automatic retries, operational behavior, and so forth. Libraries are currently available for the following languages and platforms, with others in the pipeline:
Azure Storage data services
- Storage Services REST API
- Storage Client Library for .NET
- Storage Client Library for C++
- Storage Client Library for Java/Android
- Storage Client Library for Node.js
- Storage Client Library for PHP
- Storage Client Library for Python
- Storage Client Library for Ruby
- Storage Cmdlets for PowerShell
- Storage Commands for CLI 2.0
For .NET developers
- Get started with Azure Blob storage using .NET
- Get started with Azure Table storage using .NET
- Get started with Azure Queue storage using .NET
- Get started with Azure File storage on Windows
For Java/Android developers
- How to use Blob storage from Java
- How to use Table storage from Java
- How to use Queue storage from Java
- How to use File storage from Java
For Node.js developers
- How to use Blob storage from Node.js
- How to use Table storage from Node.js
- How to use Queue storage from Node.js
For PHP developers
- How to use Blob storage from PHP
- How to use Table storage from PHP
- How to use Queue storage from PHP
For Ruby developers
- How to use Blob storage from Ruby
- How to use Table storage from Ruby
- How to use Queue storage from Ruby