Planning for an Azure Files deployment

Azure Files offers fully managed file shares in the cloud that are accessible via the industry standard SMB protocol. Because Azure Files is fully managed, deploying it in production scenarios is much easier than deploying and managing a file server or NAS device. This article addresses the topics to consider when deploying an Azure File share for production use within your organization.

Management concepts

The following diagram illustrates the Azure Files management constructs:

File Structure

  • Storage Account: All access to Azure Storage is done through a storage account. See Scalability and Performance Targets for details about storage account capacity.

  • Share: A File Storage share is an SMB file share in Azure. All directories and files must be created in a parent share. An account can contain an unlimited number of shares, and a share can store an unlimited number of files, up to the 5 TiB total capacity of the file share.

  • Directory: An optional hierarchy of directories.

  • File: A file in the share. A file may be up to 1 TiB in size.

  • URL format: For requests to an Azure File share made with the File REST protocol, files are addressable using the following URL format:

    https://<storage account>.file.core.windows.net/<share>/<directory>/directories>/<file>
    

Data access method

Azure Files offers two, built-in, convenient data access methods that you can use separately, or in combination with each other, to access your data:

  1. Direct cloud access: Any Azure File share can be mounted by Windows, macOS, and/or Linux with the industry standard Server Message Block (SMB) protocol or via the File REST API. With SMB, reads and writes to files on the share are made directly on the file share in Azure. To mount by a VM in Azure, the SMB client in the OS must support at least SMB 2.1. To mount on-premises, such as on a user's workstation, the SMB client supported by the workstation must support at least SMB 3.0 (with encryption). In addition to SMB, new applications or services may directly access the file share via File REST, which provides an easy and scalable application programming interface for software development.
  2. Azure File Sync (preview): With Azure File Sync, shares can be replicated to Windows Servers on-premises or in Azure. Your users would access the file share through the Windows Server, such as through an SMB or NFS share. This is useful for scenarios in which data will be accessed and modified far away from an Azure datacenter, such as in a branch office scenario. Data may be replicated between multiple Windows Server endpoints, such as between multiple branch offices. Finally, data may be tiered to Azure Files, such that all data is still accessible via the Server, but the Server does not have a full copy of the data. Rather, data is seamlessly recalled when opened by your user.

The following table illustrates how your users and applications can access your Azure File share:

Direct cloud access Azure File Sync
What protocols do you need to use? Azure Files supports SMB 2.1, SMB 3.0, and File REST API. Access your Azure File share via any supported protocol on Windows Server (SMB, NFS, FTPS, etc.)
Where are you running your workload? In Azure: Azure Files offers direct access to your data. On-premises with slow network: Windows, Linux, and macOS clients can mount a local on-premises Windows File share as a fast cache of your Azure File share.
What level of ACLs do you need? Share and file level. Share, file, and user level.

Data security

Azure Files has several built-in options for ensuring data security:

  • Support for encryption in both over-the-wire protocols: SMB 3.0 encryption and File REST over HTTPS. By default:
    • Clients which support SMB 3.0 encryption send and receive data over an encrypted channel.
    • Clients which do not support SMB 3.0, can communicate intra-datacenter over SMB 2.1 or SMB 3.0 without encryption. Note that clients are not allowed to communicate inter-datacenter over SMB 2.1 or SMB 3.0 without encryption.
    • Clients can communicate over File REST with either HTTP or HTTPS.
  • Encryption at-rest (Azure Storage Service Encryption): Storage Service Encryption (SSE) is enabled by default for all storage accounts. Data at-rest is encrypted with fully-managed keys. Encryption at-rest does not increase storage costs or reduce performance.
  • Optional requirement of encrypted data in-transit: when selected, Azure Files rejects access the data over unencrypted channels. Specifically, only HTTPS and SMB 3.0 with encryption connections are allowed.

    Important

    Requiring secure transfer of data will cause older SMB clients not capable of communicating with SMB 3.0 with encryption to fail. See Mount on Windows, Mount on Linux, Mount on macOS for more information.

For maximum security, we strongly recommend always enabling both encryption at-rest and enabling encryption of data in-transit whenever you are using modern clients to access your data. For example, if you need to mount a share on a Windows Server 2008 R2 VM, which only supports SMB 2.1, you need to allow unencrypted traffic to your storage account since SMB 2.1 does not support encryption.

If you are using Azure File Sync to access your Azure File share, we will always use HTTPS and SMB 3.0 with encryption to sync your data to your Windows Servers, regardless of whether you require encryption of data at-rest.

Data redundancy

Azure Files supports three data redundancy options: locally redundant storage (LRS), zone redundant storage (ZRS), and geo-redundant storage (GRS). The following sections describe the differences between the different redundancy options:

Locally redundant storage

Locally redundant storage (LRS) is designed to provide at least 99.999999999% (11 9's) durability of objects over a given year by replicating your data within a storage scale unit. A storage scale unit is hosted in a datacenter in the region in which you created your storage account. A write request to an LRS storage account returns successfully only after the data has been written to all replicas. These replicas each reside in separate fault domains and update domains within one storage scale unit.

A storage scale unit is a collection of racks of storage nodes. A fault domain (FD) is a group of nodes that represent a physical unit of failure and can be considered as nodes belonging to the same physical rack. An upgrade domain (UD) is a group of nodes that are upgraded together during the process of a service upgrade (rollout). The replicas are spread across UDs and FDs within one storage scale unit. This architecture ensures that your data is available if a hardware failure impacts a single rack or when nodes are upgraded during a rollout.

LRS is the lowest cost replication option and offers the least durability compared to other options. If a datacenter-level disaster (for example, fire or flooding) occurs, all replicas may be lost or unrecoverable. To mitigate this risk, Microsoft recommends using either zone-redundant storage (ZRS) or geo-redundant storage (GRS).

  • If your application stores data that can be easily reconstructed if data loss occurs, you may opt for LRS.
  • Some applications are restricted to replicating data only within a country due to data governance requirements. In some cases, the paired regions across which data is replicated for GRS accounts may be in another country. For more information on paired regions, see Azure regions.

Zone redundant storage

Zone Redundant Storage (ZRS) synchronously replicates your data across three (3) storage clusters in a single region. Each storage cluster is physically separated from the others and resides in its own availability zone (AZ). Each availability zone, and the ZRS cluster within it, is autonomous, with separate utilities and networking capabilities.

Storing your data in a ZRS account ensures that you will be able access and manage your data in the event that a zone becomes unavailable. ZRS provides excellent performance and extremely low latency. In fact, ZRS has the same scalability targets as LRS.

Consider ZRS for scenarios that require strong consistency, strong durability, and high availability even if an outage or natural disaster renders a zonal data center unavailable. ZRS offers durability for storage objects of at least 99.9999999999% (12 9's) over a given year.

For more information about availability zones, see Availability Zones overview.

Support coverage and regional availability

ZRS currently supports standard, general-purpose v2 (GPv2) account types. ZRS is available for block blobs, non-disk page blobs, files, tables, and queues. Additionally, all of your Storage Analytics logs and Storage Metrics

ZRS is generally available in the following regions:

  • US East 2
  • US Central
  • North Europe
  • West Europe
  • France Central
  • Southeast Asia

Microsoft continues to enable ZRS in other Azure regions and will update this list when that occurs. We will also make such announcements through the standard channels such as the Azure Service Updates page and email notifications to Azure subscription owners and administrators.

What happens when a zone becomes unavailable?

Your data will remain resilient if a zone becomes unavailable. Microsoft recommends that you continue to follow practices for transient fault handling, such as implementing retry policies with exponential back-off. When a zone is unavailable, Azure undertakes networking updates, such as DNS repointing. These updates may affect your application if you are accessing your data before they have completed.

ZRS may not protect your data against a regional disaster where multiple zones are permanently affected. Instead, ZRS offers resiliency for your data in the case of temporal unavailability. For protection against regional disasters, Microsoft recommends using Geo-redundant storage (GRS): Cross-regional replication for Azure Storage.

Converting to ZRS replication

Today, it is pretty straightforward to change between LRS, GRS, and RA-GRS - you leverage either the portal or the API. With ZRS, however, it is not as straightforward because it involves the physical data movement from a single storage stamp to multiple stamps within a region. As such, you have two primary options - manually copy/move data to a new ZRS account from your existing account, or request a live migration. We strongly recommend that you perform a manual migration because we are unable to guarantee when a live migration will complete; there are many factors which directly and indirectly impact the completion of a migration job.

To perform a manual migration, you have a variety options:

  • Use existing tooling like AzCopy, the storage SDK, reliable third-party tools, etc.
  • If you are familiar with Hadoop or HDInsight, you can attach both source and destination (ZRS) account to your cluster and use something like DistCp to massively parallelize the data copy process
  • Build your own tooling leveraging one flavor of the storage SDK

As mentioned earlier, we HIGHLY recommend that you go the manual migration route because it gives you more flexibility than a live migration does. You are also in total control of when the migration occurs.

If, however, a manual migration will result in some application downtime and you are unable to absorb that on your end, then we provide a live migration option. A live migration is an in-place migration that allows you to continue using your existing storage account while your data is migrated between source and destination storage stamps. During migration, you will still have the same level of durability and availability SLA as you do normally.

Live migration does come with certain restrictions, however. They are listed below.

  • While we will address your live migration request promptly, we cannot guarantee when the migration will actually complete. If you need your data to be in ZRS by a certain time, then you should do a manual migration. Generally, the more data you have in your account, the longer it will take to migrate that data.
  • You may only live migrate from an account with LRS and GRS replication. If you have RA-GRS then you will need to first change to one of these replication types before proceeding. This intermediary step ensures that the secondary read-only endpoint which RA-GRS provides is removed when are ready.
  • Your account must be non-empty.
  • Only intra-region migrations are supported. If you want to migrate your data into a ZRS account located in a region different than the source account, then you must perform a manual migration.
  • Standard storage account types only. You cannot migrate from a Premium storage account.

Live migration requests go through Azure Support portal. From the portal, you select the storage account you want to convert to ZRS.

  1. Click New Support Request
  2. Verify the Basics. Click Next.
  3. On the Problem section,
    • Leave Severity as-is.
    • Problem Type = Data Migration
    • Category = Migrate to ZRS within a region
    • Title = ZRS account migration (or something descriptive)
    • Details = I would like to migrate to ZRS from [LRS, GRS] in the __ region.
  4. Click Next.
  5. Verify that the Contact Info is correct on the Contact Info blade.
  6. Click Submit.

A support person will then be in contact with you. That person will be available to provide any assistance you may require.

ZRS Classic: A legacy option for block blobs redundancy

Note

ZRS Classic accounts are planned for deprecation and required migration on March 31, 2021. Microsoft will send more details to ZRS Classic customers prior to deprecation. Microsoft plans to provide an automated migration process from ZRS Classic to ZRS in the future.

Note

Once ZRS is generally available in a region, you will no longer be able to create a ZRS Classic account from the portal in that same region. However, you can still create one through other means like Microsoft PowerShell and the Azure CLI, that is, until ZRS Classic is deprecated.

ZRS Classic asynchronously replicates data across data centers within one to two regions. A replica may not be available unless Microsoft initiates failover to the secondary. ZRS Classic is available only for block blobs in general-purpose V1 (GPv1) storage accounts. A ZRS Classic account cannot be converted to or from LRS or GRS, and does not have metrics or logging capability.

ZRS Classic accounts cannot be converted to or from LRS, GRS, or RA-GRS. ZRS Classic accounts also do not support metrics or logging.

To manually migrate ZRS account data to or from an LRS, ZRS Classic, GRS, or RA-GRS account, use AzCopy, Azure Storage Explorer, Azure PowerShell, or Azure CLI. You can also build your own migration solution with one of the Azure Storage client libraries.

Geo-redundant storage

Geo-redundant storage (GRS) is designed to provide at least 99.99999999999999% (16 9's) durability of objects over a given year by replicating your data to a secondary region that is hundreds of miles away from the primary region. If your storage account has GRS enabled, then your data is durable even in the case of a complete regional outage or a disaster in which the primary region is not recoverable.

If you opt for GRS, you have two related options to choose from:

  • GRS replicates your data to another data center in a secondary region, but that data is available to be read only if Microsoft initiates a failover from the primary to secondary region.
  • Read-access geo-redundant storage (RA-GRS) is based on GRS. RA-GRS replicates your data to another data center in a secondary region, and also provides you with the option to read from the secondary region. With RA-GRS, you can read from the secondary regardless of whether Microsoft initiates a failover from the primary to the secondary.

For a storage account with GRS or RA-GRS enabled, all data is first replicated with locally-redundant storage (LRS). An update is first committed to the primary location and replicated using LRS. The update is then replicated asynchronously to the secondary region using GRS. When data is written to the secondary location, it is also replicated within that location using LRS.

Both the primary and secondary regions manage replicas across separate fault domains and upgrade domains within a storage scale unit. The storage scale unit is the basic replication unit within the datacenter. Replication at this level is provided by LRS; for more information, see Locally-redundant storage (LRS): Low-cost data redundancy for Azure Storage.

Keep these points in mind when deciding which replication option to use:

  • Zone-redundant storage (ZRS) provides highly availability with synchronous replication and may be a better choice for some scenarios than GRS or RA-GRS. For more information on ZRS, see ZRS.
  • Because asynchronous replication involves a delay, in the event of a regional disaster it is possible that changes that have not yet been replicated to the secondary region will be lost if the data cannot be recovered from the primary region.
  • With GRS, the replica is not available unless Microsoft initiates failover to the secondary region. If Microsoft does initiate a failover to the secondary region, you will have read and write access to that data after the failover has completed. For more information, please see Disaster Recovery Guidance.
  • If your application needs to read from the secondary region, enable RA-GRS.

Read-access geo-redundant storage

Read-access geo-redundant storage (RA-GRS) maximizes availability for your storage account. RA-GRS provides read-only access to the data in the secondary location, in addition to geo-replication across two regions.

When you enable read-only access to your data in the secondary region, your data is available on a secondary endpoint as well as on the primary endpoint for your storage account. The secondary endpoint is similar to the primary endpoint, but appends the suffix –secondary to the account name. For example, if your primary endpoint for the Blob service is myaccount.blob.core.windows.net, then your secondary endpoint is myaccount-secondary.blob.core.windows.net. The access keys for your storage account are the same for both the primary and secondary endpoints.

Some considerations to keep in mind when using RA-GRS:

  • Your application has to manage which endpoint it is interacting with when using RA-GRS.
  • Since asynchronous replication involves a delay, changes that have not yet been replicated to the secondary region may be lost if data cannot be recovered from the primary region, for example in the event of a regional disaster.
  • You can check the Last Sync Time of your storage account. Last Sync Time is a GMT date/time value. All primary writes before the Last Sync Time have been successfully written to the secondary location, meaning that they are available to be read from the secondary location. Primary writes after the Last Sync Time may or may not be available for reads yet. You can query this value using the Azure portal, Azure PowerShell, or from one of the Azure Storage client libraries.
  • If Microsoft initiates failover to the secondary region, you will have read and write access to that data after the failover has completed. For more information, see Disaster Recovery Guidance.
  • For information on how to switch to the secondary region, see What to do if an Azure Storage outage occurs.
  • RA-GRS is intended for high-availability purposes. For scalability guidance, review the performance checklist.
  • For suggestions on how to design for high availability with RA-GRS, see Designing Highly Available Applications using RA-GRS storage.

What is the RPO and RTO with GRS?

Recovery Point Objective (RPO): In GRS and RA-GRS, the storage service asynchronously geo-replicates the data from the primary to the secondary location. In the event of a major regional disaster in the primary region, Microsoft performs a failover to the secondary region. If a failover happens, recent changes that have not yet been geo-replicated may be lost. The number of minutes of potential data lost is referred to as the RPO, and it indicates the point in time to which data can be recovered. Azure Storage typically has an RPO of less than 15 minutes, although there is currently no SLA on how long geo-replication takes.

Recovery Time Objective (RTO): The RTO is a measure of how long it takes to perform the failover and get the storage account back online. The time to perform the failover includes the following actions:

  • The time Microsoft requires to determine whether the data can be recovered at the primary location, or if a failover is necessary.
  • The time to perform the failover of the storage account by changing the primary DNS entries to point to the secondary location.

    Microsoft takes the responsibility of preserving your data seriously. If there is any chance of recovering the data in the primary region, Microsoft will delay the failover and focus on recovering your data. A future version of the service will allow you to trigger a failover at an account level so that you can control the RTO yourself.

Paired Regions

When you create a storage account, you select the primary region for the account. The paired secondary region is determined based on the primary region, and cannot be changed. For up-to-date information about regions supported by Azure, see Business continuity and disaster recovery (BCDR): Azure Paired Regions.

Data growth pattern

Today, the maximum size for an Azure File share is 5 TiB, inclusive of share snapshots. Because of this current limitation, you must consider the expected data growth when deploying an Azure File share. Note that an Azure Storage account, can store multiple shares with a total of 500 TiB stored across all shares.

It is possible to sync multiple Azure File shares to a single Windows File Server with Azure File Sync. This allows you to ensure that older, very large file shares that you may have on-premises can be brought into Azure File Sync. Please see Planning for an Azure File Sync Deployment for more information.

Data transfer method

There are many easy options to bulk transfer data from an existing file share, such as an on-premises file share, into Azure Files. A few popular ones include (non-exhaustive list):

  • Azure File Sync: As part of a first sync between an Azure File share (a "Cloud Endpoint") and a Windows directory namespace (a "Server Endpoint"), Azure File Sync will replicate all data from the existing file share to Azure Files.
  • Azure Import/Export: The Azure Import/Export service allows you to securely transfer large amounts of data into an Azure File share by shipping hard disk drives to an Azure datacenter.
  • Robocopy: Robocopy is a well known copy tool that ships with Windows and Windows Server. Robocopy may be used to transfer data into Azure Files by mounting the file share locally, and then using the mounted location as the destination in the Robocopy command.
  • AzCopy: AzCopy is a command-line utility designed for copying data to and from Azure Files, as well as Azure Blob storage, using simple commands with optimal performance. AzCopy is available for Windows and Linux.

Next steps