Use DataBox to migrate from Network Attached Storage (NAS) to Azure file shares
This migration article is one of several involving the keywords NAS and Azure DataBox. Check if this article applies to your scenario:
- Data source: Network Attached Storage (NAS)
- Migration route: NAS ⇒ DataBox ⇒ Azure file share
- Caching files on-premises: No, the final goal is to use the Azure file shares directly in the cloud. There is no plan to use Azure File Sync.
If your scenario is different, look through the table of migration guides.
This article guides you end-to-end through the planning, deployment, and networking configurations needed to migrate from your NAS appliance to functional Azure file shares. This guide uses Azure DataBox for bulk data transport (offline data transport).
The goal is to move the shares on your NAS appliance to Azure and have them become native Azure file shares. You can use native Azure file shares without a need for a Windows Server. This migration needs to be done in a way that guarantees the integrity of the production data and availability during the migration. The latter requires keeping downtime to a minimum, so that it can fit into or only slightly exceed regular maintenance windows.
The migration process consists of several phases. You'll need to deploy Azure storage accounts and file shares and configure networking. Then you'll migrate your files using Azure DataBox, and RoboCopy to catch-up with changes. Finally, you'll cut-over your users and apps to the newly created Azure file shares. The following sections describe the phases of the migration process in detail.
If you are returning to this article, use the navigation on the right side to jump to the migration phase where you left off.
Phase 1: Identify how many Azure file shares you need
In this step, you'll determine how many Azure file shares you need. A single Windows Server instance (or cluster) can sync up to 30 Azure file shares.
You might have more folders on your volumes that you currently share out locally as SMB shares to your users and apps. The easiest way to picture this scenario is to envision an on-premises share that maps 1:1 to an Azure file share. If you have a small enough number of shares, below 30 for a single Windows Server instance, we recommend a 1:1 mapping.
If you have more than 30 shares, mapping an on-premises share 1:1 to an Azure file share is often unnecessary. Consider the following options.
For example, if your human resources (HR) department has 15 shares, you might consider storing all the HR data in a single Azure file share. Storing multiple on-premises shares in one Azure file share doesn't prevent you from creating the usual 15 SMB shares on your local Windows Server instance. It only means that you organize the root folders of these 15 shares as subfolders under a common folder. You then sync this common folder to an Azure file share. That way, only a single Azure file share in the cloud is needed for this group of on-premises shares.
Azure File Sync supports syncing the root of a volume to an Azure file share. If you sync the volume root, all subfolders and files will go to the same Azure file share.
Syncing the root of the volume isn't always the best option. There are benefits to syncing multiple locations. For example, doing so helps keep the number of items lower per sync scope. We test Azure file shares and Azure File Sync with 100 million items (files and folders) per share. But a best practice is to try to keep the number below 20 million or 30 million in a single share. Setting up Azure File Sync with a lower number of items isn't beneficial only for file sync. A lower number of items also benefits scenarios like these:
- Initial scan of the cloud content can complete faster, which in turn decreases the wait for the namespace to appear on a server enabled for Azure File Sync.
- Cloud-side restore from an Azure file share snapshot will be faster.
- Disaster recovery of an on-premises server can speed up significantly.
- Changes made directly in an Azure file share (outside of sync) can be detected and synced faster.
If you don't know how many files and folders you have, check out the TreeSize tool from JAM Software GmbH.
A structured approach to a deployment map
Before you deploy cloud storage in a later step, it's important to create a map between on-premises folders and Azure file shares. This mapping will inform how many and which Azure File Sync sync group resources you'll provision. A sync group ties the Azure file share and the folder on your server together and establishes a sync connection.
To decide how many Azure file shares you need, review the following limits and best practices. Doing so will help you optimize your map.
A server on which the Azure File Sync agent is installed can sync with up to 30 Azure file shares.
An Azure file share is deployed in a storage account. That arrangement makes the storage account a scale target for performance numbers like IOPS and throughput.
One standard Azure file share can theoretically saturate the maximum performance that a storage account can deliver. If you place multiple shares in a single storage account, you're creating a shared pool of IOPS and throughput for these shares. If you plan to only attach Azure File Sync to these file shares, grouping several Azure file shares into the same storage account won't create a problem. Review the Azure file share performance targets for deeper insight into the relevant metrics. These limitations don't apply to premium storage, where performance is explicitly provisioned and guaranteed for each share.
If you plan to lift an app to Azure that will use the Azure file share natively, you might need more performance from your Azure file share. If this type of use is a possibility, even in the future, it's best to create a single standard Azure file share in its own storage account.
There's a limit of 250 storage accounts per subscription per Azure region.
Given this information, it often becomes necessary to group multiple top-level folders on your volumes into a new common root directory. You then sync this new root directory, and all the folders you grouped into it, to a single Azure file share. This technique allows you to stay within the limit of 30 Azure file share syncs per server.
This grouping under a common root doesn't affect access to your data. Your ACLs stay as they are. You only need to adjust any share paths (like SMB or NFS shares) you might have on the local server folders that you now changed into a common root. Nothing else changes.
The most important scale vector for Azure File Sync is the number of items (files and folders) that need to be synced. Review the Azure File Sync scale targets for more details.
It's a best practice to keep the number of items per sync scope low. That's an important factor to consider in your mapping of folders to Azure file shares. Azure File Sync is tested with 100 million items (files and folders) per share. But it's often best to keep the number of items below 20 million or 30 million in a single share. Split your namespace into multiple shares if you start to exceed these numbers. You can continue to group multiple on-premises shares into the same Azure file share if you stay roughly below these numbers. This practice will provide you with room to grow.
It's possible that, in your situation, a set of folders can logically sync to the same Azure file share (by using the new common root folder approach mentioned earlier). But it might still be better to regroup folders so they sync to two instead of one Azure file share. You can use this approach to keep the number of files and folders per file share balanced across the server. You can also split your on-premises shares and sync across more on-premises servers, adding the ability to sync with 30 more Azure file shares per extra server.
Create a mapping table
Use the previous information to determine how many Azure file shares you need and which parts of your existing data will end up in which Azure file share.
Create a table that records your thoughts so you can refer to it when you need to. Staying organized is important because it can be easy to lose details of your mapping plan when you're provisioning many Azure resources at once. Download the following Excel file to use as a template to help create your mapping.
|Download a namespace-mapping template.|
Phase 2: Deploy Azure storage resources
In this phase, consult the mapping table from Phase 1 and use it to provision the correct number of Azure storage accounts and file shares within them.
An Azure file share is stored in the cloud in an Azure storage account. Another level of performance considerations applies here.
If you have highly active shares (shares used by many users and/or applications), two Azure file shares might reach the performance limit of a storage account.
A best practice is to deploy storage accounts with one file share each. You can pool multiple Azure file shares into the same storage account if you have archival shares or you expect low day-to-day activity in them.
These considerations apply more to direct cloud access (through an Azure VM) than to Azure File Sync. If you plan to use only Azure File Sync on these shares, grouping several into a single Azure storage account is fine.
If you've made a list of your shares, you should map each share to the storage account it will be in.
In the previous phase, you determined the appropriate number of shares. In this step, you have a mapping of storage accounts to file shares. Now deploy the appropriate number of Azure storage accounts with the appropriate number of Azure file shares in them.
Make sure the region of each of your storage accounts is the same and matches the region of the Storage Sync Service resource you've already deployed.
If you create an Azure file share that has a 100-TiB limit, that share can use only locally redundant storage or zone-redundant storage redundancy options. Consider your storage redundancy needs before using 100-TiB file shares.
Azure file shares are still created with a 5-TiB limit by default. Because you're creating new storage accounts, be sure to follow the guidance to create storage accounts that allow Azure file shares with 100-TiB limits.
Another consideration when you're deploying a storage account is the redundancy of Azure Storage. See Azure Storage redundancy options.
The names of your resources are also important. For example, if you group multiple shares for the HR department into an Azure storage account, you should name the storage account appropriately. Similarly, when you name your Azure file shares, you should use names similar to the ones used for their on-premises counterparts.
Phase 3: Determine how many Azure DataBox appliances you need
Start this step only, when you have completed the previous phase. Your Azure storage resources (storage accounts and file shares) should be created at this time. During your DataBox order, you need to specify into which storage accounts the DataBox is moving data.
In this phase, you need to map the results of the migration plan from the previous phase to the limits of the available DataBox options. These considerations will help you make a plan for which DataBox options you should choose and how many of them you will need to move your NAS shares to Azure file shares.
To determine how many devices of which type you need, consider these important limits:
- Any Azure DataBox can move data into up to 10 storage accounts.
- Each DataBox option comes at their own usable capacity. See DataBox options.
Consult your migration plan for the number of storage accounts you have decided to create and the shares in each one. Then look at the size of each of the shares on your NAS. Combining this information will allow you to optimize and decide which appliance should be sending data to which storage accounts. You can have two DataBox devices move files into the same storage account, but don't split content of a single file share across 2 DataBoxes.
For a standard migration, one or a combination of these three DataBox options should be chosen:
- DataBox Disks Microsoft will send you one and up to five SSD disks with a capacity of 8 TiB each, for a maximum total of 40 TiB. The usable capacity is about 20% less, due to encryption and file system overhead. For more information, see DataBox Disks documentation.
- DataBox This is the most common option. A ruggedized DataBox appliance, that works similar to a NAS, will be shipped to you. It has a usable capacity of 80 TiB. For more information, see DataBox documentation.
- DataBox Heavy This option features a ruggedized DataBox appliance on wheels, that works similar to a NAS, with a capacity of 1 PiB. The usable capacity is about 20% less, due to encryption and file system overhead. For more information, see DataBox Heavy documentation.
Phase 4: Provision a temporary Windows Server
While you wait for your Azure DataBox(es) to arrive, you can already deploy one or more Windows Servers you will need for running RoboCopy jobs.
- The first use of these servers will be to copy files onto the DataBox.
- The second use of these servers will be to catch-up with changes that have occurred on the NAS appliance while DataBox was in transport. This approach keeps downtime on the source side to a minimum.
The speed in which your RoboCopy jobs work depend on mainly these factors:
- IOPS on the source and target storage
- the available network bandwidth between them Find more details in the Troubleshooting section: IOPS and Bandwidth considerations
- the ability to quickly process files and folders in a namespace Find more details in the Troubleshooting section: Processing speed
- the number of changes between RoboCopy runs Find more details in the Troubleshooting section: Avoid unnecessary work
It is important to keep the referenced details in mind when deciding on the RAM and thread count you will provide to your temporary Windows Server(s).
Phase 5: Preparing to use Azure file shares
To save time, you should proceed with this phase while you wait for your DataBox to arrive. With the information in this phase, you will be able to decide how your servers and users in Azure and outside of Azure will be enabled to utilize your Azure file shares. The most critical decisions are:
- Networking: Enable your networks to route SMB traffic.
- Authentication: Configure Azure storage accounts for Kerberos authentication. AdConnect and Domain joining your storage account will allow your apps and users to use their AD identity to for authentication
- Authorization: Share-level ACLs for each Azure file share will allow AD users and groups to access a given share and within an Azure file share, native NTFS ACLs will take over. Authorization based on file and folder ACLs then works like it does for on-premises SMB shares.
- Business continuity: Integration of Azure file shares into an existing environment often entails to preserve existing share addresses. If you are not already using DFS-Namespaces, consider establishing that in your environment. You'd be able to keep share addresses your users and scripts use, unchanged. You would use DFS-N as a namespace routing service for SMB, by redirecting DFS-Namespace targets to Azure file shares after their migration.
This video is a guide and demo for how to securely expose Azure file shares directly to information workers and apps in five simple steps. The video references dedicated documentation for some topics:
Phase 6: Copy files onto your DataBox
When your DataBox arrives, you need to set up your DataBox in a line of sight to your NAS appliance. Follow the setup documentation for the DataBox type you ordered.
Depending on the DataBox type, there maybe DataBox copy tools available to you. At this point, they are not recommended for migrations to Azure file shares as they do not copy your files with full fidelity to the DataBox. Use RoboCopy instead.
When your DataBox arrives, it will have pre-provisioned SMB shares available for each storage account you specified at the time of ordering it.
- If your files go into a premium Azure file share, there will be one SMB share per premium "File storage" storage account.
- If your files go into a standard storage account, there will be three SMB shares per standard (GPv1 and GPv2) storage account. Only the file shares ending with
_AzFilesare relevant for your migration. Ignore any block and page blob shares.
Follow the steps in the Azure DataBox documentation:
The linked DataBox documentation specifies a RoboCopy command. However, the command is not suitable to preserve the full file and folder fidelity. Use this command instead:
Robocopy /MT:32 /NP /NFL /NDL /B /MIR /IT /COPY:DATSO /DCOPY:DAT /UNILOG:<FilePathAndName> <SourcePath> <Dest.Path>
- To learn more about the details of the individual RoboCopy flags, check out the table in the upcoming RoboCopy section.
- To learn more about how to appropriately size the thread count
/MT:n, optimize RoboCopy speed, and make RoboCopy a good neighbor in your data center, take a look at the RoboCopy troubleshooting section.
Phase 7: Catch-up RoboCopy from your NAS
Once your DataBox reports that all files and folders have been placed into the planned Azure file shares, you can continue with this phase. A catch-up RoboCopy is only needed if the data on the NAS may have changed since the DataBox copy was started. In certain scenarios where you use a share for archiving purposes, you might be able to stop changes to the share on your NAS until the migration is complete. You might also have the ability to serve your business requirements by setting NAS shares to read-only during the migration.
In cases where you need a share to be read-write during the migration and can only absorb a small downtime window, this catch-up RoboCopy step will be important to complete before the fail-over of user access directly to the Azure file share.
In this step, you will run RoboCopy jobs to catch up your cloud shares with the latest changes on your NAS since the time you forked your shares onto the DataBox. This catch-up RoboCopy may finish quickly or take a while, depending on the amount of churn that happened on your NAS shares.
Run the first local copy to your Windows Server target folder:
- Identify the first location on your NAS appliance.
- Identify the matching Azure file share.
- Mount the Azure file share as a local network drive on your temporary Windows Server
- Start the copy using RoboCopy as described
Mounting an Azure file share
Before you can use RoboCopy, you need to make the Azure file share accessible over SMB. The easiest way is to mount the share as a local network drive to the Windows Server you are planning on using for RoboCopy.
Before you can successfully mount an Azure file share to a local Windows Server, you need to have completed Phase : Preparing to use Azure file shares!
Once you are ready, review the Use an Azure file share with Windows how-to article and mount the Azure file share you want to start the NAS catch-up RoboCopy for.
The following RoboCopy command will copy only the differences (updated files and folders) from your NAS storage to your Azure file share.
Robocopy /MT:32 /R:5 /W:5 /B /MIR /IT /COPY:DATSO /DCOPY:DAT /NP /NFL /NDL /UNILOG:<FilePathAndName> <SourcePath> <Dest.Path>
||Allows Robocopy to run multithreaded. Default for
||Maximum retry count for a file that fails to copy on first attempt. You can improve the speed of a Robocopy run by specifying a maximum number (
||Specifies the time Robocopy waits before attempting to copy a file that didn't successfully copy during a previous attempt.
||Runs Robocopy in the same mode that a backup application would use. This switch allows Robocopy to move files that the current user doesn't have permissions for.|
||(Mirror source to target.) Allows Robocopy to copy only deltas between source and target. Empty subdirectories will be copied. Items (files or folders) that have changed or don't exist on the target will be copied. Items that exist on the target but not on the source will be purged (deleted) from the target. When you use this switch, match the source and target folder structures exactly. Matching means copying from the correct source and folder level to the matching folder level on the target. Only then can a "catch up" copy be successful. When source and target are mismatched, using
||Ensures fidelity is preserved in certain mirror scenarios. For example, if a file experiences an ACL change and an attribute update between two Robocopy runs, it's marked hidden. Without
||The fidelity of the file copy. Default:
||Fidelity for the copy of directories. Default:
||Specifies that the progress of the copy for each file and folder won't be displayed. Displaying the progress significantly lowers copy performance.|
||Specifies that file names aren't logged. Improves copy performance.|
||Specifies that directory names aren't logged. Improves copy performance.|
||Writes status to the log file as Unicode. (Overwrites the existing log.)|
||Only for targets with tiered storage Specifies that Robocopy operates in "low free space mode." This switch is useful only for targets with tiered storage that might run out of local capacity before Robocopy finishes. It was added specifically for use with a target enabled for Azure File Sync cloud tiering. It can be used independently of Azure File Sync. In this mode, Robocopy will pause whenever a file copy would cause the destination volume's free space to go below a "floor" value. This value can be specified by the
||Use cautiously Copies files in restart mode. This switch is recommended only in an unstable network environment. It significantly reduces copy performance because of extra logging.|
||Use cautiously Uses restart mode. If access is denied, this option uses backup mode. This option significantly reduces copy performance because of checkpointing.|
Check out the Troubleshooting section if RoboCopy is impacting your production environment, reports lots of errors or is not progressing as fast as expected.
When you run the RoboCopy command for the first time, your users and applications are still accessing files on the NAS and potentially change them. It is possible, that RoboCopy has processed a directory, moves on to the next and then a user on the source location (NAS) adds, changes, or deletes a file that will now not be processed in this current RoboCopy run. This behavior is expected.
The first run is about moving the bulk of the churned data to your Azure file share. This first copy can take a while. Check out the Troubleshooting section for more insight into what can affect RoboCopy speeds.
Once the initial run is complete, run the command again.
A second time you run RoboCopy for the same share, it will finish faster, because it only needs to transport changes that happened since the last run. You can run repeated jobs for the same share.
When you consider the downtime acceptable, then you need to remove user access to your NAS-based shares. You can do that by any steps that prevent users from changing the file and folder structure and content. An example is to point your DFS-Namespace to a non-existing location or change the root ACLs on the share.
Run one last RoboCopy round. It will pick up any changes, that might have been missed. How long this final step takes, is dependent on the speed of the RoboCopy scan. You can estimate the time (which is equal to your downtime) by measuring how long the previous run took.
Create a share on the Windows Server folder and possibly adjust your DFS-N deployment to point to it. Be sure to set the same share-level permissions as on your NAS SMB share. If you had an enterprise-class domain-joined NAS, then the user SIDs will automatically match as the users exist in Active Directory and RoboCopy copies files and metadata at full fidelity. If you have used local users on your NAS, you need to re-create these users as Windows Server local users and map the existing SIDs RoboCopy moved over to your Windows Server to the SIDs of your new, Windows Server local users.
You have finished migrating a share / group of shares into a common root or volume. (Depending on your mapping from Phase 1)
You can try to run a few of these copies in parallel. We recommend processing the scope of one Azure file share at a time.
Speed and success rate of a given RoboCopy run will depend on several factors:
- IOPS on the source and target storage
- the available network bandwidth between source and target
- the ability to quickly process files and folders in a namespace
- the number of changes between RoboCopy runs
IOPS and bandwidth considerations
In this category, you need to consider abilities of the source storage, the target storage, and the network connecting them. The maximum possible throughput is determined by the slowest of these three components. Make sure your network infrastructure is configured to support optimal transfer speeds to its best abilities.
While copying as fast as possible is often most desireable, consider the utilization of your local network and NAS appliance for other, often business critical tasks.
Copying as fast as possible might not be desirable when there's a risk that the migration could monopolize available resources.
- Consider when it's best in your environment to run migrations: during the day, off-hours, or during weekends.
- Also consider networking QoS on a Windows Server to throttle the RoboCopy speed.
- Avoid unnecessary work for the migration tools.
RobCopy can insert inter-packet delays by specifying the
/IPG:n switch where
n is measured in milliseconds between RoboCopy packets. Using this switch can help avoid monopolization of resources on both IO constrained devices, and crowded network links.
/IPG:n cannot be used for precise network throttling to a certain Mbps. Use Windows Server Network QoS instead. RoboCopy entirely relies on the SMB protocol for all networking needs. Using SMB is the reason why RoboCopy can't influence the network throughput itself, but it can slow down its use.
A similar line of thought applies to the IOPS observed on the NAS. The cluster size on the NAS volume, packet sizes, and an array of other factors influence the observed IOPS. Introducing inter-packet delay is often the easiest way to control the load on the NAS. Test multiple values, for instance from about 20 milliseconds (n=20) to multiples of that number. Once you introduce a delay, you can evaluate if your other apps can now work as expected. This optimization strategy will allow you to find the optimal RoboCopy speed in your environment.
RoboCopy will traverse the namespace it's pointed to and evaluate each file and folder for copy. Every file will be evaluated during an initial copy and during catch-up copies. For example, repeated runs of RoboCopy /MIR against the same source and target storage locations. These repeated runs are useful to minimize downtime for users and apps, and to improve the overall success rate of files migrated.
We often default to considering bandwidth as the most limiting factor in a migration - and that can be true. But the ability to enumerate a namespace can influence the total time to copy even more for larger namespaces with smaller files. Consider that copying 1 TiB of small files will take considerably longer than copying 1 TiB of fewer but larger files. Assuming that all other variables remain the same.
The cause for this difference is the processing power needed to walk through a namespace. RoboCopy supports multi-threaded copies through the
/MT:n parameter where n stands for the number of processor threads. So when provisioning a machine specifically for RoboCopy, consider the number of processor cores and their relationship to the thread count they provide. Most common are two threads per core. The core and thread count of a machine is an important data point to decide what multi-thread values
/MT:n you should specify. Also consider how many RoboCopy jobs you plan to run in parallel on a given machine.
More threads will copy our 1-TiB example of small files considerably faster than fewer threads. At the same time, the extra resource investment on our 1 TiB of larger files may not yield proportional benefits. A high thread count will attempt to copy more of the large files over the network simultaneously. This extra network activity increases the probability of getting constrained by throughput or storage IOPS.
During a first RoboCopy into an empty target or a differential run with lots of changed files, you are likely constrained by your network throughput. Start with a high thread count for an initial run. A high thread count, even beyond your currently available threads on the machine, helps saturate the available network bandwidth. Subsequent /MIR runs are progressively impacted by processing items. Fewer changes in a differential run mean less transport of data over the network. Your speed is now more dependent on your ability to process namespace items than to move them over the network link. For subsequent runs, match your thread count value to your processor core count and thread count per core. Consider if cores need to be reserved for other tasks a production server may have.
Avoid unnecessary work
Avoid large-scale changes in your namespace. For example, moving files between directories, changing properties at a large scale, or changing permissions (NTFS ACLs). Especially ACL changes can have a high impact because they often have a cascading change effect on files lower in the folder hierarchy. Consequences can be:
- extended RoboCopy job run time because each file and folder affected by an ACL change needing to be updated
- reusing data moved earlier may need to be recopied. For instance, more data will need to be copied when folder structures change after files had already been copied earlier. A RoboCopy job can't "play back" a namespace change. The next job must purge the files previously transported to the old folder structure and upload the files in the new folder structure again.
Another important aspect is to use the RoboCopy tool effectively. With the recommended RoboCopy script, you'll create and save a log file for errors. Copy errors can occur - that is normal. These errors often make it necessary to run multiple rounds of a copy tool like RoboCopy. An initial run, say from a NAS to DataBox or a server to an Azure file share. And one or more extra runs with the /MIR switch to catch and retry files that didn't get copied.
You should be prepared to run multiple rounds of RoboCopy against a given namespace scope. Successive runs will finish faster as they have less to copy but are constrained increasingly by the speed of processing the namespace. When you run multiple rounds, you can speed up each round by not having RoboCopy try unreasonably hard to copy everything in a given run. These RoboCopy switches can make a significant difference:
/R:nn = how often you retry to copy a failed file and
/W:nn = how many seconds to wait between retries
/R:5 /W:5 is a reasonable setting that you can adjust to your liking. In this example, a failed file will be retried five times, with five-second wait time between retries. If the file still fails to copy, the next RoboCopy job will try again. Often files that failed because they are in use or because of timeout issues might eventually be copied successfully this way.
There is more to discover about Azure file shares. The following articles help understand advanced options, best practices, and also contain troubleshooting help. These articles link to Azure file share documentation as appropriate.