AzCopy – Announcing General Availability of AzCopy 3.0 plus preview release of AzCopy 4.0 with Table and File support
We are pleased to announce that AzCopy is now GA.
Starting from this release, we will publish two AzCopy series, the RTM series that includes only the GA features and the Pre-release series that includes both the GA and the preview features.
You can download either the AzCopy 3.0.0 with blob copy functionality only, or the AzCopy 4.0.0-preview which includes the GA features and the additional Storage Table Entities copy feature that’s under preview.
AzCopy 3.0.0 - General Available
AzCopy GA version 3.0.0 includes following changes:
- AzCopy now requires that the end user explicitly specify every parameter’s name. In the previous releases, the source, destination and file pattern parameters do not require any parameter names. Starting from 3.0.0, the command line ‘AzCopy <source> <dest> [pattern] [options]’ needs to be changed to:
AzCopy /Source:<source> /Dest:<destination> /Pattern:<pattern> [Options] …
As a result of this change, it is no longer required that parameters like source and destination follow any specified order.
- We have also made the following changes to the AzCopy command line’s help messages:
- Type ‘AzCopy’ to get short version’s help.
- Type ‘AzCopy /?’ to get detailed command line help
- Type ‘AzCopy /?:Sample’ to get command line samples.
- Type ‘AzCopy /?:<option name>’ to get detailed help for the named AzCopy option, e.g.
- In previous version of AzCopy, if user chooses NOT to overwrite existing files or blobs, AzCopy will assign ‘failed’ status for those files or blobs that already exist. From 3.0.0, AzCopy will assign ‘skipped’ status for such files and display ‘Transfer skipped: <Total skipped count>’ as part of ‘Transfer summary’ in the console window.
AzCopy 4.0.0-preview - Copy Azure Storage Table Entities (New Preview)
Besides copying blobs and Azure Files, AzCopy 4.0.0-preview will also support exporting table entities to local files or to azure storage block blobs, and importing the data back to a storage table. Note that this is not a consistent snapshot of the table since changes may occur to entities in a table at various times before AzCopy completes retrieving all the entities.
- When exporting table entities, user can specify the parameter /Dest with a local folder or blob containers, e.g.
AzCopy /Source:https://myaccount.table.core.windows.net/myTable/ /Dest:D:\test\ /SourceKey:key
AzCopy /Source:https://myaccount.table.core.windows.net/myTable/ /Dest:https://myaccount.blob.core.windows.net/mycontainer/ /SourceKey:key1 /Destkey:key2
AzCopy will generate JSON data files in the local folder or blob container with the following naming convention:
<account name>_<table name>_<timestamp>_<volume index>_<CRC>.json
- AzCopy will by default generate one JSON data file, user can specify /SplitSize:<split file size in MB> to generate multiple data files, e.g.
AzCopy /Source:https://myaccount.table.core.windows.net/myTable/ /Dest:D:\test\ /SourceKey:key /SplitSize:100
AzCopy uses ‘volume index’ in the data files’ name to distinguish multiple files. ‘Volume index’ contains two parts, ‘partition key range index’ and ‘split file index’ (both starting from 0). The ‘partition key range index’ will be 0 if user does not specify the option /PKRS, which will be introduced in the next section.
For instance, AzCopy generates two data files after the user specifies the option /SplitSize, the data files’ name may look like the following:
Note that the minimum value of split size is 32MB, and if the destination is blob storage, AzCopy will split the data file once the file size reaches the blob size limit (200GB) even though the option /SplitSize is not specified by end user.
- AzCopy by default exports the whole table’s entities in a serial fashion. To start concurrent exporting, user needs to specify the option /PKRS:<partition key range split>. Use this option with caution since Azure Table Service is a key lookup store and is not built for efficient scans. Too many scans on a table can lead to throttling of live traffic.
For instance, when the option /PKRS:”aa#bb” is specified, AzCopy will start three concurrent operations to export three partition key ranges below:
[<first partition key>, aa)
[bb, <last partition key>]
AzCopy /Source:https://myaccount.table.core.windows.net/myTable/ /Dest:D:\test\ /SourceKey:key /PKRS:”aa#bb”
And the generated JSON data files may looks like this:
Note that the number of concurrent operations is also controlled by the option /NC, AzCopy uses the number of cores on the machine as the default value of /NC when copying table entities. When user specifies the option /PKRS, AzCopy will choose the smaller of the two values, number of partition key ranges or the value specified in the /NC, as the number of concurrent operations. Please find more details about /NC by input ‘AzCopy /?:NC’.
- When importing the data file back to table, user needs to specify both the option /Manifest and /EntityOperation.
AzCopy /Source:D:\test\ /Dest:https://myaccount.table.core.windows.net/mytable1/ /DestKey:key /Manifest:"myaccount_mytable_20140103T112020.manifest" /EntityOperation:InsertOrReplace AzCopy /Source:https://myaccount.blob.core.windows.net/mycontainer/ /Dest:https://myaccount.table.core.windows.net/mytable1/ /SourceKey:key1 /DestKey:key2 /Manifest:"myaccount_mytable_20140103T112020.manifest" /EntityOperation:InsertOrReplace
The manifest file is generated in the destination local folder or the blob container when user exports table entities using AzCopy. The manifest file will be used to locate all the data files and to perform data validation during importing. The manifest file uses the following naming convention:
<account name>_<table name>_<timestamp>.manifest
The option /EntityOperation is used to govern the behavior of entity importing:
- InsertOrSkip - Skips an existing entity or inserts a new entity if it does not exist in the table.
- InsertOrMerge - Merges an existing entity or inserts a new entity if it does not exist in the table.
- InsertReplace - Replaces an existing entity or inserts a new entity if it does not exist in the table.
Note that option /PKRS cannot be used when importing entities. AzCopy will by default start concurrent operations in the import scenario, the default number of concurrent operations is equal to the number of cores of the machine, but user can change the number by specifying the option /NC. For more details, type ‘AzCopy /?:NC’.
As always, we are looking forward to your feedback.
Microsoft Azure Storage Team