Object replication for block blobs

Object replication asynchronously copies block blobs between a source storage account and a destination account. Some scenarios supported by object replication include:

  • Minimizing latency. Object replication can reduce latency for read requests by enabling clients to consume data from a region that is in closer physical proximity.
  • Increase efficiency for compute workloads. With object replication, compute workloads can process the same sets of block blobs in different regions.
  • Optimizing data distribution. You can process or analyze data in a single location and then replicate just the results to additional regions.
  • Optimizing costs. After your data has been replicated, you can reduce costs by moving it to the archive tier using life cycle management policies.

The following diagram shows how object replication replicates block blobs from a source storage account in one region to destination accounts in two different regions.

Diagram showing how object replication works

To learn how to configure object replication, see Configure object replication.

Prerequisites for object replication

Object replication requires that the following Azure Storage features are also enabled:

Enabling change feed and blob versioning may incur additional costs. For more details, refer to the Azure Storage pricing page.

Object replication is supported for general-purpose v2 storage accounts, and for premium block blob accounts in preview. Both the source and destination accounts must be either general-purpose v2 or premium block blob accounts. Object replication supports block blobs only; append blobs and page blobs are not supported.

Important

Object replication for premium block blob accounts is currently in PREVIEW. See the Supplemental Terms of Use for Microsoft Azure Previews for legal terms that apply to Azure features that are in beta, preview, or otherwise not yet released into general availability.

How object replication works

Object replication asynchronously copies block blobs in a container according to rules that you configure. The contents of the blob, any versions associated with the blob, and the blob's metadata and properties are all copied from the source container to the destination container.

Important

Because block blob data is replicated asynchronously, the source account and destination account are not immediately in sync. There's currently no SLA on how long it takes to replicate data to the destination account. You can check the replication status on the source blob to determine whether replication is complete. For more information, see Check the replication status of a blob.

Blob versioning

Object replication requires that blob versioning is enabled on both the source and destination accounts. When a replicated blob in the source account is modified, a new version of the blob is created in the source account that reflects the previous state of the blob, before modification. The current version in the source account reflects the most recent updates. Both the current version and any previous versions are replicated to the destination account. For more information about how write operations affect blob versions, see Versioning on write operations.

When a blob in the source account is deleted, the current version of the blob becomes a previous version, and there is no longer a current version. All existing previous versions of the blob are preserved. This state is replicated to the destination account. For more information about how delete operations affect blob versions, see Versioning on delete operations.

Snapshots

Object replication does not support blob snapshots. Any snapshots on a blob in the source account are not replicated to the destination account.

Blob tiering

Object replication is supported when the source and destination accounts are in the hot or cool tier. The source and destination accounts may be in different tiers. However, object replication will fail if a blob in either the source or destination account has been moved to the archive tier. For more information on blob tiers, see Hot, Cool, and Archive access tiers for blob data.

Immutable blobs

Immutability policies for Azure Blob Storage include time-based retention policies and legal holds. When an immutability policy is in effect on the destination account, object replication may be affected. For more information about immutability policies, see Store business-critical blob data with immutable storage.

If a container-level immutability policy is in effect for a container in the destination account, and an object in the source container is updated or deleted, then the operation on the source container may succeed, but replication of that operation to the destination container will fail. For more information about which operations are prohibited with an immutability policy that is scoped to a container, see Scenarios with container-level scope.

If a version-level immutability policy is in effect for a blob version in the destination account, and a delete or update operation is performed on the blob version in the source container, then the operation on the source object may succeed, but replication of that operation to the destination object will fail. For more information about which operations are prohibited with an immutability policy that is scoped to a container, see Scenarios with version-level scope.

Object replication policies and rules

When you configure object replication, you create a replication policy that specifies the source storage account and the destination account. A replication policy includes one or more rules that specify a source container and a destination container and indicate which block blobs in the source container will be replicated.

After you configure object replication, Azure Storage checks the change feed for the source account periodically and asynchronously replicates any write or delete operations to the destination account. Replication latency depends on the size of the block blob being replicated.

Replication policies

When you configure object replication, you create a replication policy on the destination account via the Azure Storage resource provider. After the replication policy is created, Azure Storage assigns it a policy ID. You must then associate that replication policy with the source account by using the policy ID. The policy ID on the source and destination accounts must be the same in order for replication to take place.

A source account can replicate to no more than two destination accounts, with one policy for each destination account. Similarly, an account may serve as the destination account for no more than two replication policies.

The source and destination accounts may be in the same region or in different regions. They may also reside in the same subscription or in different subscriptions. Optionally, the source and destination accounts may reside in different Azure Active Directory (Azure AD) tenants. Only one replication policy may be created for each source account/destination account pair.

Replication rules

Replication rules specify how Azure Storage will replicate blobs from a source container to a destination container. You can specify up to 10 replication rules for each replication policy. Each replication rule defines a single source and destination container, and each source and destination container can be used in only one rule, meaning that a maximum of 10 source containers and 10 destination containers may participate in a single replication policy.

When you create a replication rule, by default only new block blobs that are subsequently added to the source container are copied. You can specify that both new and existing block blobs are copied, or you can define a custom copy scope that copies block blobs created from a specified time onward.

You can also specify one or more filters as part of a replication rule to filter block blobs by prefix. When you specify a prefix, only blobs matching that prefix in the source container will be copied to the destination container.

The source and destination containers must both exist before you can specify them in a rule. After you create the replication policy, write operations to the destination container are not permitted. Any attempts to write to the destination container fail with error code 409 (Conflict). To write to a destination container for which a replication rule is configured, you must either delete the rule that is configured for that container, or remove the replication policy. Read and delete operations to the destination container are permitted when the replication policy is active.

You can call the Set Blob Tier operation on a blob in the destination container to move it to the archive tier. For more information about the archive tier, see Hot, Cool, and Archive access tiers for blob data.

Policy definition file

An object replication policy is defined by JSON file. You can get the policy definition file from an existing object replication policy. You can also create an object replication policy by uploading a policy definition file.

Sample policy definition file

The following example defines a replication policy on the destination account with a single rule that matches the prefix b and sets the minimum creation time for blobs that are to be replicated. Remember to replace values in angle brackets with your own values:

{
  "properties": {
    "policyId": "default",
    "sourceAccount": "/subscriptions/<subscriptionId>/resourceGroups/<resource-group>/providers/Microsoft.Storage/storageAccounts/<storage-account>",
    "destinationAccount": "/subscriptions/<subscriptionId>/resourceGroups/<resource-group>/providers/Microsoft.Storage/storageAccounts/<storage-account>",
    "rules": [
      {
        "ruleId": "",
        "sourceContainer": "<source-container>",
        "destinationContainer": "<destination-container>",
        "filters": {
          "prefixMatch": [
            "b"
          ],
          "minCreationTime": "2021-08-028T00:00:00Z"
        }
      }
    ]
  }
}

Specify full resource IDs for the source and destination accounts

When you create the policy definition file, specify the full Azure Resource Manager resource IDs for the sourceAccount and destinationAccount entries, as shown in the example in the previous section. To learn how to locate the resource ID for a storage account, see Get the resource ID for a storage account.

The full resource ID is in the following format:

/subscriptions/<subscriptionId>/resourceGroups/<resource-group>/providers/Microsoft.Storage/storageAccounts/<storage-account>

The policy definition file previously required only the account name, instead of the full resource ID for the storage account. With the introduction of the AllowCrossTenantReplication security property in version 2021-02-01 of the Azure Storage resource provider REST API, you must now provide the full resource ID for any object replication policies that are created when cross-tenant replication is disallowed for a storage account that participates in the replication policy. Azure Storage uses the full resource ID to verify whether the source and destination accounts reside within the same tenant. To learn more about disallowing cross-tenant replication policies, see Prevent replication across Azure AD tenants.

While providing only the account name is still supported when cross-tenant replication is allowed for a storage account, Microsoft recommends always providing the full resource ID as a best practice. All previous versions of the Azure Storage resource provider REST API support using the full resource ID path in object replication policies.

The following table describes what happens when you create a replication policy with the full resource ID specified, versus the account name, in the scenarios where cross-tenant replication is allowed or disallowed for the storage account.

Storage account identifier in policy definition Cross-tenant replication allowed Cross-tenant replication disallowed
Full resource ID Same-tenant policies can be created.

Cross-tenant policies can be created.
Same-tenant policies can be created.

Cross-tenant policies cannot be created.
Account name only Same-tenant policies can be created.

Cross-tenant policies can be created.
Neither same-tenant nor cross-tenant policies can be created. An error occurs, because Azure Storage cannot verify that source and destination accounts are in the same tenant. The error indicates that you must specify the full resource ID for the sourceAccount and destinationAccount entries in the policy definition file.

Specify the policy and rule IDs

The following table summarizes which values to use for the policyId and ruleId entries in the policy definition file in each scenario.

When you are creating the policy definition file for this account... Set the policy ID to this value Set rule IDs to this value
Destination account The string value default. Azure Storage will create the policy ID value for you. An empty string. Azure Storage will create the rule ID values for you.
Source account The value of the policy ID returned when you download the policy definition file for the destination account. The values of the rule IDs returned when you download the policy definition file for the destination account.

Prevent replication across Azure AD tenants

An Azure Active Directory (Azure AD) tenant is a dedicated instance of Azure AD that represents an organization for the purpose of identity and access management. Each Azure subscription has a trust relationship with a single Azure AD tenant. All resources in a subscription, including storage accounts, are associated with the same Azure AD tenant. For more information, see What is Azure Active Directory?

By default, a user with appropriate permissions can configure object replication with a source storage account that is in one Azure AD tenant and a destination account that is in a different tenant. If your security policies require that you restrict object replication to storage accounts that reside within the same tenant only, you can disallow replication across tenants by setting a security property, the AllowCrossTenantReplication property (preview). When you disallow cross-tenant object replication for a storage account, then for any object replication policy that is configured with that storage account as the source or destination account, Azure Storage requires that both the source and destination accounts reside within the same Azure AD tenant. For more information about disallowing cross-tenant object replication, see Prevent object replication across Azure Active Directory tenants.

To disallow cross-tenant object replication for a storage account, set the AllowCrossTenantReplication property to false. If the storage account does not currently participate in any cross-tenant object replication policies, then setting the AllowCrossTenantReplication property to false prevents future configuration of cross-tenant object replication policies with this storage account as the source or destination.

If the storage account currently does participates in one or more cross-tenant object replication policies, then setting the AllowCrossTenantReplication property to false is not permitted. You must delete the existing cross-tenant policies before you can disallow cross-tenant replication.

By default, the AllowCrossTenantReplication property is not set for a storage account, and its value is null, which is equivalent to true. When the value of the AllowCrossTenantReplication property for a storage account is null or true, then authorized users can configure cross-tenant object replication policies with this account as the source or destination. For more information about how to configure cross-tenant policies, see Configure object replication for block blobs.

You can use Azure Policy to audit a set of storage accounts to ensure that the AllowCrossTenantReplication property is set to prevent cross-tenant object replication. You can also use Azure Policy to enforce governance for a set of storage accounts. For example, you can create a policy with the deny effect to prevent a user from creating a storage account where the AllowCrossTenantReplication property is set to true, or from modifying an existing storage account to change the property value to true.

Replication status

You can check the replication status for a blob in the source account. For more information, see Check the replication status of a blob.

If the replication status for a blob in the source account indicates failure, then investigate the following possible causes:

  • Make sure that the object replication policy is configured on the destination account.
  • Verify that the destination container still exists.
  • If the source blob has been encrypted with a customer-provided key as part of a write operation, then object replication will fail. For more information about customer-provided keys, see Provide an encryption key on a request to Blob storage.

Feature support

This table shows how this feature is supported in your account and the impact on support when you enable certain capabilities.

Storage account type Blob Storage (default support) Data Lake Storage Gen2 1 NFS 3.0 1 SFTP 1
Standard general-purpose v2 Yes No No No
Premium block blobs Yes No No No

1 Data Lake Storage Gen2, Network File System (NFS) 3.0 protocol, and SSH File Transfer Protocol (SFTP) support all require a storage account with a hierarchical namespace enabled.

Billing

Object replication incurs additional costs on read and write transactions against the source and destination accounts, as well as egress charges for the replication of data from the source account to the destination account and read charges to process change feed.

Next steps