您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

比较用于与 Azure HDInsight 群集配合使用的存储选项Compare storage options for use with Azure HDInsight clusters

创建 HDInsight 群集时,可以在几个不同的 Azure 存储服务之间进行选择:You can choose between a few different Azure storage services when creating HDInsight clusters:

本文概述了这些存储类型和其独特功能。This article provides an overview of these storage types and their unique features.

存储类型和功能Storage types and features

下表汇总了不同版本的 HDInsight 支持的 Azure 存储服务:The following table summarizes the Azure Storage services that are supported with different versions of HDInsight:

存储服务Storage service 帐户类型Account type 命名空间类型Namespace Type 支持的服务Supported services 支持的性能层Supported performance tiers 支持的访问层Supported access tiers HDInsight 版本HDInsight Version 群集类型Cluster type
Azure Data Lake Storage Gen2Azure Data Lake Storage Gen2 常规用途 V2General-purpose V2 分层 (filesystem) Hierarchical (filesystem) BlobBlob 标准Standard 热、冷、存档Hot, Cool, Archive 3.6+3.6+ 除 Spark 2.1 和2.2 之外的所有All except Spark 2.1 and 2.2
Azure 存储Azure Storage 常规用途 V2General-purpose V2 对象Object BlobBlob 标准Standard 热、冷、存档Hot, Cool, Archive 3.6+3.6+ AllAll
Azure 存储Azure Storage 常规用途 V1General-purpose V1 对象Object BlobBlob StandardStandard 空值N/A AllAll AllAll
Azure 存储Azure Storage Blob 存储 * *Blob Storage** 对象Object 块 blobBlock Blob 标准Standard 热、冷、存档Hot, Cool, Archive AllAll AllAll
Azure Data Lake Storage Gen1Azure Data Lake Storage Gen1 不适用N/A 分层 (filesystem) Hierarchical (filesystem) 不适用N/A 不适用N/A 不适用N/A 仅3。63.6 Only 除 HBase 之外的所有All except HBase

* * 对于 HDInsight 群集,只有辅助存储帐户的类型为 BlobStorage,页 Blob 不是受支持的存储选项。**For HDInsight clusters, only secondary storage accounts can be of type BlobStorage and Page Blob isn't a supported storage option.

有关 Azure 存储帐户类型的详细信息,请参阅 azure 存储帐户概述For more information on Azure Storage account types, see Azure storage account overview

有关 Azure 存储访问层的详细信息,请参阅 Azure Blob 存储:高级 (预览) 、热、冷和存档存储层For more information on Azure Storage access tiers, see Azure Blob storage: Premium (preview), Hot, Cool, and Archive storage tiers

可以使用服务组合创建群集,以用于主要和可选的辅助存储。You can create clusters using combinations of services for primary and optional secondary storage. 下表总结了 HDInsight 当前支持的群集存储配置:The following table summarizes the cluster storage configurations that are currently supported in HDInsight:

HDInsight 版本HDInsight Version 主存储Primary Storage 辅助存储Secondary Storage 支持Supported
3.6 & 4。03.6 & 4.0 常规用途 V1,常规用途 V2General Purpose V1, General Purpose V2 常规用途 V1,常规用途 V2,BlobStorage (块 Blob) General Purpose V1, General Purpose V2, BlobStorage(Block Blobs) Yes
3.6 & 4。03.6 & 4.0 常规用途 V1,常规用途 V2General Purpose V1, General Purpose V2 Data Lake Storage Gen2Data Lake Storage Gen2 No
3.6 & 4。03.6 & 4.0 Data Lake Storage Gen2 *Data Lake Storage Gen2* Data Lake Storage Gen2Data Lake Storage Gen2 Yes
3.6 & 4。03.6 & 4.0 Data Lake Storage Gen2 *Data Lake Storage Gen2* 常规用途 V1,常规用途 V2,BlobStorage (块 Blob) General Purpose V1, General Purpose V2, BlobStorage(Block Blobs) Yes
3.6 & 4。03.6 & 4.0 Data Lake Storage Gen2Data Lake Storage Gen2 Data Lake Storage Gen1Data Lake Storage Gen1 No
3.63.6 Data Lake Storage Gen1Data Lake Storage Gen1 Data Lake Storage Gen1Data Lake Storage Gen1 Yes
3.63.6 Data Lake Storage Gen1Data Lake Storage Gen1 常规用途 V1,常规用途 V2,BlobStorage (块 Blob) General Purpose V1, General Purpose V2, BlobStorage(Block Blobs) Yes
3.63.6 Data Lake Storage Gen1Data Lake Storage Gen1 Data Lake Storage Gen2Data Lake Storage Gen2 No
4.04.0 Data Lake Storage Gen1Data Lake Storage Gen1 AnyAny No
4.04.0 常规用途 V1,常规用途 V2General Purpose V1, General Purpose V2 Data Lake Storage Gen1Data Lake Storage Gen1 No

* = 这可能是一个或多个 Data Lake Storage Gen2,只要它们都设置为使用相同的托管标识进行群集访问。*=This could be one or multiple Data Lake Storage Gen2, as long as they're all setup to use the same managed identity for cluster access.

备注

Spark 2.1 或2.2 群集不支持主存储 Data Lake Storage Gen2。Data Lake Storage Gen2 primary storage is not supported for Spark 2.1 or 2.2 clusters.

数据复制Data replication

Azure HDInsight 不存储客户数据。Azure HDInsight does not store customer data. 群集存储的主要方法是其关联的存储帐户。The primary means of storage for a cluster are its associated storage accounts. 可以将群集附加到现有存储帐户,或在群集创建过程中创建新的存储帐户。You can attach your cluster to an existing storage account, or create a new storage account during the cluster creation process. 如果创建了新帐户,则会将其创建为本地冗余存储 (LRS) 帐户,并满足区域内数据派驻要求,其中包括在 信任中心中指定的帐户。If a new account is created, it will be created as a locally redundant storage (LRS) account, and will satisfy in-region data residency requirements including those specified in the Trust Center.

可以通过确保与 HDInsight 关联的存储帐户为 LRS 或在 信任中心提及的其他存储选项,验证 hdinsight 是否已正确配置为将数据存储在单个区域中。You can validate that HDInsight is properly configured to store data in a single region by ensuring that the storage account associated with your HDInsight is LRS or another storage option mentioned on Trust Center.

后续步骤Next steps