您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

使用自定义 Ambari DB 设置 HDInsight 群集Set up HDInsight clusters with a custom Ambari DB

Apache Ambari 简化了 Apache Hadoop 群集的管理和监视。Apache Ambari simplifies the management and monitoring of an Apache Hadoop cluster. Ambari 提供易于使用的 web UI 和 REST API。Ambari provides an easy to use web UI and REST API. Ambari 包含在 HDInsight 群集上,用于监视群集和进行配置更改。Ambari is included on HDInsight clusters, and is used to monitor the cluster and make configuration changes.

在常规群集创建中,如在hdinsight 中设置群集之类的其他文章中所述,Ambari 部署在由 hdinsight 管理并且用户无法访问的S0 Azure SQL 数据库中。In normal cluster creation, as described in other articles such as Set up clusters in HDInsight, Ambari is deployed in an S0 Azure SQL database that is managed by HDInsight and is not accessible to users.

使用自定义 Ambari DB 功能,你可以部署新的群集,并在你管理的外部数据库中设置 Ambari。The custom Ambari DB feature allows you to deploy a new cluster and setup Ambari in an external database that you manage. 部署是使用 Azure 资源管理器模板来完成的。The deployment is done with an Azure Resource Manager template. 此功能具有以下优点:This feature has the following benefits:

  • 自定义-选择数据库的大小和处理容量。Customization - you choose the size and processing capacity of the database. 如果有大量的群集处理密集型工作负荷,具有较低规范的 Ambari 数据库可能会成为管理操作的瓶颈。If you have large clusters processing intensive workloads, an Ambari database with lower specifications could become a bottleneck for management operations.
  • 灵活性-可以根据需要缩放数据库以满足你的要求。Flexibility - you can scale the database as needed to suit your requirements.
  • 控制-您可以采用适合于组织需求的方式管理数据库的备份和安全。Control - you can manage backups and security for your database in a way that fits with your organizations requirements.

本文的其余部分将讨论以下几点:The remainder of this article discusses the following points:

  • 使用自定义 Ambari DB 功能的要求requirements to use the custom Ambari DB feature
  • 为 Apache Ambari 使用自己的外部数据库预配 HDInsight 群集所需的步骤the steps necessary to provision HDInsight clusters using your own external database for Apache Ambari

自定义 Ambari DB 要求Custom Ambari DB requirements

你可以使用所有群集类型和版本部署自定义的 Ambari DB。You can deploy a custom Ambari DB with all cluster types and versions. 多个群集不能使用相同的 Ambari DB。Multiple clusters cannot use the same Ambari DB.

自定义 Ambari DB 具有以下其他要求:The custom Ambari DB has the following other requirements:

  • 你必须具有现有的 Azure SQL DB 服务器和数据库。You must have an existing Azure SQL DB server and database.
  • 为 Ambari 安装程序提供的数据库必须为空。The database that you provide for Ambari setup must be empty. 默认 dbo 架构中不应有任何表。There should be no tables in the default dbo schema.
  • 用于连接到数据库的用户应对数据库具有 SELECT、CREATE TABLE 和 INSERT 权限。The user used to connect to the database should have SELECT, CREATE TABLE, and INSERT permissions on the database.
  • 启用此选项以允许在将托管 Ambari 的 Azure SQL server 上访问 azure 服务Turn on the option to Allow access to Azure services on the Azure SQL server where you will host Ambari.
  • 需要在 SQL Server 中允许来自 HDInsight 服务的管理 IP 地址。Management IP addresses from HDInsight service need to be allowed in the SQL Server. 有关必须添加到 SQL server 防火墙的 IP 地址的列表,请参阅HDInsight 管理 ip 地址See HDInsight management IP addresses for a list of the IP addresses that must be added to the SQL server firewall.

在外部数据库中托管 Apache Ambari DB 时,请记住以下几点:When you host your Apache Ambari DB in an external database, remember the following points:

  • 你需要负责保存 Ambari 的 Azure SQL DB 的额外成本。You're responsible for the additional costs of the Azure SQL DB that holds Ambari.
  • 定期备份自定义 Ambari 数据库。Back up your custom Ambari DB periodically. Azure SQL 数据库自动生成备份,但备份保留时间框架各不相同。Azure SQL Database generates backups automatically, but the backup retention time-frame varies. 有关详细信息,请参阅了解 SQL 数据库自动备份For more information, see Learn about automatic SQL Database backups.

使用自定义 Ambari DB 部署群集Deploy clusters with a custom Ambari DB

若要创建使用自己的外部 Ambari 数据库的 HDInsight 群集,请使用自定义 AMBARI DB 快速入门模板To create an HDInsight cluster that uses your own external Ambari database, use the custom Ambari DB Quickstart template.

编辑 azuredeploy.parameters.json 中的参数,指定有关新群集和将保留 Ambari 的数据库的信息。Edit the parameters in the azuredeploy.parameters.json to specify information about your new cluster and the database that will hold Ambari.

您可以使用 Azure CLI 开始部署。You can begin the deployment using the Azure CLI. <RESOURCEGROUPNAME> 替换为要在其中部署群集的资源组。Replace <RESOURCEGROUPNAME> with the resource group where you want to deploy your cluster.

az group deployment create --name HDInsightAmbariDBDeployment \
    --resource-group <RESOURCEGROUPNAME> \
    --template-file azuredeploy.json \
    --parameters azuredeploy.parameters.json

后续步骤Next steps