您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

设置包含自定义 Ambari DB 的 HDInsight 群集Set up HDInsight clusters with a custom Ambari DB

Apache Ambari 简化了 Apache Hadoop 群集的管理和监视。Apache Ambari simplifies the management and monitoring of an Apache Hadoop cluster. Ambari 提供易于使用的 Web UI 和 REST API。Ambari provides an easy to use web UI and REST API. Ambari 包括在 HDInsight 群集中,用于监视群集和进行配置更改。Ambari is included on HDInsight clusters, and is used to monitor the cluster and make configuration changes.

在常规群集创建中,如在hdinsight 中设置群集之类的其他文章中所述,Ambari 部署在由 hdinsight 管理并且用户无法访问的S0 Azure SQL 数据库中。In normal cluster creation, as described in other articles such as Set up clusters in HDInsight, Ambari is deployed in an S0 Azure SQL Database that is managed by HDInsight and is not accessible to users.

使用自定义 Ambari DB 功能,可以在由你管理的外部数据库中部署新群集和设置 Ambari。The custom Ambari DB feature allows you to deploy a new cluster and setup Ambari in an external database that you manage. 部署是使用 Azure 资源管理器模板完成的。The deployment is done with an Azure Resource Manager template. 此功能提供以下优势:This feature has the following benefits:

  • 自定义 - 可以选择数据库的大小和处理容量。Customization - you choose the size and processing capacity of the database. 如果大型群集需要处理密集型工作负荷,规格较低的 Ambari 数据库可能会成为管理操作的瓶颈。If you have large clusters processing intensive workloads, an Ambari database with lower specifications could become a bottleneck for management operations.
  • 灵活性 - 可根据要求按需缩放数据库。Flexibility - you can scale the database as needed to suit your requirements.
  • 控制 - 可以根据组织的要求管理数据库的备份和安全性。Control - you can manage backups and security for your database in a way that fits with your organizations requirements.

本文的余下内容将讨论以下几点:The remainder of this article discusses the following points:

  • 使用自定义 Ambari DB 功能的要求requirements to use the custom Ambari DB feature
  • 使用自己的外部数据库为 Apache Ambari 预配 HDInsight 群集所要执行的步骤the steps necessary to provision HDInsight clusters using your own external database for Apache Ambari

自定义 Ambari DB 的要求Custom Ambari DB requirements

可以使用所有群集类型和版本来部署自定义的 Ambari DB。You can deploy a custom Ambari DB with all cluster types and versions. 多个群集不能使用同一个 Ambari DB。Multiple clusters cannot use the same Ambari DB.

自定义 Ambari DB 具有以下附加要求:The custom Ambari DB has the following other requirements:

  • 数据库名称不能包含连字符或空格The name of the database cannot contain hyphens or spaces
  • 你必须具有现有的 Azure SQL DB 服务器和数据库。You must have an existing Azure SQL DB server and database.
  • 为 Ambari 安装程序提供的数据库必须为空。The database that you provide for Ambari setup must be empty. 默认 dbo 架构中不应有任何表。There should be no tables in the default dbo schema.
  • 用于连接到数据库的用户应对数据库具有 SELECT、CREATE TABLE 和 INSERT 权限。The user used to connect to the database should have SELECT, CREATE TABLE, and INSERT permissions on the database.
  • 启用此选项以允许在将托管 Ambari 的服务器上访问 Azure 服务Turn on the option to Allow access to Azure services on the server where you will host Ambari.
  • 需要在防火墙规则中允许来自 HDInsight 服务的管理 IP 地址。Management IP addresses from HDInsight service need to be allowed in the firewall rule. 有关必须添加到服务器级防火墙规则的 IP 地址的列表,请参阅HDInsight 管理 ip 地址See HDInsight management IP addresses for a list of the IP addresses that must be added to the server-level firewall rule.

在外部数据库中托管 Apache Ambari DB 时,请记住以下几点:When you host your Apache Ambari DB in an external database, remember the following points:

  • 你需要负责保存 Ambari 的 Azure SQL DB 的额外成本。You're responsible for the additional costs of the Azure SQL DB that holds Ambari.
  • 定期备份自定义 Ambari 数据库。Back up your custom Ambari DB periodically. Azure SQL 数据库自动生成备份,但备份保留时间框架各不相同。Azure SQL Database generates backups automatically, but the backup retention time-frame varies. 有关详细信息,请参阅了解 SQL 数据库自动备份For more information, see Learn about automatic SQL Database backups.

使用自定义 Ambari DB 部署群集Deploy clusters with a custom Ambari DB

若要创建使用自己的外部 Ambari 数据库的 HDInsight 群集,请使用自定义 AMBARI DB 快速入门模板To create an HDInsight cluster that uses your own external Ambari database, use the custom Ambari DB Quickstart template.

编辑中的参数 azuredeploy.parameters.json ,指定有关新群集和将保留 Ambari 的数据库的信息。Edit the parameters in the azuredeploy.parameters.json to specify information about your new cluster and the database that will hold Ambari.

您可以使用 Azure CLI 开始部署。You can begin the deployment using the Azure CLI. 将替换 <RESOURCEGROUPNAME> 为要在其中部署群集的资源组。Replace <RESOURCEGROUPNAME> with the resource group where you want to deploy your cluster.

az group deployment create --name HDInsightAmbariDBDeployment \
    --resource-group <RESOURCEGROUPNAME> \
    --template-file azuredeploy.json \
    --parameters azuredeploy.parameters.json

后续步骤Next steps