您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

HDInsight 上的 Apache Hadoop 服务所使用的端口Ports used by Apache Hadoop services on HDInsight

本文档提供在 HDInsight 群集上运行的 Apache Hadoop 服务使用的端口列表。This document provides a list of the ports used by Apache Hadoop services running on HDInsight clusters. 此外,还提供了用于通过 SSH 连接到群集的端口的信息。It also provides information on ports used to connect to the cluster using SSH.

公共端口与非公共端口Public ports vs. non-public ports

基于 Linux 的 HDInsight 群集只在 Internet 上公开三个端口:22、23 和 443。Linux-based HDInsight clusters only expose three ports publicly on the internet; 22, 23, and 443. 使用这些端口可以通过 SSH 安全访问群集,以及访问通过安全 HTTPS 协议公开的服务。These ports are used to securely access the cluster using SSH and services exposed over the secure HTTPS protocol.

在内部,HDInsight 由在 Azure 虚拟网络上运行的多个 Azure 虚拟机(群集内的节点)实现。Internally, HDInsight is implemented by several Azure Virtual Machines (the nodes within the cluster) running on an Azure Virtual Network. 从虚拟网络内部可以访问不是通过 Internet 公开的端口。From within the virtual network, you can access ports not exposed over the internet. 例如,如果使用 SSH 连接到某个头节点,则可以从该头节点直接访问群集节点上运行的服务。For example, if you connect to one of the head nodes using SSH, from the head node you can then directly access services running on the cluster nodes.

重要

如果尚未指定某个 Azure 虚拟网络作为 HDInsight 的配置选项,系统会自动创建一个 Azure 虚拟网络。If you do not specify an Azure Virtual Network as a configuration option for HDInsight, one is created automatically. 但无法将其他计算机(例如其他 Azure 虚拟机或客户端开发计算机)加入到此虚拟网络中。However, you can't join other machines (such as other Azure Virtual Machines or your client development machine) to this virtual network.

要将其他计算机添加到虚拟网络,必须先创建虚拟网络,然后在创建 HDInsight 群集时指定该网络。To join additional machines to the virtual network, you must create the virtual network first, and then specify it when creating your HDInsight cluster. 有关详细信息,请参阅为 HDInsight 规划虚拟网络For more information, see Plan a virtual network for HDInsight.

公共端口Public ports

HDInsight 群集中的所有节点都位于 Azure 虚拟网络中,无法直接从 Internet 访问。All the nodes in an HDInsight cluster are located in an Azure Virtual Network, and can't be directly accessed from the internet. 使用公共网关可以通过 Internet 访问以下端口(在所有 HDInsight 群集类型中很常见)。A public gateway provides internet access to the following ports, which are common across all HDInsight cluster types.

服务Service 端口Port 协议Protocol 描述Description
sshdsshd 2222 SSHSSH 将客户端连接到主头节点上的 sshd。Connects clients to sshd on the primary headnode. 有关详细信息,请参阅将SSH 与 HDInsight 一起使用For more information, see Use SSH with HDInsight.
sshdsshd 2222 SSHSSH 将客户端连接到边缘节点上的 sshd。Connects clients to sshd on the edge node. 有关详细信息,请参阅将SSH 与 HDInsight 一起使用For more information, see Use SSH with HDInsight.
sshdsshd 2323 SSHSSH 将客户端连接到辅助头节点上的 sshd。Connects clients to sshd on the secondary headnode. 有关详细信息,请参阅将SSH 与 HDInsight 一起使用For more information, see Use SSH with HDInsight.
AmbariAmbari 443443 HTTPSHTTPS Ambari Web UI。Ambari web UI. 请参阅使用 Apache Ambari Web UI 管理 HDInsightSee Manage HDInsight using the Apache Ambari Web UI
AmbariAmbari 443443 HTTPSHTTPS Ambari REST API。Ambari REST API. 请参阅使用 Apache Ambari REST API 管理 HDInsightSee Manage HDInsight using the Apache Ambari REST API
WebHCatWebHCat 443443 HTTPSHTTPS HCatalog REST API。HCatalog REST API. 请参阅使用带卷曲的映射缩减See Use MapReduce with Curl
HiveServer2HiveServer2 443443 ODBCODBC 使用 ODBC 连接到 Hive。Connects to Hive using ODBC. 请参阅使用 Microsoft ODBC 驱动程序将 Excel 连接到 HDInsightSee Connect Excel to HDInsight with the Microsoft ODBC driver.
HiveServer2HiveServer2 443443 JDBCJDBC 使用 JDBC 连接到 Apache Hive。Connects to ApacheHive using JDBC. 请参阅使用 Hive JDBC 驱动程序连接到 HDInsight 上的 Apache HiveSee Connect to Apache Hive on HDInsight using the Hive JDBC driver

以下各项适用于特定的群集类型:The following are available for specific cluster types:

服务Service 端口Port 协议Protocol 群集类型Cluster type 描述Description
StargateStargate 443443 HTTPSHTTPS HBaseHBase HBase REST API。HBase REST API. 请参阅开始使用 Apache HBaseSee Get started using Apache HBase
LivyLivy 443443 HTTPSHTTPS SparkSpark Spark REST API。Spark REST API. 请参阅使用 Apache Livy 远程提交 Apache Spark 作业See Submit Apache Spark jobs remotely using Apache Livy
Spark Thrift 服务器Spark Thrift server 443443 HTTPSHTTPS SparkSpark 用来提交 Hive 查询的 Spark Thrift 服务器。Spark Thrift server used to submit Hive queries. 请参阅在 HDInsight 上将 Beeline 与 Apache Hive 配合使用See Use Beeline with Apache Hive on HDInsight
StormStorm 443443 HTTPSHTTPS StormStorm Storm Web UI。Storm web UI. 请参阅在 HDInsight 上部署和管理 Apache Storm 拓扑See Deploy and manage Apache Storm topologies on HDInsight
卡夫卡休息代理Kafka Rest proxy 443443 HTTPSHTTPS KafkaKafka 卡夫卡 REST API.Kafka REST API. 请参阅使用 REST 代理在 Azure HDInsight 中与 Apache Kafka 群集进行交互See Interact with Apache Kafka clusters in Azure HDInsight using a REST proxy

身份验证Authentication

在 Internet 上公开的所有服务都必须经过身份验证:All services publicly exposed on the internet must be authenticated:

端口Port 凭据Credentials
22 或 2322 or 23 在创建群集期间指定的 SSH 用户凭据The SSH user credentials specified during cluster creation
443443 在创建群集期间设置的登录名(默认为 admin)和密码The login name (default: admin) and password that were set during cluster creation

非公共端口Non-public ports

备注

某些服务仅适用于特定的群集类型。Some services are only available on specific cluster types. 例如,HBase 仅适用于 HBase 群集类型。For example, HBase is only available on HBase cluster types.

重要

某些服务仅在一个头节点上运行一次。Some services only run on one headnode at a time. 如果尝试连接到主头节点上的服务并收到错误,请重试使用辅助头节点。If you attempt to connect to the service on the primary headnode and receive an error, retry using the secondary headnode.

AmbariAmbari

服务Service NodesNodes 端口Port URL 路径URL path 协议Protocol
Ambari Web UIAmbari web UI 头节点Head nodes 80808080 / HTTPHTTP
Ambari REST APIAmbari REST API 头节点Head nodes 80808080 /api/v1/api/v1 HTTPHTTP

示例:Examples:

  • Ambari REST API:curl -u admin "http://10.0.0.11:8080/api/v1/clusters"Ambari REST API: curl -u admin "http://10.0.0.11:8080/api/v1/clusters"

HDFS 端口HDFS ports

服务Service NodesNodes 端口Port 协议Protocol 描述Description
NameNode Web UINameNode web UI 头节点Head nodes 3007030070 HTTPSHTTPS 用于查看状态的 Web UIWeb UI to view status
NameNode 元数据服务NameNode metadata service 头节点head nodes 80208020 IPCIPC 文件系统元数据File system metadata
DataNodeDataNode 所有辅助角色节点All worker nodes 3007530075 HTTPSHTTPS 用于查看状态、日志等信息的 Web UIWeb UI to view status, logs, etc.
DataNodeDataNode 所有辅助角色节点All worker nodes 3001030010   数据传输Data transfer
DataNodeDataNode 所有辅助角色节点All worker nodes 3002030020 IPCIPC 元数据操作Metadata operations
辅助 NameNodeSecondary NameNode 头节点Head nodes 5009050090 HTTPHTTP NameNode 元数据检查点Checkpoint for NameNode metadata

YARN 端口YARN ports

服务Service NodesNodes 端口Port 协议Protocol 描述Description
Resource Manager Web UIResource Manager web UI 头节点Head nodes 80888088 HTTPHTTP Resource Manager 的 Web UIWeb UI for Resource Manager
Resource Manager Web UIResource Manager web UI 头节点Head nodes 80908090 HTTPSHTTPS Resource Manager 的 Web UIWeb UI for Resource Manager
Resource Manager 管理界面Resource Manager admin interface 头节点head nodes 81418141 IPCIPC 用于应用程序提交(Hive、Hive 服务器、Pig 等)For application submissions (Hive, Hive server, Pig, etc.)
Resource Manager 计划程序Resource Manager scheduler 头节点head nodes 80308030 HTTPHTTP 管理界面Administrative interface
Resource Manager 应用程序界面Resource Manager application interface 头节点head nodes 80508050 HTTPHTTP 应用程序管理器界面的地址Address of the applications manager interface
NodeManagerNodeManager 所有辅助角色节点All worker nodes 3005030050   容器管理器的地址The address of the container manager
NodeManager Web UINodeManager web UI 所有辅助角色节点All worker nodes 3006030060 HTTPHTTP 资源管理器界面Resource Manager interface
Timeline 地址Timeline address 头节点Head nodes 1020010200 RPCRPC Timeline 服务 RPC 服务。The Timeline service RPC service.
Timeline Web UITimeline web UI 头节点Head nodes 81888188 HTTPHTTP Timeline 服务 Web UIThe Timeline service web UI

Hive 端口Hive ports

服务Service NodesNodes 端口Port 协议Protocol 描述Description
HiveServer2HiveServer2 头节点Head nodes 1000110001 ThriftThrift 用于连接到 Hive 的服务 (Thrift/JDBC)Service for connecting to Hive (Thrift/JDBC)
Hive 元存储Hive Metastore 头节点Head nodes 90839083 ThriftThrift 用于连接到 Hive 元数据的服务 (Thrift/JDBC)Service for connecting to Hive metadata (Thrift/JDBC)

WebHCat 端口WebHCat ports

服务Service NodesNodes 端口Port 协议Protocol 描述Description
WebHCat 服务器WebHCat server 头节点Head nodes 3011130111 HTTPHTTP 位于 HCatalog 和其他 Hadoop 服务顶层的 Web APIWeb API on top of HCatalog and other Hadoop services

MapReduce 端口MapReduce ports

服务Service NodesNodes 端口Port 协议Protocol 描述Description
JobHistoryJobHistory 头节点Head nodes 1988819888 HTTPHTTP MapReduce JobHistory Web UIMapReduce JobHistory web UI
JobHistoryJobHistory 头节点Head nodes 1002010020   MapReduce JobHistory 服务器MapReduce JobHistory server
ShuffleHandlerShuffleHandler   1356213562   将中间映射输出传输到请求化简器Transfers intermediate Map outputs to requesting Reducers

OozieOozie

服务Service NodesNodes 端口Port 协议Protocol 描述Description
Oozie 服务器Oozie server 头节点Head nodes 1100011000 HTTPHTTP Oozie 服务的 URLURL for Oozie service
Oozie 服务器Oozie server 头节点Head nodes 1100111001 HTTPHTTP Oozie 管理端口Port for Oozie admin

Ambari 指标Ambari Metrics

服务Service NodesNodes 端口Port 协议Protocol 描述Description
TimeLine(应用程序历史记录)TimeLine (Application history) 头节点Head nodes 61886188 HTTPHTTP TimeLine 服务 Web UIThe TimeLine service web UI
TimeLine(应用程序历史记录)TimeLine (Application history) 头节点Head nodes 3020030200 RPCRPC TimeLine 服务 Web UIThe TimeLine service web UI

HBase 端口HBase ports

服务Service NodesNodes 端口Port 协议Protocol 描述Description
HMasterHMaster 头节点Head nodes 1600016000    
HMaster 信息 Web UIHMaster info Web UI 头节点Head nodes 1601016010 HTTPHTTP HBase 主控 Web UI 的端口The port for the HBase Master web UI
区域服务器Region server 所有辅助角色节点All worker nodes 1602016020    
    21812181   客户端用来连接 ZooKeeper 的端口The port that clients use to connect to ZooKeeper

Kafka 端口Kafka ports

服务Service NodesNodes 端口Port 协议Protocol 描述Description
代理Broker 辅助角色节点Worker nodes 90929092 Kafka Wire Protocol(Kafka 线路协议)Kafka Wire Protocol 用于客户端通信Used for client communication
  Zookeeper 节点Zookeeper nodes 21812181   客户端用来连接 Zookeeper 的端口The port that clients use to connect to Zookeeper
REST 代理REST proxy 卡夫卡管理节点Kafka management nodes 94009400 HTTPSHTTPS 卡夫卡 REST 规格Kafka REST specification

Spark 端口Spark ports

服务Service NodesNodes 端口Port 协议Protocol URL 路径URL path 描述Description
Spark Thrift 服务器Spark Thrift servers 头节点Head nodes 1000210002 ThriftThrift   用于连接到 Spark SQL 的服务 (Thrift/JDBC)Service for connecting to Spark SQL (Thrift/JDBC)
Livy 服务器Livy server 头节点Head nodes 89988998 HTTPHTTP   用于运行语句、作业和应用程序的服务Service for running statements, jobs, and applications
Jupyter 笔记本Jupyter notebook 头节点Head nodes 80018001 HTTPHTTP   Jupyter notebook 网站Jupyter notebook website

示例:Examples:

  • Livy:curl -u admin -G "http://10.0.0.11:8998/"Livy: curl -u admin -G "http://10.0.0.11:8998/". 在此示例中,10.0.0.11 是托管 Livy 服务的头节点的 IP 地址。In this example, 10.0.0.11 is the IP address of the headnode that hosts the Livy service.