Manage Domain-joined HDInsight clusters

Learn the users and the roles in Domain-joined HDInsight, and how to manage domain-joined HDInsight clusters.

You can link a normal cluster by using Ambari managed username, also link a security hadoop cluster by using domain username (such as: user1@contoso.com).

  1. Open the command palette by selecting CTRL+SHIFT+P, and then enter HDInsight: Link a cluster.

    link cluster command

  2. Enter HDInsight cluster URL -> input Username -> input Password -> select cluster type -> it shows success info if verification passed.

    link cluster dialog

    Note

    The linked username and password are used if the cluster both logged in Azure subscription and Linked a cluster.

  3. You can see a Linked cluster by using command List cluster. Now you can submit a script to this linked cluster.

    linked cluster

  4. You also can unlink a cluster by inputting HDInsight: Unlink a cluster from command palette.

You can link a normal cluster by using Ambari managed username, also link a security hadoop cluster by using domain username (such as: user1@contoso.com).

  1. Click Link a cluster from Azure Explorer.

    link cluster context menu

  2. Enter Cluster Name, User Name and Password. You need to check the username and password if got the authentication failure. Optionally, add Storage Account, Storage Key, then select a container from Storage Container. Storage information is for storage explorer in the left tree

    link cluster dialog

    Note

    We use the linked storage key, username and password if the cluster both logged in Azure subscription and Linked a cluster. storage explorer in IntelliJ

  3. You can see a Linked cluster in HDInsight node if the input information are right. Now you can submit an application to this linked cluster.

    linked cluster

  4. You also can unlink a cluster from Azure Explorer.

    unlinked cluster

You can link a normal cluster by using Ambari managed username, also link a security hadoop cluster by using domain username (such as: user1@contoso.com).

  1. Click Link a cluster from Azure Explorer.

    link cluster context menu

  2. Enter Cluster Name, User Name and Password, then click OK button to link cluster. Optionally, enter Storage Account, Storage Key and then select Storage Container for storage explorer to work in the left tree view

    link cluster dialog

    Note

    We use the linked storage key, username and password if the cluster both logged in Azure subscription and Linked a cluster. storage explorer in Eclipse

  3. You can see a Linked cluster in HDInsight node after clicking OK button, if the input information are right. Now you can submit an application to this linked cluster.

    linked cluster

  4. You also can unlink a cluster from Azure Explorer.

    unlinked cluster

Access the clusters with Enterprise Security Package.

Enterprise Security Package (previously known as HDInsight Premium) provides multi-user access to the cluster, where authentication is done by Active Directory and authorization by Apache Ranger and Storage ACLs (ADLS ACLs). Authorization provides secure boundaries among multiple users and allows only privileged users to have access to the data based on the authorization policies.

Security and user isolation are important for a HDInsight cluster with Enterprise Security Package. To meet these requirements, SSH access to the cluster with Enterprise Security Package is blocked. The following table shows the recommended access methods for each cluster type:

Workload Scenario Access Method
Hadoop Hive – Interactive Jobs/Queries
Spark Interactive Jobs/Queries, PySpark interactive
Spark Batch Scenarios – Spark submit, PySpark
Interactive Query (LLAP) Interactive
Any Install Custom Application

Note

Jupyter is not installed/supported in Enterprise Security Package.

Using the standard APIs helps from security perspective. In addition, you get the following benefits:

  1. Management – You can manage your code and automate jobs using standard APIs – Livy, HS2 etc.
  2. Audit – With SSH, there is no way to audit, which users SSH’d to the cluster. This wouldn’t be the case when jobs are constructed via standard endpoints as they would be executed in context of user.

Use Beeline

Install Beeline on your machine, and connect over the public internet, use the following parameters:

- Connection string: -u 'jdbc:hive2://<clustername>.azurehdinsight.net:443/;ssl=true;transportMode=http;httpPath=/hive2'
- Cluster login name: -n admin
- Cluster login password -p 'password'

If you have Beeline installed locally, and connect over an Azure Virtual Network, use the following parameters:

- Connection string: -u 'jdbc:hive2://<headnode-FQDN>:10001/;transportMode=http'

To find the fully qualified domain name of a headnode, use the information in the Manage HDInsight using the Ambari REST API document.

Users of Domain-joined HDInsight clusters

An HDInsight cluster that is not domain-joined has two user accounts that are created during the cluster creation:

  • Ambari admin: This account is also known as Hadoop user or HTTP user. This account can be used to log on to Ambari at https://<clustername>.azurehdinsight.net. It can also be used to run queries on Ambari views, execute jobs via external tools (for example, PowerShell, Templeton, Visual Studio), and authenticate with the Hive ODBC driver and BI tools (for example, Excel, PowerBI, or Tableau).

A domain-joined HDInsight cluster has three new users in addition to Ambari Admin.

  • Ranger admin: This account is the local Apache Ranger admin account. It is not an active directory domain user. This account can be used to setup policies and make other users admins or delegated admins (so that those users can manage policies). By default, the username is admin and the password is the same as the Ambari admin password. The password can be updated from the Settings page in Ranger.
  • Cluster admin domain user: This account is an active directory domain user designated as the Hadoop cluster admin including Ambari and Ranger. You must provide this user’s credentials during cluster creation. This user has the following privileges:

    • Join machines to the domain and place them within the OU that you specify during cluster creation.
    • Create service principals within the OU that you specify during cluster creation.
    • Create reverse DNS entries.

      Note the other AD users also have these privileges.

      There are some end points within the cluster (for example, Templeton) which are not managed by Ranger, and hence are not secure. These end points are locked down for all users except the cluster admin domain user.

  • Regular: During cluster creation, you can provide multiple active directory groups. The users in these groups are synced to Ranger and Ambari. These users are domain users and have access to only Ranger-managed endpoints (for example, Hiveserver2). All the RBAC policies and auditing will be applicable to these users.

Roles of Domain-joined HDInsight clusters

Domain-joined HDInsight have the following roles:

  • Cluster Administrator
  • Cluster Operator
  • Service Administrator
  • Service Operator
  • Cluster User

To see the permissions of these roles

  1. Open the Ambari Management UI. See Open the Ambari Management UI.
  2. From the left menu, click Roles.
  3. Click the blue question mark to see the permissions:

    Domain-joined HDInsight roles permissions

Open the Ambari Management UI

  1. Sign on to the Azure portal.
  2. Open your HDInsight cluster. See List and show clusters.
  3. Click Dashboard from the top menu to open Ambari.
  4. Log on to Ambari using the cluster administrator domain user name and password.
  5. Click the Admin dropdown menu from the upper right corner, and then click Manage Ambari.

    Domain-joined HDInsight manage Ambari

    The UI looks like:

    Domain-joined HDInsight Ambari management UI

List the domain users synchronized from your Active Directory

  1. Open the Ambari Management UI. See Open the Ambari Management UI.
  2. From the left menu, click Users. You shall see all the users synced from your Active Directory to the HDInsight cluster.

    Domain-joined HDInsight Ambari management UI list users

List the domain groups synchronized from your Active Directory

  1. Open the Ambari Management UI. See Open the Ambari Management UI.
  2. From the left menu, click Groups. You shall see all the groups synced from your Active Directory to the HDInsight cluster.

    Domain-joined HDInsight Ambari management UI list groups

Configure Hive Views permissions

  1. Open the Ambari Management UI. See Open the Ambari Management UI.
  2. From the left menu, click Views.
  3. Click HIVE to show the details.

    Domain-joined HDInsight Ambari management UI Hive Views

  4. Click the Hive View link to configure Hive Views.
  5. Scroll down to the Permissions section.

    Domain-joined HDInsight Ambari management UI Hive Views configure permissions

  6. Click Add User or Add Group, and then specify the users or groups that can use Hive Views.

Configure users for the roles

To see a list of roles and their permissions, see Roles of Domain-joined HDInsight clusters.

  1. Open the Ambari Management UI. See Open the Ambari Management UI.
  2. From the left menu, click Roles.
  3. Click Add User or Add Group to assign users and groups to different roles.

Next steps