Use Apache Zeppelin to run Apache Hive queries in Azure HDInsight

HDInsight Interactive Query clusters include Apache Zeppelin notebooks that you can use to run interactive Hive queries. In this article, you learn how to use Apache Zeppelin to run Apache Hive queries in Azure HDInsight.


Before going through this article, you must have the following items:

  • HDInsight Interactive Query cluster. See Create cluster to create a HDInsight cluster. Make sure to choose the Interactive Query type.

Create an Apache Zeppelin Note

  1. Browse to the following URL:

    Replace CLUSTERNAME with the name of your cluster.

  2. Enter your Hadoop username and password. From the Zeppelin page, you can either create a new note or open existing notes. HiveSample contains some sample Hive queries.

    HDInsight Interactive Query zeppelin

  3. Click Create new Note.
  4. Type or select the following values:

    • Note name: enter a name for the note.
    • Default interpreter: select JDBC.
  5. Click Create Note.

  6. Run the following Hive query:

     show tables

    HDInsight Interactive Query zeppelin runs query

    The %jdbc(hive) statement in the first line tells the notebook to use the Hive JDBC interpreter.

    The query shall return one Hive table called hivesampletable.

    The following are two more Hive queries that you can run against the hivesampletable.

     select * from hivesampletable limit 10
     select ${group_name}, count(*) as total_count
     from hivesampletable
     group by ${group_name=market,market|deviceplatform|devicemake}
     limit ${total_count=10}

    Comparing to the traditional Hive, the query results come back must faster.

Next steps

In this article, you learned how to visualize data from HDInsight using Microsoft Power BI. To learn more, see the following articles: