Use Apache Zeppelin to run Apache Phoenix queries over Apache HBase in Azure HDInsight

Apache Phoenix is an open source, massively parallel relational database layer built on HBase. Phoenix allows you to use SQL like queries over HBase. Phoenix uses JDBC drivers underneath to enable you to create, delete, alter SQL tables, indexes, views, and sequences. You can also use Phoenix to update rows individually and in bulk. Phoenix uses a NOSQL native compilation rather than using MapReduce to compile queries, enabling the creation of low-latency applications on top of HBase.

Apache Zeppelin is an open source web-based notebook that enables you to create data-driven, collaborative documents using interactive data analytics and languages such as SQL and Scala. It helps data developers & data scientists develop, organize, execute, and share code for data manipulation. It allows you to visualize results without referring to the command line or needing the cluster details.

HDInsight users can use Apache Zeppelin to query Phoenix tables. Apache Zeppelin is integrated with HDInsight cluster and there are no additional steps to use it. Create a Zeppelin Notebook with JDBC interpreter and start writing your Phoenix SQL queries

Prerequisites

An Apache HBase cluster on HDInsight. See Get started with Apache HBase.

Create an Apache Zeppelin Note

  1. Replace CLUSTERNAME with the name of your cluster in the following URL https://CLUSTERNAME.azurehdinsight.net/zeppelin. Then enter the URL in a web browser. Enter your cluster login username and password.

  2. From the Zeppelin page, select Create new note.

    HDInsight Interactive Query zeppelin.

  3. From the Create new note dialog, type or select the following values:

    • Note Name: Enter a name for the note.
    • Default interpreter: Select jdbc from the drop-down list.

    Then select Create Note.

  4. Ensure the notebook header shows a connected status. It's denoted by a green dot in the top-right corner.

    Zeppelin notebook status.

  5. Create an HBase table. Enter the following command and then press Shift + Enter:

    %jdbc(phoenix)
    CREATE TABLE Company (
        company_id INTEGER PRIMARY KEY,
        name VARCHAR(225)
    );
    

    The %jdbc(phoenix) statement in the frontline tells the notebook to use the Phoenix JDBC interpreter.

  6. View created tables.

    %jdbc(phoenix)
    SELECT DISTINCT table_name
    FROM SYSTEM.CATALOG
    WHERE table_schem is null or table_schem <> 'SYSTEM';
    
  7. Insert values in the table.

    %jdbc(phoenix)
    UPSERT INTO Company VALUES(1, 'Microsoft');
    UPSERT INTO Company (name, company_id) VALUES('Apache', 2);
    
  8. Query the table.

    %jdbc(phoenix)
    SELECT * FROM Company;
    
  9. Delete a record.

    %jdbc(phoenix)
    DELETE FROM Company WHERE COMPANY_ID=1;
    
  10. Drop the table.

    %jdbc(phoenix)
    DROP TABLE Company;
    

Next steps