Skip to content

Latest commit

 

History

History
76 lines (53 loc) · 3.51 KB

hdinsight-connect-hive-zeppelin.md

File metadata and controls

76 lines (53 loc) · 3.51 KB
title description keywords services author ms.reviewer ms.service ms.custom ms.topic ms.date ms.author
Use Apache Zeppelin to run Apache Hive queries in Azure HDInsight
Learn how to use Apache Zeppelin to run Apache Hive queries.
hdinsight,hadoop,hive,interactive query,LLAP
hdinsight
hrasheed-msft
jasonh
hdinsight
hdinsightactive,
conceptual
11/05/2018
hrasheed

Use Apache Zeppelin to run Apache Hive queries in Azure HDInsight

HDInsight Interactive Query clusters include Apache Zeppelin notebooks that you can use to run interactive Hive queries. In this article, you learn how to use Apache Zeppelin to run Apache Hive queries in Azure HDInsight.

Prerequisites

Before going through this article, you must have the following items:

  • HDInsight Interactive Query cluster. See Create cluster to create a HDInsight cluster. Make sure to choose the Interactive Query type.

Create an Apache Zeppelin Note

  1. Browse to the following URL:

     https://CLUSTERNAME.azurehdinsight.net/zeppelin
    

    Replace CLUSTERNAME with the name of your cluster.

  2. Enter your Hadoop username and password. From the Zeppelin page, you can either create a new note or open existing notes. HiveSample contains some sample Hive queries.

    HDInsight Interactive Query zeppelin

  3. Click Create new Note.

  4. Type or select the following values:

    • Note name: enter a name for the note.
    • Default interpreter: select JDBC.
  5. Click Create Note.

  6. Run the following Hive query:

     %jdbc(hive)
     show tables
    

    HDInsight Interactive Query zeppelin runs query

    The %jdbc(hive) statement in the first line tells the notebook to use the Hive JDBC interpreter.

    The query shall return one Hive table called hivesampletable.

    The following are two more Hive queries that you can run against the hivesampletable.

     %jdbc(hive)
     select * from hivesampletable limit 10
    
     %jdbc(hive)
     select ${group_name}, count(*) as total_count
     from hivesampletable
     group by ${group_name=market,market|deviceplatform|devicemake}
     limit ${total_count=10}
    

    Comparing to the traditional Hive, the query results come back must faster.

Next steps

In this article, you learned how to visualize data from HDInsight using Microsoft Power BI. To learn more, see the following articles: