Hive - Project

11 / 17

Hive - Project - Sentiment Analysis - Visualization

Hive - Connecting to Tableau

Tableau is a visualization tool. We can connect to Hive using Tableau and transform data stored in Hive into visually appealing and interactive visualizations.

Hive - Connecting to Tableau - Steps

Let's connect to Hive using Tableau. Download Tableau Desktop from the link displayed on the screen. Tableau is a paid software. If you are a student, you can get the one-year free license. After the download is finished, install the Tableau desktop.

We will connect to Hive using Hive ODBC driver. Download Hortonworks Hive ODBC driver for your operating system using the link displayed on the screen. After the download is finished, install it.

Hive - Connecting to Tableau - Hands-on

Let's use tableau. Visualize top 10 stocks with highest opening price on Dec 31, 2009

  • Open tableau and select Hortonworks Hadoop Hive as the data source.
  • Specify server as and port as 10000. Select "HiveServer2" as a type and "Username and password" as authentication.
  • Provide your lab username and password. Click on "sign in" and wait for connection to establish.
  • Type in your database name in schema and press enter. Now select your database.
  • Click on "search" under 'table' to list all the tables in your database.
  • Double click on "NYSE" table and click on "go to worksheet".
  • Drag and drop "Symbol1" to columns and "Price high" to rows.
  • Click on "show me" on the top right corner and select the recommended chart by Tableau.
  • Now we will filter the data for "Dec 31, 2009".
  • Drag "Ymd" to Filters, select "2009-12-31" from the list and click on ok.
  • Now sort the data points to see the stocks with the highest opening price.
  • Stocks CME, CEO and CLB have the highest opening price on Dec 31, 2009.

Now, we will visualize sentiments of each country for "Iron Man 3" movie. We'll use _tweetsbi_ table in Hive. If you haven't gone through "Connect To Apache Hive Using Tableau" topic, please see the first video.

We'll visualize the sentiments in China, Mexico and the United States. Please see below screencast to learn how will you visualize the data using Tableau

  • Let's visualize the sentiments in the United States, Mexico and China. Open Tableau.
  • Select "Hortonworks Hadoop Hive".
  • Enter the Hive server and authentication details and click on "Sign In".
  • Now search for your database under schema and select it.
  • Now search for "tweetsbi" table under your database and select it.
  • Click on "sheet1". Select "Country", "Sentiment" from dimensions and "Number of records" from measures and click on "show me" to see the recommended chart by Tableau.
  • Select the recommended chart and wait for the query to get executed.
  • Hide the recommended charts. As you can see, Tableau has plotted the chart with sentiments in each country.
  • Each circle represents a country and each pie block in the circle shows the share of positive, negative and neutral sentiment in that particular country.
  • Click on "Size" and increase the size of circles.

Now, hover on each pie block to see the sentiments in that country. As you can see, in the United States 13,235 people have a positive sentiment for the "Iron man 3" movie, 5,229 people have a negative sentiment and 25,959 people have expressed a neutral sentiment.

A majority of the people have a neutral sentiment in the United States. Let's visualize the sentiments in Mexico.

Click on search on the map toolbar and search for Mexico. Increase the size of the circles. You can see that 628 people have a neutral sentiment, 172 people have a negative sentiment and 136 people have a positive sentiment.

Similarly, we can visualize the sentiments in China.

Note -

  • Download the Hortonworks ODBC Driver for Apache Hive (v2.1.16) from here
  • For connecting to Hive, please select transport as SASL in Tableau