Hive - Select and Aggregation Queries




Not able to play video? Try with youtube

Important Note - We recommend you to execute the given commands on Hive Console instead of Hue. The video is only for representational purposes.

To see the values of all columns of the table we can use SELECT * from table name SQL command. Type SELECT * from nyse; to see all column values of nyse table.

To see only exchange1 and symbol1 columns, type SELECT exchange1, symbol1 FROM nyse;

We can write aggregation queries in Hive in the same way we write in SQL. To find the average opening price of each stock in nyse table, type SELECT symbol1, AVG(price_open) AS avg_price FROM nyse GROUP BY symbol1;. We can see the tez session is opened. Tez is a framework for building high-performance batch and interactive data processing applications, coordinated by YARN. Tez dramatically improves the MapReduce performance. We can see the average opening price of each stock.

Please note that we can improve the performance of aggregation queries by setting top-level aggregation in the map phase.

INSTRUCTIONS

Steps:

  • Login to the web console.
  • Launch Hive by running command hive. Run the below commands in Hive.
  • Use your database. Run the following command. ${env:USER} gets replaced by your username automatically:

    use ${env:USER};
    
  • To select all columns from the table nyse

    SELECT * FROM nyse;
    
  • To select only required columns from the table nyse

    SELECT exchange1, symbol1 FROM nyse;
    
  • Find the average opening price for each stock

    SELECT symbol1, AVG(price_open) AS avg_price FROM nyse
    GROUP BY symbol1;
    

Loading comments...