Hive

6 / 18

Hive - External Tables - Hands on




Not able to play video? Try with vimeo

Let us create an external table in hive. First copy the data from /data/NYSE_daily to your home directory in hdfs. Run hadoop fs -cp /data/NYSE_daily . Now type hive and wait for the hive command-line interface to appear. Please use your database type use followed by the database name if your database name is same as your user you can just use the environment variable env:USER and press enter. The string ${env:USER} is replaced by your username automatically. The syntax for creating external table is same as manage table just that we add an external keyword and the location keyword to define the path of underlying data. Please note that for external tables the warehouse will be located at the location provided in the location keyword. To see the low-level metadata type describe formatted nyse_external, the table that you just created and then followed by enter. As you can see the warehouse is located at nysc_daily directory in your home directory in hdfs also, the table type is external when we drop an external table the underlying data does not get deleted this is the most important difference between internal table and external table.

INSTRUCTIONS

Steps:

  • Login to the web console

  • Copy NYSE data from HDFS to your local

    hadoop fs -cp /data/NYSE_daily .
    
  • Launch Hive with typing in hive on console

  • Use your own database by using the below command. ${env:USER} gets replaced by your username automatically:

    use ${env:USER};
    
  • Create an external table nyse_external using below command

    CREATE EXTERNAL TABLE nyse_external(
        exchange1 STRING,
        symbol1 STRING,
        ymd STRING,
        price_open FLOAT,
        price_high FLOAT,
        price_low FLOAT,
        price_close FLOAT,
        volume INT,
        price_adj_close FLOAT
    )
    ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
    LOCATION '/user/${env:USER}/NYSE_daily';
    
  • To see more low-level details, type below command

    DESCRIBE FORMATTED nyse_external;
    
  • Check the warehouse directory in HDFS by typing below command in unix console(press ctrl+c to exit hive console).

    hadoop fs -ls NYSE_daily
    
  • To drop the external table type

    DROP TABLE nyse_external;
    

Loading comments...