Hive

14 / 18

Hive - ORC File Format




Not able to play video? Try with vimeo

ORC optimized row columnar file format provides a highly efficient way to store hive data. Using orc files improves performance when hive is reading writing and processing data orc has a built-in index min/max values and other aggregations it is proven in large-scale deployments. Facebook uses orc file format for a 300 plus petabyte deployment. To use orc file format specify stored as orc clause while defining the table login to CloudxLab linux console. Type hive and wait for hive prompt to appear. Use your own database. Run use followed by the database name here the string ${env:USER} is replaced by your username automatically this would work only if your username is same as the name of your database. Copy the create table command and paste it in the hive shell in web console now insert some data using command insert into orc_table followed by values ('John', 'Gill'); in order to retrieve all the values from the table now you can simply use select * from orc_table; so you can see that you can insert rows into a table by the way of orc format

INSTRUCTIONS

Steps:

  • Create ORC table
  • Login to the web console
  • Launch Hive by typing hive in the web console. Run the below commands in Hive.
  • Use your database by using the below command. ${env:USER} gets replaced by your username automatically:

    use ${env:USER};
    
  • To create an ORC file format:

    CREATE TABLE orc_table (
        first_name STRING, 
        last_name STRING
     ) 
     STORED AS ORC;
    
  • To insert values in the table:

    INSERT INTO orc_table VALUES ('John','Gill');
    
  • To retrieve all the values in the table:

    SELECT * FROM orc_table;
    

Loading comments...