Enrollments closing soon for Post Graduate Certificate Program in Applied Data Science & AI By IIT Roorkee | 3 Seats Left
Apply NowLogin using Social Account
     Continue with GoogleLogin using your credentials
To load json data in hive we use json-serde, serde represents serializer deserializer. When we load json data in hive json-serde converts the json data into the tabular format. json-serde jar is located at /data/serde
directory in hdfs.
Login into cloudxlab linux console copy /data/sample_json
from hdfs to your home folder in hdfs run command hadoop fs -cp /data/sample_json
followed by /user/$USER
, \$USER is a variable which will be substituted with your username.
Now to check the contents of the sample_json directory. Run hadoop fs -ls /user/$USER/sample_json it contains the file user_country.json. Check the last few lines of the file please note that each row in the file contains json object run command hadoop fs -tail /user/$USER/sample_json /user_country.json
.
Now launch hive by typing hive and wait for hive prompt to appear. When we load json data into hive, json-serde converts the json data into the tabular format. json-serde jar is located at /data/serde directory in hdfs, before creating the table add the serde jar in hive with add jar followed by jar path command now you can create the table using the syntax displayed on the screen please note the row format unlike previous examples fields are not terminated by tab or comma. To select the rows from user_countries table you can now simply run select * from user_countries
followed by semicolon
now this command will query the json object and print the result using a simple sql query the translation from json into this tabular form is being done by serde the jar that we have added.
Copy /data/sample_json
from HDFS to your home folder in HDFS. $USER
gets replaced by your username automatically:
hadoop fs -cp /data/sample_json /user/$USER
Check the content of the sample_json
directory.
hadoop fs -ls /user/$USER/sample_json
It contains the file user_country.json
. Check the last few lines of the file. Please note that each row in the file contains JSON.
hadoop fs -tail /user/$USER/sample_json/user_country.json
Launch Hive by typing hive
in the web console. Run the below commands in the hive.
Add JSON-SERDE JAR using below command:
ADD JAR hdfs:///data/serde/json-serde-1.3.6-SNAPSHOT-jar-with-dependencies.jar;
Use your database. ${env:USER}
gets replaced by your username automatically:
use ${env:USER};
To create the table use the following command.
CREATE EXTERNAL TABLE user_countries(
name string,
country string
)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
LOCATION 'hdfs:///user/${env:USER}/sample_json';
Select rows from user_countries
table
select * from user_countries;
Taking you to the next exercise in seconds...
Want to create exercises like this yourself? Click here.
Loading comments...