Login using Social Account
     Continue with GoogleLogin using your credentials
Important Note - We recommend you to execute the given commands on Hive Console instead of Hue. The video is only for representational purposes.
To load JSON data in Hive we use JSON serde. Serde represents Serializer-deserializer. When we load JSON data in Hive, JSON serde converts the JSON data into the tabular format. JSON serde JAR is located at /data/serde
directory in HDFS. Before creating the table, add the serde JAR in Hive query editor with ADD JAR JAR_PATH
command. Now you can create the table using the syntax displayed on the screen. Please note the ROW FORMAT
. Unlike previous examples, fields are not terminated by tab or comma.
Copy /data/sample_json
from HDFS to your home folder in HDFS. $USER
gets replaced by your username automatically:
hadoop fs -cp /data/sample_json /user/$USER
Check the content of the sample_json
directory.
hadoop fs -ls /user/$USER/sample_json
It contains the file user_country.json
. Check the last few lines of the file. Please note that each row in the file contains JSON.
hadoop fs -tail /user/$USER/sample_json/user_country.json
Launch Hive by typing hive
in the web console. Run the below commands in the hive.
Add JSON-SERDE JAR using below command:
ADD JAR hdfs:///data/serde/json-serde-1.3.6-SNAPSHOT-jar-with-dependencies.jar;
Use your database. ${env:USER}
gets replaced by your username automatically:
use ${env:USER};
To create the table use the following command. Replace YOUR_USER_NAME
with your cloudxlab username
CREATE EXTERNAL TABLE user_countries(
name string,
country string
)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
LOCATION 'hdfs:///user/${env:USER}/sample_json';
Select rows from user_countries
table
select * from user_countries;
Loading comments...