In this step, we will create tweets_simple
and tweets_clean
views.
Question-
From which country tweet with id 330044004693598208
was tweeted?
Launch hive console by typing the hive
command in the web console.
hive
Run below commands in the hive.
ADD JAR hdfs:///data/hive/json-serde-1.1.9.9-Hive13-jar-with-dependencies.jar;
SET hive.support.sql11.reserved.keywords=false;
Select your database. Run below commands. ${env:USER}
gets replaced by your username automatically:
CREATE DATABASE IF NOT EXISTS ${env:USER};
USE ${env:USER};
Create view tweets_simple
. The tweets_simple
view contains tweet id, the timestamp of the
tweet, tweet text, and user's time zone. Run below command in the Hive on your web console.
CREATE VIEW tweets_simple AS
SELECT
id,
cast ( from_unixtime( unix_timestamp(concat( '2013 ', substring(created_at,5,15)), 'yyyy MMM dd hh:mm:ss')) as timestamp) ts,
text,
user.time_zone
FROM tweets_raw;
Sample rows of tweets_simple
views are
Create view tweets_clean
. The tweets_clean
view maps the user's timezone to the country. Each
row of the tweets_clean
view contains tweet_id, the timestamp of the tweet, tweet text, and user's
country (which is derived from time zone). Run below command in the Hive on your web console.
CREATE VIEW tweets_clean AS
SELECT
id,
ts,
text,
m.country
FROM tweets_simple t LEFT OUTER JOIN time_zone_map m ON t.time_zone = m.time_zone;
Sample rows of tweets_simple
views are
Note - Having trouble with the assessment engine? Follow the steps listed here
No hints are availble for this assesment
Answer is not availble for this assesment
Loading comments...