Hive - Project

7 / 14
   

In this step, we will create tweets_simple and tweets_clean views.

Steps-

  1. Create view tweets_simple. tweets_simple view contains tweet id, the timestamp of the tweet, tweet text and user's time zone. Run below command in the Hive query editor in Hue

    CREATE VIEW tweets_simple AS
    SELECT          
    id,
    cast ( from_unixtime( unix_timestamp(concat( '2013 ', substring(created_at,5,15)), 'yyyy MMM dd hh:mm:ss')) as timestamp) ts,
    text,
    user.time_zone 
    FROM tweets_raw;
    

    Sample rows of tweets_simple views are

  2. Create view tweets_clean. tweets_clean view maps user's timezone to the country.Each row of the tweets_clean view contains tweet_id, the timestamp of the tweet, tweet text and user's country (which is derived from time zone). Run below command in the Hive query editor in Hue

    CREATE VIEW tweets_clean AS
    SELECT
    id,
    ts,
    text,
    m.country 
    FROM tweets_simple t LEFT OUTER JOIN time_zone_map m ON t.time_zone = m.time_zone;
    

    Sample rows of tweets_simple views are

Question-

From which country tweet with id 330044004693598208 was tweeted?