Hive - Project

7 / 14
   

In this step, we will create _tweets_simple_ and _tweets_clean_ views.

Steps-

  1. Create view _tweets_simple_. _tweets_simple_ view contains tweet id, the timestamp of the tweet, tweet text and user's time zone. Run below command in the Hive query editor in Hue

    CREATE VIEW tweets_simple AS
    SELECT          
    id,
    cast ( from_unixtime( unix_timestamp(concat( '2013 ', substring(created_at,5,15)), 'yyyy MMM dd hh:mm:ss')) as timestamp) ts,
    text,
    user.time_zone 
    FROM tweets_raw;
    

    Sample rows of _tweets_simple_ views are

  2. Create view _tweets_clean_. _tweets_clean_ view maps user's timezone to the country.Each row of the _tweets_clean_ view contains tweet_id, the timestamp of the tweet, tweet text and user's country (which is derived from time zone). Run below command in the Hive query editor in Hue

    CREATE VIEW tweets_clean AS
    SELECT
    id,
    ts,
    text,
    m.country 
    FROM tweets_simple t LEFT OUTER JOIN time_zone_map m ON t.time_zone = m.time_zone;
    

    Sample rows of _tweets_simple_ views are

Question-

From which country tweet with id 330044004693598208 was tweeted?