Enrollments closing soon for Post Graduate Certificate Program in Applied Data Science & AI By IIT Roorkee | 3 Seats Left
Apply NowLogin using Social Account
     Continue with GoogleLogin using your credentials
[Pig - Relational Operators - FOREACH]
FOREACH operator generates data transformations based on columns of data.
Let's say, we want to store just stock_symbol and dividends in HDFS for further processing. We can do it using the FOREACH operator. Let's do it. Launch pig, load the data, take only stock_symbol and dividends columns, and store it in a new file in HDFS. As discussed earlier, note that Pig will start the execution only when we specify dump or store command. See the content of the file to verify the result.
[Pig - FOREACH - Question]
Question - Will the code displayed on the screen, generate any reducer code?
Answer - No, as we are not doing any aggregation or sorting.
Code
divs = LOAD '/data/NYSE_dividends' AS (name:chararray, stock_symbol:chararray, date:chararray, dividends:float);
values = FOREACH divs GENERATE stock_symbol, dividends;
STORE values INTO 'values_1';
cat values_1
Taking you to the next exercise in seconds...
Want to create exercises like this yourself? Click here.
Loading comments...