Enrollments closing soon for Post Graduate Certificate Program in Applied Data Science & AI By IIT Roorkee | 3 Seats Left
Apply NowLogin using Social Account
     Continue with GoogleLogin using your credentials
Given below is the Scala code for counting word frequencies
var linesRdd = sc.textFile("/data/mr/wordcount/input/big.txt")
var words = linesRdd.flatMap(x => x.split(" "))
var wordsKv = words.map(x => (x, 1))
//def myfunc(x:Int, y:Int): Int = x + y
var output = wordsKv.reduceByKey(_ + _)
output.take(10)
We can also save the output to HDFS:
output.saveAsTextFile("my_result")
Note - In this video, we used Hue to access the results in HDFS. We have deprecated the Hue. Please use the below commands in the web console to access the files
Check the files
hadoop fs -ls my_result
Check the content of the first part
hadoop fs -cat my_result/part-00000 | more
Check the content of the second part
hadoop fs -cat my_result/part-00001 | more
Taking you to the next exercise in seconds...
Want to create exercises like this yourself? Click here.
Loading comments...