Registrations Closing Soon for DevOps Certification Training by CloudxLab | Registrations Closing in

  Enroll Now

Apache Spark - Counting Word Frequencies

  • Given below is the Scala code for counting word frequencies

    var linesRdd = sc.textFile("/data/mr/wordcount/input/big.txt")
    var words = linesRdd.flatMap(x => x.split(" "))
    var wordsKv = => (x, 1))
    //def myfunc(x:Int, y:Int): Int = x + y
    var output = wordsKv.reduceByKey(_ + _)

    We can also save the output to HDFS:


Note - In this video, we used Hue to access the results in HDFS. We have deprecated the Hue. Please use the below commands in the web console to access the files

  • Login to the web console
  • Check the files

    hadoop fs -ls  my_result
  • Check the content of the first part

    hadoop fs -cat my_result/part-00000 | more
  • Check the content of the second part

    hadoop fs -cat my_result/part-00001 | more

No hints are availble for this assesment

Answer is not availble for this assesment

Loading comments...