Registrations Closing Soon for DevOps Certification Training by CloudxLab | Registrations Closing in

  Enroll Now

Apache Spark - More Operations - Transformations & Actions

INSTRUCTIONS
  • flatMap

    To convert one record of an RDD into multiple records

    var linesRDD = sc.parallelize( Array("this is a dog", "named jerry"))
    def toWords(line:String):Array[String]= line.split(" ")
    var wordsRDD = linesRDD.flatMap(toWords)
    wordsRDD.collect()
    
  • Using Map

    var linesRDD = sc.parallelize( Array("this is a dog", "named jerry"))
    def toWords(line:String):Array[String]= line.split(" ")
    var wordsRDD1 = linesRDD.map(toWords)
    wordsRDD1.collect()
    
  • flatMap as Map

    val arr = 1 to 10000
    val nums = sc.parallelize(arr)
    def multiplyByTwo(x:Int) = Array(x*2)
    multiplyByTwo(5)
    

    Write the following commands in a new cell:

    var dbls = nums.flatMap(multiplyByTwo);
    dbls.take(5)
    
  • flatMap as filter

    var arr = 1 to 1000
    var nums = sc.parallelize(arr)
    def isEven(x:Int):Array[Int] = {
      if(x%2 == 0) Array(x)
      else Array()
    }
    

    Write the following commands in a new cell:

    var evens = nums.flatMap(isEven)
    evens.take(3)
    
  • Transformations :: Union

    var a = sc.parallelize(Array('1','2','3'));
    var b = sc.parallelize(Array('A','B','C'));
    var c=a.union(b)
    c.collect();
    
  • Actions: saveAsTextFile()

    Saves all the elements into HDFS as text files.

    var a = sc.parallelize(Array(1,2,3, 4, 5 , 6, 7));
    a.saveAsTextFile("myresult");
    

    Check the HDFS. There should myresult folder in your home directory.

  • Actions: collect()

    var a = sc.parallelize(Array(1,2,3, 4, 5 , 6, 7));
    a
    

    Write the following commands in a new cell:

    var localarray =  a.collect();
    localarray
    
  • Actions: take()

    var a = sc.parallelize(Array(1,2,3, 4, 5 , 6, 7));
    var localarray =  a.take(4);
    localarray
    
  • Actions: count()

    var a = sc.parallelize(Array(1,2,3, 4, 5 , 6, 7), 3);
    var mycount =  a.count();
    mycount
    

No hints are availble for this assesment

Answer is not availble for this assesment

Loading comments...