Login using Social Account
     Continue with GoogleLogin using your credentials
Edit: at 2:48 to 2:50, it should be flatMap instead of map in var dbls = nums.map(MultiplyByTwo)
flatMap
To convert one record of an RDD into multiple records
var linesRDD = sc.parallelize( Array("this is a dog", "named jerry"))
def toWords(line:String):Array[String]= line.split(" ")
var wordsRDD = linesRDD.flatMap(toWords)
wordsRDD.collect()
Using Map
var linesRDD = sc.parallelize( Array("this is a dog", "named jerry")) def toWords(line:String):Array[String]= line.split(" ") var wordsRDD1 = linesRDD.map(toWords) wordsRDD1.collect()
flatMap as Map
val arr = 1 to 10000 val nums = sc.parallelize(arr) def multiplyByTwo(x:Int) = Array(x*2) multiplyByTwo(5)
Write the following commands in a new cell:
var dbls = nums.flatMap(multiplyByTwo);
dbls.take(5)
flatMap as filter
var arr = 1 to 1000 var nums = sc.parallelize(arr) def isEven(x:Int):Array[Int] = { if(x%2 == 0) Array(x) else Array() }
Write the following commands in a new cell:
var evens = nums.flatMap(isEven)
evens.take(3)
Transformations :: Union
var a = sc.parallelize(Array('1','2','3')); var b = sc.parallelize(Array('A','B','C')); var c=a.union(b) c.collect();
Actions: saveAsTextFile()
Saves all the elements into HDFS as text files.
var a = sc.parallelize(Array(1,2,3, 4, 5 , 6, 7));
a.saveAsTextFile("myresult");
Check the HDFS. There should myresult folder in your home directory.
Actions: collect()
var a = sc.parallelize(Array(1,2,3, 4, 5 , 6, 7)); a
Write the following commands in a new cell:
var localarray = a.collect();
localarray
Actions: take()
var a = sc.parallelize(Array(1,2,3, 4, 5 , 6, 7)); var localarray = a.take(4); localarray
Actions: count()
var a = sc.parallelize(Array(1,2,3, 4, 5 , 6, 7), 3); var mycount = a.count(); mycount
Taking you to the next exercise in seconds...
Want to create exercises like this yourself? Click here.
Loading comments...