Enrollments Open for Advanced Certification Courses on Data Science, ML & AI by E&ICT Academy IIT Roorkee
Apply NowflatMap
To convert one record of an RDD into multiple records
var linesRDD = sc.parallelize( Array("this is a dog", "named jerry"))
def toWords(line:String):Array[String]= line.split(" ")
var wordsRDD = linesRDD.flatMap(toWords)
wordsRDD.collect()
Using Map
var linesRDD = sc.parallelize( Array("this is a dog", "named jerry"))
def toWords(line:String):Array[String]= line.split(" ")
var wordsRDD1 = linesRDD.map(toWords)
wordsRDD1.collect()
flatMap as Map
val arr = 1 to 10000
val nums = sc.parallelize(arr)
def multiplyByTwo(x:Int) = Array(x*2)
multiplyByTwo(5)
Write the following commands in a new cell:
var dbls = nums.flatMap(multiplyByTwo);
dbls.take(5)
flatMap as filter
var arr = 1 to 1000
var nums = sc.parallelize(arr)
def isEven(x:Int):Array[Int] = {
if(x%2 == 0) Array(x)
else Array()
}
Write the following commands in a new cell:
var evens = nums.flatMap(isEven)
evens.take(3)
Transformations :: Union
var a = sc.parallelize(Array('1','2','3'));
var b = sc.parallelize(Array('A','B','C'));
var c=a.union(b)
c.collect();
Actions: saveAsTextFile()
Saves all the elements into HDFS as text files.
var a = sc.parallelize(Array(1,2,3, 4, 5 , 6, 7));
a.saveAsTextFile("myresult");
Check the HDFS. There should myresult folder in your home directory.
Actions: collect()
var a = sc.parallelize(Array(1,2,3, 4, 5 , 6, 7));
a
Write the following commands in a new cell:
var localarray = a.collect();
localarray
Actions: take()
var a = sc.parallelize(Array(1,2,3, 4, 5 , 6, 7));
var localarray = a.take(4);
localarray
Actions: count()
var a = sc.parallelize(Array(1,2,3, 4, 5 , 6, 7), 3);
var mycount = a.count();
mycount
No hints are availble for this assesment
Answer is not availble for this assesment
Loading comments...