Apache Spark Basics

43 / 89

What is not true about map transformations?


Note - Having trouble with the assessment engine? Follow the steps listed here


No hints are availble for this assesment

Answer is not availble for this assesment

Please login to comment

13 Comments

This option should be true:

The number of elements in resulting RDD will always be same as original

As shown by:

var evens = nums.fiter(isEven)

 1  Upvote    Share

Hi, 

In the questions it is asked which is not True?

After the transformation, the resultant RDD is always different from its parent RDD as datatype. 

It can be smaller (e.g. filter, count, distinct, sample), bigger (e.g. flatMap(), union(), Cartesian()) or the same size (e.g. map).

So, The data type of elements in resulting RDD will always be same as original  is the False one. 

 

All the best!

  Upvote    Share

Hi,

1. The resulting RDD is created from the same elements as the original RDD. Hence the data type in original and resulting RDD should be same, right ? Hence this statement looks to be true. (last option in above question)

2. The number of elements in resulting RDD can be lesser than the original RDD. So, this statement looks false. (first option in above question). 

So, shouldn't the answer to above question be the first option ? Please clarify if I am missing anything.

  Upvote    Share

> The resulting RDD is created from the same elements as the original RDD. 

Yes but the result is a function of what are you doing in transformation. 

Check this:

var src_rdd = sc.parallelize(1 to 10)

// The src_rdd is of type integers

var result_rdd = src_rdd.map(x => ":"+ x.toString + ":")

// result_rdd is of type string.

 1  Upvote    Share

I agree, however, since this statement is also `false`:
<i>The number of elements in resulting RDD will always be same as original</i>

sc.parallelize(1 to 10).filter(_ % 5 == 0)

.. therefore, shouldnt' it be an accepted answer as well?

  Upvote    Share

> The number of elements in resulting RDD will always be same as original

This is true for map.

  Upvote    Share

This comment has been removed.

i think i have not come across anything so far on map transformation, please can you give more examples on map tranformation, it's required for deep learning

  Upvote    Share

Hello Disqus,

Thanks for contacting CloudxLab!

This automatic reply is just to let you know that we received your message and we’ll get back to you with a response as quickly as possible. During business hours (9am-5pm IST, Monday-Friday) we do our best to reply within a few hours. Evenings and weekends may take us a little bit longer.

If you have a general question about using CloudxLab, you’re welcome to browse our below Knowledge Base for walkthroughs of all of our features and answers to frequently asked questions.

- Tech FAQ <https: cloudxlab.com="" faq="" support="">
- General FAQ <https: cloudxlab.com="" faq=""/>

If you have any additional information that you think will help us to assist you, please feel free to reply to this email. We look forward to chatting soon!

Cheers,
The CloudxLab Team

  Upvote    Share

Please provide some example to support the answer.

  Upvote    Share

Can you please give a example for different datatype of resulting RDD element from map transformation

 1  Upvote    Share

Hello,

As per first option "The number of elements in resulting RDD will always be same as original" here it is different can you please verify ?

scala> val stringRdd = sc.parallelize(Array("one","two","three","four","five"))
stringRdd: org.apache.spark.rdd.RDD[String] = ParallelCollectionRDD[20] at parallelize at <console>:24

scala> stringRdd.count
res9: Long = 5

scala> val unionRdd = stringRdd.union(stringRdd)
unionRdd: org.apache.spark.rdd.RDD[String] = UnionRDD[21] at union at <console>:25

scala> unionRdd.count
res11: Long = 10

  Upvote    Share

In question we are talking about map() and in the example you have used union.

  Upvote    Share