Adv Spark Programming

48 / 52

Adv Spark Programming - Hardware Provisioning

Slides - Adv Spark Programming (2)


No hints are availble for this assesment

Answer is not availble for this assesment

Please login to comment

15 Comments

"jvm can suffer garbage collector pauses for large heap sizes" pls explain his statement i dint understand.

  Upvote    Share

Hi,

The JVM Garbage Collector(GC) holds the responsibility of watching which objects are created and allocated space in the heap memory, which objects are in use and which are not, and delete the unused objects from memory, so that this space could be used for other purposes. To delete unused objects, the GC may use a full or partial "Stop the world" strategy( meaning no other threads could continue till its operations are complete). Thus, there is a pause time involved. Larger heap sizes would consequently lead to longer garbage collector pauses.

Thanks.

  Upvote    Share

How to see sparkUI here ?

  Upvote    Share

Hi,

I guess Video is incomplete at the end.Please look into it.

  Upvote    Share

Actually, the video ends at 50s. Please ignore the remaining part. We will be updating the video soon.

  Upvote    Share

Fixed it.

  Upvote    Share

Coalescing a large RDD

scala> rdd1 = rdd.filter(lambda line: line.lower().startswith('this'));
<console>:1: error: ')' expected but '(' found.
rdd1 = rdd.filter(lambda line: line.lower().startswith('this'));
                                                              ^

What is the error here?

  Upvote    Share

Hi Chitra,

I think you are writing the python code in scala for this the code should be like that

val rdd1 = rdd.filter(line => line.toLowerCase().startsWith("this"))

hope this helps you 

Happy coding :)

  Upvote    Share

to request smaller size executer ; please explain this with more examples not clear

  Upvote    Share

Hi, Amit

The number of cores to use on each executor.

In standalone mode, setting the parameter spark.executor.cores allows an application to run multiple executors on the same worker,

Otherwise, only one executor per application will run on each worker.

https://spark.apache.org/docs/1.6.1/configuration.html#execution-behavior

All th best!

  Upvote    Share

Hi, kindly check the end, it looks u missed the explanation part here.

  Upvote    Share

Hi,
Example in the end is not explained.
It is there in slides,so that we can understand.But video seems incomplete.

  Upvote    Share

Right. There is more audio which has gone missing. We are thoroughly reviewing the entire session. Thank you again for bring it to our notice.

  Upvote    Share

This issue is still not resolved after three years!

 3  Upvote    Share