Let us understand what goes underneath when a program is launched on spark.
First, the user submits an application using say command spark-submit. This could be done using spark-shell too.
The command spark-submit launches the driver program
The driver invokes the main() method specified by the user and creates the spark context.
Then, the driver program contacts the cluster manager for resources such as CPUs, memory etc.
Afterwards, the cluster manager launches the executors on various nodes inside the cluster. These executors as discussed earlier would run the tasks.
The driver process runs through the user application.
The driver sends work to executors in the form of tasks
Tasks are run on executor processes to compute and save results.
Once the main method exits or spark-context is stopped by sc.stop, the driver informs resource manager to terminate the executors and release resources
Taking you to the next exercise in seconds...
Want to create exercises like this yourself? Click here.
No hints are availble for this assesment
Answer is not availble for this assesment
Please login to comment
3 Comments
"driver's process run through user's app " -can you explain this what is the meaning ?
Upvote ShareThe starting point of user's app is the driver. When you are launching pyspark or spark-shell, you are the user and the pyspark or spark-shell launches the driver.
Can you please add the video transcripts? it would be helpful.
1 Upvote Share