Spark On Cluster

6 / 23

Apache Spark - Running On Cluster - Launching

Let us understand what goes underneath when a program is launched on spark.

First, the user submits an application using say command spark-submit. This could be done using spark-shell too.

The command spark-submit launches the driver program

The driver invokes the main() method specified by the user and creates the spark context.

Then, the driver program contacts the cluster manager for resources such as CPUs, memory etc.

Afterwards, the cluster manager launches the executors on various nodes inside the cluster. These executors as discussed earlier would run the tasks.

The driver process runs through the user application.

The driver sends work to executors in the form of tasks

Tasks are run on executor processes to compute and save results.

Once the main method exits or spark-context is stopped by sc.stop, the driver informs resource manager to terminate the executors and release resources

Apache Spark - Running On Cluster

No hints are availble for this assesment

Answer is not availble for this assesment

Loading comments...