Spark On Cluster

21 / 23
Runing on cluster - When we run spark with yarn, how does it discover...

When we run spark with yarn, how does it discover the IP address of resource manger of YARN?

  • The spark need to be installed on the resource manager of Hadoop. It just uses localhost
  • It finds the configuration files in directory mentioned in environment variable YARN_CONF_DIR (earlier it used to be HADOOP_CONF_DIR)
  • We have to specify the IP of YARN resource manager while downloading spark