Enrollments closing soon for Post Graduate Certificate Program in Applied Data Science & AI By IIT Roorkee | 3 Seats Left

  Apply Now

Getting Started with various Tools

24 / 44


Purpose: Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.

Home Page: https://spark.apache.org/

Documentation: https://spark.apache.org/docs/latest/

Related resources to learn: https://cloudxlab.com/assessment/playlist-intro/17/apache-spark-basics?course_id=1&playlist_id=17


How to get started:

  1. In the web console tab on the right side of the screen, type the following code to runs spark scala interactive command line

  2. Type the following code to runs python spark interactive command line

  3. Type the following code to runs R on spark (/usr/spark2.6/bin/sparkR)

  4. Type the following code to submit a jar or python application for execution on cluster

  5. Type the following code to runs the spark sql interactive shell


No hints are availble for this assesment

Answer is not availble for this assesment

Loading comments...