Login using Social Account
     Continue with GoogleLogin using your credentials
Purpose: Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.
Home Page: https://spark.apache.org/
Documentation: https://spark.apache.org/docs/latest/
Related resources to learn: https://cloudxlab.com/assessment/playlist-intro/17/apache-spark-basics?course_id=1&playlist_id=17
https://cloudxlab.com/blog/running-pyspark-jupyter-notebook/
How to get started:
In the web console tab on the right side of the screen, type the following code to runs spark scala interactive command line
spark-shell
Type the following code to runs python spark interactive command line
pyspark
Type the following code to runs R on spark (/usr/spark2.6/bin/sparkR)
sparkR
Type the following code to submit a jar or python application for execution on cluster
spark-submit
Type the following code to runs the spark sql interactive shell
spark-sql
Taking you to the next exercise in seconds...
Want to create exercises like this yourself? Click here.
No hints are availble for this assesment
Answer is not availble for this assesment
Loading comments...