Apache Spark Basics

You are currently auditing this course.
48 / 89

Apache Spark - Actions - take & saveTextFile

Not able to play video? Try with youtube

  • Action example - take()

    val arr = 1 to 1000000
    val nums = sc.parallelize(arr)
    def multipleByTwo(x:Int):Int = x*2

    Write the following commands in a new cell:

    var dbls = nums.map(multipleByTwo);
  • Action example - saveAsTextFile()

    val arr = 1 to 1000
    val nums = sc.parallelize(arr)
    def multipleByTwo(x:Int):Int = x*2

    Write the following commands in a new cell:

    var dbls = nums.map(multipleByTwo);

Note - In this video, we used Hue to access the results in HDFS. We have deprecated the Hue. Please use the below commands in the web console to access the files

  • Login to the web console
  • Check the files

    hadoop fs -ls  mydirectory
  • Check the content of the first part

    hadoop fs -cat mydirectory/part-00000 | more
  • Check the content of the second part

    hadoop fs -cat mydirectory/part-00001 | more

In case of any issues related to DiskQuota, feel free to visit https://discuss.cloudxlab.com/t/the-diskspace-quota-of-user-is-execeeded/5156

Loading comments...