Action example - take()
val arr = 1 to 1000000
val nums = sc.parallelize(arr)
def multipleByTwo(x:Int):Int = x*2
Write the following commands in a new cell:
var dbls = nums.map(multipleByTwo);
dbls.take(5)
Action example - saveAsTextFile()
val arr = 1 to 1000
val nums = sc.parallelize(arr)
def multipleByTwo(x:Int):Int = x*2
Write the following commands in a new cell:
var dbls = nums.map(multipleByTwo);
dbls.saveAsTextFile("mydirectory")
Note - In this video, we used Hue to access the results in HDFS. We have deprecated the Hue. Please use the below commands in the web console to access the files
Check the files
hadoop fs -ls mydirectory
Check the content of the first part
hadoop fs -cat mydirectory/part-00000 | more
Check the content of the second part
hadoop fs -cat mydirectory/part-00001 | more
In case of any issues related to DiskQuota, feel free to visit https://discuss.cloudxlab.com/t/the-diskspace-quota-of-user-is-execeeded/5156
Taking you to the next exercise in seconds...
Want to create exercises like this yourself? Click here.
Please login to comment
2 Comments
how can I see my files in HDFS?
Upvote ShareHi Sushanta,
You can view all the files and directories in HDFS using the -ls command .Use
hdfs dfs -ls /user/showsushanta331050.
Keep in mind the appropraite space between dfs and -ls and -ls and /user.
Happy Learning!
Upvote Share