Apache Spark Basics

12 / 89

Getting Started with Spark using CloudxLab




Not able to play video? Try with youtube

Note -

We've recently upgraded the lab. Please go through below blogs to read about how to access new versions of Spark on CloudxLab


Please login to comment

17 Comments

when i. run the spak-submit command as shown in the video it does not allow me to type the entire command. It take me back to the promp and starts to overwrite. 

  Upvote    Share

Hi Troydon,

It's working fine from my end. Can you please share screeshot of the screen when this happens?

  Upvote    Share

I am getting issue

  Upvote    Share

Can you restart the spark-shell. It looks like spark session was not created succesfully.

  Upvote    Share

Not able read file or run the take command on spark -shell. Can you help understand the issue?

I am getting thw file does not exist error. Here is the output from running `hadoop fs -ls "data/mr/wordcount/input/` 

  Upvote    Share

Hi Vrinda,

You are providing the wrong path to textFile function. You have not included the '/' in the start which is essential to include telling that the path is an absolute path from the root directory. Otherwise it will consider it as a relative path and check for it in your home directory.

  Upvote    Share

Hi, Is it possible that I can upload my own data and run the code on that data. Please elaborate the process for me. Thanks!

 1  Upvote    Share

Hi, while using the command - hadoop fs -copyFromLocal C:\Users\myID\OneDrive - myComp\Download\titanic_dataset --

I am getting error - -copyFromLocal: Can not create a Path from a null string  

  Upvote    Share

Hi Mantasha,

The error you're encountering, "Can not create a Path from a null string," suggests that there is an issue with the path you provided. The correct syntax os using copyFromLocal is:

hadoop fs -copyFromLocal your_file_path destination_path

 

  Upvote    Share

I am not getting the Hue option on the My Lab page. I have Ambari, Jupyter and web console option. How will I get it ?

  Upvote    Share

Hi,

We don't provide Hue anymore. You can refer to https://discuss.cloudxlab.com/t/should-we-be-using-hue/5821/2?u=shubh_tripathi for more details.

  Upvote    Share

Hi Team,
I am getting below error while login o spark shell, could you please help me on this.

  Upvote    Share

Hi Sarita,

It's because you have files with the names "abc.py" and "abc.pyc" in your home directory which shadows the stdlib abc module. You should delete them to make it work.

  Upvote    Share

As you have shown to browse the files using HUE, but Hue is not available in the cloudex lab. Kindly tell me  how can I check the files.

 1  Upvote    Share

Hi

You can check files in a particular directory by using the following command in the console:

 hadoop fs -ls path

Just replace path with the path of the directory in which you want to look for.

So, if you want to check for the path "/data/mr/wordcount/input/", you can check it as:

hadoop fs -ls "/data/mr/wordcount/input/"

To view the content of the file, you can run the following command in the console:

hadoop fs -cat filepath

Thanks

  Upvote    Share