HDFS - Hadoop Distributed File System

49 / 58

HDFS - Hands-On with Console




Not able to play video? Try with youtube

We can also upload files from CloudxLab Linux console to HDFS.

Let’s say we want to copy the test.txt file from CloudxLab Linux console to HDFS.

  • Login to CloudxLab Linux console.
  • Create a file test.txt using nano or vi editors.
  • Run command hadoop fs -copyFromLocal test.txt.
  • This command will copy the file from CloudxLab Linux console to CloudxLab HDFS.
  • To verify if the file is copied to HDFS, please type hadoop fs -ls test.txt.
  • To see the content of the file in HDFS, please type hadoop fs -cat test.txt.

Please login to comment

14 Comments

Hi,

I am getting following error : copyFromLocal: Cannot create file/user/bhardwajprince09046635/abc.txt._COPYING_. Name node is in safe mode.while copying file from local to hdfs.

  Upvote    Share

Hi Prince,

The error was due to name node being in safe mode. Now it is working fine. Kindly check.

  Upvote    Share

How to login to linux console. I am stuck there only

  Upvote    Share

What is the difference between hadoop fs -put <<file_name>>  &  hadoop fs -CopyFromLocal <<file_name>>

  Upvote    Share

copyFromLocal is Similar to the fs -put command, except that the source is restricted to a local file reference.

While -put copies single src or multiple srcs from local file system to the destination file system. It also reads input from stdin and writes to destination file system if the source is set to “-”

  Upvote    Share

Hello Please Tell me how can i login into web console.

  Upvote    Share

I am new to Hadoop and I have very baisc doubts:

1. The web console is the console of the Name node. Is it correct?

2. When we run copyFromLocal command, how does the data gets stored in Data Node?

3. After receiving data, to which machine does the Data Node send ACK to? How can we check whether ACK is received or not?

4. Suppose I have to upload a file from my laptop to the Cloudxlab Hadoop cluster, do I need to first upload the file to Name node and then run the copyFromLocal command on Name node?

  Upvote    Share

 1. Web console is just a command line utility to interact with Hadoop. It is not specific to either of the namenodes or datanodes. To check the namenodes, please run the following command:-

hdfs getconf -namenodes

To get further details on the nodes, run the following command:

yarn node -list -all

2. You should refer to the command's documentation to get the info on its internal working.

You are much confused about namenode and datanode. The other two doubts can be only cleared when you have a clear understanding of these two. 

Namenode and datanodes in layman's terms are Service that runs in the background and listens on a network port. I think getting some understanding of nodes, clusters and ports may help you clear your doubts better. 

  Upvote    Share

Thank you for clarifying most of my doubts. Just the last doubt needs to be clarified a bit.

Assume that there is 1 Namenode and 1 DataNode in the Hadoop cluster. The Namenode service is running on a physical machine 'A' and DataNode service is running on a different physical machine 'B'. 

I have to upload a file from my laptop to this Hadoop cluster. What are the steps I need to follow for this requirement?

I know that using Hadoop's copyFromLocal command we can copy files from local filesystem to Hadoop filesystem but I don't know how to establish connection between my laptop and the Hadoop cluster.

  Upvote    Share

That's a very interesting question. So to upload the files, you will need to interact with namenode. It's because the namenode is the master node and it takes care of all the operations to perform. Namenode then provides the address of the data nodes (slaves) on which the client will start writing the data. It will distribute the file among the datanodes.

 1  Upvote    Share

why are we using "-" in hdfs fs -cat test.txt

  Upvote    Share

-cat is an argument for "hdfs fs" command.

  Upvote    Share