Data Science Certification Course (AI/ML + DL + Big Data + Python) by E&ICT, IIT Roorkee for $179 | Expires inEnroll Now
Check the Data using cat command. Since the file is big, you can use "more" to see pagewise
cat /cxldata/big.txt | more
Replace space with newline such that every line in output contains only single word:
cat /cxldata/big.txt | sed 's/ /\n/g' |more
For example, after replacing space with new line in "I am ok" we should get:
I am ok
The "/g" is an option of sed which makes replace all occurrences of space instead of only one.
Also, note this command has three programs connected by two pipes. The output of cat is going to sed and output of sed is going to more to see the pagewise.
We can sort the words using sort command in the following way
cat /cxldata/big.txt | sed 's/ /\n/g' | sort|more
Note that we are using "more" command just to avoid screen-blindness (too much text scrolling).
We can now, count the words using uniq command
cat /cxldata/big.txt | sed 's/ /\n/g' | sort|uniq -c|more
Please save the result of the command to a file "word_count_results" in your home directory
cat /cxldata/big.txt | sed 's/ /\n/g' | sort|uniq -c > word_count_results
Taking you to the next exercise in seconds...