Flash Sale: Flat 70% + Addl. 25% Off on all Courses | Use Coupon DS25 in Checkout | Offer Expires InEnroll Now
The third approach is to use Unix command in pipeline or in chain. Let us first try to understand what does it mean by pipeline.
As we discussed earlier that when we run a program it may take input from you. In other words, you may provide input to a program by typing. A program or command may also print some output on the screen.
In Unix, you can provide output of one program as input to another. This is known as piping. A pipe is denoted by vertical bar symbol. command1 vertical bar command2 means the output of command1 will become input to command2.
Let us take an example.
echo Unix command prints on the standard output whatever argument is passed to it.
For example, echo "Hi" print "Hi" to the screen.
wc command prints the number of characters, words, and lines out of whatever you type on standard input. Let me show you, Start
wc command, type some text say "hi", newline and "how are you" and then press Ctrl+d to end the input:
It would print number of lines, words, and characters which are 2, 4, and 15 respectively.
If we want to count the number of words or characters in the output of echo command, we could use a command like:
echo "Hello, World" | wc
Let us try to understand this pipeline of commands for word counting in parts.
The first command
cat myfile prints the contents of the file "myfile".
Second command in chain is
sed stands for streaming editor. It is used to replace a text with something else in the input. It is very similar to the search and replace option feature of text editors. You can use regular expression with
sed by providing an option
-E to it.
sed -E 's/[\t ]+/\n/g' replaces spaces and tabs with newline. Essentially, it converts text into one word per line. So, when you chain
sed, it basically prints one word per line from the file.
This one-word-per-line text can be sent further to a command called
sort which can order lines in input. The
sort command take various options. The option
-S makes it use only limited memory. In our case, we are using
-S 1g option to sort data using only 1-gigabyte of memory.
The last command is
uniq command finds unique lines in the input. It expects the data to be ordered already. In case, the input to
uniq is not sorted, the result is not correct.
uniq command has
-c option which prints the counts of each unique word. So
uniq -c would print counts of each unique word in the sorted input.
So, the entire pipeline consisting of
sort followed by
uniq prints the word count of unique words in the text file.
No hints are availble for this assesment
Answer is not availble for this assesment