Posted November 9, 2009 by Spyros in Linux Tips

Use The wc Command to Count Words, Lines and other Tokens


You may be thinking why we need to count words, lines or bytes as you read the title of this post. Actually, if you think a bit about it, you will see that it is quite useful to use a command like wc (no, it does not mean Water Cabinet, rather means something like Word Count).

For instance, suppose that we have created a coding project and after writing some code, we need to check how many lines we have actually written. Especially when the project becomes bigger and bigger, we need to keep in track with how big the source code is. Also, sometimes we may find that a file is overly large and that we need to split it to two or three different files, maybe using classes as well.

Let’s first check the most basic wc usage. Notice that we will also be using the find command with wc. This is because we need to feed wc with the input files so that it then searches for the word, lines or whatever other count we like. Therefore, if you are not comfortable with that, you may want to take a look at this tutorial about how to use the unix find command.

Now, think that we have a file named “new.txt” that we need to find out how many lines it contains. The command to execute is pretty simple :

wc -l new.txt

This will not only return the number of lines inside this file, but also the filepath as well. If you do not want to get the filepath, you should not specify an input but rather use a pipe to the wc command like :

cat new.txt | wc -l

In most cases, counting the lines of a simple file alone is just not enough. In most cases, we would need to get the total sum of lines in, say, all *.php files under a certain filepath. While this seems a bit daunting at first, it is actually very easy :

wc -l `find . -name "*.php"`

Let’s analyze that command a bit. If you are not familiar with bash scripting, you may not know that the text inside “ actually indicate commands that will run. What will happen is that the command “find . -name “*.php” is going to be executed first. This actually returns a list of all php files in the current directory and subdirectories. Once this is executed, it passes that list as a parameter to the wc -l command, thus returning the number of lines that all these files consist of.

To conclude this post, i would just want to say that wc can be also used to get the number of bytes, words and characters of some input, using the switches :

-c   for bytes

-l   for lines

-m for characters

-w  for words