Bash: Read File Line by Line


Let’s say we have a file named in.txt with the following content:

$ cat in.txt
    # comment

and we want to read it line by line and do something with each one of them.

The following could be a solution:


for line in $(cat in.txt); do
    echo "line:$line"

Let’s see if it works

 $ ls
 in.txt   null.txt

 $ ./

Hm, the * character was replaced by the filenames of the current directory, but why did that happen? Some info about a couple of bash commands first.

Command Substitution

Allows the output of a command to replace the command itself. It occurs when a command is enclosed as $(command) or `command`.

Word Splitting

The shell scans the results of parameter expansion, command substitution, and arithmetic expansion that did not occur within double quotes for word splitting.

Filename Expansion

After word splitting, unless the -f option has been set, Bash scans each word for the characters *, ?, and [. If one of these characters appears, then the word is regarded as a pattern, and replaced with an alphabetically sorted list of filenames matching the pattern.

So, Bash treated the * character, that exists in the file in.txt, as a pattern and replaced it with the list of all filenames. What if we put the $(cat in.txt) in double quotes?

$ cat

for line in "$(cat in.txt)"; do
    echo "line:$line"

$ ./
line:    # comment

A little bit better now, but the whole result is in one line. Obviously, none of these solutions can be used. So, let’s remove the double quotes and add the set -f option bash man page mentions.

$ cat

set -f
for line in $(cat in.txt); do
    echo "line:$line"

$ ./

Better now, but the # comment line is still divided into 2 lines, because of the space between # and comment.

Solution 1

$ cat

while read line; do
    echo "line:$line"
done < in.txt

$ ./
line:# comment

Nice! The wildcard * is not treated as a pattern anymore, but the leading spaces have been removed. This happened because the read command by default removes all leading and trailing whitespace characters. Fortunately, we can handle this by clearing the IFS variable, like the example below.

Final Solution

$ cat

# Add IFS= so `read` won't trim leading and trailing whitespace from each line
# Add -r to read to prevent from backslashes from being interpreted as escape sequences
# Use printf in place of echo is safer if $line is a string like -n which echo would interpret as a flag

while IFS= read -r line; do
    printf '%s\n' "$line"
done < in.txt

$ ./
line:    # comment


The read command does not read lines! When a line is read, the first word is assigned to the first variable, the second word to the second variable, and so on. If there are more words than variables, the remaining words are assigned to the last variable.


Leave a Reply

Your email address will not be published. Required fields are marked *