Problem
Let’s say we have a file named in.txt
with the following content:
$ cat in.txt
# comment
*
and we want to read it line by line and do something with each one of them.
The following could be a solution:
#!/bin/bash
for line in $(cat in.txt); do
echo "line:$line"
done
Let’s see if it works
$ ls
in.txt null.txt script.sh
$ ./script.sh
line:#
line:comment
line:in.txt
line:null.txt
line:script.sh
Hm, the *
character was replaced by the filenames of the current directory, but why did that happen? Some info about a couple of bash commands first.
Command Substitution
Allows the output of a command to replace the command itself. It occurs when a command is enclosed as $(command)
or `command`
.
Word Splitting
The shell scans the results of parameter expansion, command substitution, and arithmetic expansion that did not occur within double quotes for word splitting.
Filename Expansion
After word splitting, unless the -f option has been set, Bash scans each word for the characters *
, ?
, and [
. If one of these characters appears, then the word is regarded as a pattern, and replaced with an alphabetically sorted list of filenames matching the pattern.
So, Bash treated the *
character, that exists in the file in.txt
, as a pattern and replaced it with the list of all filenames. What if we put the $(cat in.txt)
in double quotes?
$ cat script.sh
#!/bin/bash
for line in "$(cat in.txt)"; do
echo "line:$line"
done
$ ./script.sh
line: # comment
*
A little bit better now, but the whole result is in one line. Obviously, none of these solutions can be used. So, let’s remove the double quotes and add the set -f
option bash man page mentions.
$ cat script.sh
#!/bin/bash
set -f
for line in $(cat in.txt); do
echo "line:$line"
done
$ ./script.sh
line:#
line:comment
line:*
Better now, but the # comment
line is still divided into 2 lines, because of the space between #
and comment
.
Solution 1
$ cat script.sh
#!/bin/bash
while read line; do
echo "line:$line"
done < in.txt
$ ./script.sh
line:# comment
line:*
Nice! The wildcard *
is not treated as a pattern anymore, but the leading spaces have been removed. This happened because the read command by default removes all leading and trailing whitespace characters. Fortunately, we can handle this by clearing the IFS
variable, like the example below.
Final Solution
$ cat script.sh
#!/bin/bash
# Add IFS= so `read` won't trim leading and trailing whitespace from each line
# Add -r to read to prevent from backslashes from being interpreted as escape sequences
# Use printf in place of echo is safer if $line is a string like -n which echo would interpret as a flag
while IFS= read -r line; do
printf '%s\n' "$line"
done < in.txt
$ ./script.sh
line: # comment
line:*
Note
The read
command does not read lines! When a line is read, the first word is assigned to the first variable, the second word to the second variable, and so on. If there are more words than variables, the remaining words are assigned to the last variable.