Understanding IFS
Clash Royale CLAN TAG#URR8PPP
up vote
66
down vote
favorite
The following few threads on this site and StackOverflow were helpful for understanding how IFS
works:
- What is IFS in context of for looping?
- How to loop over the lines of a file
- Bash, read line by line from file, with IFS
But I still have some short questions. I decided to ask them in the same post since I think it may help better future readers:
Q1. IFS
is typically discussed in the context of "field splitting". Is field splitting the same as word splitting ?
Q2: The POSIX specification says:
If the value of IFS is null, no field splitting shall be performed.
Is setting IFS=
the same as setting IFS
to null? Is this what is meant by setting it to an empty string
too?
Q3: In the POSIX specification, I read the following:
If IFS is not set, the shell shall behave as if the value of IFS is
<space>, <tab> and <newline>
Say I want to restore the default value of IFS
. How do I do that? (more specifically, how do I refer to <tab>
and <newline>
?)
Q4: Finally, how would this code:
while IFS= read -r line
do
echo $line
done < /path_to_text_file
behave if we we change the first line to
while read -r line # Use the default IFS value
or to:
while IFS=' ' read -r line
shell
add a comment |Â
up vote
66
down vote
favorite
The following few threads on this site and StackOverflow were helpful for understanding how IFS
works:
- What is IFS in context of for looping?
- How to loop over the lines of a file
- Bash, read line by line from file, with IFS
But I still have some short questions. I decided to ask them in the same post since I think it may help better future readers:
Q1. IFS
is typically discussed in the context of "field splitting". Is field splitting the same as word splitting ?
Q2: The POSIX specification says:
If the value of IFS is null, no field splitting shall be performed.
Is setting IFS=
the same as setting IFS
to null? Is this what is meant by setting it to an empty string
too?
Q3: In the POSIX specification, I read the following:
If IFS is not set, the shell shall behave as if the value of IFS is
<space>, <tab> and <newline>
Say I want to restore the default value of IFS
. How do I do that? (more specifically, how do I refer to <tab>
and <newline>
?)
Q4: Finally, how would this code:
while IFS= read -r line
do
echo $line
done < /path_to_text_file
behave if we we change the first line to
while read -r line # Use the default IFS value
or to:
while IFS=' ' read -r line
shell
add a comment |Â
up vote
66
down vote
favorite
up vote
66
down vote
favorite
The following few threads on this site and StackOverflow were helpful for understanding how IFS
works:
- What is IFS in context of for looping?
- How to loop over the lines of a file
- Bash, read line by line from file, with IFS
But I still have some short questions. I decided to ask them in the same post since I think it may help better future readers:
Q1. IFS
is typically discussed in the context of "field splitting". Is field splitting the same as word splitting ?
Q2: The POSIX specification says:
If the value of IFS is null, no field splitting shall be performed.
Is setting IFS=
the same as setting IFS
to null? Is this what is meant by setting it to an empty string
too?
Q3: In the POSIX specification, I read the following:
If IFS is not set, the shell shall behave as if the value of IFS is
<space>, <tab> and <newline>
Say I want to restore the default value of IFS
. How do I do that? (more specifically, how do I refer to <tab>
and <newline>
?)
Q4: Finally, how would this code:
while IFS= read -r line
do
echo $line
done < /path_to_text_file
behave if we we change the first line to
while read -r line # Use the default IFS value
or to:
while IFS=' ' read -r line
shell
The following few threads on this site and StackOverflow were helpful for understanding how IFS
works:
- What is IFS in context of for looping?
- How to loop over the lines of a file
- Bash, read line by line from file, with IFS
But I still have some short questions. I decided to ask them in the same post since I think it may help better future readers:
Q1. IFS
is typically discussed in the context of "field splitting". Is field splitting the same as word splitting ?
Q2: The POSIX specification says:
If the value of IFS is null, no field splitting shall be performed.
Is setting IFS=
the same as setting IFS
to null? Is this what is meant by setting it to an empty string
too?
Q3: In the POSIX specification, I read the following:
If IFS is not set, the shell shall behave as if the value of IFS is
<space>, <tab> and <newline>
Say I want to restore the default value of IFS
. How do I do that? (more specifically, how do I refer to <tab>
and <newline>
?)
Q4: Finally, how would this code:
while IFS= read -r line
do
echo $line
done < /path_to_text_file
behave if we we change the first line to
while read -r line # Use the default IFS value
or to:
while IFS=' ' read -r line
shell
shell
edited May 23 '17 at 11:33
Communityâ¦
1
1
asked Dec 13 '11 at 23:43
Amelio Vazquez-Reina
11.9k51124228
11.9k51124228
add a comment |Â
add a comment |Â
5 Answers
5
active
oldest
votes
up vote
26
down vote
accepted
- Yes, they are the same.
- Yes.
- In bash, and similar shells, you could do something like
IFS=$' tn'
. Otherwise, you could insert the literal control codes by using[space] CTRL+V [tab] CTRL+V [enter]
. If you are planning to do this, however, it's better to use another variable to temporarily store the oldIFS
value, and then restore it afterwards (or temporarily override it for one command by using thevar=foo command
syntax). - The first code snippet will put the entire line read, verbatim, into
$line
, as there are no field separators to perform word splitting for. Bear in mind however that since many shells use cstrings to store strings, the first instance of a NUL may still cause the appearance of it being prematurely terminated. - The second code snippet may not put an exact copy of the input into
$line
. For example, if there are multiple consecutive field separators, they will be made into a single instance of the first element. This is often recognised as loss of surrounding whitespace. - The third code snippet will do the same as the second, except it will only split on a space (not the usual space, tab, or newline).
- The first code snippet will put the entire line read, verbatim, into
3
The answer to Q2 is wrong: an emptyIFS
and an unsetIFS
are very different. The answer to Q4 is partly wrong: inner separators are not touched here, only leading and trailing ones.
â Gilles
Dec 14 '11 at 12:06
3
@Gilles: In Q2 none of the three given denominations refers to an unsetIFS
, all of them meanIFS=
.
â Stéphane Gimenez
Dec 14 '11 at 13:00
@Gilles In Q2, I never said they were the same. And inner separators are touched, as shown here:IFS=' ' ; foo=( bar baz qux ) ; echo "$#foo[@]"
. (Er, what? There should be multiple space delimiters in there, SO engine keeps on stripping them).
â Chris Down
Dec 14 '11 at 16:05
2
@StéphaneGimenez, Chris: Oh, right, sorry about Q2, I misread the question. For Q4, we're talking aboutread
; the last variable grabs everything that's left except for the last separator and leaves inner separators inside.
â Gilles
Dec 15 '11 at 15:25
1
Gilles is partially correct about the spaces not being removed by read. Read my answer for details.
â user79743
Aug 6 '15 at 23:28
add a comment |Â
up vote
21
down vote
Q1: Yes. âÂÂField splittingâ and âÂÂword splittingâ are two terms for the same concept.
Q2: Yes. If IFS
is unset (i.e. after unset IFS
), it is equivalent IFS
being set to $' tn'
(a space, a tab and a newline). If IFS
is set to an empty value (that's what âÂÂnullâ means here) (i.e. after IFS=
or IFS=''
or IFS=""
), no field splitting is performed at all (and $*
, which normally uses the first character of $IFS
, uses a space character).
Q3: If you want to have the default IFS
behavior, you can use unset IFS
. If you want to set IFS
explicitly to this default value, you can put the literal characters space, tab, newline in single quotes. In ksh93, bash or zsh, you can use IFS=$' tn'
. Portably, if you want to avoid having a literal tab character in your source file, you can use
IFS=" $(echo t | tr t \t)
"
Q4: With IFS
set to an empty value, read -r line
sets line
to the whole line except its terminating newline. With IFS=" "
, spaces at the beginning and at the end of the line are trimmed. With the default value of IFS
, tabs and spaces are trimmed.
2
Q2 is partly wrong. If IFS is empty, "$*" is joined without separators. (for$@
, there are some variations between shells in non-list contexts likeIFS=; var=$@
). It should be noted that when IFS is empty, no word splitting is perfomed but $var still expands to no argument instead of an empty argument when $var is empty, and globbing still applies, so you still need to quote variables (even if you disable globbing)
â Stéphane Chazelas
Feb 8 '13 at 22:19
add a comment |Â
up vote
12
down vote
Q1. Field splitting.
Is field splitting the same as word splitting ?
Yes, both point to the same idea.
Q2: When is IFS null?.
Is setting
IFS=''
the same as null, the same as an empty string too?
Yes, all three mean the same: No field/word splitting shall be performed.
Also, this affects printing fields (as with echo "$*"
) all fields will be concatenated together with no space.
Q3: (part a) Unset IFS.
In the POSIX specification, I read the following:
If IFS is not set, the shell shall behave as if the value of IFS is <space><tab><newline>.
Which is exactly equivalent to:
With an
unset IFS
, the shell shall behave as if IFS is default.
That means that the 'Field splitting' will be exactly the same with a default IFS value, or unset.
That does NOT mean that IFS will work the same way in all conditions.
Being more specific, executing OldIFS=$IFS
will set the var OldIFS
to null, not the default. And trying to set IFS back, as this, IFS=OldIFS
will set IFS to null, not keep it unset as it were before. Watch out !!.
Q3: (part b) Restore IFS.
How could I restore the value of IFS to default.
Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to <tab> and <newline>?)
For zsh, ksh, and bash (AFAIK), IFS could be set to the default value as:
IFS=$' tn' # works with zsh, ksh, bash.
Done, you need to read nothing else.
But if you need to re-set IFS for sh, it may become complex.
Let's take a look from easiest to complete with no drawbacks (except complexity).
1.- Unset IFS.
We could just unset IFS
(Read Q3 part a, above.).
2.- Swap chars.
As a workaround, swapping the value of tab and newline makes it simpler to set the value of IFS, and then it works in a equivalent way.
Set IFS to <space><newline><tab>:
sh -c 'IFS=$(echo " nt"); printf "%s" "$IFS"|xxd' # Works.
3.- A simple? solution:
If there are child scripts that need IFS correctly set, you could always manually write:
IFS='
'
Where the sequence manually typed was: IFS=
'spacetabnewline', sequence which has actually been correctly typed above (If you need to confirm, edit this answer). But a copy/paste from your browser will break because the browser will squeeze/hide the whitespace. It makes it difficult to share the code as written above.
4.- Complete solution.
To write code that can be safely copied usually involves unambiguous printable escapes.
We need some code that "produces" the expected value. But, even if conceptually correct, this code will NOT set a trailing n
:
sh -c 'IFS=$(echo " tn"); printf "%s" "$IFS"|xxd' # wrong.
That happens because, under most shells, all trailing newlines of $(...)
or `...`
command substitutions are removed on expansion.
We need to use a trick for sh:
sh -c 'IFS="$(printf " tnx")"; IFS="$IFS%x"; printf "$IFS"|xxd' # Correct.
An alternative way may be to set IFS as an environment value from bash (for example) and then call sh (the versions of it that accept IFS to be set via the environment), as this:
env IFS=$' tn' sh -c 'printf "%s" "$IFS"|xxd'
In short, sh makes resetting IFS to default quite an odd adventure.
Q4: In actual code:
Finally, how would this code:
while IFS= read -r line
do
echo $line
done < /path_to_text_file
behave if we we change the first line to
while read -r line # Use the default IFS value
or to:
while IFS=' ' read -r line
First: I do not know if the echo $line
(with the var NOT quoted) is there on porpouse, or not.
It introduces a second level of 'field splitting' that read does not have.
So I'll answer both. :)
With this code (so you could confirm). You'll need the useful xxd:
#!/bin/ksh
# Correctly set IFS as described above.
defIFS="$(printf " tnx")"; defIFS="$defIFS%x";
IFS="$defIFS"
printf "IFS value: "
printf "%s" "$IFS"| xxd -p
a=' bar baz quz '; l="$#a"
printf "var value : %$ls-" "$a" ; printf "%sn" "$a" | xxd -p
printf "%sn" "$a" | while IFS='x' read -r line; do
printf "IFS --x-- : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
printf 'Values quoted :n' "" # With values quoted:
printf "%sn" "$a" | while IFS='' read -r line; do
printf "IFS null quoted : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
printf "%sn" "$a" | while IFS="$defIFS" read -r line; do
printf "IFS default quoted : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
unset IFS; printf "%sn" "$a" | while read -r line; do
printf "IFS unset quoted : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
IFS="$defIFS" # set IFS back to default.
printf "%sn" "$a" | while IFS=' ' read -r line; do
printf "IFS space quoted : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
printf '%sn' "Values unquoted :" # Now with values unquoted:
printf "%sn" "$a" | while IFS='x' read -r line; do
printf "IFS --x-- unquoted : "
printf "%s, " $line; printf "%s," $line |xxd -p; done
printf "%sn" "$a" | while IFS='' read -r line; do
printf "IFS null unquoted : ";
printf "%s, " $line; printf "%s," $line |xxd -p; done
printf "%sn" "$a" | while IFS="$defIFS" read -r line; do
printf "IFS defau unquoted : ";
printf "%s, " $line; printf "%s," $line |xxd -p; done
unset IFS; printf "%sn" "$a" | while read -r line; do
printf "IFS unset unquoted : ";
printf "%s, " $line; printf "%s," $line |xxd -p; done
IFS="$defIFS" # set IFS back to default.
printf "%sn" "$a" | while IFS=' ' read -r line; do
printf "IFS space unquoted : ";
printf "%s, " $line; printf "%s," $line |xxd -p; done
I get:
$ ./stackexchange-Understanding-IFS.sh
IFS value: 20090a
var value : bar baz quz -20202062617220202062617a20202071757a2020200a
IFS --x-- : bar baz quz -20202062617220202062617a20202071757a202020
Values quoted :
IFS null quoted : bar baz quz -20202062617220202062617a20202071757a202020
IFS default quoted : bar baz quz-62617220202062617a20202071757a
IFS unset quoted : bar baz quz-62617220202062617a20202071757a
IFS space quoted : bar baz quz-62617220202062617a20202071757a
Values unquoted :
IFS --x-- unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
IFS null unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
IFS defau unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
IFS unset unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
IFS space unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
The first value is just the correct value of IFS=
'spacetabnewline'
Next line is all the hex values that the var $a
has, and a newline '0a' at the end as it is going to be given to each read command.
The next line, for which IFS is null, does not perform any 'field spliting', but the newline is removed (as expected).
The next three lines, as IFS contains an space, remove the initial spaces and set the var line to the balance remaining.
The last four lines shows what an unquoted variable will do. The values will be split on the (several) spaces and will be printed as: bar,baz,qux,
add a comment |Â
up vote
4
down vote
unset IFS
does clear IFS, even if IFS is thereafter presumed to be " tn":
$ echo "'$IFS'"
'
'
$ IFS=""
$ echo "'$IFS'"
''
$ unset IFS
$ echo "'$IFS'"
''
$ IFS=$' tn'
$ echo "'$IFS'"
'
'
$
Tested on bash versions 4.2.45 and 3.2.25 with the same behavior.
add a comment |Â
up vote
0
down vote
Q1 Splitting
Q1. Is field splitting the same as word splitting ?
Probably, but with a caveat.
A parameter expansion as called in POSIX, ksh, bash or zsh (and others) is subject to "Field splitting" as called in POSIX (a.k.a.: "Field spliting" in ksh, "Word Splitting" in bash, and sometimes field and sometimes word in zsh).
I would define it as:
The process of splitting a parameter that is done using the IFS characters.
Where "parameter" means "a variable value (contents)" and using IFS characters might be different in zsh. There is a s:string:
flag that does "field splitting" on string
in zsh.
caveat
However, there is a process of splitting called "Token Recognition" as defined by Posix that splits command lines into words (tokens) mostly using blanks (tabs and spaces) and some other rules. That tokens are subsequently (immediately) called "words" is shown in the alias description (for example):
After a token has been delimited, ⦠, a resulting word â¦
As explained in ksh manual page:
Command Syntax
The shell begins parsing its input by breaking it into words. Words, which are sequences of characters, are delimited by unquoted white space characters (space, tab and newline) or meta-characters (<, >, |, ;, &, ( and )).
Also explicitly defined in bash man as this:
word A sequence of characters considered as a single unit by the shell. Also known as a token.
Or this:
word A sequence of characters treated as a unit by the shell. Words may not include unquoted metacharacters.
That is a "word splitting" in layman terms.
Q2 IFS null
Q2: Is setting IFS= the same as setting IFS to null? Is this what is meant by setting it to an empty string too?
An unset variable doesn't exist. A set variable exists but may be empty. If this value of empty is called "null" (as opposed to "NUL" or '0x00' or ''), then yes, all three are equivalent.
The variable is set but empty. var=
â¡ var=''
â¡ var=""
.
Q3 unset IFS
Q3: In the POSIX specification, I read the following:
If IFS is not set, the shell shall behave as if the value of IFS is , and
Yes, the shell shall behave In the sense that the effects that IFS should have should still be the same if an unset IFS
was executed, mainly for "word splitting" and read
commands.
That is not exactly equal to believe that an unset variable acts the same as a set variable. In specific, if you have:
$ unset a
$ b=$a
The variable a is unset, it doesn't exist yet, however, b is set to null
as described in the previous question. And this will also be true:
$ echo ""$a-a is UN-set" "$b-b is UN-set""
"a is UN-set" ""
That is important in the case where this is done:
$ unset IFS
$ oldIFS=$IFS
The variable oldFS is now set (but IFS is unset), trying to restore IFS by doing:
$ IFS=$oldIFS
Will end with an IFS set to null, not unset. Its effects will be different.
The only solution is to ensure that oldIFS
is also set (or unset) as IFS:
$ [ "$(set | grep '^IFS=')" ] && oldIFS=$IFS || unset oldIFS;
If IFS
is not unset, set oldIFS
to its value, otherwise unset it.
Restore by the same procedure (swap vars):
$ [ "$(set | grep '^oldIFS=')" ] && IFS=$oldIFS || unset IFS;
Q3 reset IFS
Q3 Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to and ?)
The only real problem is the newline at the end. The old, simple way to get it is:
nl='
'
Yes, a real newline. For a full IFS of :
IFS=" $(printf \t)$nl"
eval "$(printf "s=' tn'")"
IFS=$' tn'
Q4 IFS on read
Q4: Finally, how would this code:
while IFS= read -r line ...
Will read one line (up to a newline character) and assign it (without the trailing newline) to the var line
. No word splitting nor white space removal (leading or trailing white space) will be executed.
while read -r line # Use the default IFS value
With the default IFS ( tn
) the first effect is that whole line leading and trailing white space will be trimmed. Then, each (group of consecutive) delimiter(s) will be used to divide the line for each variable. That is: two variables need one (not leading or trailing) delimiter. Each additional variable require an additional delimiter (or group of delimiters).
while IFS=' ' read -r line
Leading and trailing (runs of) spaces will be removed, each (run of) spaces will be used to split the line at as many places as the variables require.
add a comment |Â
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
26
down vote
accepted
- Yes, they are the same.
- Yes.
- In bash, and similar shells, you could do something like
IFS=$' tn'
. Otherwise, you could insert the literal control codes by using[space] CTRL+V [tab] CTRL+V [enter]
. If you are planning to do this, however, it's better to use another variable to temporarily store the oldIFS
value, and then restore it afterwards (or temporarily override it for one command by using thevar=foo command
syntax). - The first code snippet will put the entire line read, verbatim, into
$line
, as there are no field separators to perform word splitting for. Bear in mind however that since many shells use cstrings to store strings, the first instance of a NUL may still cause the appearance of it being prematurely terminated. - The second code snippet may not put an exact copy of the input into
$line
. For example, if there are multiple consecutive field separators, they will be made into a single instance of the first element. This is often recognised as loss of surrounding whitespace. - The third code snippet will do the same as the second, except it will only split on a space (not the usual space, tab, or newline).
- The first code snippet will put the entire line read, verbatim, into
3
The answer to Q2 is wrong: an emptyIFS
and an unsetIFS
are very different. The answer to Q4 is partly wrong: inner separators are not touched here, only leading and trailing ones.
â Gilles
Dec 14 '11 at 12:06
3
@Gilles: In Q2 none of the three given denominations refers to an unsetIFS
, all of them meanIFS=
.
â Stéphane Gimenez
Dec 14 '11 at 13:00
@Gilles In Q2, I never said they were the same. And inner separators are touched, as shown here:IFS=' ' ; foo=( bar baz qux ) ; echo "$#foo[@]"
. (Er, what? There should be multiple space delimiters in there, SO engine keeps on stripping them).
â Chris Down
Dec 14 '11 at 16:05
2
@StéphaneGimenez, Chris: Oh, right, sorry about Q2, I misread the question. For Q4, we're talking aboutread
; the last variable grabs everything that's left except for the last separator and leaves inner separators inside.
â Gilles
Dec 15 '11 at 15:25
1
Gilles is partially correct about the spaces not being removed by read. Read my answer for details.
â user79743
Aug 6 '15 at 23:28
add a comment |Â
up vote
26
down vote
accepted
- Yes, they are the same.
- Yes.
- In bash, and similar shells, you could do something like
IFS=$' tn'
. Otherwise, you could insert the literal control codes by using[space] CTRL+V [tab] CTRL+V [enter]
. If you are planning to do this, however, it's better to use another variable to temporarily store the oldIFS
value, and then restore it afterwards (or temporarily override it for one command by using thevar=foo command
syntax). - The first code snippet will put the entire line read, verbatim, into
$line
, as there are no field separators to perform word splitting for. Bear in mind however that since many shells use cstrings to store strings, the first instance of a NUL may still cause the appearance of it being prematurely terminated. - The second code snippet may not put an exact copy of the input into
$line
. For example, if there are multiple consecutive field separators, they will be made into a single instance of the first element. This is often recognised as loss of surrounding whitespace. - The third code snippet will do the same as the second, except it will only split on a space (not the usual space, tab, or newline).
- The first code snippet will put the entire line read, verbatim, into
3
The answer to Q2 is wrong: an emptyIFS
and an unsetIFS
are very different. The answer to Q4 is partly wrong: inner separators are not touched here, only leading and trailing ones.
â Gilles
Dec 14 '11 at 12:06
3
@Gilles: In Q2 none of the three given denominations refers to an unsetIFS
, all of them meanIFS=
.
â Stéphane Gimenez
Dec 14 '11 at 13:00
@Gilles In Q2, I never said they were the same. And inner separators are touched, as shown here:IFS=' ' ; foo=( bar baz qux ) ; echo "$#foo[@]"
. (Er, what? There should be multiple space delimiters in there, SO engine keeps on stripping them).
â Chris Down
Dec 14 '11 at 16:05
2
@StéphaneGimenez, Chris: Oh, right, sorry about Q2, I misread the question. For Q4, we're talking aboutread
; the last variable grabs everything that's left except for the last separator and leaves inner separators inside.
â Gilles
Dec 15 '11 at 15:25
1
Gilles is partially correct about the spaces not being removed by read. Read my answer for details.
â user79743
Aug 6 '15 at 23:28
add a comment |Â
up vote
26
down vote
accepted
up vote
26
down vote
accepted
- Yes, they are the same.
- Yes.
- In bash, and similar shells, you could do something like
IFS=$' tn'
. Otherwise, you could insert the literal control codes by using[space] CTRL+V [tab] CTRL+V [enter]
. If you are planning to do this, however, it's better to use another variable to temporarily store the oldIFS
value, and then restore it afterwards (or temporarily override it for one command by using thevar=foo command
syntax). - The first code snippet will put the entire line read, verbatim, into
$line
, as there are no field separators to perform word splitting for. Bear in mind however that since many shells use cstrings to store strings, the first instance of a NUL may still cause the appearance of it being prematurely terminated. - The second code snippet may not put an exact copy of the input into
$line
. For example, if there are multiple consecutive field separators, they will be made into a single instance of the first element. This is often recognised as loss of surrounding whitespace. - The third code snippet will do the same as the second, except it will only split on a space (not the usual space, tab, or newline).
- The first code snippet will put the entire line read, verbatim, into
- Yes, they are the same.
- Yes.
- In bash, and similar shells, you could do something like
IFS=$' tn'
. Otherwise, you could insert the literal control codes by using[space] CTRL+V [tab] CTRL+V [enter]
. If you are planning to do this, however, it's better to use another variable to temporarily store the oldIFS
value, and then restore it afterwards (or temporarily override it for one command by using thevar=foo command
syntax). - The first code snippet will put the entire line read, verbatim, into
$line
, as there are no field separators to perform word splitting for. Bear in mind however that since many shells use cstrings to store strings, the first instance of a NUL may still cause the appearance of it being prematurely terminated. - The second code snippet may not put an exact copy of the input into
$line
. For example, if there are multiple consecutive field separators, they will be made into a single instance of the first element. This is often recognised as loss of surrounding whitespace. - The third code snippet will do the same as the second, except it will only split on a space (not the usual space, tab, or newline).
- The first code snippet will put the entire line read, verbatim, into
answered Dec 14 '11 at 0:25
Chris Down
76.6k12182196
76.6k12182196
3
The answer to Q2 is wrong: an emptyIFS
and an unsetIFS
are very different. The answer to Q4 is partly wrong: inner separators are not touched here, only leading and trailing ones.
â Gilles
Dec 14 '11 at 12:06
3
@Gilles: In Q2 none of the three given denominations refers to an unsetIFS
, all of them meanIFS=
.
â Stéphane Gimenez
Dec 14 '11 at 13:00
@Gilles In Q2, I never said they were the same. And inner separators are touched, as shown here:IFS=' ' ; foo=( bar baz qux ) ; echo "$#foo[@]"
. (Er, what? There should be multiple space delimiters in there, SO engine keeps on stripping them).
â Chris Down
Dec 14 '11 at 16:05
2
@StéphaneGimenez, Chris: Oh, right, sorry about Q2, I misread the question. For Q4, we're talking aboutread
; the last variable grabs everything that's left except for the last separator and leaves inner separators inside.
â Gilles
Dec 15 '11 at 15:25
1
Gilles is partially correct about the spaces not being removed by read. Read my answer for details.
â user79743
Aug 6 '15 at 23:28
add a comment |Â
3
The answer to Q2 is wrong: an emptyIFS
and an unsetIFS
are very different. The answer to Q4 is partly wrong: inner separators are not touched here, only leading and trailing ones.
â Gilles
Dec 14 '11 at 12:06
3
@Gilles: In Q2 none of the three given denominations refers to an unsetIFS
, all of them meanIFS=
.
â Stéphane Gimenez
Dec 14 '11 at 13:00
@Gilles In Q2, I never said they were the same. And inner separators are touched, as shown here:IFS=' ' ; foo=( bar baz qux ) ; echo "$#foo[@]"
. (Er, what? There should be multiple space delimiters in there, SO engine keeps on stripping them).
â Chris Down
Dec 14 '11 at 16:05
2
@StéphaneGimenez, Chris: Oh, right, sorry about Q2, I misread the question. For Q4, we're talking aboutread
; the last variable grabs everything that's left except for the last separator and leaves inner separators inside.
â Gilles
Dec 15 '11 at 15:25
1
Gilles is partially correct about the spaces not being removed by read. Read my answer for details.
â user79743
Aug 6 '15 at 23:28
3
3
The answer to Q2 is wrong: an empty
IFS
and an unset IFS
are very different. The answer to Q4 is partly wrong: inner separators are not touched here, only leading and trailing ones.â Gilles
Dec 14 '11 at 12:06
The answer to Q2 is wrong: an empty
IFS
and an unset IFS
are very different. The answer to Q4 is partly wrong: inner separators are not touched here, only leading and trailing ones.â Gilles
Dec 14 '11 at 12:06
3
3
@Gilles: In Q2 none of the three given denominations refers to an unset
IFS
, all of them mean IFS=
.â Stéphane Gimenez
Dec 14 '11 at 13:00
@Gilles: In Q2 none of the three given denominations refers to an unset
IFS
, all of them mean IFS=
.â Stéphane Gimenez
Dec 14 '11 at 13:00
@Gilles In Q2, I never said they were the same. And inner separators are touched, as shown here:
IFS=' ' ; foo=( bar baz qux ) ; echo "$#foo[@]"
. (Er, what? There should be multiple space delimiters in there, SO engine keeps on stripping them).â Chris Down
Dec 14 '11 at 16:05
@Gilles In Q2, I never said they were the same. And inner separators are touched, as shown here:
IFS=' ' ; foo=( bar baz qux ) ; echo "$#foo[@]"
. (Er, what? There should be multiple space delimiters in there, SO engine keeps on stripping them).â Chris Down
Dec 14 '11 at 16:05
2
2
@StéphaneGimenez, Chris: Oh, right, sorry about Q2, I misread the question. For Q4, we're talking about
read
; the last variable grabs everything that's left except for the last separator and leaves inner separators inside.â Gilles
Dec 15 '11 at 15:25
@StéphaneGimenez, Chris: Oh, right, sorry about Q2, I misread the question. For Q4, we're talking about
read
; the last variable grabs everything that's left except for the last separator and leaves inner separators inside.â Gilles
Dec 15 '11 at 15:25
1
1
Gilles is partially correct about the spaces not being removed by read. Read my answer for details.
â user79743
Aug 6 '15 at 23:28
Gilles is partially correct about the spaces not being removed by read. Read my answer for details.
â user79743
Aug 6 '15 at 23:28
add a comment |Â
up vote
21
down vote
Q1: Yes. âÂÂField splittingâ and âÂÂword splittingâ are two terms for the same concept.
Q2: Yes. If IFS
is unset (i.e. after unset IFS
), it is equivalent IFS
being set to $' tn'
(a space, a tab and a newline). If IFS
is set to an empty value (that's what âÂÂnullâ means here) (i.e. after IFS=
or IFS=''
or IFS=""
), no field splitting is performed at all (and $*
, which normally uses the first character of $IFS
, uses a space character).
Q3: If you want to have the default IFS
behavior, you can use unset IFS
. If you want to set IFS
explicitly to this default value, you can put the literal characters space, tab, newline in single quotes. In ksh93, bash or zsh, you can use IFS=$' tn'
. Portably, if you want to avoid having a literal tab character in your source file, you can use
IFS=" $(echo t | tr t \t)
"
Q4: With IFS
set to an empty value, read -r line
sets line
to the whole line except its terminating newline. With IFS=" "
, spaces at the beginning and at the end of the line are trimmed. With the default value of IFS
, tabs and spaces are trimmed.
2
Q2 is partly wrong. If IFS is empty, "$*" is joined without separators. (for$@
, there are some variations between shells in non-list contexts likeIFS=; var=$@
). It should be noted that when IFS is empty, no word splitting is perfomed but $var still expands to no argument instead of an empty argument when $var is empty, and globbing still applies, so you still need to quote variables (even if you disable globbing)
â Stéphane Chazelas
Feb 8 '13 at 22:19
add a comment |Â
up vote
21
down vote
Q1: Yes. âÂÂField splittingâ and âÂÂword splittingâ are two terms for the same concept.
Q2: Yes. If IFS
is unset (i.e. after unset IFS
), it is equivalent IFS
being set to $' tn'
(a space, a tab and a newline). If IFS
is set to an empty value (that's what âÂÂnullâ means here) (i.e. after IFS=
or IFS=''
or IFS=""
), no field splitting is performed at all (and $*
, which normally uses the first character of $IFS
, uses a space character).
Q3: If you want to have the default IFS
behavior, you can use unset IFS
. If you want to set IFS
explicitly to this default value, you can put the literal characters space, tab, newline in single quotes. In ksh93, bash or zsh, you can use IFS=$' tn'
. Portably, if you want to avoid having a literal tab character in your source file, you can use
IFS=" $(echo t | tr t \t)
"
Q4: With IFS
set to an empty value, read -r line
sets line
to the whole line except its terminating newline. With IFS=" "
, spaces at the beginning and at the end of the line are trimmed. With the default value of IFS
, tabs and spaces are trimmed.
2
Q2 is partly wrong. If IFS is empty, "$*" is joined without separators. (for$@
, there are some variations between shells in non-list contexts likeIFS=; var=$@
). It should be noted that when IFS is empty, no word splitting is perfomed but $var still expands to no argument instead of an empty argument when $var is empty, and globbing still applies, so you still need to quote variables (even if you disable globbing)
â Stéphane Chazelas
Feb 8 '13 at 22:19
add a comment |Â
up vote
21
down vote
up vote
21
down vote
Q1: Yes. âÂÂField splittingâ and âÂÂword splittingâ are two terms for the same concept.
Q2: Yes. If IFS
is unset (i.e. after unset IFS
), it is equivalent IFS
being set to $' tn'
(a space, a tab and a newline). If IFS
is set to an empty value (that's what âÂÂnullâ means here) (i.e. after IFS=
or IFS=''
or IFS=""
), no field splitting is performed at all (and $*
, which normally uses the first character of $IFS
, uses a space character).
Q3: If you want to have the default IFS
behavior, you can use unset IFS
. If you want to set IFS
explicitly to this default value, you can put the literal characters space, tab, newline in single quotes. In ksh93, bash or zsh, you can use IFS=$' tn'
. Portably, if you want to avoid having a literal tab character in your source file, you can use
IFS=" $(echo t | tr t \t)
"
Q4: With IFS
set to an empty value, read -r line
sets line
to the whole line except its terminating newline. With IFS=" "
, spaces at the beginning and at the end of the line are trimmed. With the default value of IFS
, tabs and spaces are trimmed.
Q1: Yes. âÂÂField splittingâ and âÂÂword splittingâ are two terms for the same concept.
Q2: Yes. If IFS
is unset (i.e. after unset IFS
), it is equivalent IFS
being set to $' tn'
(a space, a tab and a newline). If IFS
is set to an empty value (that's what âÂÂnullâ means here) (i.e. after IFS=
or IFS=''
or IFS=""
), no field splitting is performed at all (and $*
, which normally uses the first character of $IFS
, uses a space character).
Q3: If you want to have the default IFS
behavior, you can use unset IFS
. If you want to set IFS
explicitly to this default value, you can put the literal characters space, tab, newline in single quotes. In ksh93, bash or zsh, you can use IFS=$' tn'
. Portably, if you want to avoid having a literal tab character in your source file, you can use
IFS=" $(echo t | tr t \t)
"
Q4: With IFS
set to an empty value, read -r line
sets line
to the whole line except its terminating newline. With IFS=" "
, spaces at the beginning and at the end of the line are trimmed. With the default value of IFS
, tabs and spaces are trimmed.
edited Dec 15 '11 at 15:26
answered Dec 14 '11 at 12:01
Gilles
513k12010161547
513k12010161547
2
Q2 is partly wrong. If IFS is empty, "$*" is joined without separators. (for$@
, there are some variations between shells in non-list contexts likeIFS=; var=$@
). It should be noted that when IFS is empty, no word splitting is perfomed but $var still expands to no argument instead of an empty argument when $var is empty, and globbing still applies, so you still need to quote variables (even if you disable globbing)
â Stéphane Chazelas
Feb 8 '13 at 22:19
add a comment |Â
2
Q2 is partly wrong. If IFS is empty, "$*" is joined without separators. (for$@
, there are some variations between shells in non-list contexts likeIFS=; var=$@
). It should be noted that when IFS is empty, no word splitting is perfomed but $var still expands to no argument instead of an empty argument when $var is empty, and globbing still applies, so you still need to quote variables (even if you disable globbing)
â Stéphane Chazelas
Feb 8 '13 at 22:19
2
2
Q2 is partly wrong. If IFS is empty, "$*" is joined without separators. (for
$@
, there are some variations between shells in non-list contexts like IFS=; var=$@
). It should be noted that when IFS is empty, no word splitting is perfomed but $var still expands to no argument instead of an empty argument when $var is empty, and globbing still applies, so you still need to quote variables (even if you disable globbing)â Stéphane Chazelas
Feb 8 '13 at 22:19
Q2 is partly wrong. If IFS is empty, "$*" is joined without separators. (for
$@
, there are some variations between shells in non-list contexts like IFS=; var=$@
). It should be noted that when IFS is empty, no word splitting is perfomed but $var still expands to no argument instead of an empty argument when $var is empty, and globbing still applies, so you still need to quote variables (even if you disable globbing)â Stéphane Chazelas
Feb 8 '13 at 22:19
add a comment |Â
up vote
12
down vote
Q1. Field splitting.
Is field splitting the same as word splitting ?
Yes, both point to the same idea.
Q2: When is IFS null?.
Is setting
IFS=''
the same as null, the same as an empty string too?
Yes, all three mean the same: No field/word splitting shall be performed.
Also, this affects printing fields (as with echo "$*"
) all fields will be concatenated together with no space.
Q3: (part a) Unset IFS.
In the POSIX specification, I read the following:
If IFS is not set, the shell shall behave as if the value of IFS is <space><tab><newline>.
Which is exactly equivalent to:
With an
unset IFS
, the shell shall behave as if IFS is default.
That means that the 'Field splitting' will be exactly the same with a default IFS value, or unset.
That does NOT mean that IFS will work the same way in all conditions.
Being more specific, executing OldIFS=$IFS
will set the var OldIFS
to null, not the default. And trying to set IFS back, as this, IFS=OldIFS
will set IFS to null, not keep it unset as it were before. Watch out !!.
Q3: (part b) Restore IFS.
How could I restore the value of IFS to default.
Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to <tab> and <newline>?)
For zsh, ksh, and bash (AFAIK), IFS could be set to the default value as:
IFS=$' tn' # works with zsh, ksh, bash.
Done, you need to read nothing else.
But if you need to re-set IFS for sh, it may become complex.
Let's take a look from easiest to complete with no drawbacks (except complexity).
1.- Unset IFS.
We could just unset IFS
(Read Q3 part a, above.).
2.- Swap chars.
As a workaround, swapping the value of tab and newline makes it simpler to set the value of IFS, and then it works in a equivalent way.
Set IFS to <space><newline><tab>:
sh -c 'IFS=$(echo " nt"); printf "%s" "$IFS"|xxd' # Works.
3.- A simple? solution:
If there are child scripts that need IFS correctly set, you could always manually write:
IFS='
'
Where the sequence manually typed was: IFS=
'spacetabnewline', sequence which has actually been correctly typed above (If you need to confirm, edit this answer). But a copy/paste from your browser will break because the browser will squeeze/hide the whitespace. It makes it difficult to share the code as written above.
4.- Complete solution.
To write code that can be safely copied usually involves unambiguous printable escapes.
We need some code that "produces" the expected value. But, even if conceptually correct, this code will NOT set a trailing n
:
sh -c 'IFS=$(echo " tn"); printf "%s" "$IFS"|xxd' # wrong.
That happens because, under most shells, all trailing newlines of $(...)
or `...`
command substitutions are removed on expansion.
We need to use a trick for sh:
sh -c 'IFS="$(printf " tnx")"; IFS="$IFS%x"; printf "$IFS"|xxd' # Correct.
An alternative way may be to set IFS as an environment value from bash (for example) and then call sh (the versions of it that accept IFS to be set via the environment), as this:
env IFS=$' tn' sh -c 'printf "%s" "$IFS"|xxd'
In short, sh makes resetting IFS to default quite an odd adventure.
Q4: In actual code:
Finally, how would this code:
while IFS= read -r line
do
echo $line
done < /path_to_text_file
behave if we we change the first line to
while read -r line # Use the default IFS value
or to:
while IFS=' ' read -r line
First: I do not know if the echo $line
(with the var NOT quoted) is there on porpouse, or not.
It introduces a second level of 'field splitting' that read does not have.
So I'll answer both. :)
With this code (so you could confirm). You'll need the useful xxd:
#!/bin/ksh
# Correctly set IFS as described above.
defIFS="$(printf " tnx")"; defIFS="$defIFS%x";
IFS="$defIFS"
printf "IFS value: "
printf "%s" "$IFS"| xxd -p
a=' bar baz quz '; l="$#a"
printf "var value : %$ls-" "$a" ; printf "%sn" "$a" | xxd -p
printf "%sn" "$a" | while IFS='x' read -r line; do
printf "IFS --x-- : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
printf 'Values quoted :n' "" # With values quoted:
printf "%sn" "$a" | while IFS='' read -r line; do
printf "IFS null quoted : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
printf "%sn" "$a" | while IFS="$defIFS" read -r line; do
printf "IFS default quoted : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
unset IFS; printf "%sn" "$a" | while read -r line; do
printf "IFS unset quoted : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
IFS="$defIFS" # set IFS back to default.
printf "%sn" "$a" | while IFS=' ' read -r line; do
printf "IFS space quoted : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
printf '%sn' "Values unquoted :" # Now with values unquoted:
printf "%sn" "$a" | while IFS='x' read -r line; do
printf "IFS --x-- unquoted : "
printf "%s, " $line; printf "%s," $line |xxd -p; done
printf "%sn" "$a" | while IFS='' read -r line; do
printf "IFS null unquoted : ";
printf "%s, " $line; printf "%s," $line |xxd -p; done
printf "%sn" "$a" | while IFS="$defIFS" read -r line; do
printf "IFS defau unquoted : ";
printf "%s, " $line; printf "%s," $line |xxd -p; done
unset IFS; printf "%sn" "$a" | while read -r line; do
printf "IFS unset unquoted : ";
printf "%s, " $line; printf "%s," $line |xxd -p; done
IFS="$defIFS" # set IFS back to default.
printf "%sn" "$a" | while IFS=' ' read -r line; do
printf "IFS space unquoted : ";
printf "%s, " $line; printf "%s," $line |xxd -p; done
I get:
$ ./stackexchange-Understanding-IFS.sh
IFS value: 20090a
var value : bar baz quz -20202062617220202062617a20202071757a2020200a
IFS --x-- : bar baz quz -20202062617220202062617a20202071757a202020
Values quoted :
IFS null quoted : bar baz quz -20202062617220202062617a20202071757a202020
IFS default quoted : bar baz quz-62617220202062617a20202071757a
IFS unset quoted : bar baz quz-62617220202062617a20202071757a
IFS space quoted : bar baz quz-62617220202062617a20202071757a
Values unquoted :
IFS --x-- unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
IFS null unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
IFS defau unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
IFS unset unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
IFS space unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
The first value is just the correct value of IFS=
'spacetabnewline'
Next line is all the hex values that the var $a
has, and a newline '0a' at the end as it is going to be given to each read command.
The next line, for which IFS is null, does not perform any 'field spliting', but the newline is removed (as expected).
The next three lines, as IFS contains an space, remove the initial spaces and set the var line to the balance remaining.
The last four lines shows what an unquoted variable will do. The values will be split on the (several) spaces and will be printed as: bar,baz,qux,
add a comment |Â
up vote
12
down vote
Q1. Field splitting.
Is field splitting the same as word splitting ?
Yes, both point to the same idea.
Q2: When is IFS null?.
Is setting
IFS=''
the same as null, the same as an empty string too?
Yes, all three mean the same: No field/word splitting shall be performed.
Also, this affects printing fields (as with echo "$*"
) all fields will be concatenated together with no space.
Q3: (part a) Unset IFS.
In the POSIX specification, I read the following:
If IFS is not set, the shell shall behave as if the value of IFS is <space><tab><newline>.
Which is exactly equivalent to:
With an
unset IFS
, the shell shall behave as if IFS is default.
That means that the 'Field splitting' will be exactly the same with a default IFS value, or unset.
That does NOT mean that IFS will work the same way in all conditions.
Being more specific, executing OldIFS=$IFS
will set the var OldIFS
to null, not the default. And trying to set IFS back, as this, IFS=OldIFS
will set IFS to null, not keep it unset as it were before. Watch out !!.
Q3: (part b) Restore IFS.
How could I restore the value of IFS to default.
Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to <tab> and <newline>?)
For zsh, ksh, and bash (AFAIK), IFS could be set to the default value as:
IFS=$' tn' # works with zsh, ksh, bash.
Done, you need to read nothing else.
But if you need to re-set IFS for sh, it may become complex.
Let's take a look from easiest to complete with no drawbacks (except complexity).
1.- Unset IFS.
We could just unset IFS
(Read Q3 part a, above.).
2.- Swap chars.
As a workaround, swapping the value of tab and newline makes it simpler to set the value of IFS, and then it works in a equivalent way.
Set IFS to <space><newline><tab>:
sh -c 'IFS=$(echo " nt"); printf "%s" "$IFS"|xxd' # Works.
3.- A simple? solution:
If there are child scripts that need IFS correctly set, you could always manually write:
IFS='
'
Where the sequence manually typed was: IFS=
'spacetabnewline', sequence which has actually been correctly typed above (If you need to confirm, edit this answer). But a copy/paste from your browser will break because the browser will squeeze/hide the whitespace. It makes it difficult to share the code as written above.
4.- Complete solution.
To write code that can be safely copied usually involves unambiguous printable escapes.
We need some code that "produces" the expected value. But, even if conceptually correct, this code will NOT set a trailing n
:
sh -c 'IFS=$(echo " tn"); printf "%s" "$IFS"|xxd' # wrong.
That happens because, under most shells, all trailing newlines of $(...)
or `...`
command substitutions are removed on expansion.
We need to use a trick for sh:
sh -c 'IFS="$(printf " tnx")"; IFS="$IFS%x"; printf "$IFS"|xxd' # Correct.
An alternative way may be to set IFS as an environment value from bash (for example) and then call sh (the versions of it that accept IFS to be set via the environment), as this:
env IFS=$' tn' sh -c 'printf "%s" "$IFS"|xxd'
In short, sh makes resetting IFS to default quite an odd adventure.
Q4: In actual code:
Finally, how would this code:
while IFS= read -r line
do
echo $line
done < /path_to_text_file
behave if we we change the first line to
while read -r line # Use the default IFS value
or to:
while IFS=' ' read -r line
First: I do not know if the echo $line
(with the var NOT quoted) is there on porpouse, or not.
It introduces a second level of 'field splitting' that read does not have.
So I'll answer both. :)
With this code (so you could confirm). You'll need the useful xxd:
#!/bin/ksh
# Correctly set IFS as described above.
defIFS="$(printf " tnx")"; defIFS="$defIFS%x";
IFS="$defIFS"
printf "IFS value: "
printf "%s" "$IFS"| xxd -p
a=' bar baz quz '; l="$#a"
printf "var value : %$ls-" "$a" ; printf "%sn" "$a" | xxd -p
printf "%sn" "$a" | while IFS='x' read -r line; do
printf "IFS --x-- : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
printf 'Values quoted :n' "" # With values quoted:
printf "%sn" "$a" | while IFS='' read -r line; do
printf "IFS null quoted : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
printf "%sn" "$a" | while IFS="$defIFS" read -r line; do
printf "IFS default quoted : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
unset IFS; printf "%sn" "$a" | while read -r line; do
printf "IFS unset quoted : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
IFS="$defIFS" # set IFS back to default.
printf "%sn" "$a" | while IFS=' ' read -r line; do
printf "IFS space quoted : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
printf '%sn' "Values unquoted :" # Now with values unquoted:
printf "%sn" "$a" | while IFS='x' read -r line; do
printf "IFS --x-- unquoted : "
printf "%s, " $line; printf "%s," $line |xxd -p; done
printf "%sn" "$a" | while IFS='' read -r line; do
printf "IFS null unquoted : ";
printf "%s, " $line; printf "%s," $line |xxd -p; done
printf "%sn" "$a" | while IFS="$defIFS" read -r line; do
printf "IFS defau unquoted : ";
printf "%s, " $line; printf "%s," $line |xxd -p; done
unset IFS; printf "%sn" "$a" | while read -r line; do
printf "IFS unset unquoted : ";
printf "%s, " $line; printf "%s," $line |xxd -p; done
IFS="$defIFS" # set IFS back to default.
printf "%sn" "$a" | while IFS=' ' read -r line; do
printf "IFS space unquoted : ";
printf "%s, " $line; printf "%s," $line |xxd -p; done
I get:
$ ./stackexchange-Understanding-IFS.sh
IFS value: 20090a
var value : bar baz quz -20202062617220202062617a20202071757a2020200a
IFS --x-- : bar baz quz -20202062617220202062617a20202071757a202020
Values quoted :
IFS null quoted : bar baz quz -20202062617220202062617a20202071757a202020
IFS default quoted : bar baz quz-62617220202062617a20202071757a
IFS unset quoted : bar baz quz-62617220202062617a20202071757a
IFS space quoted : bar baz quz-62617220202062617a20202071757a
Values unquoted :
IFS --x-- unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
IFS null unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
IFS defau unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
IFS unset unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
IFS space unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
The first value is just the correct value of IFS=
'spacetabnewline'
Next line is all the hex values that the var $a
has, and a newline '0a' at the end as it is going to be given to each read command.
The next line, for which IFS is null, does not perform any 'field spliting', but the newline is removed (as expected).
The next three lines, as IFS contains an space, remove the initial spaces and set the var line to the balance remaining.
The last four lines shows what an unquoted variable will do. The values will be split on the (several) spaces and will be printed as: bar,baz,qux,
add a comment |Â
up vote
12
down vote
up vote
12
down vote
Q1. Field splitting.
Is field splitting the same as word splitting ?
Yes, both point to the same idea.
Q2: When is IFS null?.
Is setting
IFS=''
the same as null, the same as an empty string too?
Yes, all three mean the same: No field/word splitting shall be performed.
Also, this affects printing fields (as with echo "$*"
) all fields will be concatenated together with no space.
Q3: (part a) Unset IFS.
In the POSIX specification, I read the following:
If IFS is not set, the shell shall behave as if the value of IFS is <space><tab><newline>.
Which is exactly equivalent to:
With an
unset IFS
, the shell shall behave as if IFS is default.
That means that the 'Field splitting' will be exactly the same with a default IFS value, or unset.
That does NOT mean that IFS will work the same way in all conditions.
Being more specific, executing OldIFS=$IFS
will set the var OldIFS
to null, not the default. And trying to set IFS back, as this, IFS=OldIFS
will set IFS to null, not keep it unset as it were before. Watch out !!.
Q3: (part b) Restore IFS.
How could I restore the value of IFS to default.
Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to <tab> and <newline>?)
For zsh, ksh, and bash (AFAIK), IFS could be set to the default value as:
IFS=$' tn' # works with zsh, ksh, bash.
Done, you need to read nothing else.
But if you need to re-set IFS for sh, it may become complex.
Let's take a look from easiest to complete with no drawbacks (except complexity).
1.- Unset IFS.
We could just unset IFS
(Read Q3 part a, above.).
2.- Swap chars.
As a workaround, swapping the value of tab and newline makes it simpler to set the value of IFS, and then it works in a equivalent way.
Set IFS to <space><newline><tab>:
sh -c 'IFS=$(echo " nt"); printf "%s" "$IFS"|xxd' # Works.
3.- A simple? solution:
If there are child scripts that need IFS correctly set, you could always manually write:
IFS='
'
Where the sequence manually typed was: IFS=
'spacetabnewline', sequence which has actually been correctly typed above (If you need to confirm, edit this answer). But a copy/paste from your browser will break because the browser will squeeze/hide the whitespace. It makes it difficult to share the code as written above.
4.- Complete solution.
To write code that can be safely copied usually involves unambiguous printable escapes.
We need some code that "produces" the expected value. But, even if conceptually correct, this code will NOT set a trailing n
:
sh -c 'IFS=$(echo " tn"); printf "%s" "$IFS"|xxd' # wrong.
That happens because, under most shells, all trailing newlines of $(...)
or `...`
command substitutions are removed on expansion.
We need to use a trick for sh:
sh -c 'IFS="$(printf " tnx")"; IFS="$IFS%x"; printf "$IFS"|xxd' # Correct.
An alternative way may be to set IFS as an environment value from bash (for example) and then call sh (the versions of it that accept IFS to be set via the environment), as this:
env IFS=$' tn' sh -c 'printf "%s" "$IFS"|xxd'
In short, sh makes resetting IFS to default quite an odd adventure.
Q4: In actual code:
Finally, how would this code:
while IFS= read -r line
do
echo $line
done < /path_to_text_file
behave if we we change the first line to
while read -r line # Use the default IFS value
or to:
while IFS=' ' read -r line
First: I do not know if the echo $line
(with the var NOT quoted) is there on porpouse, or not.
It introduces a second level of 'field splitting' that read does not have.
So I'll answer both. :)
With this code (so you could confirm). You'll need the useful xxd:
#!/bin/ksh
# Correctly set IFS as described above.
defIFS="$(printf " tnx")"; defIFS="$defIFS%x";
IFS="$defIFS"
printf "IFS value: "
printf "%s" "$IFS"| xxd -p
a=' bar baz quz '; l="$#a"
printf "var value : %$ls-" "$a" ; printf "%sn" "$a" | xxd -p
printf "%sn" "$a" | while IFS='x' read -r line; do
printf "IFS --x-- : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
printf 'Values quoted :n' "" # With values quoted:
printf "%sn" "$a" | while IFS='' read -r line; do
printf "IFS null quoted : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
printf "%sn" "$a" | while IFS="$defIFS" read -r line; do
printf "IFS default quoted : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
unset IFS; printf "%sn" "$a" | while read -r line; do
printf "IFS unset quoted : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
IFS="$defIFS" # set IFS back to default.
printf "%sn" "$a" | while IFS=' ' read -r line; do
printf "IFS space quoted : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
printf '%sn' "Values unquoted :" # Now with values unquoted:
printf "%sn" "$a" | while IFS='x' read -r line; do
printf "IFS --x-- unquoted : "
printf "%s, " $line; printf "%s," $line |xxd -p; done
printf "%sn" "$a" | while IFS='' read -r line; do
printf "IFS null unquoted : ";
printf "%s, " $line; printf "%s," $line |xxd -p; done
printf "%sn" "$a" | while IFS="$defIFS" read -r line; do
printf "IFS defau unquoted : ";
printf "%s, " $line; printf "%s," $line |xxd -p; done
unset IFS; printf "%sn" "$a" | while read -r line; do
printf "IFS unset unquoted : ";
printf "%s, " $line; printf "%s," $line |xxd -p; done
IFS="$defIFS" # set IFS back to default.
printf "%sn" "$a" | while IFS=' ' read -r line; do
printf "IFS space unquoted : ";
printf "%s, " $line; printf "%s," $line |xxd -p; done
I get:
$ ./stackexchange-Understanding-IFS.sh
IFS value: 20090a
var value : bar baz quz -20202062617220202062617a20202071757a2020200a
IFS --x-- : bar baz quz -20202062617220202062617a20202071757a202020
Values quoted :
IFS null quoted : bar baz quz -20202062617220202062617a20202071757a202020
IFS default quoted : bar baz quz-62617220202062617a20202071757a
IFS unset quoted : bar baz quz-62617220202062617a20202071757a
IFS space quoted : bar baz quz-62617220202062617a20202071757a
Values unquoted :
IFS --x-- unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
IFS null unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
IFS defau unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
IFS unset unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
IFS space unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
The first value is just the correct value of IFS=
'spacetabnewline'
Next line is all the hex values that the var $a
has, and a newline '0a' at the end as it is going to be given to each read command.
The next line, for which IFS is null, does not perform any 'field spliting', but the newline is removed (as expected).
The next three lines, as IFS contains an space, remove the initial spaces and set the var line to the balance remaining.
The last four lines shows what an unquoted variable will do. The values will be split on the (several) spaces and will be printed as: bar,baz,qux,
Q1. Field splitting.
Is field splitting the same as word splitting ?
Yes, both point to the same idea.
Q2: When is IFS null?.
Is setting
IFS=''
the same as null, the same as an empty string too?
Yes, all three mean the same: No field/word splitting shall be performed.
Also, this affects printing fields (as with echo "$*"
) all fields will be concatenated together with no space.
Q3: (part a) Unset IFS.
In the POSIX specification, I read the following:
If IFS is not set, the shell shall behave as if the value of IFS is <space><tab><newline>.
Which is exactly equivalent to:
With an
unset IFS
, the shell shall behave as if IFS is default.
That means that the 'Field splitting' will be exactly the same with a default IFS value, or unset.
That does NOT mean that IFS will work the same way in all conditions.
Being more specific, executing OldIFS=$IFS
will set the var OldIFS
to null, not the default. And trying to set IFS back, as this, IFS=OldIFS
will set IFS to null, not keep it unset as it were before. Watch out !!.
Q3: (part b) Restore IFS.
How could I restore the value of IFS to default.
Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to <tab> and <newline>?)
For zsh, ksh, and bash (AFAIK), IFS could be set to the default value as:
IFS=$' tn' # works with zsh, ksh, bash.
Done, you need to read nothing else.
But if you need to re-set IFS for sh, it may become complex.
Let's take a look from easiest to complete with no drawbacks (except complexity).
1.- Unset IFS.
We could just unset IFS
(Read Q3 part a, above.).
2.- Swap chars.
As a workaround, swapping the value of tab and newline makes it simpler to set the value of IFS, and then it works in a equivalent way.
Set IFS to <space><newline><tab>:
sh -c 'IFS=$(echo " nt"); printf "%s" "$IFS"|xxd' # Works.
3.- A simple? solution:
If there are child scripts that need IFS correctly set, you could always manually write:
IFS='
'
Where the sequence manually typed was: IFS=
'spacetabnewline', sequence which has actually been correctly typed above (If you need to confirm, edit this answer). But a copy/paste from your browser will break because the browser will squeeze/hide the whitespace. It makes it difficult to share the code as written above.
4.- Complete solution.
To write code that can be safely copied usually involves unambiguous printable escapes.
We need some code that "produces" the expected value. But, even if conceptually correct, this code will NOT set a trailing n
:
sh -c 'IFS=$(echo " tn"); printf "%s" "$IFS"|xxd' # wrong.
That happens because, under most shells, all trailing newlines of $(...)
or `...`
command substitutions are removed on expansion.
We need to use a trick for sh:
sh -c 'IFS="$(printf " tnx")"; IFS="$IFS%x"; printf "$IFS"|xxd' # Correct.
An alternative way may be to set IFS as an environment value from bash (for example) and then call sh (the versions of it that accept IFS to be set via the environment), as this:
env IFS=$' tn' sh -c 'printf "%s" "$IFS"|xxd'
In short, sh makes resetting IFS to default quite an odd adventure.
Q4: In actual code:
Finally, how would this code:
while IFS= read -r line
do
echo $line
done < /path_to_text_file
behave if we we change the first line to
while read -r line # Use the default IFS value
or to:
while IFS=' ' read -r line
First: I do not know if the echo $line
(with the var NOT quoted) is there on porpouse, or not.
It introduces a second level of 'field splitting' that read does not have.
So I'll answer both. :)
With this code (so you could confirm). You'll need the useful xxd:
#!/bin/ksh
# Correctly set IFS as described above.
defIFS="$(printf " tnx")"; defIFS="$defIFS%x";
IFS="$defIFS"
printf "IFS value: "
printf "%s" "$IFS"| xxd -p
a=' bar baz quz '; l="$#a"
printf "var value : %$ls-" "$a" ; printf "%sn" "$a" | xxd -p
printf "%sn" "$a" | while IFS='x' read -r line; do
printf "IFS --x-- : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
printf 'Values quoted :n' "" # With values quoted:
printf "%sn" "$a" | while IFS='' read -r line; do
printf "IFS null quoted : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
printf "%sn" "$a" | while IFS="$defIFS" read -r line; do
printf "IFS default quoted : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
unset IFS; printf "%sn" "$a" | while read -r line; do
printf "IFS unset quoted : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
IFS="$defIFS" # set IFS back to default.
printf "%sn" "$a" | while IFS=' ' read -r line; do
printf "IFS space quoted : %$ls-" "$line" ;
printf "%s" "$line" |xxd -p; done;
printf '%sn' "Values unquoted :" # Now with values unquoted:
printf "%sn" "$a" | while IFS='x' read -r line; do
printf "IFS --x-- unquoted : "
printf "%s, " $line; printf "%s," $line |xxd -p; done
printf "%sn" "$a" | while IFS='' read -r line; do
printf "IFS null unquoted : ";
printf "%s, " $line; printf "%s," $line |xxd -p; done
printf "%sn" "$a" | while IFS="$defIFS" read -r line; do
printf "IFS defau unquoted : ";
printf "%s, " $line; printf "%s," $line |xxd -p; done
unset IFS; printf "%sn" "$a" | while read -r line; do
printf "IFS unset unquoted : ";
printf "%s, " $line; printf "%s," $line |xxd -p; done
IFS="$defIFS" # set IFS back to default.
printf "%sn" "$a" | while IFS=' ' read -r line; do
printf "IFS space unquoted : ";
printf "%s, " $line; printf "%s," $line |xxd -p; done
I get:
$ ./stackexchange-Understanding-IFS.sh
IFS value: 20090a
var value : bar baz quz -20202062617220202062617a20202071757a2020200a
IFS --x-- : bar baz quz -20202062617220202062617a20202071757a202020
Values quoted :
IFS null quoted : bar baz quz -20202062617220202062617a20202071757a202020
IFS default quoted : bar baz quz-62617220202062617a20202071757a
IFS unset quoted : bar baz quz-62617220202062617a20202071757a
IFS space quoted : bar baz quz-62617220202062617a20202071757a
Values unquoted :
IFS --x-- unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
IFS null unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
IFS defau unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
IFS unset unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
IFS space unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
The first value is just the correct value of IFS=
'spacetabnewline'
Next line is all the hex values that the var $a
has, and a newline '0a' at the end as it is going to be given to each read command.
The next line, for which IFS is null, does not perform any 'field spliting', but the newline is removed (as expected).
The next three lines, as IFS contains an space, remove the initial spaces and set the var line to the balance remaining.
The last four lines shows what an unquoted variable will do. The values will be split on the (several) spaces and will be printed as: bar,baz,qux,
edited Aug 23 '15 at 8:48
answered Aug 6 '15 at 23:13
user79743
add a comment |Â
add a comment |Â
up vote
4
down vote
unset IFS
does clear IFS, even if IFS is thereafter presumed to be " tn":
$ echo "'$IFS'"
'
'
$ IFS=""
$ echo "'$IFS'"
''
$ unset IFS
$ echo "'$IFS'"
''
$ IFS=$' tn'
$ echo "'$IFS'"
'
'
$
Tested on bash versions 4.2.45 and 3.2.25 with the same behavior.
add a comment |Â
up vote
4
down vote
unset IFS
does clear IFS, even if IFS is thereafter presumed to be " tn":
$ echo "'$IFS'"
'
'
$ IFS=""
$ echo "'$IFS'"
''
$ unset IFS
$ echo "'$IFS'"
''
$ IFS=$' tn'
$ echo "'$IFS'"
'
'
$
Tested on bash versions 4.2.45 and 3.2.25 with the same behavior.
add a comment |Â
up vote
4
down vote
up vote
4
down vote
unset IFS
does clear IFS, even if IFS is thereafter presumed to be " tn":
$ echo "'$IFS'"
'
'
$ IFS=""
$ echo "'$IFS'"
''
$ unset IFS
$ echo "'$IFS'"
''
$ IFS=$' tn'
$ echo "'$IFS'"
'
'
$
Tested on bash versions 4.2.45 and 3.2.25 with the same behavior.
unset IFS
does clear IFS, even if IFS is thereafter presumed to be " tn":
$ echo "'$IFS'"
'
'
$ IFS=""
$ echo "'$IFS'"
''
$ unset IFS
$ echo "'$IFS'"
''
$ IFS=$' tn'
$ echo "'$IFS'"
'
'
$
Tested on bash versions 4.2.45 and 3.2.25 with the same behavior.
answered Sep 17 '13 at 19:48
derekm
412
412
add a comment |Â
add a comment |Â
up vote
0
down vote
Q1 Splitting
Q1. Is field splitting the same as word splitting ?
Probably, but with a caveat.
A parameter expansion as called in POSIX, ksh, bash or zsh (and others) is subject to "Field splitting" as called in POSIX (a.k.a.: "Field spliting" in ksh, "Word Splitting" in bash, and sometimes field and sometimes word in zsh).
I would define it as:
The process of splitting a parameter that is done using the IFS characters.
Where "parameter" means "a variable value (contents)" and using IFS characters might be different in zsh. There is a s:string:
flag that does "field splitting" on string
in zsh.
caveat
However, there is a process of splitting called "Token Recognition" as defined by Posix that splits command lines into words (tokens) mostly using blanks (tabs and spaces) and some other rules. That tokens are subsequently (immediately) called "words" is shown in the alias description (for example):
After a token has been delimited, ⦠, a resulting word â¦
As explained in ksh manual page:
Command Syntax
The shell begins parsing its input by breaking it into words. Words, which are sequences of characters, are delimited by unquoted white space characters (space, tab and newline) or meta-characters (<, >, |, ;, &, ( and )).
Also explicitly defined in bash man as this:
word A sequence of characters considered as a single unit by the shell. Also known as a token.
Or this:
word A sequence of characters treated as a unit by the shell. Words may not include unquoted metacharacters.
That is a "word splitting" in layman terms.
Q2 IFS null
Q2: Is setting IFS= the same as setting IFS to null? Is this what is meant by setting it to an empty string too?
An unset variable doesn't exist. A set variable exists but may be empty. If this value of empty is called "null" (as opposed to "NUL" or '0x00' or ''), then yes, all three are equivalent.
The variable is set but empty. var=
â¡ var=''
â¡ var=""
.
Q3 unset IFS
Q3: In the POSIX specification, I read the following:
If IFS is not set, the shell shall behave as if the value of IFS is , and
Yes, the shell shall behave In the sense that the effects that IFS should have should still be the same if an unset IFS
was executed, mainly for "word splitting" and read
commands.
That is not exactly equal to believe that an unset variable acts the same as a set variable. In specific, if you have:
$ unset a
$ b=$a
The variable a is unset, it doesn't exist yet, however, b is set to null
as described in the previous question. And this will also be true:
$ echo ""$a-a is UN-set" "$b-b is UN-set""
"a is UN-set" ""
That is important in the case where this is done:
$ unset IFS
$ oldIFS=$IFS
The variable oldFS is now set (but IFS is unset), trying to restore IFS by doing:
$ IFS=$oldIFS
Will end with an IFS set to null, not unset. Its effects will be different.
The only solution is to ensure that oldIFS
is also set (or unset) as IFS:
$ [ "$(set | grep '^IFS=')" ] && oldIFS=$IFS || unset oldIFS;
If IFS
is not unset, set oldIFS
to its value, otherwise unset it.
Restore by the same procedure (swap vars):
$ [ "$(set | grep '^oldIFS=')" ] && IFS=$oldIFS || unset IFS;
Q3 reset IFS
Q3 Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to and ?)
The only real problem is the newline at the end. The old, simple way to get it is:
nl='
'
Yes, a real newline. For a full IFS of :
IFS=" $(printf \t)$nl"
eval "$(printf "s=' tn'")"
IFS=$' tn'
Q4 IFS on read
Q4: Finally, how would this code:
while IFS= read -r line ...
Will read one line (up to a newline character) and assign it (without the trailing newline) to the var line
. No word splitting nor white space removal (leading or trailing white space) will be executed.
while read -r line # Use the default IFS value
With the default IFS ( tn
) the first effect is that whole line leading and trailing white space will be trimmed. Then, each (group of consecutive) delimiter(s) will be used to divide the line for each variable. That is: two variables need one (not leading or trailing) delimiter. Each additional variable require an additional delimiter (or group of delimiters).
while IFS=' ' read -r line
Leading and trailing (runs of) spaces will be removed, each (run of) spaces will be used to split the line at as many places as the variables require.
add a comment |Â
up vote
0
down vote
Q1 Splitting
Q1. Is field splitting the same as word splitting ?
Probably, but with a caveat.
A parameter expansion as called in POSIX, ksh, bash or zsh (and others) is subject to "Field splitting" as called in POSIX (a.k.a.: "Field spliting" in ksh, "Word Splitting" in bash, and sometimes field and sometimes word in zsh).
I would define it as:
The process of splitting a parameter that is done using the IFS characters.
Where "parameter" means "a variable value (contents)" and using IFS characters might be different in zsh. There is a s:string:
flag that does "field splitting" on string
in zsh.
caveat
However, there is a process of splitting called "Token Recognition" as defined by Posix that splits command lines into words (tokens) mostly using blanks (tabs and spaces) and some other rules. That tokens are subsequently (immediately) called "words" is shown in the alias description (for example):
After a token has been delimited, ⦠, a resulting word â¦
As explained in ksh manual page:
Command Syntax
The shell begins parsing its input by breaking it into words. Words, which are sequences of characters, are delimited by unquoted white space characters (space, tab and newline) or meta-characters (<, >, |, ;, &, ( and )).
Also explicitly defined in bash man as this:
word A sequence of characters considered as a single unit by the shell. Also known as a token.
Or this:
word A sequence of characters treated as a unit by the shell. Words may not include unquoted metacharacters.
That is a "word splitting" in layman terms.
Q2 IFS null
Q2: Is setting IFS= the same as setting IFS to null? Is this what is meant by setting it to an empty string too?
An unset variable doesn't exist. A set variable exists but may be empty. If this value of empty is called "null" (as opposed to "NUL" or '0x00' or ''), then yes, all three are equivalent.
The variable is set but empty. var=
â¡ var=''
â¡ var=""
.
Q3 unset IFS
Q3: In the POSIX specification, I read the following:
If IFS is not set, the shell shall behave as if the value of IFS is , and
Yes, the shell shall behave In the sense that the effects that IFS should have should still be the same if an unset IFS
was executed, mainly for "word splitting" and read
commands.
That is not exactly equal to believe that an unset variable acts the same as a set variable. In specific, if you have:
$ unset a
$ b=$a
The variable a is unset, it doesn't exist yet, however, b is set to null
as described in the previous question. And this will also be true:
$ echo ""$a-a is UN-set" "$b-b is UN-set""
"a is UN-set" ""
That is important in the case where this is done:
$ unset IFS
$ oldIFS=$IFS
The variable oldFS is now set (but IFS is unset), trying to restore IFS by doing:
$ IFS=$oldIFS
Will end with an IFS set to null, not unset. Its effects will be different.
The only solution is to ensure that oldIFS
is also set (or unset) as IFS:
$ [ "$(set | grep '^IFS=')" ] && oldIFS=$IFS || unset oldIFS;
If IFS
is not unset, set oldIFS
to its value, otherwise unset it.
Restore by the same procedure (swap vars):
$ [ "$(set | grep '^oldIFS=')" ] && IFS=$oldIFS || unset IFS;
Q3 reset IFS
Q3 Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to and ?)
The only real problem is the newline at the end. The old, simple way to get it is:
nl='
'
Yes, a real newline. For a full IFS of :
IFS=" $(printf \t)$nl"
eval "$(printf "s=' tn'")"
IFS=$' tn'
Q4 IFS on read
Q4: Finally, how would this code:
while IFS= read -r line ...
Will read one line (up to a newline character) and assign it (without the trailing newline) to the var line
. No word splitting nor white space removal (leading or trailing white space) will be executed.
while read -r line # Use the default IFS value
With the default IFS ( tn
) the first effect is that whole line leading and trailing white space will be trimmed. Then, each (group of consecutive) delimiter(s) will be used to divide the line for each variable. That is: two variables need one (not leading or trailing) delimiter. Each additional variable require an additional delimiter (or group of delimiters).
while IFS=' ' read -r line
Leading and trailing (runs of) spaces will be removed, each (run of) spaces will be used to split the line at as many places as the variables require.
add a comment |Â
up vote
0
down vote
up vote
0
down vote
Q1 Splitting
Q1. Is field splitting the same as word splitting ?
Probably, but with a caveat.
A parameter expansion as called in POSIX, ksh, bash or zsh (and others) is subject to "Field splitting" as called in POSIX (a.k.a.: "Field spliting" in ksh, "Word Splitting" in bash, and sometimes field and sometimes word in zsh).
I would define it as:
The process of splitting a parameter that is done using the IFS characters.
Where "parameter" means "a variable value (contents)" and using IFS characters might be different in zsh. There is a s:string:
flag that does "field splitting" on string
in zsh.
caveat
However, there is a process of splitting called "Token Recognition" as defined by Posix that splits command lines into words (tokens) mostly using blanks (tabs and spaces) and some other rules. That tokens are subsequently (immediately) called "words" is shown in the alias description (for example):
After a token has been delimited, ⦠, a resulting word â¦
As explained in ksh manual page:
Command Syntax
The shell begins parsing its input by breaking it into words. Words, which are sequences of characters, are delimited by unquoted white space characters (space, tab and newline) or meta-characters (<, >, |, ;, &, ( and )).
Also explicitly defined in bash man as this:
word A sequence of characters considered as a single unit by the shell. Also known as a token.
Or this:
word A sequence of characters treated as a unit by the shell. Words may not include unquoted metacharacters.
That is a "word splitting" in layman terms.
Q2 IFS null
Q2: Is setting IFS= the same as setting IFS to null? Is this what is meant by setting it to an empty string too?
An unset variable doesn't exist. A set variable exists but may be empty. If this value of empty is called "null" (as opposed to "NUL" or '0x00' or ''), then yes, all three are equivalent.
The variable is set but empty. var=
â¡ var=''
â¡ var=""
.
Q3 unset IFS
Q3: In the POSIX specification, I read the following:
If IFS is not set, the shell shall behave as if the value of IFS is , and
Yes, the shell shall behave In the sense that the effects that IFS should have should still be the same if an unset IFS
was executed, mainly for "word splitting" and read
commands.
That is not exactly equal to believe that an unset variable acts the same as a set variable. In specific, if you have:
$ unset a
$ b=$a
The variable a is unset, it doesn't exist yet, however, b is set to null
as described in the previous question. And this will also be true:
$ echo ""$a-a is UN-set" "$b-b is UN-set""
"a is UN-set" ""
That is important in the case where this is done:
$ unset IFS
$ oldIFS=$IFS
The variable oldFS is now set (but IFS is unset), trying to restore IFS by doing:
$ IFS=$oldIFS
Will end with an IFS set to null, not unset. Its effects will be different.
The only solution is to ensure that oldIFS
is also set (or unset) as IFS:
$ [ "$(set | grep '^IFS=')" ] && oldIFS=$IFS || unset oldIFS;
If IFS
is not unset, set oldIFS
to its value, otherwise unset it.
Restore by the same procedure (swap vars):
$ [ "$(set | grep '^oldIFS=')" ] && IFS=$oldIFS || unset IFS;
Q3 reset IFS
Q3 Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to and ?)
The only real problem is the newline at the end. The old, simple way to get it is:
nl='
'
Yes, a real newline. For a full IFS of :
IFS=" $(printf \t)$nl"
eval "$(printf "s=' tn'")"
IFS=$' tn'
Q4 IFS on read
Q4: Finally, how would this code:
while IFS= read -r line ...
Will read one line (up to a newline character) and assign it (without the trailing newline) to the var line
. No word splitting nor white space removal (leading or trailing white space) will be executed.
while read -r line # Use the default IFS value
With the default IFS ( tn
) the first effect is that whole line leading and trailing white space will be trimmed. Then, each (group of consecutive) delimiter(s) will be used to divide the line for each variable. That is: two variables need one (not leading or trailing) delimiter. Each additional variable require an additional delimiter (or group of delimiters).
while IFS=' ' read -r line
Leading and trailing (runs of) spaces will be removed, each (run of) spaces will be used to split the line at as many places as the variables require.
Q1 Splitting
Q1. Is field splitting the same as word splitting ?
Probably, but with a caveat.
A parameter expansion as called in POSIX, ksh, bash or zsh (and others) is subject to "Field splitting" as called in POSIX (a.k.a.: "Field spliting" in ksh, "Word Splitting" in bash, and sometimes field and sometimes word in zsh).
I would define it as:
The process of splitting a parameter that is done using the IFS characters.
Where "parameter" means "a variable value (contents)" and using IFS characters might be different in zsh. There is a s:string:
flag that does "field splitting" on string
in zsh.
caveat
However, there is a process of splitting called "Token Recognition" as defined by Posix that splits command lines into words (tokens) mostly using blanks (tabs and spaces) and some other rules. That tokens are subsequently (immediately) called "words" is shown in the alias description (for example):
After a token has been delimited, ⦠, a resulting word â¦
As explained in ksh manual page:
Command Syntax
The shell begins parsing its input by breaking it into words. Words, which are sequences of characters, are delimited by unquoted white space characters (space, tab and newline) or meta-characters (<, >, |, ;, &, ( and )).
Also explicitly defined in bash man as this:
word A sequence of characters considered as a single unit by the shell. Also known as a token.
Or this:
word A sequence of characters treated as a unit by the shell. Words may not include unquoted metacharacters.
That is a "word splitting" in layman terms.
Q2 IFS null
Q2: Is setting IFS= the same as setting IFS to null? Is this what is meant by setting it to an empty string too?
An unset variable doesn't exist. A set variable exists but may be empty. If this value of empty is called "null" (as opposed to "NUL" or '0x00' or ''), then yes, all three are equivalent.
The variable is set but empty. var=
â¡ var=''
â¡ var=""
.
Q3 unset IFS
Q3: In the POSIX specification, I read the following:
If IFS is not set, the shell shall behave as if the value of IFS is , and
Yes, the shell shall behave In the sense that the effects that IFS should have should still be the same if an unset IFS
was executed, mainly for "word splitting" and read
commands.
That is not exactly equal to believe that an unset variable acts the same as a set variable. In specific, if you have:
$ unset a
$ b=$a
The variable a is unset, it doesn't exist yet, however, b is set to null
as described in the previous question. And this will also be true:
$ echo ""$a-a is UN-set" "$b-b is UN-set""
"a is UN-set" ""
That is important in the case where this is done:
$ unset IFS
$ oldIFS=$IFS
The variable oldFS is now set (but IFS is unset), trying to restore IFS by doing:
$ IFS=$oldIFS
Will end with an IFS set to null, not unset. Its effects will be different.
The only solution is to ensure that oldIFS
is also set (or unset) as IFS:
$ [ "$(set | grep '^IFS=')" ] && oldIFS=$IFS || unset oldIFS;
If IFS
is not unset, set oldIFS
to its value, otherwise unset it.
Restore by the same procedure (swap vars):
$ [ "$(set | grep '^oldIFS=')" ] && IFS=$oldIFS || unset IFS;
Q3 reset IFS
Q3 Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to and ?)
The only real problem is the newline at the end. The old, simple way to get it is:
nl='
'
Yes, a real newline. For a full IFS of :
IFS=" $(printf \t)$nl"
eval "$(printf "s=' tn'")"
IFS=$' tn'
Q4 IFS on read
Q4: Finally, how would this code:
while IFS= read -r line ...
Will read one line (up to a newline character) and assign it (without the trailing newline) to the var line
. No word splitting nor white space removal (leading or trailing white space) will be executed.
while read -r line # Use the default IFS value
With the default IFS ( tn
) the first effect is that whole line leading and trailing white space will be trimmed. Then, each (group of consecutive) delimiter(s) will be used to divide the line for each variable. That is: two variables need one (not leading or trailing) delimiter. Each additional variable require an additional delimiter (or group of delimiters).
while IFS=' ' read -r line
Leading and trailing (runs of) spaces will be removed, each (run of) spaces will be used to split the line at as many places as the variables require.
answered yesterday
Isaac
7,85711137
7,85711137
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f26784%2funderstanding-ifs%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password