Understanding IFS

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
66
down vote

favorite
41












The following few threads on this site and StackOverflow were helpful for understanding how IFS works:



  • What is IFS in context of for looping?

  • How to loop over the lines of a file

  • Bash, read line by line from file, with IFS

But I still have some short questions. I decided to ask them in the same post since I think it may help better future readers:



Q1. IFS is typically discussed in the context of "field splitting". Is field splitting the same as word splitting ?



Q2: The POSIX specification says:




If the value of IFS is null, no field splitting shall be performed.




Is setting IFS= the same as setting IFS to null? Is this what is meant by setting it to an empty string too?



Q3: In the POSIX specification, I read the following:




If IFS is not set, the shell shall behave as if the value of IFS is
<space>, <tab> and <newline>




Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to <tab> and <newline>?)



Q4: Finally, how would this code:



while IFS= read -r line
do
echo $line
done < /path_to_text_file


behave if we we change the first line to



while read -r line # Use the default IFS value


or to:



while IFS=' ' read -r line









share|improve this question



























    up vote
    66
    down vote

    favorite
    41












    The following few threads on this site and StackOverflow were helpful for understanding how IFS works:



    • What is IFS in context of for looping?

    • How to loop over the lines of a file

    • Bash, read line by line from file, with IFS

    But I still have some short questions. I decided to ask them in the same post since I think it may help better future readers:



    Q1. IFS is typically discussed in the context of "field splitting". Is field splitting the same as word splitting ?



    Q2: The POSIX specification says:




    If the value of IFS is null, no field splitting shall be performed.




    Is setting IFS= the same as setting IFS to null? Is this what is meant by setting it to an empty string too?



    Q3: In the POSIX specification, I read the following:




    If IFS is not set, the shell shall behave as if the value of IFS is
    <space>, <tab> and <newline>




    Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to <tab> and <newline>?)



    Q4: Finally, how would this code:



    while IFS= read -r line
    do
    echo $line
    done < /path_to_text_file


    behave if we we change the first line to



    while read -r line # Use the default IFS value


    or to:



    while IFS=' ' read -r line









    share|improve this question

























      up vote
      66
      down vote

      favorite
      41









      up vote
      66
      down vote

      favorite
      41






      41





      The following few threads on this site and StackOverflow were helpful for understanding how IFS works:



      • What is IFS in context of for looping?

      • How to loop over the lines of a file

      • Bash, read line by line from file, with IFS

      But I still have some short questions. I decided to ask them in the same post since I think it may help better future readers:



      Q1. IFS is typically discussed in the context of "field splitting". Is field splitting the same as word splitting ?



      Q2: The POSIX specification says:




      If the value of IFS is null, no field splitting shall be performed.




      Is setting IFS= the same as setting IFS to null? Is this what is meant by setting it to an empty string too?



      Q3: In the POSIX specification, I read the following:




      If IFS is not set, the shell shall behave as if the value of IFS is
      <space>, <tab> and <newline>




      Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to <tab> and <newline>?)



      Q4: Finally, how would this code:



      while IFS= read -r line
      do
      echo $line
      done < /path_to_text_file


      behave if we we change the first line to



      while read -r line # Use the default IFS value


      or to:



      while IFS=' ' read -r line









      share|improve this question















      The following few threads on this site and StackOverflow were helpful for understanding how IFS works:



      • What is IFS in context of for looping?

      • How to loop over the lines of a file

      • Bash, read line by line from file, with IFS

      But I still have some short questions. I decided to ask them in the same post since I think it may help better future readers:



      Q1. IFS is typically discussed in the context of "field splitting". Is field splitting the same as word splitting ?



      Q2: The POSIX specification says:




      If the value of IFS is null, no field splitting shall be performed.




      Is setting IFS= the same as setting IFS to null? Is this what is meant by setting it to an empty string too?



      Q3: In the POSIX specification, I read the following:




      If IFS is not set, the shell shall behave as if the value of IFS is
      <space>, <tab> and <newline>




      Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to <tab> and <newline>?)



      Q4: Finally, how would this code:



      while IFS= read -r line
      do
      echo $line
      done < /path_to_text_file


      behave if we we change the first line to



      while read -r line # Use the default IFS value


      or to:



      while IFS=' ' read -r line






      shell






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited May 23 '17 at 11:33









      Community♦

      1




      1










      asked Dec 13 '11 at 23:43









      Amelio Vazquez-Reina

      11.9k51124228




      11.9k51124228




















          5 Answers
          5






          active

          oldest

          votes

















          up vote
          26
          down vote



          accepted










          1. Yes, they are the same.

          2. Yes.

          3. In bash, and similar shells, you could do something like IFS=$' tn'. Otherwise, you could insert the literal control codes by using [space] CTRL+V [tab] CTRL+V [enter]. If you are planning to do this, however, it's better to use another variable to temporarily store the old IFS value, and then restore it afterwards (or temporarily override it for one command by using the var=foo command syntax).


            • The first code snippet will put the entire line read, verbatim, into $line, as there are no field separators to perform word splitting for. Bear in mind however that since many shells use cstrings to store strings, the first instance of a NUL may still cause the appearance of it being prematurely terminated.

            • The second code snippet may not put an exact copy of the input into $line. For example, if there are multiple consecutive field separators, they will be made into a single instance of the first element. This is often recognised as loss of surrounding whitespace.

            • The third code snippet will do the same as the second, except it will only split on a space (not the usual space, tab, or newline).






          share|improve this answer
















          • 3




            The answer to Q2 is wrong: an empty IFS and an unset IFS are very different. The answer to Q4 is partly wrong: inner separators are not touched here, only leading and trailing ones.
            – Gilles
            Dec 14 '11 at 12:06






          • 3




            @Gilles: In Q2 none of the three given denominations refers to an unset IFS, all of them mean IFS=.
            – Stéphane Gimenez
            Dec 14 '11 at 13:00










          • @Gilles In Q2, I never said they were the same. And inner separators are touched, as shown here: IFS=' ' ; foo=( bar baz qux ) ; echo "$#foo[@]". (Er, what? There should be multiple space delimiters in there, SO engine keeps on stripping them).
            – Chris Down
            Dec 14 '11 at 16:05







          • 2




            @StéphaneGimenez, Chris: Oh, right, sorry about Q2, I misread the question. For Q4, we're talking about read; the last variable grabs everything that's left except for the last separator and leaves inner separators inside.
            – Gilles
            Dec 15 '11 at 15:25






          • 1




            Gilles is partially correct about the spaces not being removed by read. Read my answer for details.
            – user79743
            Aug 6 '15 at 23:28

















          up vote
          21
          down vote













          Q1: Yes. “Field splitting” and “word splitting” are two terms for the same concept.



          Q2: Yes. If IFS is unset (i.e. after unset IFS), it is equivalent IFS being set to $' tn' (a space, a tab and a newline). If IFS is set to an empty value (that's what “null” means here) (i.e. after IFS= or IFS='' or IFS=""), no field splitting is performed at all (and $*, which normally uses the first character of $IFS, uses a space character).



          Q3: If you want to have the default IFS behavior, you can use unset IFS. If you want to set IFS explicitly to this default value, you can put the literal characters space, tab, newline in single quotes. In ksh93, bash or zsh, you can use IFS=$' tn'. Portably, if you want to avoid having a literal tab character in your source file, you can use



          IFS=" $(echo t | tr t \t)
          "


          Q4: With IFS set to an empty value, read -r line sets line to the whole line except its terminating newline. With IFS=" ", spaces at the beginning and at the end of the line are trimmed. With the default value of IFS, tabs and spaces are trimmed.






          share|improve this answer


















          • 2




            Q2 is partly wrong. If IFS is empty, "$*" is joined without separators. (for $@, there are some variations between shells in non-list contexts like IFS=; var=$@). It should be noted that when IFS is empty, no word splitting is perfomed but $var still expands to no argument instead of an empty argument when $var is empty, and globbing still applies, so you still need to quote variables (even if you disable globbing)
            – Stéphane Chazelas
            Feb 8 '13 at 22:19

















          up vote
          12
          down vote













          Q1. Field splitting.




          Is field splitting the same as word splitting ?




          Yes, both point to the same idea.



          Q2: When is IFS null?.




          Is setting IFS='' the same as null, the same as an empty string too?




          Yes, all three mean the same: No field/word splitting shall be performed.
          Also, this affects printing fields (as with echo "$*") all fields will be concatenated together with no space.



          Q3: (part a) Unset IFS.




          In the POSIX specification, I read the following:




          If IFS is not set, the shell shall behave as if the value of IFS is <space><tab><newline>.





          Which is exactly equivalent to:




          With an unset IFS, the shell shall behave as if IFS is default.




          That means that the 'Field splitting' will be exactly the same with a default IFS value, or unset.

          That does NOT mean that IFS will work the same way in all conditions.
          Being more specific, executing OldIFS=$IFS will set the var OldIFS to null, not the default. And trying to set IFS back, as this, IFS=OldIFS will set IFS to null, not keep it unset as it were before. Watch out !!.



          Q3: (part b) Restore IFS.




          How could I restore the value of IFS to default.
          Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to <tab> and <newline>?)




          For zsh, ksh, and bash (AFAIK), IFS could be set to the default value as:



          IFS=$' tn' # works with zsh, ksh, bash.


          Done, you need to read nothing else.



          But if you need to re-set IFS for sh, it may become complex.



          Let's take a look from easiest to complete with no drawbacks (except complexity).



          1.- Unset IFS.



          We could just unset IFS (Read Q3 part a, above.).



          2.- Swap chars.



          As a workaround, swapping the value of tab and newline makes it simpler to set the value of IFS, and then it works in a equivalent way.



          Set IFS to <space><newline><tab>:



          sh -c 'IFS=$(echo " nt"); printf "%s" "$IFS"|xxd' # Works.


          3.- A simple? solution:



          If there are child scripts that need IFS correctly set, you could always manually write:




          IFS='
          '


          Where the sequence manually typed was: IFS='spacetabnewline', sequence which has actually been correctly typed above (If you need to confirm, edit this answer). But a copy/paste from your browser will break because the browser will squeeze/hide the whitespace. It makes it difficult to share the code as written above.



          4.- Complete solution.



          To write code that can be safely copied usually involves unambiguous printable escapes.



          We need some code that "produces" the expected value. But, even if conceptually correct, this code will NOT set a trailing n:



          sh -c 'IFS=$(echo " tn"); printf "%s" "$IFS"|xxd' # wrong.


          That happens because, under most shells, all trailing newlines of $(...) or `...` command substitutions are removed on expansion.



          We need to use a trick for sh:



          sh -c 'IFS="$(printf " tnx")"; IFS="$IFS%x"; printf "$IFS"|xxd' # Correct.


          An alternative way may be to set IFS as an environment value from bash (for example) and then call sh (the versions of it that accept IFS to be set via the environment), as this:



          env IFS=$' tn' sh -c 'printf "%s" "$IFS"|xxd'


          In short, sh makes resetting IFS to default quite an odd adventure.



          Q4: In actual code:




          Finally, how would this code:



          while IFS= read -r line
          do
          echo $line
          done < /path_to_text_file


          behave if we we change the first line to



          while read -r line # Use the default IFS value


          or to:



          while IFS=' ' read -r line



          First: I do not know if the echo $line (with the var NOT quoted) is there on porpouse, or not.
          It introduces a second level of 'field splitting' that read does not have.
          So I'll answer both. :)



          With this code (so you could confirm). You'll need the useful xxd:



          #!/bin/ksh
          # Correctly set IFS as described above.
          defIFS="$(printf " tnx")"; defIFS="$defIFS%x";
          IFS="$defIFS"
          printf "IFS value: "
          printf "%s" "$IFS"| xxd -p

          a=' bar baz quz '; l="$#a"
          printf "var value : %$ls-" "$a" ; printf "%sn" "$a" | xxd -p

          printf "%sn" "$a" | while IFS='x' read -r line; do
          printf "IFS --x-- : %$ls-" "$line" ;
          printf "%s" "$line" |xxd -p; done;

          printf 'Values quoted :n' "" # With values quoted:
          printf "%sn" "$a" | while IFS='' read -r line; do
          printf "IFS null quoted : %$ls-" "$line" ;
          printf "%s" "$line" |xxd -p; done;

          printf "%sn" "$a" | while IFS="$defIFS" read -r line; do
          printf "IFS default quoted : %$ls-" "$line" ;
          printf "%s" "$line" |xxd -p; done;

          unset IFS; printf "%sn" "$a" | while read -r line; do
          printf "IFS unset quoted : %$ls-" "$line" ;
          printf "%s" "$line" |xxd -p; done;
          IFS="$defIFS" # set IFS back to default.

          printf "%sn" "$a" | while IFS=' ' read -r line; do
          printf "IFS space quoted : %$ls-" "$line" ;
          printf "%s" "$line" |xxd -p; done;

          printf '%sn' "Values unquoted :" # Now with values unquoted:
          printf "%sn" "$a" | while IFS='x' read -r line; do
          printf "IFS --x-- unquoted : "
          printf "%s, " $line; printf "%s," $line |xxd -p; done

          printf "%sn" "$a" | while IFS='' read -r line; do
          printf "IFS null unquoted : ";
          printf "%s, " $line; printf "%s," $line |xxd -p; done

          printf "%sn" "$a" | while IFS="$defIFS" read -r line; do
          printf "IFS defau unquoted : ";
          printf "%s, " $line; printf "%s," $line |xxd -p; done

          unset IFS; printf "%sn" "$a" | while read -r line; do
          printf "IFS unset unquoted : ";
          printf "%s, " $line; printf "%s," $line |xxd -p; done
          IFS="$defIFS" # set IFS back to default.

          printf "%sn" "$a" | while IFS=' ' read -r line; do
          printf "IFS space unquoted : ";
          printf "%s, " $line; printf "%s," $line |xxd -p; done


          I get:



          $ ./stackexchange-Understanding-IFS.sh
          IFS value: 20090a
          var value : bar baz quz -20202062617220202062617a20202071757a2020200a
          IFS --x-- : bar baz quz -20202062617220202062617a20202071757a202020
          Values quoted :
          IFS null quoted : bar baz quz -20202062617220202062617a20202071757a202020
          IFS default quoted : bar baz quz-62617220202062617a20202071757a
          IFS unset quoted : bar baz quz-62617220202062617a20202071757a
          IFS space quoted : bar baz quz-62617220202062617a20202071757a
          Values unquoted :
          IFS --x-- unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
          IFS null unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
          IFS defau unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
          IFS unset unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
          IFS space unquoted : bar, baz, quz, 6261722c62617a2c71757a2c


          The first value is just the correct value of IFS='spacetabnewline'



          Next line is all the hex values that the var $a has, and a newline '0a' at the end as it is going to be given to each read command.



          The next line, for which IFS is null, does not perform any 'field spliting', but the newline is removed (as expected).



          The next three lines, as IFS contains an space, remove the initial spaces and set the var line to the balance remaining.



          The last four lines shows what an unquoted variable will do. The values will be split on the (several) spaces and will be printed as: bar,baz,qux,






          share|improve this answer





























            up vote
            4
            down vote













            unset IFS does clear IFS, even if IFS is thereafter presumed to be " tn":



            $ echo "'$IFS'"
            '
            '
            $ IFS=""
            $ echo "'$IFS'"
            ''
            $ unset IFS
            $ echo "'$IFS'"
            ''
            $ IFS=$' tn'
            $ echo "'$IFS'"
            '
            '
            $


            Tested on bash versions 4.2.45 and 3.2.25 with the same behavior.






            share|improve this answer



























              up vote
              0
              down vote













              Q1 Splitting




              Q1. Is field splitting the same as word splitting ?




              Probably, but with a caveat.

              A parameter expansion as called in POSIX, ksh, bash or zsh (and others) is subject to "Field splitting" as called in POSIX (a.k.a.: "Field spliting" in ksh, "Word Splitting" in bash, and sometimes field and sometimes word in zsh).



              I would define it as:



              The process of splitting a parameter that is done using the IFS characters.



              Where "parameter" means "a variable value (contents)" and using IFS characters might be different in zsh. There is a s:string: flag that does "field splitting" on string in zsh.



              caveat

              However, there is a process of splitting called "Token Recognition" as defined by Posix that splits command lines into words (tokens) mostly using blanks (tabs and spaces) and some other rules. That tokens are subsequently (immediately) called "words" is shown in the alias description (for example):




              After a token has been delimited, … , a resulting word …




              As explained in ksh manual page:




              Command Syntax
              The shell begins parsing its input by breaking it into words. Words, which are sequences of characters, are delimited by unquoted white space characters (space, tab and newline) or meta-characters (<, >, |, ;, &, ( and )).




              Also explicitly defined in bash man as this:




              word A sequence of characters considered as a single unit by the shell. Also known as a token.




              Or this:




              word A sequence of characters treated as a unit by the shell. Words may not include unquoted metacharacters.




              That is a "word splitting" in layman terms.




              Q2 IFS null




              Q2: Is setting IFS= the same as setting IFS to null? Is this what is meant by setting it to an empty string too?




              An unset variable doesn't exist. A set variable exists but may be empty. If this value of empty is called "null" (as opposed to "NUL" or '0x00' or ''), then yes, all three are equivalent.



              The variable is set but empty. var= ≡ var='' ≡ var="".




              Q3 unset IFS




              Q3: In the POSIX specification, I read the following:



              If IFS is not set, the shell shall behave as if the value of IFS is , and




              Yes, the shell shall behave In the sense that the effects that IFS should have should still be the same if an unset IFS was executed, mainly for "word splitting" and read commands.



              That is not exactly equal to believe that an unset variable acts the same as a set variable. In specific, if you have:



              $ unset a
              $ b=$a


              The variable a is unset, it doesn't exist yet, however, b is set to null as described in the previous question. And this will also be true:



              $ echo ""$a-a is UN-set" "$b-b is UN-set""
              "a is UN-set" ""


              That is important in the case where this is done:



              $ unset IFS
              $ oldIFS=$IFS


              The variable oldFS is now set (but IFS is unset), trying to restore IFS by doing:



              $ IFS=$oldIFS


              Will end with an IFS set to null, not unset. Its effects will be different.



              The only solution is to ensure that oldIFS is also set (or unset) as IFS:



              $ [ "$(set | grep '^IFS=')" ] && oldIFS=$IFS || unset oldIFS;


              If IFS is not unset, set oldIFS to its value, otherwise unset it.

              Restore by the same procedure (swap vars):



              $ [ "$(set | grep '^oldIFS=')" ] && IFS=$oldIFS || unset IFS;



              Q3 reset IFS




              Q3 Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to and ?)




              The only real problem is the newline at the end. The old, simple way to get it is:



              nl='
              '


              Yes, a real newline. For a full IFS of :



              IFS=" $(printf \t)$nl"
              eval "$(printf "s=' tn'")"
              IFS=$' tn'



              Q4 IFS on read



              Q4: Finally, how would this code:




              while IFS= read -r line ...




              Will read one line (up to a newline character) and assign it (without the trailing newline) to the var line. No word splitting nor white space removal (leading or trailing white space) will be executed.




              while read -r line # Use the default IFS value




              With the default IFS ( tn) the first effect is that whole line leading and trailing white space will be trimmed. Then, each (group of consecutive) delimiter(s) will be used to divide the line for each variable. That is: two variables need one (not leading or trailing) delimiter. Each additional variable require an additional delimiter (or group of delimiters).




              while IFS=' ' read -r line




              Leading and trailing (runs of) spaces will be removed, each (run of) spaces will be used to split the line at as many places as the variables require.






              share|improve this answer




















                Your Answer







                StackExchange.ready(function()
                var channelOptions =
                tags: "".split(" "),
                id: "106"
                ;
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function()
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled)
                StackExchange.using("snippets", function()
                createEditor();
                );

                else
                createEditor();

                );

                function createEditor()
                StackExchange.prepareEditor(
                heartbeatType: 'answer',
                convertImagesToLinks: false,
                noModals: false,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: null,
                bindNavPrevention: true,
                postfix: "",
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                );



                );













                 

                draft saved


                draft discarded


















                StackExchange.ready(
                function ()
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f26784%2funderstanding-ifs%23new-answer', 'question_page');

                );

                Post as a guest






























                5 Answers
                5






                active

                oldest

                votes








                5 Answers
                5






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes








                up vote
                26
                down vote



                accepted










                1. Yes, they are the same.

                2. Yes.

                3. In bash, and similar shells, you could do something like IFS=$' tn'. Otherwise, you could insert the literal control codes by using [space] CTRL+V [tab] CTRL+V [enter]. If you are planning to do this, however, it's better to use another variable to temporarily store the old IFS value, and then restore it afterwards (or temporarily override it for one command by using the var=foo command syntax).


                  • The first code snippet will put the entire line read, verbatim, into $line, as there are no field separators to perform word splitting for. Bear in mind however that since many shells use cstrings to store strings, the first instance of a NUL may still cause the appearance of it being prematurely terminated.

                  • The second code snippet may not put an exact copy of the input into $line. For example, if there are multiple consecutive field separators, they will be made into a single instance of the first element. This is often recognised as loss of surrounding whitespace.

                  • The third code snippet will do the same as the second, except it will only split on a space (not the usual space, tab, or newline).






                share|improve this answer
















                • 3




                  The answer to Q2 is wrong: an empty IFS and an unset IFS are very different. The answer to Q4 is partly wrong: inner separators are not touched here, only leading and trailing ones.
                  – Gilles
                  Dec 14 '11 at 12:06






                • 3




                  @Gilles: In Q2 none of the three given denominations refers to an unset IFS, all of them mean IFS=.
                  – Stéphane Gimenez
                  Dec 14 '11 at 13:00










                • @Gilles In Q2, I never said they were the same. And inner separators are touched, as shown here: IFS=' ' ; foo=( bar baz qux ) ; echo "$#foo[@]". (Er, what? There should be multiple space delimiters in there, SO engine keeps on stripping them).
                  – Chris Down
                  Dec 14 '11 at 16:05







                • 2




                  @StéphaneGimenez, Chris: Oh, right, sorry about Q2, I misread the question. For Q4, we're talking about read; the last variable grabs everything that's left except for the last separator and leaves inner separators inside.
                  – Gilles
                  Dec 15 '11 at 15:25






                • 1




                  Gilles is partially correct about the spaces not being removed by read. Read my answer for details.
                  – user79743
                  Aug 6 '15 at 23:28














                up vote
                26
                down vote



                accepted










                1. Yes, they are the same.

                2. Yes.

                3. In bash, and similar shells, you could do something like IFS=$' tn'. Otherwise, you could insert the literal control codes by using [space] CTRL+V [tab] CTRL+V [enter]. If you are planning to do this, however, it's better to use another variable to temporarily store the old IFS value, and then restore it afterwards (or temporarily override it for one command by using the var=foo command syntax).


                  • The first code snippet will put the entire line read, verbatim, into $line, as there are no field separators to perform word splitting for. Bear in mind however that since many shells use cstrings to store strings, the first instance of a NUL may still cause the appearance of it being prematurely terminated.

                  • The second code snippet may not put an exact copy of the input into $line. For example, if there are multiple consecutive field separators, they will be made into a single instance of the first element. This is often recognised as loss of surrounding whitespace.

                  • The third code snippet will do the same as the second, except it will only split on a space (not the usual space, tab, or newline).






                share|improve this answer
















                • 3




                  The answer to Q2 is wrong: an empty IFS and an unset IFS are very different. The answer to Q4 is partly wrong: inner separators are not touched here, only leading and trailing ones.
                  – Gilles
                  Dec 14 '11 at 12:06






                • 3




                  @Gilles: In Q2 none of the three given denominations refers to an unset IFS, all of them mean IFS=.
                  – Stéphane Gimenez
                  Dec 14 '11 at 13:00










                • @Gilles In Q2, I never said they were the same. And inner separators are touched, as shown here: IFS=' ' ; foo=( bar baz qux ) ; echo "$#foo[@]". (Er, what? There should be multiple space delimiters in there, SO engine keeps on stripping them).
                  – Chris Down
                  Dec 14 '11 at 16:05







                • 2




                  @StéphaneGimenez, Chris: Oh, right, sorry about Q2, I misread the question. For Q4, we're talking about read; the last variable grabs everything that's left except for the last separator and leaves inner separators inside.
                  – Gilles
                  Dec 15 '11 at 15:25






                • 1




                  Gilles is partially correct about the spaces not being removed by read. Read my answer for details.
                  – user79743
                  Aug 6 '15 at 23:28












                up vote
                26
                down vote



                accepted







                up vote
                26
                down vote



                accepted






                1. Yes, they are the same.

                2. Yes.

                3. In bash, and similar shells, you could do something like IFS=$' tn'. Otherwise, you could insert the literal control codes by using [space] CTRL+V [tab] CTRL+V [enter]. If you are planning to do this, however, it's better to use another variable to temporarily store the old IFS value, and then restore it afterwards (or temporarily override it for one command by using the var=foo command syntax).


                  • The first code snippet will put the entire line read, verbatim, into $line, as there are no field separators to perform word splitting for. Bear in mind however that since many shells use cstrings to store strings, the first instance of a NUL may still cause the appearance of it being prematurely terminated.

                  • The second code snippet may not put an exact copy of the input into $line. For example, if there are multiple consecutive field separators, they will be made into a single instance of the first element. This is often recognised as loss of surrounding whitespace.

                  • The third code snippet will do the same as the second, except it will only split on a space (not the usual space, tab, or newline).






                share|improve this answer












                1. Yes, they are the same.

                2. Yes.

                3. In bash, and similar shells, you could do something like IFS=$' tn'. Otherwise, you could insert the literal control codes by using [space] CTRL+V [tab] CTRL+V [enter]. If you are planning to do this, however, it's better to use another variable to temporarily store the old IFS value, and then restore it afterwards (or temporarily override it for one command by using the var=foo command syntax).


                  • The first code snippet will put the entire line read, verbatim, into $line, as there are no field separators to perform word splitting for. Bear in mind however that since many shells use cstrings to store strings, the first instance of a NUL may still cause the appearance of it being prematurely terminated.

                  • The second code snippet may not put an exact copy of the input into $line. For example, if there are multiple consecutive field separators, they will be made into a single instance of the first element. This is often recognised as loss of surrounding whitespace.

                  • The third code snippet will do the same as the second, except it will only split on a space (not the usual space, tab, or newline).







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Dec 14 '11 at 0:25









                Chris Down

                76.6k12182196




                76.6k12182196







                • 3




                  The answer to Q2 is wrong: an empty IFS and an unset IFS are very different. The answer to Q4 is partly wrong: inner separators are not touched here, only leading and trailing ones.
                  – Gilles
                  Dec 14 '11 at 12:06






                • 3




                  @Gilles: In Q2 none of the three given denominations refers to an unset IFS, all of them mean IFS=.
                  – Stéphane Gimenez
                  Dec 14 '11 at 13:00










                • @Gilles In Q2, I never said they were the same. And inner separators are touched, as shown here: IFS=' ' ; foo=( bar baz qux ) ; echo "$#foo[@]". (Er, what? There should be multiple space delimiters in there, SO engine keeps on stripping them).
                  – Chris Down
                  Dec 14 '11 at 16:05







                • 2




                  @StéphaneGimenez, Chris: Oh, right, sorry about Q2, I misread the question. For Q4, we're talking about read; the last variable grabs everything that's left except for the last separator and leaves inner separators inside.
                  – Gilles
                  Dec 15 '11 at 15:25






                • 1




                  Gilles is partially correct about the spaces not being removed by read. Read my answer for details.
                  – user79743
                  Aug 6 '15 at 23:28












                • 3




                  The answer to Q2 is wrong: an empty IFS and an unset IFS are very different. The answer to Q4 is partly wrong: inner separators are not touched here, only leading and trailing ones.
                  – Gilles
                  Dec 14 '11 at 12:06






                • 3




                  @Gilles: In Q2 none of the three given denominations refers to an unset IFS, all of them mean IFS=.
                  – Stéphane Gimenez
                  Dec 14 '11 at 13:00










                • @Gilles In Q2, I never said they were the same. And inner separators are touched, as shown here: IFS=' ' ; foo=( bar baz qux ) ; echo "$#foo[@]". (Er, what? There should be multiple space delimiters in there, SO engine keeps on stripping them).
                  – Chris Down
                  Dec 14 '11 at 16:05







                • 2




                  @StéphaneGimenez, Chris: Oh, right, sorry about Q2, I misread the question. For Q4, we're talking about read; the last variable grabs everything that's left except for the last separator and leaves inner separators inside.
                  – Gilles
                  Dec 15 '11 at 15:25






                • 1




                  Gilles is partially correct about the spaces not being removed by read. Read my answer for details.
                  – user79743
                  Aug 6 '15 at 23:28







                3




                3




                The answer to Q2 is wrong: an empty IFS and an unset IFS are very different. The answer to Q4 is partly wrong: inner separators are not touched here, only leading and trailing ones.
                – Gilles
                Dec 14 '11 at 12:06




                The answer to Q2 is wrong: an empty IFS and an unset IFS are very different. The answer to Q4 is partly wrong: inner separators are not touched here, only leading and trailing ones.
                – Gilles
                Dec 14 '11 at 12:06




                3




                3




                @Gilles: In Q2 none of the three given denominations refers to an unset IFS, all of them mean IFS=.
                – Stéphane Gimenez
                Dec 14 '11 at 13:00




                @Gilles: In Q2 none of the three given denominations refers to an unset IFS, all of them mean IFS=.
                – Stéphane Gimenez
                Dec 14 '11 at 13:00












                @Gilles In Q2, I never said they were the same. And inner separators are touched, as shown here: IFS=' ' ; foo=( bar baz qux ) ; echo "$#foo[@]". (Er, what? There should be multiple space delimiters in there, SO engine keeps on stripping them).
                – Chris Down
                Dec 14 '11 at 16:05





                @Gilles In Q2, I never said they were the same. And inner separators are touched, as shown here: IFS=' ' ; foo=( bar baz qux ) ; echo "$#foo[@]". (Er, what? There should be multiple space delimiters in there, SO engine keeps on stripping them).
                – Chris Down
                Dec 14 '11 at 16:05





                2




                2




                @StéphaneGimenez, Chris: Oh, right, sorry about Q2, I misread the question. For Q4, we're talking about read; the last variable grabs everything that's left except for the last separator and leaves inner separators inside.
                – Gilles
                Dec 15 '11 at 15:25




                @StéphaneGimenez, Chris: Oh, right, sorry about Q2, I misread the question. For Q4, we're talking about read; the last variable grabs everything that's left except for the last separator and leaves inner separators inside.
                – Gilles
                Dec 15 '11 at 15:25




                1




                1




                Gilles is partially correct about the spaces not being removed by read. Read my answer for details.
                – user79743
                Aug 6 '15 at 23:28




                Gilles is partially correct about the spaces not being removed by read. Read my answer for details.
                – user79743
                Aug 6 '15 at 23:28












                up vote
                21
                down vote













                Q1: Yes. “Field splitting” and “word splitting” are two terms for the same concept.



                Q2: Yes. If IFS is unset (i.e. after unset IFS), it is equivalent IFS being set to $' tn' (a space, a tab and a newline). If IFS is set to an empty value (that's what “null” means here) (i.e. after IFS= or IFS='' or IFS=""), no field splitting is performed at all (and $*, which normally uses the first character of $IFS, uses a space character).



                Q3: If you want to have the default IFS behavior, you can use unset IFS. If you want to set IFS explicitly to this default value, you can put the literal characters space, tab, newline in single quotes. In ksh93, bash or zsh, you can use IFS=$' tn'. Portably, if you want to avoid having a literal tab character in your source file, you can use



                IFS=" $(echo t | tr t \t)
                "


                Q4: With IFS set to an empty value, read -r line sets line to the whole line except its terminating newline. With IFS=" ", spaces at the beginning and at the end of the line are trimmed. With the default value of IFS, tabs and spaces are trimmed.






                share|improve this answer


















                • 2




                  Q2 is partly wrong. If IFS is empty, "$*" is joined without separators. (for $@, there are some variations between shells in non-list contexts like IFS=; var=$@). It should be noted that when IFS is empty, no word splitting is perfomed but $var still expands to no argument instead of an empty argument when $var is empty, and globbing still applies, so you still need to quote variables (even if you disable globbing)
                  – Stéphane Chazelas
                  Feb 8 '13 at 22:19














                up vote
                21
                down vote













                Q1: Yes. “Field splitting” and “word splitting” are two terms for the same concept.



                Q2: Yes. If IFS is unset (i.e. after unset IFS), it is equivalent IFS being set to $' tn' (a space, a tab and a newline). If IFS is set to an empty value (that's what “null” means here) (i.e. after IFS= or IFS='' or IFS=""), no field splitting is performed at all (and $*, which normally uses the first character of $IFS, uses a space character).



                Q3: If you want to have the default IFS behavior, you can use unset IFS. If you want to set IFS explicitly to this default value, you can put the literal characters space, tab, newline in single quotes. In ksh93, bash or zsh, you can use IFS=$' tn'. Portably, if you want to avoid having a literal tab character in your source file, you can use



                IFS=" $(echo t | tr t \t)
                "


                Q4: With IFS set to an empty value, read -r line sets line to the whole line except its terminating newline. With IFS=" ", spaces at the beginning and at the end of the line are trimmed. With the default value of IFS, tabs and spaces are trimmed.






                share|improve this answer


















                • 2




                  Q2 is partly wrong. If IFS is empty, "$*" is joined without separators. (for $@, there are some variations between shells in non-list contexts like IFS=; var=$@). It should be noted that when IFS is empty, no word splitting is perfomed but $var still expands to no argument instead of an empty argument when $var is empty, and globbing still applies, so you still need to quote variables (even if you disable globbing)
                  – Stéphane Chazelas
                  Feb 8 '13 at 22:19












                up vote
                21
                down vote










                up vote
                21
                down vote









                Q1: Yes. “Field splitting” and “word splitting” are two terms for the same concept.



                Q2: Yes. If IFS is unset (i.e. after unset IFS), it is equivalent IFS being set to $' tn' (a space, a tab and a newline). If IFS is set to an empty value (that's what “null” means here) (i.e. after IFS= or IFS='' or IFS=""), no field splitting is performed at all (and $*, which normally uses the first character of $IFS, uses a space character).



                Q3: If you want to have the default IFS behavior, you can use unset IFS. If you want to set IFS explicitly to this default value, you can put the literal characters space, tab, newline in single quotes. In ksh93, bash or zsh, you can use IFS=$' tn'. Portably, if you want to avoid having a literal tab character in your source file, you can use



                IFS=" $(echo t | tr t \t)
                "


                Q4: With IFS set to an empty value, read -r line sets line to the whole line except its terminating newline. With IFS=" ", spaces at the beginning and at the end of the line are trimmed. With the default value of IFS, tabs and spaces are trimmed.






                share|improve this answer














                Q1: Yes. “Field splitting” and “word splitting” are two terms for the same concept.



                Q2: Yes. If IFS is unset (i.e. after unset IFS), it is equivalent IFS being set to $' tn' (a space, a tab and a newline). If IFS is set to an empty value (that's what “null” means here) (i.e. after IFS= or IFS='' or IFS=""), no field splitting is performed at all (and $*, which normally uses the first character of $IFS, uses a space character).



                Q3: If you want to have the default IFS behavior, you can use unset IFS. If you want to set IFS explicitly to this default value, you can put the literal characters space, tab, newline in single quotes. In ksh93, bash or zsh, you can use IFS=$' tn'. Portably, if you want to avoid having a literal tab character in your source file, you can use



                IFS=" $(echo t | tr t \t)
                "


                Q4: With IFS set to an empty value, read -r line sets line to the whole line except its terminating newline. With IFS=" ", spaces at the beginning and at the end of the line are trimmed. With the default value of IFS, tabs and spaces are trimmed.







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Dec 15 '11 at 15:26

























                answered Dec 14 '11 at 12:01









                Gilles

                513k12010161547




                513k12010161547







                • 2




                  Q2 is partly wrong. If IFS is empty, "$*" is joined without separators. (for $@, there are some variations between shells in non-list contexts like IFS=; var=$@). It should be noted that when IFS is empty, no word splitting is perfomed but $var still expands to no argument instead of an empty argument when $var is empty, and globbing still applies, so you still need to quote variables (even if you disable globbing)
                  – Stéphane Chazelas
                  Feb 8 '13 at 22:19












                • 2




                  Q2 is partly wrong. If IFS is empty, "$*" is joined without separators. (for $@, there are some variations between shells in non-list contexts like IFS=; var=$@). It should be noted that when IFS is empty, no word splitting is perfomed but $var still expands to no argument instead of an empty argument when $var is empty, and globbing still applies, so you still need to quote variables (even if you disable globbing)
                  – Stéphane Chazelas
                  Feb 8 '13 at 22:19







                2




                2




                Q2 is partly wrong. If IFS is empty, "$*" is joined without separators. (for $@, there are some variations between shells in non-list contexts like IFS=; var=$@). It should be noted that when IFS is empty, no word splitting is perfomed but $var still expands to no argument instead of an empty argument when $var is empty, and globbing still applies, so you still need to quote variables (even if you disable globbing)
                – Stéphane Chazelas
                Feb 8 '13 at 22:19




                Q2 is partly wrong. If IFS is empty, "$*" is joined without separators. (for $@, there are some variations between shells in non-list contexts like IFS=; var=$@). It should be noted that when IFS is empty, no word splitting is perfomed but $var still expands to no argument instead of an empty argument when $var is empty, and globbing still applies, so you still need to quote variables (even if you disable globbing)
                – Stéphane Chazelas
                Feb 8 '13 at 22:19










                up vote
                12
                down vote













                Q1. Field splitting.




                Is field splitting the same as word splitting ?




                Yes, both point to the same idea.



                Q2: When is IFS null?.




                Is setting IFS='' the same as null, the same as an empty string too?




                Yes, all three mean the same: No field/word splitting shall be performed.
                Also, this affects printing fields (as with echo "$*") all fields will be concatenated together with no space.



                Q3: (part a) Unset IFS.




                In the POSIX specification, I read the following:




                If IFS is not set, the shell shall behave as if the value of IFS is <space><tab><newline>.





                Which is exactly equivalent to:




                With an unset IFS, the shell shall behave as if IFS is default.




                That means that the 'Field splitting' will be exactly the same with a default IFS value, or unset.

                That does NOT mean that IFS will work the same way in all conditions.
                Being more specific, executing OldIFS=$IFS will set the var OldIFS to null, not the default. And trying to set IFS back, as this, IFS=OldIFS will set IFS to null, not keep it unset as it were before. Watch out !!.



                Q3: (part b) Restore IFS.




                How could I restore the value of IFS to default.
                Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to <tab> and <newline>?)




                For zsh, ksh, and bash (AFAIK), IFS could be set to the default value as:



                IFS=$' tn' # works with zsh, ksh, bash.


                Done, you need to read nothing else.



                But if you need to re-set IFS for sh, it may become complex.



                Let's take a look from easiest to complete with no drawbacks (except complexity).



                1.- Unset IFS.



                We could just unset IFS (Read Q3 part a, above.).



                2.- Swap chars.



                As a workaround, swapping the value of tab and newline makes it simpler to set the value of IFS, and then it works in a equivalent way.



                Set IFS to <space><newline><tab>:



                sh -c 'IFS=$(echo " nt"); printf "%s" "$IFS"|xxd' # Works.


                3.- A simple? solution:



                If there are child scripts that need IFS correctly set, you could always manually write:




                IFS='
                '


                Where the sequence manually typed was: IFS='spacetabnewline', sequence which has actually been correctly typed above (If you need to confirm, edit this answer). But a copy/paste from your browser will break because the browser will squeeze/hide the whitespace. It makes it difficult to share the code as written above.



                4.- Complete solution.



                To write code that can be safely copied usually involves unambiguous printable escapes.



                We need some code that "produces" the expected value. But, even if conceptually correct, this code will NOT set a trailing n:



                sh -c 'IFS=$(echo " tn"); printf "%s" "$IFS"|xxd' # wrong.


                That happens because, under most shells, all trailing newlines of $(...) or `...` command substitutions are removed on expansion.



                We need to use a trick for sh:



                sh -c 'IFS="$(printf " tnx")"; IFS="$IFS%x"; printf "$IFS"|xxd' # Correct.


                An alternative way may be to set IFS as an environment value from bash (for example) and then call sh (the versions of it that accept IFS to be set via the environment), as this:



                env IFS=$' tn' sh -c 'printf "%s" "$IFS"|xxd'


                In short, sh makes resetting IFS to default quite an odd adventure.



                Q4: In actual code:




                Finally, how would this code:



                while IFS= read -r line
                do
                echo $line
                done < /path_to_text_file


                behave if we we change the first line to



                while read -r line # Use the default IFS value


                or to:



                while IFS=' ' read -r line



                First: I do not know if the echo $line (with the var NOT quoted) is there on porpouse, or not.
                It introduces a second level of 'field splitting' that read does not have.
                So I'll answer both. :)



                With this code (so you could confirm). You'll need the useful xxd:



                #!/bin/ksh
                # Correctly set IFS as described above.
                defIFS="$(printf " tnx")"; defIFS="$defIFS%x";
                IFS="$defIFS"
                printf "IFS value: "
                printf "%s" "$IFS"| xxd -p

                a=' bar baz quz '; l="$#a"
                printf "var value : %$ls-" "$a" ; printf "%sn" "$a" | xxd -p

                printf "%sn" "$a" | while IFS='x' read -r line; do
                printf "IFS --x-- : %$ls-" "$line" ;
                printf "%s" "$line" |xxd -p; done;

                printf 'Values quoted :n' "" # With values quoted:
                printf "%sn" "$a" | while IFS='' read -r line; do
                printf "IFS null quoted : %$ls-" "$line" ;
                printf "%s" "$line" |xxd -p; done;

                printf "%sn" "$a" | while IFS="$defIFS" read -r line; do
                printf "IFS default quoted : %$ls-" "$line" ;
                printf "%s" "$line" |xxd -p; done;

                unset IFS; printf "%sn" "$a" | while read -r line; do
                printf "IFS unset quoted : %$ls-" "$line" ;
                printf "%s" "$line" |xxd -p; done;
                IFS="$defIFS" # set IFS back to default.

                printf "%sn" "$a" | while IFS=' ' read -r line; do
                printf "IFS space quoted : %$ls-" "$line" ;
                printf "%s" "$line" |xxd -p; done;

                printf '%sn' "Values unquoted :" # Now with values unquoted:
                printf "%sn" "$a" | while IFS='x' read -r line; do
                printf "IFS --x-- unquoted : "
                printf "%s, " $line; printf "%s," $line |xxd -p; done

                printf "%sn" "$a" | while IFS='' read -r line; do
                printf "IFS null unquoted : ";
                printf "%s, " $line; printf "%s," $line |xxd -p; done

                printf "%sn" "$a" | while IFS="$defIFS" read -r line; do
                printf "IFS defau unquoted : ";
                printf "%s, " $line; printf "%s," $line |xxd -p; done

                unset IFS; printf "%sn" "$a" | while read -r line; do
                printf "IFS unset unquoted : ";
                printf "%s, " $line; printf "%s," $line |xxd -p; done
                IFS="$defIFS" # set IFS back to default.

                printf "%sn" "$a" | while IFS=' ' read -r line; do
                printf "IFS space unquoted : ";
                printf "%s, " $line; printf "%s," $line |xxd -p; done


                I get:



                $ ./stackexchange-Understanding-IFS.sh
                IFS value: 20090a
                var value : bar baz quz -20202062617220202062617a20202071757a2020200a
                IFS --x-- : bar baz quz -20202062617220202062617a20202071757a202020
                Values quoted :
                IFS null quoted : bar baz quz -20202062617220202062617a20202071757a202020
                IFS default quoted : bar baz quz-62617220202062617a20202071757a
                IFS unset quoted : bar baz quz-62617220202062617a20202071757a
                IFS space quoted : bar baz quz-62617220202062617a20202071757a
                Values unquoted :
                IFS --x-- unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
                IFS null unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
                IFS defau unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
                IFS unset unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
                IFS space unquoted : bar, baz, quz, 6261722c62617a2c71757a2c


                The first value is just the correct value of IFS='spacetabnewline'



                Next line is all the hex values that the var $a has, and a newline '0a' at the end as it is going to be given to each read command.



                The next line, for which IFS is null, does not perform any 'field spliting', but the newline is removed (as expected).



                The next three lines, as IFS contains an space, remove the initial spaces and set the var line to the balance remaining.



                The last four lines shows what an unquoted variable will do. The values will be split on the (several) spaces and will be printed as: bar,baz,qux,






                share|improve this answer


























                  up vote
                  12
                  down vote













                  Q1. Field splitting.




                  Is field splitting the same as word splitting ?




                  Yes, both point to the same idea.



                  Q2: When is IFS null?.




                  Is setting IFS='' the same as null, the same as an empty string too?




                  Yes, all three mean the same: No field/word splitting shall be performed.
                  Also, this affects printing fields (as with echo "$*") all fields will be concatenated together with no space.



                  Q3: (part a) Unset IFS.




                  In the POSIX specification, I read the following:




                  If IFS is not set, the shell shall behave as if the value of IFS is <space><tab><newline>.





                  Which is exactly equivalent to:




                  With an unset IFS, the shell shall behave as if IFS is default.




                  That means that the 'Field splitting' will be exactly the same with a default IFS value, or unset.

                  That does NOT mean that IFS will work the same way in all conditions.
                  Being more specific, executing OldIFS=$IFS will set the var OldIFS to null, not the default. And trying to set IFS back, as this, IFS=OldIFS will set IFS to null, not keep it unset as it were before. Watch out !!.



                  Q3: (part b) Restore IFS.




                  How could I restore the value of IFS to default.
                  Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to <tab> and <newline>?)




                  For zsh, ksh, and bash (AFAIK), IFS could be set to the default value as:



                  IFS=$' tn' # works with zsh, ksh, bash.


                  Done, you need to read nothing else.



                  But if you need to re-set IFS for sh, it may become complex.



                  Let's take a look from easiest to complete with no drawbacks (except complexity).



                  1.- Unset IFS.



                  We could just unset IFS (Read Q3 part a, above.).



                  2.- Swap chars.



                  As a workaround, swapping the value of tab and newline makes it simpler to set the value of IFS, and then it works in a equivalent way.



                  Set IFS to <space><newline><tab>:



                  sh -c 'IFS=$(echo " nt"); printf "%s" "$IFS"|xxd' # Works.


                  3.- A simple? solution:



                  If there are child scripts that need IFS correctly set, you could always manually write:




                  IFS='
                  '


                  Where the sequence manually typed was: IFS='spacetabnewline', sequence which has actually been correctly typed above (If you need to confirm, edit this answer). But a copy/paste from your browser will break because the browser will squeeze/hide the whitespace. It makes it difficult to share the code as written above.



                  4.- Complete solution.



                  To write code that can be safely copied usually involves unambiguous printable escapes.



                  We need some code that "produces" the expected value. But, even if conceptually correct, this code will NOT set a trailing n:



                  sh -c 'IFS=$(echo " tn"); printf "%s" "$IFS"|xxd' # wrong.


                  That happens because, under most shells, all trailing newlines of $(...) or `...` command substitutions are removed on expansion.



                  We need to use a trick for sh:



                  sh -c 'IFS="$(printf " tnx")"; IFS="$IFS%x"; printf "$IFS"|xxd' # Correct.


                  An alternative way may be to set IFS as an environment value from bash (for example) and then call sh (the versions of it that accept IFS to be set via the environment), as this:



                  env IFS=$' tn' sh -c 'printf "%s" "$IFS"|xxd'


                  In short, sh makes resetting IFS to default quite an odd adventure.



                  Q4: In actual code:




                  Finally, how would this code:



                  while IFS= read -r line
                  do
                  echo $line
                  done < /path_to_text_file


                  behave if we we change the first line to



                  while read -r line # Use the default IFS value


                  or to:



                  while IFS=' ' read -r line



                  First: I do not know if the echo $line (with the var NOT quoted) is there on porpouse, or not.
                  It introduces a second level of 'field splitting' that read does not have.
                  So I'll answer both. :)



                  With this code (so you could confirm). You'll need the useful xxd:



                  #!/bin/ksh
                  # Correctly set IFS as described above.
                  defIFS="$(printf " tnx")"; defIFS="$defIFS%x";
                  IFS="$defIFS"
                  printf "IFS value: "
                  printf "%s" "$IFS"| xxd -p

                  a=' bar baz quz '; l="$#a"
                  printf "var value : %$ls-" "$a" ; printf "%sn" "$a" | xxd -p

                  printf "%sn" "$a" | while IFS='x' read -r line; do
                  printf "IFS --x-- : %$ls-" "$line" ;
                  printf "%s" "$line" |xxd -p; done;

                  printf 'Values quoted :n' "" # With values quoted:
                  printf "%sn" "$a" | while IFS='' read -r line; do
                  printf "IFS null quoted : %$ls-" "$line" ;
                  printf "%s" "$line" |xxd -p; done;

                  printf "%sn" "$a" | while IFS="$defIFS" read -r line; do
                  printf "IFS default quoted : %$ls-" "$line" ;
                  printf "%s" "$line" |xxd -p; done;

                  unset IFS; printf "%sn" "$a" | while read -r line; do
                  printf "IFS unset quoted : %$ls-" "$line" ;
                  printf "%s" "$line" |xxd -p; done;
                  IFS="$defIFS" # set IFS back to default.

                  printf "%sn" "$a" | while IFS=' ' read -r line; do
                  printf "IFS space quoted : %$ls-" "$line" ;
                  printf "%s" "$line" |xxd -p; done;

                  printf '%sn' "Values unquoted :" # Now with values unquoted:
                  printf "%sn" "$a" | while IFS='x' read -r line; do
                  printf "IFS --x-- unquoted : "
                  printf "%s, " $line; printf "%s," $line |xxd -p; done

                  printf "%sn" "$a" | while IFS='' read -r line; do
                  printf "IFS null unquoted : ";
                  printf "%s, " $line; printf "%s," $line |xxd -p; done

                  printf "%sn" "$a" | while IFS="$defIFS" read -r line; do
                  printf "IFS defau unquoted : ";
                  printf "%s, " $line; printf "%s," $line |xxd -p; done

                  unset IFS; printf "%sn" "$a" | while read -r line; do
                  printf "IFS unset unquoted : ";
                  printf "%s, " $line; printf "%s," $line |xxd -p; done
                  IFS="$defIFS" # set IFS back to default.

                  printf "%sn" "$a" | while IFS=' ' read -r line; do
                  printf "IFS space unquoted : ";
                  printf "%s, " $line; printf "%s," $line |xxd -p; done


                  I get:



                  $ ./stackexchange-Understanding-IFS.sh
                  IFS value: 20090a
                  var value : bar baz quz -20202062617220202062617a20202071757a2020200a
                  IFS --x-- : bar baz quz -20202062617220202062617a20202071757a202020
                  Values quoted :
                  IFS null quoted : bar baz quz -20202062617220202062617a20202071757a202020
                  IFS default quoted : bar baz quz-62617220202062617a20202071757a
                  IFS unset quoted : bar baz quz-62617220202062617a20202071757a
                  IFS space quoted : bar baz quz-62617220202062617a20202071757a
                  Values unquoted :
                  IFS --x-- unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
                  IFS null unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
                  IFS defau unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
                  IFS unset unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
                  IFS space unquoted : bar, baz, quz, 6261722c62617a2c71757a2c


                  The first value is just the correct value of IFS='spacetabnewline'



                  Next line is all the hex values that the var $a has, and a newline '0a' at the end as it is going to be given to each read command.



                  The next line, for which IFS is null, does not perform any 'field spliting', but the newline is removed (as expected).



                  The next three lines, as IFS contains an space, remove the initial spaces and set the var line to the balance remaining.



                  The last four lines shows what an unquoted variable will do. The values will be split on the (several) spaces and will be printed as: bar,baz,qux,






                  share|improve this answer
























                    up vote
                    12
                    down vote










                    up vote
                    12
                    down vote









                    Q1. Field splitting.




                    Is field splitting the same as word splitting ?




                    Yes, both point to the same idea.



                    Q2: When is IFS null?.




                    Is setting IFS='' the same as null, the same as an empty string too?




                    Yes, all three mean the same: No field/word splitting shall be performed.
                    Also, this affects printing fields (as with echo "$*") all fields will be concatenated together with no space.



                    Q3: (part a) Unset IFS.




                    In the POSIX specification, I read the following:




                    If IFS is not set, the shell shall behave as if the value of IFS is <space><tab><newline>.





                    Which is exactly equivalent to:




                    With an unset IFS, the shell shall behave as if IFS is default.




                    That means that the 'Field splitting' will be exactly the same with a default IFS value, or unset.

                    That does NOT mean that IFS will work the same way in all conditions.
                    Being more specific, executing OldIFS=$IFS will set the var OldIFS to null, not the default. And trying to set IFS back, as this, IFS=OldIFS will set IFS to null, not keep it unset as it were before. Watch out !!.



                    Q3: (part b) Restore IFS.




                    How could I restore the value of IFS to default.
                    Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to <tab> and <newline>?)




                    For zsh, ksh, and bash (AFAIK), IFS could be set to the default value as:



                    IFS=$' tn' # works with zsh, ksh, bash.


                    Done, you need to read nothing else.



                    But if you need to re-set IFS for sh, it may become complex.



                    Let's take a look from easiest to complete with no drawbacks (except complexity).



                    1.- Unset IFS.



                    We could just unset IFS (Read Q3 part a, above.).



                    2.- Swap chars.



                    As a workaround, swapping the value of tab and newline makes it simpler to set the value of IFS, and then it works in a equivalent way.



                    Set IFS to <space><newline><tab>:



                    sh -c 'IFS=$(echo " nt"); printf "%s" "$IFS"|xxd' # Works.


                    3.- A simple? solution:



                    If there are child scripts that need IFS correctly set, you could always manually write:




                    IFS='
                    '


                    Where the sequence manually typed was: IFS='spacetabnewline', sequence which has actually been correctly typed above (If you need to confirm, edit this answer). But a copy/paste from your browser will break because the browser will squeeze/hide the whitespace. It makes it difficult to share the code as written above.



                    4.- Complete solution.



                    To write code that can be safely copied usually involves unambiguous printable escapes.



                    We need some code that "produces" the expected value. But, even if conceptually correct, this code will NOT set a trailing n:



                    sh -c 'IFS=$(echo " tn"); printf "%s" "$IFS"|xxd' # wrong.


                    That happens because, under most shells, all trailing newlines of $(...) or `...` command substitutions are removed on expansion.



                    We need to use a trick for sh:



                    sh -c 'IFS="$(printf " tnx")"; IFS="$IFS%x"; printf "$IFS"|xxd' # Correct.


                    An alternative way may be to set IFS as an environment value from bash (for example) and then call sh (the versions of it that accept IFS to be set via the environment), as this:



                    env IFS=$' tn' sh -c 'printf "%s" "$IFS"|xxd'


                    In short, sh makes resetting IFS to default quite an odd adventure.



                    Q4: In actual code:




                    Finally, how would this code:



                    while IFS= read -r line
                    do
                    echo $line
                    done < /path_to_text_file


                    behave if we we change the first line to



                    while read -r line # Use the default IFS value


                    or to:



                    while IFS=' ' read -r line



                    First: I do not know if the echo $line (with the var NOT quoted) is there on porpouse, or not.
                    It introduces a second level of 'field splitting' that read does not have.
                    So I'll answer both. :)



                    With this code (so you could confirm). You'll need the useful xxd:



                    #!/bin/ksh
                    # Correctly set IFS as described above.
                    defIFS="$(printf " tnx")"; defIFS="$defIFS%x";
                    IFS="$defIFS"
                    printf "IFS value: "
                    printf "%s" "$IFS"| xxd -p

                    a=' bar baz quz '; l="$#a"
                    printf "var value : %$ls-" "$a" ; printf "%sn" "$a" | xxd -p

                    printf "%sn" "$a" | while IFS='x' read -r line; do
                    printf "IFS --x-- : %$ls-" "$line" ;
                    printf "%s" "$line" |xxd -p; done;

                    printf 'Values quoted :n' "" # With values quoted:
                    printf "%sn" "$a" | while IFS='' read -r line; do
                    printf "IFS null quoted : %$ls-" "$line" ;
                    printf "%s" "$line" |xxd -p; done;

                    printf "%sn" "$a" | while IFS="$defIFS" read -r line; do
                    printf "IFS default quoted : %$ls-" "$line" ;
                    printf "%s" "$line" |xxd -p; done;

                    unset IFS; printf "%sn" "$a" | while read -r line; do
                    printf "IFS unset quoted : %$ls-" "$line" ;
                    printf "%s" "$line" |xxd -p; done;
                    IFS="$defIFS" # set IFS back to default.

                    printf "%sn" "$a" | while IFS=' ' read -r line; do
                    printf "IFS space quoted : %$ls-" "$line" ;
                    printf "%s" "$line" |xxd -p; done;

                    printf '%sn' "Values unquoted :" # Now with values unquoted:
                    printf "%sn" "$a" | while IFS='x' read -r line; do
                    printf "IFS --x-- unquoted : "
                    printf "%s, " $line; printf "%s," $line |xxd -p; done

                    printf "%sn" "$a" | while IFS='' read -r line; do
                    printf "IFS null unquoted : ";
                    printf "%s, " $line; printf "%s," $line |xxd -p; done

                    printf "%sn" "$a" | while IFS="$defIFS" read -r line; do
                    printf "IFS defau unquoted : ";
                    printf "%s, " $line; printf "%s," $line |xxd -p; done

                    unset IFS; printf "%sn" "$a" | while read -r line; do
                    printf "IFS unset unquoted : ";
                    printf "%s, " $line; printf "%s," $line |xxd -p; done
                    IFS="$defIFS" # set IFS back to default.

                    printf "%sn" "$a" | while IFS=' ' read -r line; do
                    printf "IFS space unquoted : ";
                    printf "%s, " $line; printf "%s," $line |xxd -p; done


                    I get:



                    $ ./stackexchange-Understanding-IFS.sh
                    IFS value: 20090a
                    var value : bar baz quz -20202062617220202062617a20202071757a2020200a
                    IFS --x-- : bar baz quz -20202062617220202062617a20202071757a202020
                    Values quoted :
                    IFS null quoted : bar baz quz -20202062617220202062617a20202071757a202020
                    IFS default quoted : bar baz quz-62617220202062617a20202071757a
                    IFS unset quoted : bar baz quz-62617220202062617a20202071757a
                    IFS space quoted : bar baz quz-62617220202062617a20202071757a
                    Values unquoted :
                    IFS --x-- unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
                    IFS null unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
                    IFS defau unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
                    IFS unset unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
                    IFS space unquoted : bar, baz, quz, 6261722c62617a2c71757a2c


                    The first value is just the correct value of IFS='spacetabnewline'



                    Next line is all the hex values that the var $a has, and a newline '0a' at the end as it is going to be given to each read command.



                    The next line, for which IFS is null, does not perform any 'field spliting', but the newline is removed (as expected).



                    The next three lines, as IFS contains an space, remove the initial spaces and set the var line to the balance remaining.



                    The last four lines shows what an unquoted variable will do. The values will be split on the (several) spaces and will be printed as: bar,baz,qux,






                    share|improve this answer














                    Q1. Field splitting.




                    Is field splitting the same as word splitting ?




                    Yes, both point to the same idea.



                    Q2: When is IFS null?.




                    Is setting IFS='' the same as null, the same as an empty string too?




                    Yes, all three mean the same: No field/word splitting shall be performed.
                    Also, this affects printing fields (as with echo "$*") all fields will be concatenated together with no space.



                    Q3: (part a) Unset IFS.




                    In the POSIX specification, I read the following:




                    If IFS is not set, the shell shall behave as if the value of IFS is <space><tab><newline>.





                    Which is exactly equivalent to:




                    With an unset IFS, the shell shall behave as if IFS is default.




                    That means that the 'Field splitting' will be exactly the same with a default IFS value, or unset.

                    That does NOT mean that IFS will work the same way in all conditions.
                    Being more specific, executing OldIFS=$IFS will set the var OldIFS to null, not the default. And trying to set IFS back, as this, IFS=OldIFS will set IFS to null, not keep it unset as it were before. Watch out !!.



                    Q3: (part b) Restore IFS.




                    How could I restore the value of IFS to default.
                    Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to <tab> and <newline>?)




                    For zsh, ksh, and bash (AFAIK), IFS could be set to the default value as:



                    IFS=$' tn' # works with zsh, ksh, bash.


                    Done, you need to read nothing else.



                    But if you need to re-set IFS for sh, it may become complex.



                    Let's take a look from easiest to complete with no drawbacks (except complexity).



                    1.- Unset IFS.



                    We could just unset IFS (Read Q3 part a, above.).



                    2.- Swap chars.



                    As a workaround, swapping the value of tab and newline makes it simpler to set the value of IFS, and then it works in a equivalent way.



                    Set IFS to <space><newline><tab>:



                    sh -c 'IFS=$(echo " nt"); printf "%s" "$IFS"|xxd' # Works.


                    3.- A simple? solution:



                    If there are child scripts that need IFS correctly set, you could always manually write:




                    IFS='
                    '


                    Where the sequence manually typed was: IFS='spacetabnewline', sequence which has actually been correctly typed above (If you need to confirm, edit this answer). But a copy/paste from your browser will break because the browser will squeeze/hide the whitespace. It makes it difficult to share the code as written above.



                    4.- Complete solution.



                    To write code that can be safely copied usually involves unambiguous printable escapes.



                    We need some code that "produces" the expected value. But, even if conceptually correct, this code will NOT set a trailing n:



                    sh -c 'IFS=$(echo " tn"); printf "%s" "$IFS"|xxd' # wrong.


                    That happens because, under most shells, all trailing newlines of $(...) or `...` command substitutions are removed on expansion.



                    We need to use a trick for sh:



                    sh -c 'IFS="$(printf " tnx")"; IFS="$IFS%x"; printf "$IFS"|xxd' # Correct.


                    An alternative way may be to set IFS as an environment value from bash (for example) and then call sh (the versions of it that accept IFS to be set via the environment), as this:



                    env IFS=$' tn' sh -c 'printf "%s" "$IFS"|xxd'


                    In short, sh makes resetting IFS to default quite an odd adventure.



                    Q4: In actual code:




                    Finally, how would this code:



                    while IFS= read -r line
                    do
                    echo $line
                    done < /path_to_text_file


                    behave if we we change the first line to



                    while read -r line # Use the default IFS value


                    or to:



                    while IFS=' ' read -r line



                    First: I do not know if the echo $line (with the var NOT quoted) is there on porpouse, or not.
                    It introduces a second level of 'field splitting' that read does not have.
                    So I'll answer both. :)



                    With this code (so you could confirm). You'll need the useful xxd:



                    #!/bin/ksh
                    # Correctly set IFS as described above.
                    defIFS="$(printf " tnx")"; defIFS="$defIFS%x";
                    IFS="$defIFS"
                    printf "IFS value: "
                    printf "%s" "$IFS"| xxd -p

                    a=' bar baz quz '; l="$#a"
                    printf "var value : %$ls-" "$a" ; printf "%sn" "$a" | xxd -p

                    printf "%sn" "$a" | while IFS='x' read -r line; do
                    printf "IFS --x-- : %$ls-" "$line" ;
                    printf "%s" "$line" |xxd -p; done;

                    printf 'Values quoted :n' "" # With values quoted:
                    printf "%sn" "$a" | while IFS='' read -r line; do
                    printf "IFS null quoted : %$ls-" "$line" ;
                    printf "%s" "$line" |xxd -p; done;

                    printf "%sn" "$a" | while IFS="$defIFS" read -r line; do
                    printf "IFS default quoted : %$ls-" "$line" ;
                    printf "%s" "$line" |xxd -p; done;

                    unset IFS; printf "%sn" "$a" | while read -r line; do
                    printf "IFS unset quoted : %$ls-" "$line" ;
                    printf "%s" "$line" |xxd -p; done;
                    IFS="$defIFS" # set IFS back to default.

                    printf "%sn" "$a" | while IFS=' ' read -r line; do
                    printf "IFS space quoted : %$ls-" "$line" ;
                    printf "%s" "$line" |xxd -p; done;

                    printf '%sn' "Values unquoted :" # Now with values unquoted:
                    printf "%sn" "$a" | while IFS='x' read -r line; do
                    printf "IFS --x-- unquoted : "
                    printf "%s, " $line; printf "%s," $line |xxd -p; done

                    printf "%sn" "$a" | while IFS='' read -r line; do
                    printf "IFS null unquoted : ";
                    printf "%s, " $line; printf "%s," $line |xxd -p; done

                    printf "%sn" "$a" | while IFS="$defIFS" read -r line; do
                    printf "IFS defau unquoted : ";
                    printf "%s, " $line; printf "%s," $line |xxd -p; done

                    unset IFS; printf "%sn" "$a" | while read -r line; do
                    printf "IFS unset unquoted : ";
                    printf "%s, " $line; printf "%s," $line |xxd -p; done
                    IFS="$defIFS" # set IFS back to default.

                    printf "%sn" "$a" | while IFS=' ' read -r line; do
                    printf "IFS space unquoted : ";
                    printf "%s, " $line; printf "%s," $line |xxd -p; done


                    I get:



                    $ ./stackexchange-Understanding-IFS.sh
                    IFS value: 20090a
                    var value : bar baz quz -20202062617220202062617a20202071757a2020200a
                    IFS --x-- : bar baz quz -20202062617220202062617a20202071757a202020
                    Values quoted :
                    IFS null quoted : bar baz quz -20202062617220202062617a20202071757a202020
                    IFS default quoted : bar baz quz-62617220202062617a20202071757a
                    IFS unset quoted : bar baz quz-62617220202062617a20202071757a
                    IFS space quoted : bar baz quz-62617220202062617a20202071757a
                    Values unquoted :
                    IFS --x-- unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
                    IFS null unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
                    IFS defau unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
                    IFS unset unquoted : bar, baz, quz, 6261722c62617a2c71757a2c
                    IFS space unquoted : bar, baz, quz, 6261722c62617a2c71757a2c


                    The first value is just the correct value of IFS='spacetabnewline'



                    Next line is all the hex values that the var $a has, and a newline '0a' at the end as it is going to be given to each read command.



                    The next line, for which IFS is null, does not perform any 'field spliting', but the newline is removed (as expected).



                    The next three lines, as IFS contains an space, remove the initial spaces and set the var line to the balance remaining.



                    The last four lines shows what an unquoted variable will do. The values will be split on the (several) spaces and will be printed as: bar,baz,qux,







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Aug 23 '15 at 8:48

























                    answered Aug 6 '15 at 23:13







                    user79743



























                        up vote
                        4
                        down vote













                        unset IFS does clear IFS, even if IFS is thereafter presumed to be " tn":



                        $ echo "'$IFS'"
                        '
                        '
                        $ IFS=""
                        $ echo "'$IFS'"
                        ''
                        $ unset IFS
                        $ echo "'$IFS'"
                        ''
                        $ IFS=$' tn'
                        $ echo "'$IFS'"
                        '
                        '
                        $


                        Tested on bash versions 4.2.45 and 3.2.25 with the same behavior.






                        share|improve this answer
























                          up vote
                          4
                          down vote













                          unset IFS does clear IFS, even if IFS is thereafter presumed to be " tn":



                          $ echo "'$IFS'"
                          '
                          '
                          $ IFS=""
                          $ echo "'$IFS'"
                          ''
                          $ unset IFS
                          $ echo "'$IFS'"
                          ''
                          $ IFS=$' tn'
                          $ echo "'$IFS'"
                          '
                          '
                          $


                          Tested on bash versions 4.2.45 and 3.2.25 with the same behavior.






                          share|improve this answer






















                            up vote
                            4
                            down vote










                            up vote
                            4
                            down vote









                            unset IFS does clear IFS, even if IFS is thereafter presumed to be " tn":



                            $ echo "'$IFS'"
                            '
                            '
                            $ IFS=""
                            $ echo "'$IFS'"
                            ''
                            $ unset IFS
                            $ echo "'$IFS'"
                            ''
                            $ IFS=$' tn'
                            $ echo "'$IFS'"
                            '
                            '
                            $


                            Tested on bash versions 4.2.45 and 3.2.25 with the same behavior.






                            share|improve this answer












                            unset IFS does clear IFS, even if IFS is thereafter presumed to be " tn":



                            $ echo "'$IFS'"
                            '
                            '
                            $ IFS=""
                            $ echo "'$IFS'"
                            ''
                            $ unset IFS
                            $ echo "'$IFS'"
                            ''
                            $ IFS=$' tn'
                            $ echo "'$IFS'"
                            '
                            '
                            $


                            Tested on bash versions 4.2.45 and 3.2.25 with the same behavior.







                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Sep 17 '13 at 19:48









                            derekm

                            412




                            412




















                                up vote
                                0
                                down vote













                                Q1 Splitting




                                Q1. Is field splitting the same as word splitting ?




                                Probably, but with a caveat.

                                A parameter expansion as called in POSIX, ksh, bash or zsh (and others) is subject to "Field splitting" as called in POSIX (a.k.a.: "Field spliting" in ksh, "Word Splitting" in bash, and sometimes field and sometimes word in zsh).



                                I would define it as:



                                The process of splitting a parameter that is done using the IFS characters.



                                Where "parameter" means "a variable value (contents)" and using IFS characters might be different in zsh. There is a s:string: flag that does "field splitting" on string in zsh.



                                caveat

                                However, there is a process of splitting called "Token Recognition" as defined by Posix that splits command lines into words (tokens) mostly using blanks (tabs and spaces) and some other rules. That tokens are subsequently (immediately) called "words" is shown in the alias description (for example):




                                After a token has been delimited, … , a resulting word …




                                As explained in ksh manual page:




                                Command Syntax
                                The shell begins parsing its input by breaking it into words. Words, which are sequences of characters, are delimited by unquoted white space characters (space, tab and newline) or meta-characters (<, >, |, ;, &, ( and )).




                                Also explicitly defined in bash man as this:




                                word A sequence of characters considered as a single unit by the shell. Also known as a token.




                                Or this:




                                word A sequence of characters treated as a unit by the shell. Words may not include unquoted metacharacters.




                                That is a "word splitting" in layman terms.




                                Q2 IFS null




                                Q2: Is setting IFS= the same as setting IFS to null? Is this what is meant by setting it to an empty string too?




                                An unset variable doesn't exist. A set variable exists but may be empty. If this value of empty is called "null" (as opposed to "NUL" or '0x00' or ''), then yes, all three are equivalent.



                                The variable is set but empty. var= ≡ var='' ≡ var="".




                                Q3 unset IFS




                                Q3: In the POSIX specification, I read the following:



                                If IFS is not set, the shell shall behave as if the value of IFS is , and




                                Yes, the shell shall behave In the sense that the effects that IFS should have should still be the same if an unset IFS was executed, mainly for "word splitting" and read commands.



                                That is not exactly equal to believe that an unset variable acts the same as a set variable. In specific, if you have:



                                $ unset a
                                $ b=$a


                                The variable a is unset, it doesn't exist yet, however, b is set to null as described in the previous question. And this will also be true:



                                $ echo ""$a-a is UN-set" "$b-b is UN-set""
                                "a is UN-set" ""


                                That is important in the case where this is done:



                                $ unset IFS
                                $ oldIFS=$IFS


                                The variable oldFS is now set (but IFS is unset), trying to restore IFS by doing:



                                $ IFS=$oldIFS


                                Will end with an IFS set to null, not unset. Its effects will be different.



                                The only solution is to ensure that oldIFS is also set (or unset) as IFS:



                                $ [ "$(set | grep '^IFS=')" ] && oldIFS=$IFS || unset oldIFS;


                                If IFS is not unset, set oldIFS to its value, otherwise unset it.

                                Restore by the same procedure (swap vars):



                                $ [ "$(set | grep '^oldIFS=')" ] && IFS=$oldIFS || unset IFS;



                                Q3 reset IFS




                                Q3 Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to and ?)




                                The only real problem is the newline at the end. The old, simple way to get it is:



                                nl='
                                '


                                Yes, a real newline. For a full IFS of :



                                IFS=" $(printf \t)$nl"
                                eval "$(printf "s=' tn'")"
                                IFS=$' tn'



                                Q4 IFS on read



                                Q4: Finally, how would this code:




                                while IFS= read -r line ...




                                Will read one line (up to a newline character) and assign it (without the trailing newline) to the var line. No word splitting nor white space removal (leading or trailing white space) will be executed.




                                while read -r line # Use the default IFS value




                                With the default IFS ( tn) the first effect is that whole line leading and trailing white space will be trimmed. Then, each (group of consecutive) delimiter(s) will be used to divide the line for each variable. That is: two variables need one (not leading or trailing) delimiter. Each additional variable require an additional delimiter (or group of delimiters).




                                while IFS=' ' read -r line




                                Leading and trailing (runs of) spaces will be removed, each (run of) spaces will be used to split the line at as many places as the variables require.






                                share|improve this answer
























                                  up vote
                                  0
                                  down vote













                                  Q1 Splitting




                                  Q1. Is field splitting the same as word splitting ?




                                  Probably, but with a caveat.

                                  A parameter expansion as called in POSIX, ksh, bash or zsh (and others) is subject to "Field splitting" as called in POSIX (a.k.a.: "Field spliting" in ksh, "Word Splitting" in bash, and sometimes field and sometimes word in zsh).



                                  I would define it as:



                                  The process of splitting a parameter that is done using the IFS characters.



                                  Where "parameter" means "a variable value (contents)" and using IFS characters might be different in zsh. There is a s:string: flag that does "field splitting" on string in zsh.



                                  caveat

                                  However, there is a process of splitting called "Token Recognition" as defined by Posix that splits command lines into words (tokens) mostly using blanks (tabs and spaces) and some other rules. That tokens are subsequently (immediately) called "words" is shown in the alias description (for example):




                                  After a token has been delimited, … , a resulting word …




                                  As explained in ksh manual page:




                                  Command Syntax
                                  The shell begins parsing its input by breaking it into words. Words, which are sequences of characters, are delimited by unquoted white space characters (space, tab and newline) or meta-characters (<, >, |, ;, &, ( and )).




                                  Also explicitly defined in bash man as this:




                                  word A sequence of characters considered as a single unit by the shell. Also known as a token.




                                  Or this:




                                  word A sequence of characters treated as a unit by the shell. Words may not include unquoted metacharacters.




                                  That is a "word splitting" in layman terms.




                                  Q2 IFS null




                                  Q2: Is setting IFS= the same as setting IFS to null? Is this what is meant by setting it to an empty string too?




                                  An unset variable doesn't exist. A set variable exists but may be empty. If this value of empty is called "null" (as opposed to "NUL" or '0x00' or ''), then yes, all three are equivalent.



                                  The variable is set but empty. var= ≡ var='' ≡ var="".




                                  Q3 unset IFS




                                  Q3: In the POSIX specification, I read the following:



                                  If IFS is not set, the shell shall behave as if the value of IFS is , and




                                  Yes, the shell shall behave In the sense that the effects that IFS should have should still be the same if an unset IFS was executed, mainly for "word splitting" and read commands.



                                  That is not exactly equal to believe that an unset variable acts the same as a set variable. In specific, if you have:



                                  $ unset a
                                  $ b=$a


                                  The variable a is unset, it doesn't exist yet, however, b is set to null as described in the previous question. And this will also be true:



                                  $ echo ""$a-a is UN-set" "$b-b is UN-set""
                                  "a is UN-set" ""


                                  That is important in the case where this is done:



                                  $ unset IFS
                                  $ oldIFS=$IFS


                                  The variable oldFS is now set (but IFS is unset), trying to restore IFS by doing:



                                  $ IFS=$oldIFS


                                  Will end with an IFS set to null, not unset. Its effects will be different.



                                  The only solution is to ensure that oldIFS is also set (or unset) as IFS:



                                  $ [ "$(set | grep '^IFS=')" ] && oldIFS=$IFS || unset oldIFS;


                                  If IFS is not unset, set oldIFS to its value, otherwise unset it.

                                  Restore by the same procedure (swap vars):



                                  $ [ "$(set | grep '^oldIFS=')" ] && IFS=$oldIFS || unset IFS;



                                  Q3 reset IFS




                                  Q3 Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to and ?)




                                  The only real problem is the newline at the end. The old, simple way to get it is:



                                  nl='
                                  '


                                  Yes, a real newline. For a full IFS of :



                                  IFS=" $(printf \t)$nl"
                                  eval "$(printf "s=' tn'")"
                                  IFS=$' tn'



                                  Q4 IFS on read



                                  Q4: Finally, how would this code:




                                  while IFS= read -r line ...




                                  Will read one line (up to a newline character) and assign it (without the trailing newline) to the var line. No word splitting nor white space removal (leading or trailing white space) will be executed.




                                  while read -r line # Use the default IFS value




                                  With the default IFS ( tn) the first effect is that whole line leading and trailing white space will be trimmed. Then, each (group of consecutive) delimiter(s) will be used to divide the line for each variable. That is: two variables need one (not leading or trailing) delimiter. Each additional variable require an additional delimiter (or group of delimiters).




                                  while IFS=' ' read -r line




                                  Leading and trailing (runs of) spaces will be removed, each (run of) spaces will be used to split the line at as many places as the variables require.






                                  share|improve this answer






















                                    up vote
                                    0
                                    down vote










                                    up vote
                                    0
                                    down vote









                                    Q1 Splitting




                                    Q1. Is field splitting the same as word splitting ?




                                    Probably, but with a caveat.

                                    A parameter expansion as called in POSIX, ksh, bash or zsh (and others) is subject to "Field splitting" as called in POSIX (a.k.a.: "Field spliting" in ksh, "Word Splitting" in bash, and sometimes field and sometimes word in zsh).



                                    I would define it as:



                                    The process of splitting a parameter that is done using the IFS characters.



                                    Where "parameter" means "a variable value (contents)" and using IFS characters might be different in zsh. There is a s:string: flag that does "field splitting" on string in zsh.



                                    caveat

                                    However, there is a process of splitting called "Token Recognition" as defined by Posix that splits command lines into words (tokens) mostly using blanks (tabs and spaces) and some other rules. That tokens are subsequently (immediately) called "words" is shown in the alias description (for example):




                                    After a token has been delimited, … , a resulting word …




                                    As explained in ksh manual page:




                                    Command Syntax
                                    The shell begins parsing its input by breaking it into words. Words, which are sequences of characters, are delimited by unquoted white space characters (space, tab and newline) or meta-characters (<, >, |, ;, &, ( and )).




                                    Also explicitly defined in bash man as this:




                                    word A sequence of characters considered as a single unit by the shell. Also known as a token.




                                    Or this:




                                    word A sequence of characters treated as a unit by the shell. Words may not include unquoted metacharacters.




                                    That is a "word splitting" in layman terms.




                                    Q2 IFS null




                                    Q2: Is setting IFS= the same as setting IFS to null? Is this what is meant by setting it to an empty string too?




                                    An unset variable doesn't exist. A set variable exists but may be empty. If this value of empty is called "null" (as opposed to "NUL" or '0x00' or ''), then yes, all three are equivalent.



                                    The variable is set but empty. var= ≡ var='' ≡ var="".




                                    Q3 unset IFS




                                    Q3: In the POSIX specification, I read the following:



                                    If IFS is not set, the shell shall behave as if the value of IFS is , and




                                    Yes, the shell shall behave In the sense that the effects that IFS should have should still be the same if an unset IFS was executed, mainly for "word splitting" and read commands.



                                    That is not exactly equal to believe that an unset variable acts the same as a set variable. In specific, if you have:



                                    $ unset a
                                    $ b=$a


                                    The variable a is unset, it doesn't exist yet, however, b is set to null as described in the previous question. And this will also be true:



                                    $ echo ""$a-a is UN-set" "$b-b is UN-set""
                                    "a is UN-set" ""


                                    That is important in the case where this is done:



                                    $ unset IFS
                                    $ oldIFS=$IFS


                                    The variable oldFS is now set (but IFS is unset), trying to restore IFS by doing:



                                    $ IFS=$oldIFS


                                    Will end with an IFS set to null, not unset. Its effects will be different.



                                    The only solution is to ensure that oldIFS is also set (or unset) as IFS:



                                    $ [ "$(set | grep '^IFS=')" ] && oldIFS=$IFS || unset oldIFS;


                                    If IFS is not unset, set oldIFS to its value, otherwise unset it.

                                    Restore by the same procedure (swap vars):



                                    $ [ "$(set | grep '^oldIFS=')" ] && IFS=$oldIFS || unset IFS;



                                    Q3 reset IFS




                                    Q3 Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to and ?)




                                    The only real problem is the newline at the end. The old, simple way to get it is:



                                    nl='
                                    '


                                    Yes, a real newline. For a full IFS of :



                                    IFS=" $(printf \t)$nl"
                                    eval "$(printf "s=' tn'")"
                                    IFS=$' tn'



                                    Q4 IFS on read



                                    Q4: Finally, how would this code:




                                    while IFS= read -r line ...




                                    Will read one line (up to a newline character) and assign it (without the trailing newline) to the var line. No word splitting nor white space removal (leading or trailing white space) will be executed.




                                    while read -r line # Use the default IFS value




                                    With the default IFS ( tn) the first effect is that whole line leading and trailing white space will be trimmed. Then, each (group of consecutive) delimiter(s) will be used to divide the line for each variable. That is: two variables need one (not leading or trailing) delimiter. Each additional variable require an additional delimiter (or group of delimiters).




                                    while IFS=' ' read -r line




                                    Leading and trailing (runs of) spaces will be removed, each (run of) spaces will be used to split the line at as many places as the variables require.






                                    share|improve this answer












                                    Q1 Splitting




                                    Q1. Is field splitting the same as word splitting ?




                                    Probably, but with a caveat.

                                    A parameter expansion as called in POSIX, ksh, bash or zsh (and others) is subject to "Field splitting" as called in POSIX (a.k.a.: "Field spliting" in ksh, "Word Splitting" in bash, and sometimes field and sometimes word in zsh).



                                    I would define it as:



                                    The process of splitting a parameter that is done using the IFS characters.



                                    Where "parameter" means "a variable value (contents)" and using IFS characters might be different in zsh. There is a s:string: flag that does "field splitting" on string in zsh.



                                    caveat

                                    However, there is a process of splitting called "Token Recognition" as defined by Posix that splits command lines into words (tokens) mostly using blanks (tabs and spaces) and some other rules. That tokens are subsequently (immediately) called "words" is shown in the alias description (for example):




                                    After a token has been delimited, … , a resulting word …




                                    As explained in ksh manual page:




                                    Command Syntax
                                    The shell begins parsing its input by breaking it into words. Words, which are sequences of characters, are delimited by unquoted white space characters (space, tab and newline) or meta-characters (<, >, |, ;, &, ( and )).




                                    Also explicitly defined in bash man as this:




                                    word A sequence of characters considered as a single unit by the shell. Also known as a token.




                                    Or this:




                                    word A sequence of characters treated as a unit by the shell. Words may not include unquoted metacharacters.




                                    That is a "word splitting" in layman terms.




                                    Q2 IFS null




                                    Q2: Is setting IFS= the same as setting IFS to null? Is this what is meant by setting it to an empty string too?




                                    An unset variable doesn't exist. A set variable exists but may be empty. If this value of empty is called "null" (as opposed to "NUL" or '0x00' or ''), then yes, all three are equivalent.



                                    The variable is set but empty. var= ≡ var='' ≡ var="".




                                    Q3 unset IFS




                                    Q3: In the POSIX specification, I read the following:



                                    If IFS is not set, the shell shall behave as if the value of IFS is , and




                                    Yes, the shell shall behave In the sense that the effects that IFS should have should still be the same if an unset IFS was executed, mainly for "word splitting" and read commands.



                                    That is not exactly equal to believe that an unset variable acts the same as a set variable. In specific, if you have:



                                    $ unset a
                                    $ b=$a


                                    The variable a is unset, it doesn't exist yet, however, b is set to null as described in the previous question. And this will also be true:



                                    $ echo ""$a-a is UN-set" "$b-b is UN-set""
                                    "a is UN-set" ""


                                    That is important in the case where this is done:



                                    $ unset IFS
                                    $ oldIFS=$IFS


                                    The variable oldFS is now set (but IFS is unset), trying to restore IFS by doing:



                                    $ IFS=$oldIFS


                                    Will end with an IFS set to null, not unset. Its effects will be different.



                                    The only solution is to ensure that oldIFS is also set (or unset) as IFS:



                                    $ [ "$(set | grep '^IFS=')" ] && oldIFS=$IFS || unset oldIFS;


                                    If IFS is not unset, set oldIFS to its value, otherwise unset it.

                                    Restore by the same procedure (swap vars):



                                    $ [ "$(set | grep '^oldIFS=')" ] && IFS=$oldIFS || unset IFS;



                                    Q3 reset IFS




                                    Q3 Say I want to restore the default value of IFS. How do I do that? (more specifically, how do I refer to and ?)




                                    The only real problem is the newline at the end. The old, simple way to get it is:



                                    nl='
                                    '


                                    Yes, a real newline. For a full IFS of :



                                    IFS=" $(printf \t)$nl"
                                    eval "$(printf "s=' tn'")"
                                    IFS=$' tn'



                                    Q4 IFS on read



                                    Q4: Finally, how would this code:




                                    while IFS= read -r line ...




                                    Will read one line (up to a newline character) and assign it (without the trailing newline) to the var line. No word splitting nor white space removal (leading or trailing white space) will be executed.




                                    while read -r line # Use the default IFS value




                                    With the default IFS ( tn) the first effect is that whole line leading and trailing white space will be trimmed. Then, each (group of consecutive) delimiter(s) will be used to divide the line for each variable. That is: two variables need one (not leading or trailing) delimiter. Each additional variable require an additional delimiter (or group of delimiters).




                                    while IFS=' ' read -r line




                                    Leading and trailing (runs of) spaces will be removed, each (run of) spaces will be used to split the line at as many places as the variables require.







                                    share|improve this answer












                                    share|improve this answer



                                    share|improve this answer










                                    answered yesterday









                                    Isaac

                                    7,85711137




                                    7,85711137



























                                         

                                        draft saved


                                        draft discarded















































                                         


                                        draft saved


                                        draft discarded














                                        StackExchange.ready(
                                        function ()
                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f26784%2funderstanding-ifs%23new-answer', 'question_page');

                                        );

                                        Post as a guest













































































                                        Popular posts from this blog

                                        How to check contact read email or not when send email to Individual?

                                        How many registers does an x86_64 CPU actually have?

                                        Nur Jahan