Unexpected split behavior

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
1
down vote

favorite












I read about Awk split behavior here:




[...] the fs argument to the split function (see String Functions) shall
be interpreted as extended regular expressions. These can be either ERE
tokens or arbitrary expressions, and shall be interpreted in the same manner
as the right-hand side of the ~ or !~ operator.




and:




If the right-hand operand is any expression other than the lexical token
ERE, the string value of the expression shall be interpreted as an
extended regular expression, including the escape conventions described above.




http://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html#tag_20_06_13_04



However I have noticed an unexpected result, with this code:



BEGIN 
print split("te.st", q, ".")



I would expect the . to represent any character, and for the result to be 6.
However all my tests returned 2. Running this code gives the expected 6:



BEGIN 
print split("te.st", q, /./)



Tested with:



  • gawk

  • gawk --posix

  • mawk 1.3.4

  • mawk 1.3.3

  • nawk (original-awk)

Am I misunderstanding the documentation or is this an error?










share|improve this question























  • It is usuall regex to be entered in/.../ format. It is also described in gawk documentation here (scroll down to split function): gnu.org/software/gawk/manual/html_node/String-Functions.html
    – George Vasiliou
    Nov 24 at 15:25










  • I've submitted a bug report to GNU awk and its docs have been changed to mention this behavior explicitly.
    – mosvy
    Nov 27 at 12:03















up vote
1
down vote

favorite












I read about Awk split behavior here:




[...] the fs argument to the split function (see String Functions) shall
be interpreted as extended regular expressions. These can be either ERE
tokens or arbitrary expressions, and shall be interpreted in the same manner
as the right-hand side of the ~ or !~ operator.




and:




If the right-hand operand is any expression other than the lexical token
ERE, the string value of the expression shall be interpreted as an
extended regular expression, including the escape conventions described above.




http://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html#tag_20_06_13_04



However I have noticed an unexpected result, with this code:



BEGIN 
print split("te.st", q, ".")



I would expect the . to represent any character, and for the result to be 6.
However all my tests returned 2. Running this code gives the expected 6:



BEGIN 
print split("te.st", q, /./)



Tested with:



  • gawk

  • gawk --posix

  • mawk 1.3.4

  • mawk 1.3.3

  • nawk (original-awk)

Am I misunderstanding the documentation or is this an error?










share|improve this question























  • It is usuall regex to be entered in/.../ format. It is also described in gawk documentation here (scroll down to split function): gnu.org/software/gawk/manual/html_node/String-Functions.html
    – George Vasiliou
    Nov 24 at 15:25










  • I've submitted a bug report to GNU awk and its docs have been changed to mention this behavior explicitly.
    – mosvy
    Nov 27 at 12:03













up vote
1
down vote

favorite









up vote
1
down vote

favorite











I read about Awk split behavior here:




[...] the fs argument to the split function (see String Functions) shall
be interpreted as extended regular expressions. These can be either ERE
tokens or arbitrary expressions, and shall be interpreted in the same manner
as the right-hand side of the ~ or !~ operator.




and:




If the right-hand operand is any expression other than the lexical token
ERE, the string value of the expression shall be interpreted as an
extended regular expression, including the escape conventions described above.




http://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html#tag_20_06_13_04



However I have noticed an unexpected result, with this code:



BEGIN 
print split("te.st", q, ".")



I would expect the . to represent any character, and for the result to be 6.
However all my tests returned 2. Running this code gives the expected 6:



BEGIN 
print split("te.st", q, /./)



Tested with:



  • gawk

  • gawk --posix

  • mawk 1.3.4

  • mawk 1.3.3

  • nawk (original-awk)

Am I misunderstanding the documentation or is this an error?










share|improve this question















I read about Awk split behavior here:




[...] the fs argument to the split function (see String Functions) shall
be interpreted as extended regular expressions. These can be either ERE
tokens or arbitrary expressions, and shall be interpreted in the same manner
as the right-hand side of the ~ or !~ operator.




and:




If the right-hand operand is any expression other than the lexical token
ERE, the string value of the expression shall be interpreted as an
extended regular expression, including the escape conventions described above.




http://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html#tag_20_06_13_04



However I have noticed an unexpected result, with this code:



BEGIN 
print split("te.st", q, ".")



I would expect the . to represent any character, and for the result to be 6.
However all my tests returned 2. Running this code gives the expected 6:



BEGIN 
print split("te.st", q, /./)



Tested with:



  • gawk

  • gawk --posix

  • mawk 1.3.4

  • mawk 1.3.3

  • nawk (original-awk)

Am I misunderstanding the documentation or is this an error?







awk posix gawk mawk






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 27 at 12:35









ilkkachu

54.1k782147




54.1k782147










asked Nov 24 at 14:41









Steven Penny

2,47721737




2,47721737











  • It is usuall regex to be entered in/.../ format. It is also described in gawk documentation here (scroll down to split function): gnu.org/software/gawk/manual/html_node/String-Functions.html
    – George Vasiliou
    Nov 24 at 15:25










  • I've submitted a bug report to GNU awk and its docs have been changed to mention this behavior explicitly.
    – mosvy
    Nov 27 at 12:03

















  • It is usuall regex to be entered in/.../ format. It is also described in gawk documentation here (scroll down to split function): gnu.org/software/gawk/manual/html_node/String-Functions.html
    – George Vasiliou
    Nov 24 at 15:25










  • I've submitted a bug report to GNU awk and its docs have been changed to mention this behavior explicitly.
    – mosvy
    Nov 27 at 12:03
















It is usuall regex to be entered in/.../ format. It is also described in gawk documentation here (scroll down to split function): gnu.org/software/gawk/manual/html_node/String-Functions.html
– George Vasiliou
Nov 24 at 15:25




It is usuall regex to be entered in/.../ format. It is also described in gawk documentation here (scroll down to split function): gnu.org/software/gawk/manual/html_node/String-Functions.html
– George Vasiliou
Nov 24 at 15:25












I've submitted a bug report to GNU awk and its docs have been changed to mention this behavior explicitly.
– mosvy
Nov 27 at 12:03





I've submitted a bug report to GNU awk and its docs have been changed to mention this behavior explicitly.
– mosvy
Nov 27 at 12:03











1 Answer
1






active

oldest

votes

















up vote
2
down vote













This is not an error; it's just that the standard isn't clear enough while trying to codify the existing practice.



The mawk(1) manual is more explicit:




split(expr, A, sep) works as follows:



...



(2) If sep = " " (a single space), then <SPACE> is trimmed from the
front and back of expr, and sep becomes <SPACE>. mawk defines
<SPACE> as the regular expression /[ tn]+/.
Otherwise sep is treated as a regular expression, except that
meta-characters are ignored for a string of length 1
, e.g.,
split(x, A, "*") and split(x, A, /*/) are the same.




Also, the GNU awk manual from the current sources:




split(s, a [, r [, seps] ])



...



Splitting behaves identically to field splitting, described above.
In particular, if r is a single-character string, that string acts as
the separator, even if it happens to be a regular expression
metacharacter.




This is the description from the susv4 standard:




An extended regular expression can be used to separate fields by assigning a
string containing the expression to the built-in variable FS, either
directly or as a consequence of using the -F sepstring option. The
default value of the FS variable shall be a single <space>. The
following describes FS behavior:



  1. If FS is a null string, the behavior is unspecified.


  2. If FS is a single character:



    a. If FS is <space>, skip leading and trailing <blank> and
    <newline> characters; fields shall be delimited by sets of one or
    more <blank> or <newline> characters.



    b. Otherwise, if FS is any other character c, fields shall be delimited
    by each single occurrence of c
    .



  3. Otherwise, the string value of FS shall be considered to be an
    extended regular expression. Each occurrence of a sequence matching the
    extended regular expression shall delimit fields.




Your example matches 2.b.



Even if that explicitly mentions FS, it's same behavior with any argument used
instead of it as the 3rd argument to split in all awk implementations, including in the case where that argument is a space.



It's unlikely that behavior will ever change, because the FS variable is just a string (awk doesn't have regexp objects, like javascript or perl; you cannot assign a regexp to a variable, as in a=/./ or $a=qr/./); it's the split function (called either implicitly or explicitly) which does interpret its argument as described above.



The origin of this behavior may be compatibility with the "old" awk, where FS (or the 3rd argument to split) was always treated as a single character. Example (on unix v7):



$ awk 'BEGINFS="."; print split("foo.bar.baz", a, "bar"); print a[2] '
3
ar.
$ awk 'BEGINFS="."; print split("foo.bar.baz", a, /bar/); print a[2] '
awk: syntax error near line 1
awk: illegal statement near line 1
Bus error - core dumped





share|improve this answer


















  • 1




    Not that using . as a regex to split on makes much sense anyway, it would destroy the string and return the number of characters plus one, and there are easier ways to get that. I don't think there are other single-character REs that would make sense here either (but I might be wrong).
    – ilkkachu
    Nov 27 at 12:52






  • 1




    awk -F'|', awk -F., awk -F+, awk -F'$', awk -F'^', awk -F'' are common. Note that in the original awk, awk -Ft meant split on tab (back then as you say, only the first character was used). That's one case where backward compatibility was not maintained with nawk. Here, with POSIX awk, you can always use FS = "(.)" for FS to be the regexp matching a single character (.1 would also work but is less portable).
    – Stéphane Chazelas
    Nov 27 at 13:00











Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f483870%2funexpected-split-behavior%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
2
down vote













This is not an error; it's just that the standard isn't clear enough while trying to codify the existing practice.



The mawk(1) manual is more explicit:




split(expr, A, sep) works as follows:



...



(2) If sep = " " (a single space), then <SPACE> is trimmed from the
front and back of expr, and sep becomes <SPACE>. mawk defines
<SPACE> as the regular expression /[ tn]+/.
Otherwise sep is treated as a regular expression, except that
meta-characters are ignored for a string of length 1
, e.g.,
split(x, A, "*") and split(x, A, /*/) are the same.




Also, the GNU awk manual from the current sources:




split(s, a [, r [, seps] ])



...



Splitting behaves identically to field splitting, described above.
In particular, if r is a single-character string, that string acts as
the separator, even if it happens to be a regular expression
metacharacter.




This is the description from the susv4 standard:




An extended regular expression can be used to separate fields by assigning a
string containing the expression to the built-in variable FS, either
directly or as a consequence of using the -F sepstring option. The
default value of the FS variable shall be a single <space>. The
following describes FS behavior:



  1. If FS is a null string, the behavior is unspecified.


  2. If FS is a single character:



    a. If FS is <space>, skip leading and trailing <blank> and
    <newline> characters; fields shall be delimited by sets of one or
    more <blank> or <newline> characters.



    b. Otherwise, if FS is any other character c, fields shall be delimited
    by each single occurrence of c
    .



  3. Otherwise, the string value of FS shall be considered to be an
    extended regular expression. Each occurrence of a sequence matching the
    extended regular expression shall delimit fields.




Your example matches 2.b.



Even if that explicitly mentions FS, it's same behavior with any argument used
instead of it as the 3rd argument to split in all awk implementations, including in the case where that argument is a space.



It's unlikely that behavior will ever change, because the FS variable is just a string (awk doesn't have regexp objects, like javascript or perl; you cannot assign a regexp to a variable, as in a=/./ or $a=qr/./); it's the split function (called either implicitly or explicitly) which does interpret its argument as described above.



The origin of this behavior may be compatibility with the "old" awk, where FS (or the 3rd argument to split) was always treated as a single character. Example (on unix v7):



$ awk 'BEGINFS="."; print split("foo.bar.baz", a, "bar"); print a[2] '
3
ar.
$ awk 'BEGINFS="."; print split("foo.bar.baz", a, /bar/); print a[2] '
awk: syntax error near line 1
awk: illegal statement near line 1
Bus error - core dumped





share|improve this answer


















  • 1




    Not that using . as a regex to split on makes much sense anyway, it would destroy the string and return the number of characters plus one, and there are easier ways to get that. I don't think there are other single-character REs that would make sense here either (but I might be wrong).
    – ilkkachu
    Nov 27 at 12:52






  • 1




    awk -F'|', awk -F., awk -F+, awk -F'$', awk -F'^', awk -F'' are common. Note that in the original awk, awk -Ft meant split on tab (back then as you say, only the first character was used). That's one case where backward compatibility was not maintained with nawk. Here, with POSIX awk, you can always use FS = "(.)" for FS to be the regexp matching a single character (.1 would also work but is less portable).
    – Stéphane Chazelas
    Nov 27 at 13:00















up vote
2
down vote













This is not an error; it's just that the standard isn't clear enough while trying to codify the existing practice.



The mawk(1) manual is more explicit:




split(expr, A, sep) works as follows:



...



(2) If sep = " " (a single space), then <SPACE> is trimmed from the
front and back of expr, and sep becomes <SPACE>. mawk defines
<SPACE> as the regular expression /[ tn]+/.
Otherwise sep is treated as a regular expression, except that
meta-characters are ignored for a string of length 1
, e.g.,
split(x, A, "*") and split(x, A, /*/) are the same.




Also, the GNU awk manual from the current sources:




split(s, a [, r [, seps] ])



...



Splitting behaves identically to field splitting, described above.
In particular, if r is a single-character string, that string acts as
the separator, even if it happens to be a regular expression
metacharacter.




This is the description from the susv4 standard:




An extended regular expression can be used to separate fields by assigning a
string containing the expression to the built-in variable FS, either
directly or as a consequence of using the -F sepstring option. The
default value of the FS variable shall be a single <space>. The
following describes FS behavior:



  1. If FS is a null string, the behavior is unspecified.


  2. If FS is a single character:



    a. If FS is <space>, skip leading and trailing <blank> and
    <newline> characters; fields shall be delimited by sets of one or
    more <blank> or <newline> characters.



    b. Otherwise, if FS is any other character c, fields shall be delimited
    by each single occurrence of c
    .



  3. Otherwise, the string value of FS shall be considered to be an
    extended regular expression. Each occurrence of a sequence matching the
    extended regular expression shall delimit fields.




Your example matches 2.b.



Even if that explicitly mentions FS, it's same behavior with any argument used
instead of it as the 3rd argument to split in all awk implementations, including in the case where that argument is a space.



It's unlikely that behavior will ever change, because the FS variable is just a string (awk doesn't have regexp objects, like javascript or perl; you cannot assign a regexp to a variable, as in a=/./ or $a=qr/./); it's the split function (called either implicitly or explicitly) which does interpret its argument as described above.



The origin of this behavior may be compatibility with the "old" awk, where FS (or the 3rd argument to split) was always treated as a single character. Example (on unix v7):



$ awk 'BEGINFS="."; print split("foo.bar.baz", a, "bar"); print a[2] '
3
ar.
$ awk 'BEGINFS="."; print split("foo.bar.baz", a, /bar/); print a[2] '
awk: syntax error near line 1
awk: illegal statement near line 1
Bus error - core dumped





share|improve this answer


















  • 1




    Not that using . as a regex to split on makes much sense anyway, it would destroy the string and return the number of characters plus one, and there are easier ways to get that. I don't think there are other single-character REs that would make sense here either (but I might be wrong).
    – ilkkachu
    Nov 27 at 12:52






  • 1




    awk -F'|', awk -F., awk -F+, awk -F'$', awk -F'^', awk -F'' are common. Note that in the original awk, awk -Ft meant split on tab (back then as you say, only the first character was used). That's one case where backward compatibility was not maintained with nawk. Here, with POSIX awk, you can always use FS = "(.)" for FS to be the regexp matching a single character (.1 would also work but is less portable).
    – Stéphane Chazelas
    Nov 27 at 13:00













up vote
2
down vote










up vote
2
down vote









This is not an error; it's just that the standard isn't clear enough while trying to codify the existing practice.



The mawk(1) manual is more explicit:




split(expr, A, sep) works as follows:



...



(2) If sep = " " (a single space), then <SPACE> is trimmed from the
front and back of expr, and sep becomes <SPACE>. mawk defines
<SPACE> as the regular expression /[ tn]+/.
Otherwise sep is treated as a regular expression, except that
meta-characters are ignored for a string of length 1
, e.g.,
split(x, A, "*") and split(x, A, /*/) are the same.




Also, the GNU awk manual from the current sources:




split(s, a [, r [, seps] ])



...



Splitting behaves identically to field splitting, described above.
In particular, if r is a single-character string, that string acts as
the separator, even if it happens to be a regular expression
metacharacter.




This is the description from the susv4 standard:




An extended regular expression can be used to separate fields by assigning a
string containing the expression to the built-in variable FS, either
directly or as a consequence of using the -F sepstring option. The
default value of the FS variable shall be a single <space>. The
following describes FS behavior:



  1. If FS is a null string, the behavior is unspecified.


  2. If FS is a single character:



    a. If FS is <space>, skip leading and trailing <blank> and
    <newline> characters; fields shall be delimited by sets of one or
    more <blank> or <newline> characters.



    b. Otherwise, if FS is any other character c, fields shall be delimited
    by each single occurrence of c
    .



  3. Otherwise, the string value of FS shall be considered to be an
    extended regular expression. Each occurrence of a sequence matching the
    extended regular expression shall delimit fields.




Your example matches 2.b.



Even if that explicitly mentions FS, it's same behavior with any argument used
instead of it as the 3rd argument to split in all awk implementations, including in the case where that argument is a space.



It's unlikely that behavior will ever change, because the FS variable is just a string (awk doesn't have regexp objects, like javascript or perl; you cannot assign a regexp to a variable, as in a=/./ or $a=qr/./); it's the split function (called either implicitly or explicitly) which does interpret its argument as described above.



The origin of this behavior may be compatibility with the "old" awk, where FS (or the 3rd argument to split) was always treated as a single character. Example (on unix v7):



$ awk 'BEGINFS="."; print split("foo.bar.baz", a, "bar"); print a[2] '
3
ar.
$ awk 'BEGINFS="."; print split("foo.bar.baz", a, /bar/); print a[2] '
awk: syntax error near line 1
awk: illegal statement near line 1
Bus error - core dumped





share|improve this answer














This is not an error; it's just that the standard isn't clear enough while trying to codify the existing practice.



The mawk(1) manual is more explicit:




split(expr, A, sep) works as follows:



...



(2) If sep = " " (a single space), then <SPACE> is trimmed from the
front and back of expr, and sep becomes <SPACE>. mawk defines
<SPACE> as the regular expression /[ tn]+/.
Otherwise sep is treated as a regular expression, except that
meta-characters are ignored for a string of length 1
, e.g.,
split(x, A, "*") and split(x, A, /*/) are the same.




Also, the GNU awk manual from the current sources:




split(s, a [, r [, seps] ])



...



Splitting behaves identically to field splitting, described above.
In particular, if r is a single-character string, that string acts as
the separator, even if it happens to be a regular expression
metacharacter.




This is the description from the susv4 standard:




An extended regular expression can be used to separate fields by assigning a
string containing the expression to the built-in variable FS, either
directly or as a consequence of using the -F sepstring option. The
default value of the FS variable shall be a single <space>. The
following describes FS behavior:



  1. If FS is a null string, the behavior is unspecified.


  2. If FS is a single character:



    a. If FS is <space>, skip leading and trailing <blank> and
    <newline> characters; fields shall be delimited by sets of one or
    more <blank> or <newline> characters.



    b. Otherwise, if FS is any other character c, fields shall be delimited
    by each single occurrence of c
    .



  3. Otherwise, the string value of FS shall be considered to be an
    extended regular expression. Each occurrence of a sequence matching the
    extended regular expression shall delimit fields.




Your example matches 2.b.



Even if that explicitly mentions FS, it's same behavior with any argument used
instead of it as the 3rd argument to split in all awk implementations, including in the case where that argument is a space.



It's unlikely that behavior will ever change, because the FS variable is just a string (awk doesn't have regexp objects, like javascript or perl; you cannot assign a regexp to a variable, as in a=/./ or $a=qr/./); it's the split function (called either implicitly or explicitly) which does interpret its argument as described above.



The origin of this behavior may be compatibility with the "old" awk, where FS (or the 3rd argument to split) was always treated as a single character. Example (on unix v7):



$ awk 'BEGINFS="."; print split("foo.bar.baz", a, "bar"); print a[2] '
3
ar.
$ awk 'BEGINFS="."; print split("foo.bar.baz", a, /bar/); print a[2] '
awk: syntax error near line 1
awk: illegal statement near line 1
Bus error - core dumped






share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 27 at 12:26

























answered Nov 24 at 15:13









mosvy

5,011323




5,011323







  • 1




    Not that using . as a regex to split on makes much sense anyway, it would destroy the string and return the number of characters plus one, and there are easier ways to get that. I don't think there are other single-character REs that would make sense here either (but I might be wrong).
    – ilkkachu
    Nov 27 at 12:52






  • 1




    awk -F'|', awk -F., awk -F+, awk -F'$', awk -F'^', awk -F'' are common. Note that in the original awk, awk -Ft meant split on tab (back then as you say, only the first character was used). That's one case where backward compatibility was not maintained with nawk. Here, with POSIX awk, you can always use FS = "(.)" for FS to be the regexp matching a single character (.1 would also work but is less portable).
    – Stéphane Chazelas
    Nov 27 at 13:00













  • 1




    Not that using . as a regex to split on makes much sense anyway, it would destroy the string and return the number of characters plus one, and there are easier ways to get that. I don't think there are other single-character REs that would make sense here either (but I might be wrong).
    – ilkkachu
    Nov 27 at 12:52






  • 1




    awk -F'|', awk -F., awk -F+, awk -F'$', awk -F'^', awk -F'' are common. Note that in the original awk, awk -Ft meant split on tab (back then as you say, only the first character was used). That's one case where backward compatibility was not maintained with nawk. Here, with POSIX awk, you can always use FS = "(.)" for FS to be the regexp matching a single character (.1 would also work but is less portable).
    – Stéphane Chazelas
    Nov 27 at 13:00








1




1




Not that using . as a regex to split on makes much sense anyway, it would destroy the string and return the number of characters plus one, and there are easier ways to get that. I don't think there are other single-character REs that would make sense here either (but I might be wrong).
– ilkkachu
Nov 27 at 12:52




Not that using . as a regex to split on makes much sense anyway, it would destroy the string and return the number of characters plus one, and there are easier ways to get that. I don't think there are other single-character REs that would make sense here either (but I might be wrong).
– ilkkachu
Nov 27 at 12:52




1




1




awk -F'|', awk -F., awk -F+, awk -F'$', awk -F'^', awk -F'' are common. Note that in the original awk, awk -Ft meant split on tab (back then as you say, only the first character was used). That's one case where backward compatibility was not maintained with nawk. Here, with POSIX awk, you can always use FS = "(.)" for FS to be the regexp matching a single character (.1 would also work but is less portable).
– Stéphane Chazelas
Nov 27 at 13:00





awk -F'|', awk -F., awk -F+, awk -F'$', awk -F'^', awk -F'' are common. Note that in the original awk, awk -Ft meant split on tab (back then as you say, only the first character was used). That's one case where backward compatibility was not maintained with nawk. Here, with POSIX awk, you can always use FS = "(.)" for FS to be the regexp matching a single character (.1 would also work but is less portable).
– Stéphane Chazelas
Nov 27 at 13:00


















draft saved

draft discarded
















































Thanks for contributing an answer to Unix & Linux Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f483870%2funexpected-split-behavior%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown






Popular posts from this blog

Peggy Mitchell

Palaiologos

The Forum (Inglewood, California)