Delimit by space but ignore backslash space
Clash Royale CLAN TAG#URR8PPP
up vote
9
down vote
favorite
5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1 0.0.0.0/0
456dlkjfa
1.2.3.4
test 1.2.3.4/32 4.3.2.0/23 4.3.2.0/23
default 4.3.2.0/23 4.3.2.0/23
launch-wizard-2 0.0.0.0/0
launch-wizard-3 0.0.0.0/0
2.3.4.5/32
I would like to get the first column of the above but the catch is that, I need to treat (backslash space) as a part of the column, so
awk 'print $1'
should give me
5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1
456dlkjfa
1.2.3.4
test
default
launch-wizard-2
launch-wizard-3
2.3.4.5/32
text-processing awk sed
add a comment |Â
up vote
9
down vote
favorite
5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1 0.0.0.0/0
456dlkjfa
1.2.3.4
test 1.2.3.4/32 4.3.2.0/23 4.3.2.0/23
default 4.3.2.0/23 4.3.2.0/23
launch-wizard-2 0.0.0.0/0
launch-wizard-3 0.0.0.0/0
2.3.4.5/32
I would like to get the first column of the above but the catch is that, I need to treat (backslash space) as a part of the column, so
awk 'print $1'
should give me
5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1
456dlkjfa
1.2.3.4
test
default
launch-wizard-2
launch-wizard-3
2.3.4.5/32
text-processing awk sed
Is being treated as an escape character always or is onlya\ b
one field or two?
â Gregory Nisbet
Sep 25 at 23:49
@GregoryNisbet I've put in is for escape character, not the real data
â GypsyCosmonaut
Sep 26 at 0:13
1
If your data happened to contain a real backslash, how would it be represented?
â Gregory Nisbet
Sep 26 at 0:16
@GregoryNisbet Good question. Because I replaced only[[:space:]]
with[[:space:]]
, the original data has untouched in their place. After getting the original data in the first column delimited by only spaces and not[[:space:]]
, I'd be replacing[[:space:]]
with[[:space:]]
and I'd be left with original data back again which has .
â GypsyCosmonaut
Sep 26 at 0:31
add a comment |Â
up vote
9
down vote
favorite
up vote
9
down vote
favorite
5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1 0.0.0.0/0
456dlkjfa
1.2.3.4
test 1.2.3.4/32 4.3.2.0/23 4.3.2.0/23
default 4.3.2.0/23 4.3.2.0/23
launch-wizard-2 0.0.0.0/0
launch-wizard-3 0.0.0.0/0
2.3.4.5/32
I would like to get the first column of the above but the catch is that, I need to treat (backslash space) as a part of the column, so
awk 'print $1'
should give me
5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1
456dlkjfa
1.2.3.4
test
default
launch-wizard-2
launch-wizard-3
2.3.4.5/32
text-processing awk sed
5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1 0.0.0.0/0
456dlkjfa
1.2.3.4
test 1.2.3.4/32 4.3.2.0/23 4.3.2.0/23
default 4.3.2.0/23 4.3.2.0/23
launch-wizard-2 0.0.0.0/0
launch-wizard-3 0.0.0.0/0
2.3.4.5/32
I would like to get the first column of the above but the catch is that, I need to treat (backslash space) as a part of the column, so
awk 'print $1'
should give me
5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1
456dlkjfa
1.2.3.4
test
default
launch-wizard-2
launch-wizard-3
2.3.4.5/32
text-processing awk sed
text-processing awk sed
edited Sep 25 at 15:54
ñÃÂsýù÷
16k92563
16k92563
asked Sep 25 at 15:00
GypsyCosmonaut
718628
718628
Is being treated as an escape character always or is onlya\ b
one field or two?
â Gregory Nisbet
Sep 25 at 23:49
@GregoryNisbet I've put in is for escape character, not the real data
â GypsyCosmonaut
Sep 26 at 0:13
1
If your data happened to contain a real backslash, how would it be represented?
â Gregory Nisbet
Sep 26 at 0:16
@GregoryNisbet Good question. Because I replaced only[[:space:]]
with[[:space:]]
, the original data has untouched in their place. After getting the original data in the first column delimited by only spaces and not[[:space:]]
, I'd be replacing[[:space:]]
with[[:space:]]
and I'd be left with original data back again which has .
â GypsyCosmonaut
Sep 26 at 0:31
add a comment |Â
Is being treated as an escape character always or is onlya\ b
one field or two?
â Gregory Nisbet
Sep 25 at 23:49
@GregoryNisbet I've put in is for escape character, not the real data
â GypsyCosmonaut
Sep 26 at 0:13
1
If your data happened to contain a real backslash, how would it be represented?
â Gregory Nisbet
Sep 26 at 0:16
@GregoryNisbet Good question. Because I replaced only[[:space:]]
with[[:space:]]
, the original data has untouched in their place. After getting the original data in the first column delimited by only spaces and not[[:space:]]
, I'd be replacing[[:space:]]
with[[:space:]]
and I'd be left with original data back again which has .
â GypsyCosmonaut
Sep 26 at 0:31
Is being treated as an escape character always or is only
special? For instance, is a\ b
one field or two?â Gregory Nisbet
Sep 25 at 23:49
Is being treated as an escape character always or is only
special? For instance, is a\ b
one field or two?â Gregory Nisbet
Sep 25 at 23:49
@GregoryNisbet I've put in is for escape character, not the real data
â GypsyCosmonaut
Sep 26 at 0:13
@GregoryNisbet I've put in is for escape character, not the real data
â GypsyCosmonaut
Sep 26 at 0:13
1
1
If your data happened to contain a real backslash, how would it be represented?
â Gregory Nisbet
Sep 26 at 0:16
If your data happened to contain a real backslash, how would it be represented?
â Gregory Nisbet
Sep 26 at 0:16
@GregoryNisbet Good question. Because I replaced only
[[:space:]]
with [[:space:]]
, the original data has untouched in their place. After getting the original data in the first column delimited by only spaces and not [[:space:]]
, I'd be replacing [[:space:]]
with [[:space:]]
and I'd be left with original data back again which has .â GypsyCosmonaut
Sep 26 at 0:31
@GregoryNisbet Good question. Because I replaced only
[[:space:]]
with [[:space:]]
, the original data has untouched in their place. After getting the original data in the first column delimited by only spaces and not [[:space:]]
, I'd be replacing [[:space:]]
with [[:space:]]
and I'd be left with original data back again which has .â GypsyCosmonaut
Sep 26 at 0:31
add a comment |Â
4 Answers
4
active
oldest
votes
up vote
10
down vote
accepted
with gnu awk (gawk
) you can use some zero-length assertions like <
or >
:
$ echo 'a b c' | gawk 'BEGINFS="\> +" print $1'
a b
but unfortunately not the full-blown ones from perl
or pcre
(eg. (?<!\)
, (?<=w)
, etc):
$ echo 'a b, c' | perl -nle '@a=split /(?<!\)s+/, $_; print $a[0]'
a b,
add a comment |Â
up vote
6
down vote
You could substitute space with something else and back again afterwards.
sed 's/\ /\x20/g' data_file | awk ' print $1; ' | sed 's/\x20/\ /g'
Only with sed : sed 's/\ /\x20/g;s/ .*//;s/\x20/\ /g' data_file
â ctac_
Sep 25 at 16:26
Or, awk, using the default SUBSEP variable value of34
:awk 'gsub(/\ /,SUBSEP,$0); val=$1; gsub(SUBSEP,"\ ",val); print val' file
â glenn jackman
Sep 25 at 18:29
add a comment |Â
up vote
5
down vote
With just sed
:
sed -r 's/^((([^]*\ )1,)?[^ ]*).*/1/' infile
Or shorter:
sed -r 's/^(([^]*\ )*[^ ]*).*/1/' infile
This (([^]*\ )1,)?[^ ]*
matches:
[^]*\
: anything that it's not a back-slash which ends with back-slash followed by a space (note thatinside character class is not required to be escaped, but outside does).
([^]*\ )1,
: matching above with one-or-more times of occurrences.(([^]*\ )1,)?
: this is optional when using(...)?
; we could use([^]*\ )0,
instead as well or([^]*\ )*
.((([^]*\ )1,)?[^ ]*)
: matches above which is optional followed by anything that it's not a space and hold as group match with1
as its back-reference.((([^]*\ )1,)?[^ ]*).*
: matches above(...)
and anything else.*
.
then is replacement part just print the 1
which is the output:
5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1
456dlkjfa
1.2.3.4
test
default
launch-wizard-2
launch-wizard-3
2.3.4.5/32
add a comment |Â
up vote
5
down vote
With GNU grep
or compatible:
grep -Po '^(\.|S)*'
Or with ERE:
grep -Eo '^(\.|[^[:space:]])*'
That treats as a quoting operator, for whitespace as a delimiter, but also for itself. That is, on
foo\ bar
input, it returns foo\
.
add a comment |Â
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
10
down vote
accepted
with gnu awk (gawk
) you can use some zero-length assertions like <
or >
:
$ echo 'a b c' | gawk 'BEGINFS="\> +" print $1'
a b
but unfortunately not the full-blown ones from perl
or pcre
(eg. (?<!\)
, (?<=w)
, etc):
$ echo 'a b, c' | perl -nle '@a=split /(?<!\)s+/, $_; print $a[0]'
a b,
add a comment |Â
up vote
10
down vote
accepted
with gnu awk (gawk
) you can use some zero-length assertions like <
or >
:
$ echo 'a b c' | gawk 'BEGINFS="\> +" print $1'
a b
but unfortunately not the full-blown ones from perl
or pcre
(eg. (?<!\)
, (?<=w)
, etc):
$ echo 'a b, c' | perl -nle '@a=split /(?<!\)s+/, $_; print $a[0]'
a b,
add a comment |Â
up vote
10
down vote
accepted
up vote
10
down vote
accepted
with gnu awk (gawk
) you can use some zero-length assertions like <
or >
:
$ echo 'a b c' | gawk 'BEGINFS="\> +" print $1'
a b
but unfortunately not the full-blown ones from perl
or pcre
(eg. (?<!\)
, (?<=w)
, etc):
$ echo 'a b, c' | perl -nle '@a=split /(?<!\)s+/, $_; print $a[0]'
a b,
with gnu awk (gawk
) you can use some zero-length assertions like <
or >
:
$ echo 'a b c' | gawk 'BEGINFS="\> +" print $1'
a b
but unfortunately not the full-blown ones from perl
or pcre
(eg. (?<!\)
, (?<=w)
, etc):
$ echo 'a b, c' | perl -nle '@a=split /(?<!\)s+/, $_; print $a[0]'
a b,
edited Sep 25 at 22:03
answered Sep 25 at 16:55
mosvy
1,782110
1,782110
add a comment |Â
add a comment |Â
up vote
6
down vote
You could substitute space with something else and back again afterwards.
sed 's/\ /\x20/g' data_file | awk ' print $1; ' | sed 's/\x20/\ /g'
Only with sed : sed 's/\ /\x20/g;s/ .*//;s/\x20/\ /g' data_file
â ctac_
Sep 25 at 16:26
Or, awk, using the default SUBSEP variable value of34
:awk 'gsub(/\ /,SUBSEP,$0); val=$1; gsub(SUBSEP,"\ ",val); print val' file
â glenn jackman
Sep 25 at 18:29
add a comment |Â
up vote
6
down vote
You could substitute space with something else and back again afterwards.
sed 's/\ /\x20/g' data_file | awk ' print $1; ' | sed 's/\x20/\ /g'
Only with sed : sed 's/\ /\x20/g;s/ .*//;s/\x20/\ /g' data_file
â ctac_
Sep 25 at 16:26
Or, awk, using the default SUBSEP variable value of34
:awk 'gsub(/\ /,SUBSEP,$0); val=$1; gsub(SUBSEP,"\ ",val); print val' file
â glenn jackman
Sep 25 at 18:29
add a comment |Â
up vote
6
down vote
up vote
6
down vote
You could substitute space with something else and back again afterwards.
sed 's/\ /\x20/g' data_file | awk ' print $1; ' | sed 's/\x20/\ /g'
You could substitute space with something else and back again afterwards.
sed 's/\ /\x20/g' data_file | awk ' print $1; ' | sed 's/\x20/\ /g'
answered Sep 25 at 15:20
RoVo
1,843213
1,843213
Only with sed : sed 's/\ /\x20/g;s/ .*//;s/\x20/\ /g' data_file
â ctac_
Sep 25 at 16:26
Or, awk, using the default SUBSEP variable value of34
:awk 'gsub(/\ /,SUBSEP,$0); val=$1; gsub(SUBSEP,"\ ",val); print val' file
â glenn jackman
Sep 25 at 18:29
add a comment |Â
Only with sed : sed 's/\ /\x20/g;s/ .*//;s/\x20/\ /g' data_file
â ctac_
Sep 25 at 16:26
Or, awk, using the default SUBSEP variable value of34
:awk 'gsub(/\ /,SUBSEP,$0); val=$1; gsub(SUBSEP,"\ ",val); print val' file
â glenn jackman
Sep 25 at 18:29
Only with sed : sed 's/\ /\x20/g;s/ .*//;s/\x20/\ /g' data_file
â ctac_
Sep 25 at 16:26
Only with sed : sed 's/\ /\x20/g;s/ .*//;s/\x20/\ /g' data_file
â ctac_
Sep 25 at 16:26
Or, awk, using the default SUBSEP variable value of
34
: awk 'gsub(/\ /,SUBSEP,$0); val=$1; gsub(SUBSEP,"\ ",val); print val' file
â glenn jackman
Sep 25 at 18:29
Or, awk, using the default SUBSEP variable value of
34
: awk 'gsub(/\ /,SUBSEP,$0); val=$1; gsub(SUBSEP,"\ ",val); print val' file
â glenn jackman
Sep 25 at 18:29
add a comment |Â
up vote
5
down vote
With just sed
:
sed -r 's/^((([^]*\ )1,)?[^ ]*).*/1/' infile
Or shorter:
sed -r 's/^(([^]*\ )*[^ ]*).*/1/' infile
This (([^]*\ )1,)?[^ ]*
matches:
[^]*\
: anything that it's not a back-slash which ends with back-slash followed by a space (note thatinside character class is not required to be escaped, but outside does).
([^]*\ )1,
: matching above with one-or-more times of occurrences.(([^]*\ )1,)?
: this is optional when using(...)?
; we could use([^]*\ )0,
instead as well or([^]*\ )*
.((([^]*\ )1,)?[^ ]*)
: matches above which is optional followed by anything that it's not a space and hold as group match with1
as its back-reference.((([^]*\ )1,)?[^ ]*).*
: matches above(...)
and anything else.*
.
then is replacement part just print the 1
which is the output:
5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1
456dlkjfa
1.2.3.4
test
default
launch-wizard-2
launch-wizard-3
2.3.4.5/32
add a comment |Â
up vote
5
down vote
With just sed
:
sed -r 's/^((([^]*\ )1,)?[^ ]*).*/1/' infile
Or shorter:
sed -r 's/^(([^]*\ )*[^ ]*).*/1/' infile
This (([^]*\ )1,)?[^ ]*
matches:
[^]*\
: anything that it's not a back-slash which ends with back-slash followed by a space (note thatinside character class is not required to be escaped, but outside does).
([^]*\ )1,
: matching above with one-or-more times of occurrences.(([^]*\ )1,)?
: this is optional when using(...)?
; we could use([^]*\ )0,
instead as well or([^]*\ )*
.((([^]*\ )1,)?[^ ]*)
: matches above which is optional followed by anything that it's not a space and hold as group match with1
as its back-reference.((([^]*\ )1,)?[^ ]*).*
: matches above(...)
and anything else.*
.
then is replacement part just print the 1
which is the output:
5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1
456dlkjfa
1.2.3.4
test
default
launch-wizard-2
launch-wizard-3
2.3.4.5/32
add a comment |Â
up vote
5
down vote
up vote
5
down vote
With just sed
:
sed -r 's/^((([^]*\ )1,)?[^ ]*).*/1/' infile
Or shorter:
sed -r 's/^(([^]*\ )*[^ ]*).*/1/' infile
This (([^]*\ )1,)?[^ ]*
matches:
[^]*\
: anything that it's not a back-slash which ends with back-slash followed by a space (note thatinside character class is not required to be escaped, but outside does).
([^]*\ )1,
: matching above with one-or-more times of occurrences.(([^]*\ )1,)?
: this is optional when using(...)?
; we could use([^]*\ )0,
instead as well or([^]*\ )*
.((([^]*\ )1,)?[^ ]*)
: matches above which is optional followed by anything that it's not a space and hold as group match with1
as its back-reference.((([^]*\ )1,)?[^ ]*).*
: matches above(...)
and anything else.*
.
then is replacement part just print the 1
which is the output:
5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1
456dlkjfa
1.2.3.4
test
default
launch-wizard-2
launch-wizard-3
2.3.4.5/32
With just sed
:
sed -r 's/^((([^]*\ )1,)?[^ ]*).*/1/' infile
Or shorter:
sed -r 's/^(([^]*\ )*[^ ]*).*/1/' infile
This (([^]*\ )1,)?[^ ]*
matches:
[^]*\
: anything that it's not a back-slash which ends with back-slash followed by a space (note thatinside character class is not required to be escaped, but outside does).
([^]*\ )1,
: matching above with one-or-more times of occurrences.(([^]*\ )1,)?
: this is optional when using(...)?
; we could use([^]*\ )0,
instead as well or([^]*\ )*
.((([^]*\ )1,)?[^ ]*)
: matches above which is optional followed by anything that it's not a space and hold as group match with1
as its back-reference.((([^]*\ )1,)?[^ ]*).*
: matches above(...)
and anything else.*
.
then is replacement part just print the 1
which is the output:
5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1
456dlkjfa
1.2.3.4
test
default
launch-wizard-2
launch-wizard-3
2.3.4.5/32
edited Sep 25 at 16:00
answered Sep 25 at 15:53
ñÃÂsýù÷
16k92563
16k92563
add a comment |Â
add a comment |Â
up vote
5
down vote
With GNU grep
or compatible:
grep -Po '^(\.|S)*'
Or with ERE:
grep -Eo '^(\.|[^[:space:]])*'
That treats as a quoting operator, for whitespace as a delimiter, but also for itself. That is, on
foo\ bar
input, it returns foo\
.
add a comment |Â
up vote
5
down vote
With GNU grep
or compatible:
grep -Po '^(\.|S)*'
Or with ERE:
grep -Eo '^(\.|[^[:space:]])*'
That treats as a quoting operator, for whitespace as a delimiter, but also for itself. That is, on
foo\ bar
input, it returns foo\
.
add a comment |Â
up vote
5
down vote
up vote
5
down vote
With GNU grep
or compatible:
grep -Po '^(\.|S)*'
Or with ERE:
grep -Eo '^(\.|[^[:space:]])*'
That treats as a quoting operator, for whitespace as a delimiter, but also for itself. That is, on
foo\ bar
input, it returns foo\
.
With GNU grep
or compatible:
grep -Po '^(\.|S)*'
Or with ERE:
grep -Eo '^(\.|[^[:space:]])*'
That treats as a quoting operator, for whitespace as a delimiter, but also for itself. That is, on
foo\ bar
input, it returns foo\
.
answered Sep 25 at 22:03
Stéphane Chazelas
287k53529867
287k53529867
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f471353%2fdelimit-by-space-but-ignore-backslash-space%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Is being treated as an escape character always or is only
a\ b
one field or two?â Gregory Nisbet
Sep 25 at 23:49
@GregoryNisbet I've put in is for escape character, not the real data
â GypsyCosmonaut
Sep 26 at 0:13
1
If your data happened to contain a real backslash, how would it be represented?
â Gregory Nisbet
Sep 26 at 0:16
@GregoryNisbet Good question. Because I replaced only
[[:space:]]
with[[:space:]]
, the original data has untouched in their place. After getting the original data in the first column delimited by only spaces and not[[:space:]]
, I'd be replacing[[:space:]]
with[[:space:]]
and I'd be left with original data back again which has .â GypsyCosmonaut
Sep 26 at 0:31