AWK / String manipulation: How to pull strings out from a column and compare it with a number before printing the row

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
0
down vote

favorite












I have a list of data in a table. By using awk to pull out column 5, i was able to differentiate the data amongst the rows. If the entry at column 5 is more than 4, the row should be printed.



However, there is one entry that comes in the form of a string that cannot be directly compared to a number, before deciding whether it should be printed or not. That entry has a parenthesis around the number that should be compared.



Here is the example of the column 5:



on
%)
%
replica
(

0
(100.0 <= this one
0.0
10.8
13.8
12.0
16.3
13.2
12.1
11.4
10.4
0.0
devices:


From the exmaple above, i am suppose to print rows 8 and 10 to 17 of the table.



I will provide a example table(file.txt):



1 0 0 0 on
2 0 0 0 %)
3 0 0 0 %
4 0 0 0 replica
5 0 0 0 (
6 0 0 0
7 0 0 0 0
8 0 0 0 (100.0
9 0 0 0 0.0
1 0 0 0 10.8
1 1 0 0 13.8
1 2 0 0 12.0
1 3 0 0 16.3
1 4 0 0 13.2
1 5 0 0 12.1
1 6 0 0 11.4
1 7 0 0 10.4
1 8 0 0 0.0
1 9 0 0 devices:


My attempt:



awk '{if (($5>=4)) print;
else
NUMBER=($5 | grep -o -E '[0-9]+');
if (($NUMBER>=4)) print' file.txt


Error:



awk: syntax error near line 2
awk: illegal statement near line 2
awk: syntax error near line 3
awk: illegal statement near line 3






share|improve this question





















  • I may not have the gsub command in my system
    – tthhss
    Jun 8 at 2:25














up vote
0
down vote

favorite












I have a list of data in a table. By using awk to pull out column 5, i was able to differentiate the data amongst the rows. If the entry at column 5 is more than 4, the row should be printed.



However, there is one entry that comes in the form of a string that cannot be directly compared to a number, before deciding whether it should be printed or not. That entry has a parenthesis around the number that should be compared.



Here is the example of the column 5:



on
%)
%
replica
(

0
(100.0 <= this one
0.0
10.8
13.8
12.0
16.3
13.2
12.1
11.4
10.4
0.0
devices:


From the exmaple above, i am suppose to print rows 8 and 10 to 17 of the table.



I will provide a example table(file.txt):



1 0 0 0 on
2 0 0 0 %)
3 0 0 0 %
4 0 0 0 replica
5 0 0 0 (
6 0 0 0
7 0 0 0 0
8 0 0 0 (100.0
9 0 0 0 0.0
1 0 0 0 10.8
1 1 0 0 13.8
1 2 0 0 12.0
1 3 0 0 16.3
1 4 0 0 13.2
1 5 0 0 12.1
1 6 0 0 11.4
1 7 0 0 10.4
1 8 0 0 0.0
1 9 0 0 devices:


My attempt:



awk '{if (($5>=4)) print;
else
NUMBER=($5 | grep -o -E '[0-9]+');
if (($NUMBER>=4)) print' file.txt


Error:



awk: syntax error near line 2
awk: illegal statement near line 2
awk: syntax error near line 3
awk: illegal statement near line 3






share|improve this question





















  • I may not have the gsub command in my system
    – tthhss
    Jun 8 at 2:25












up vote
0
down vote

favorite









up vote
0
down vote

favorite











I have a list of data in a table. By using awk to pull out column 5, i was able to differentiate the data amongst the rows. If the entry at column 5 is more than 4, the row should be printed.



However, there is one entry that comes in the form of a string that cannot be directly compared to a number, before deciding whether it should be printed or not. That entry has a parenthesis around the number that should be compared.



Here is the example of the column 5:



on
%)
%
replica
(

0
(100.0 <= this one
0.0
10.8
13.8
12.0
16.3
13.2
12.1
11.4
10.4
0.0
devices:


From the exmaple above, i am suppose to print rows 8 and 10 to 17 of the table.



I will provide a example table(file.txt):



1 0 0 0 on
2 0 0 0 %)
3 0 0 0 %
4 0 0 0 replica
5 0 0 0 (
6 0 0 0
7 0 0 0 0
8 0 0 0 (100.0
9 0 0 0 0.0
1 0 0 0 10.8
1 1 0 0 13.8
1 2 0 0 12.0
1 3 0 0 16.3
1 4 0 0 13.2
1 5 0 0 12.1
1 6 0 0 11.4
1 7 0 0 10.4
1 8 0 0 0.0
1 9 0 0 devices:


My attempt:



awk '{if (($5>=4)) print;
else
NUMBER=($5 | grep -o -E '[0-9]+');
if (($NUMBER>=4)) print' file.txt


Error:



awk: syntax error near line 2
awk: illegal statement near line 2
awk: syntax error near line 3
awk: illegal statement near line 3






share|improve this question













I have a list of data in a table. By using awk to pull out column 5, i was able to differentiate the data amongst the rows. If the entry at column 5 is more than 4, the row should be printed.



However, there is one entry that comes in the form of a string that cannot be directly compared to a number, before deciding whether it should be printed or not. That entry has a parenthesis around the number that should be compared.



Here is the example of the column 5:



on
%)
%
replica
(

0
(100.0 <= this one
0.0
10.8
13.8
12.0
16.3
13.2
12.1
11.4
10.4
0.0
devices:


From the exmaple above, i am suppose to print rows 8 and 10 to 17 of the table.



I will provide a example table(file.txt):



1 0 0 0 on
2 0 0 0 %)
3 0 0 0 %
4 0 0 0 replica
5 0 0 0 (
6 0 0 0
7 0 0 0 0
8 0 0 0 (100.0
9 0 0 0 0.0
1 0 0 0 10.8
1 1 0 0 13.8
1 2 0 0 12.0
1 3 0 0 16.3
1 4 0 0 13.2
1 5 0 0 12.1
1 6 0 0 11.4
1 7 0 0 10.4
1 8 0 0 0.0
1 9 0 0 devices:


My attempt:



awk '{if (($5>=4)) print;
else
NUMBER=($5 | grep -o -E '[0-9]+');
if (($NUMBER>=4)) print' file.txt


Error:



awk: syntax error near line 2
awk: illegal statement near line 2
awk: syntax error near line 3
awk: illegal statement near line 3








share|improve this question












share|improve this question




share|improve this question








edited Jun 8 at 2:30
























asked Jun 7 at 10:59









tthhss

265




265











  • I may not have the gsub command in my system
    – tthhss
    Jun 8 at 2:25
















  • I may not have the gsub command in my system
    – tthhss
    Jun 8 at 2:25















I may not have the gsub command in my system
– tthhss
Jun 8 at 2:25




I may not have the gsub command in my system
– tthhss
Jun 8 at 2:25










1 Answer
1






active

oldest

votes

















up vote
1
down vote













You could strip off the non-numeric characters before comparing:



$ awk 'x=$5; gsub(/[^0-9.]/,"",x); x+0>=4' file.txt
8 0 0 0 (100.0
1 0 0 0 10.8
1 1 0 0 13.8
1 2 0 0 12.0
1 3 0 0 16.3
1 4 0 0 13.2
1 5 0 0 12.1
1 6 0 0 11.4
1 7 0 0 10.4





share|improve this answer





















  • hmm, what if i have other conditions such as print when $1=="PM"? can i do this: awk 'if (($1=="PM")) print; x=$5; gsub(/[^0-9.]/,"",x); x+0>=4' file.txt I received a syntax error
    – tthhss
    Jun 7 at 11:19











  • Yes that should work - although it would be more vernacular to add it as a rule like x+0>=4 || $1=="PM" (if a rule evaluates TRUE, the default action is print)
    – steeldriver
    Jun 7 at 11:23











  • for some reason, the line of code you provided is giving me syntax errors. I am thinking perhaps i do not have the gsub command. May i ask what other commands can i explore? I tried egrep -v '(' , but doesn't work
    – tthhss
    Jun 8 at 3:17










  • it works now! I just had to use gawk instead of awk for the gsub to work. Thank you for your help
    – tthhss
    Jun 8 at 5:48










  • @tthhss out of interest, what awk do you have? I tested the above with both gawk and mawk before posting
    – steeldriver
    Jun 8 at 6:59










Your Answer







StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f448396%2fawk-string-manipulation-how-to-pull-strings-out-from-a-column-and-compare-it%23new-answer', 'question_page');

);

Post as a guest






























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
1
down vote













You could strip off the non-numeric characters before comparing:



$ awk 'x=$5; gsub(/[^0-9.]/,"",x); x+0>=4' file.txt
8 0 0 0 (100.0
1 0 0 0 10.8
1 1 0 0 13.8
1 2 0 0 12.0
1 3 0 0 16.3
1 4 0 0 13.2
1 5 0 0 12.1
1 6 0 0 11.4
1 7 0 0 10.4





share|improve this answer





















  • hmm, what if i have other conditions such as print when $1=="PM"? can i do this: awk 'if (($1=="PM")) print; x=$5; gsub(/[^0-9.]/,"",x); x+0>=4' file.txt I received a syntax error
    – tthhss
    Jun 7 at 11:19











  • Yes that should work - although it would be more vernacular to add it as a rule like x+0>=4 || $1=="PM" (if a rule evaluates TRUE, the default action is print)
    – steeldriver
    Jun 7 at 11:23











  • for some reason, the line of code you provided is giving me syntax errors. I am thinking perhaps i do not have the gsub command. May i ask what other commands can i explore? I tried egrep -v '(' , but doesn't work
    – tthhss
    Jun 8 at 3:17










  • it works now! I just had to use gawk instead of awk for the gsub to work. Thank you for your help
    – tthhss
    Jun 8 at 5:48










  • @tthhss out of interest, what awk do you have? I tested the above with both gawk and mawk before posting
    – steeldriver
    Jun 8 at 6:59














up vote
1
down vote













You could strip off the non-numeric characters before comparing:



$ awk 'x=$5; gsub(/[^0-9.]/,"",x); x+0>=4' file.txt
8 0 0 0 (100.0
1 0 0 0 10.8
1 1 0 0 13.8
1 2 0 0 12.0
1 3 0 0 16.3
1 4 0 0 13.2
1 5 0 0 12.1
1 6 0 0 11.4
1 7 0 0 10.4





share|improve this answer





















  • hmm, what if i have other conditions such as print when $1=="PM"? can i do this: awk 'if (($1=="PM")) print; x=$5; gsub(/[^0-9.]/,"",x); x+0>=4' file.txt I received a syntax error
    – tthhss
    Jun 7 at 11:19











  • Yes that should work - although it would be more vernacular to add it as a rule like x+0>=4 || $1=="PM" (if a rule evaluates TRUE, the default action is print)
    – steeldriver
    Jun 7 at 11:23











  • for some reason, the line of code you provided is giving me syntax errors. I am thinking perhaps i do not have the gsub command. May i ask what other commands can i explore? I tried egrep -v '(' , but doesn't work
    – tthhss
    Jun 8 at 3:17










  • it works now! I just had to use gawk instead of awk for the gsub to work. Thank you for your help
    – tthhss
    Jun 8 at 5:48










  • @tthhss out of interest, what awk do you have? I tested the above with both gawk and mawk before posting
    – steeldriver
    Jun 8 at 6:59












up vote
1
down vote










up vote
1
down vote









You could strip off the non-numeric characters before comparing:



$ awk 'x=$5; gsub(/[^0-9.]/,"",x); x+0>=4' file.txt
8 0 0 0 (100.0
1 0 0 0 10.8
1 1 0 0 13.8
1 2 0 0 12.0
1 3 0 0 16.3
1 4 0 0 13.2
1 5 0 0 12.1
1 6 0 0 11.4
1 7 0 0 10.4





share|improve this answer













You could strip off the non-numeric characters before comparing:



$ awk 'x=$5; gsub(/[^0-9.]/,"",x); x+0>=4' file.txt
8 0 0 0 (100.0
1 0 0 0 10.8
1 1 0 0 13.8
1 2 0 0 12.0
1 3 0 0 16.3
1 4 0 0 13.2
1 5 0 0 12.1
1 6 0 0 11.4
1 7 0 0 10.4






share|improve this answer













share|improve this answer



share|improve this answer











answered Jun 7 at 11:11









steeldriver

31.1k34978




31.1k34978











  • hmm, what if i have other conditions such as print when $1=="PM"? can i do this: awk 'if (($1=="PM")) print; x=$5; gsub(/[^0-9.]/,"",x); x+0>=4' file.txt I received a syntax error
    – tthhss
    Jun 7 at 11:19











  • Yes that should work - although it would be more vernacular to add it as a rule like x+0>=4 || $1=="PM" (if a rule evaluates TRUE, the default action is print)
    – steeldriver
    Jun 7 at 11:23











  • for some reason, the line of code you provided is giving me syntax errors. I am thinking perhaps i do not have the gsub command. May i ask what other commands can i explore? I tried egrep -v '(' , but doesn't work
    – tthhss
    Jun 8 at 3:17










  • it works now! I just had to use gawk instead of awk for the gsub to work. Thank you for your help
    – tthhss
    Jun 8 at 5:48










  • @tthhss out of interest, what awk do you have? I tested the above with both gawk and mawk before posting
    – steeldriver
    Jun 8 at 6:59
















  • hmm, what if i have other conditions such as print when $1=="PM"? can i do this: awk 'if (($1=="PM")) print; x=$5; gsub(/[^0-9.]/,"",x); x+0>=4' file.txt I received a syntax error
    – tthhss
    Jun 7 at 11:19











  • Yes that should work - although it would be more vernacular to add it as a rule like x+0>=4 || $1=="PM" (if a rule evaluates TRUE, the default action is print)
    – steeldriver
    Jun 7 at 11:23











  • for some reason, the line of code you provided is giving me syntax errors. I am thinking perhaps i do not have the gsub command. May i ask what other commands can i explore? I tried egrep -v '(' , but doesn't work
    – tthhss
    Jun 8 at 3:17










  • it works now! I just had to use gawk instead of awk for the gsub to work. Thank you for your help
    – tthhss
    Jun 8 at 5:48










  • @tthhss out of interest, what awk do you have? I tested the above with both gawk and mawk before posting
    – steeldriver
    Jun 8 at 6:59















hmm, what if i have other conditions such as print when $1=="PM"? can i do this: awk 'if (($1=="PM")) print; x=$5; gsub(/[^0-9.]/,"",x); x+0>=4' file.txt I received a syntax error
– tthhss
Jun 7 at 11:19





hmm, what if i have other conditions such as print when $1=="PM"? can i do this: awk 'if (($1=="PM")) print; x=$5; gsub(/[^0-9.]/,"",x); x+0>=4' file.txt I received a syntax error
– tthhss
Jun 7 at 11:19













Yes that should work - although it would be more vernacular to add it as a rule like x+0>=4 || $1=="PM" (if a rule evaluates TRUE, the default action is print)
– steeldriver
Jun 7 at 11:23





Yes that should work - although it would be more vernacular to add it as a rule like x+0>=4 || $1=="PM" (if a rule evaluates TRUE, the default action is print)
– steeldriver
Jun 7 at 11:23













for some reason, the line of code you provided is giving me syntax errors. I am thinking perhaps i do not have the gsub command. May i ask what other commands can i explore? I tried egrep -v '(' , but doesn't work
– tthhss
Jun 8 at 3:17




for some reason, the line of code you provided is giving me syntax errors. I am thinking perhaps i do not have the gsub command. May i ask what other commands can i explore? I tried egrep -v '(' , but doesn't work
– tthhss
Jun 8 at 3:17












it works now! I just had to use gawk instead of awk for the gsub to work. Thank you for your help
– tthhss
Jun 8 at 5:48




it works now! I just had to use gawk instead of awk for the gsub to work. Thank you for your help
– tthhss
Jun 8 at 5:48












@tthhss out of interest, what awk do you have? I tested the above with both gawk and mawk before posting
– steeldriver
Jun 8 at 6:59




@tthhss out of interest, what awk do you have? I tested the above with both gawk and mawk before posting
– steeldriver
Jun 8 at 6:59












 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f448396%2fawk-string-manipulation-how-to-pull-strings-out-from-a-column-and-compare-it%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

How to check contact read email or not when send email to Individual?

Displaying single band from multi-band raster using QGIS

How many registers does an x86_64 CPU actually have?