searching multiple files for a line with bigger number in column 3 of matched lines
Clash Royale CLAN TAG#URR8PPP
up vote
1
down vote
favorite
I have multiple files with contents similar to:
main file1:
test01:6733:4370:5342
test02:7776:2018:1001
test03:9865:5632:1429
test04:8477:4757:1890
test05:8019:8860:5298
test06:5602:3100:6995
test07:1445:2850:2755
test08:10924:2562:4867
test09:2575:1884:1611
sample file2:
test01:8777:1060:9236
test02:1322:1211:10837
test04:3737:10175:5219
test05:8467:8988:9739
test06:7452:3100:2709
test08:4707:9047:10578
test09:8669:2867:8233
test10:8615:10002:7056
sample file3:
test01:10957:8172:2472
test02:1401:6160:5894
test03:7245:8934:5725
test04:8477:10106:10069
test05:10769:10381:1102
test06:3605:3713:7695
test08:10924:2562:10568
test09:2913:5628:1305
test10:5501:10293:2319
I want to update each line in the main file1 with a line from another file with the same first column and in 3rd column having the biggest number from all the files.
Only first columns in main file should be considered (test## which are existing in the other files but are not existing in the main file should be ignored).
When more lines are found in the other files (with bigger but the same number in 3rd column) any (one) of them can be taken to update the main file.
here is my not optimal solution
$ awk -F: 'print $1,$3' main|while read a b;do grep ^$a: main file*|sort -t":" -rnk4|awk -F: -vb=$b 'if($4>b)print $0;next else print ($1=="main")? $0 : NULL'|head -1;done
file3:test01:10957:8172:2472
file3:test02:1401:6160:5894
file3:test03:7245:8934:5725
file2:test04:3737:10175:5219
file3:test05:10769:10381:1102
file3:test06:3605:3713:7695
main:test07:1445:2850:2755
file2:test08:4707:9047:10578
file3:test09:2913:5628:1305
how to process all such files in awk at once and do the job without while loops and many pipes which I have in my command?
Update:
@RomanPerekhrest, thank you for your awesome code, how to add yet :updated suffix to all lines which comes from the other files? I'd like to have something like:
test01:10957:8172:2472:updated
test02:1401:6160:5894:updated
test03:7245:8934:5725:updated
test04:3737:10175:5219:updated
test05:10769:10381:1102:updated
test06:3605:3713:7695:updated
test07:1445:2850:2755
test08:4707:9047:10578:updated
test09:2913:5628:1305:updated
Update:
I have new case, which I did not predict before, which is with the other files having bigger value in $3 but also non-digit in column $2 - in such case such line (although $3 bigger) should be ignored becasue of wrong values in $2.
To show this case, using above sample files, in "test09" line of file2 I replace second column with "xxxxx", and now I have:
$ grep test09 *
file2:test09:xxxxx:2867:8233
file3:test09:2913:5628:1305
main:test09:2575:1884:1611
$ awk -F':' 'FILENAME != "main" if ($2~/^[0-9]+/&&(!($1 in a) if (($1 in a) && (a[$1] > $3)) print b[$1]":updated"; delete b[$1] else print ' file* main
test01:10957:8172:2472:updated
test02:1401:6160:5894:updated
test03:7245:8934:5725:updated
test04:3737:10175:5219:updated
test05:10769:10381:1102:updated
test06:3605:3713:7695:updated
test07:1445:2850:2755
test08:4707:9047:10578:updated
test09:2913:5628:1305:updated <- this is now update from file3
next, I changed $2 value on "test09" line in file3 to non-digits too:
$ grep test09 *
file2:test09:xxxxx:2867:8233
file3:test09:zzzzz:5628:1305
main:test09:2575:1884:1611
$ awk -F':' 'FILENAME != "main" if ($2~/^[0-9]+/&&(!($1 in a) if (($1 in a) && (a[$1] > $3)) print b[$1]":updated"; delete b[$1] else print ' file* main
test01:10957:8172:2472:updated
test02:1401:6160:5894:updated
test03:7245:8934:5725:updated
test04:3737:10175:5219:updated
test05:10769:10381:1102:updated
test06:3605:3713:7695:updated
test07:1445:2850:2755
test08:4707:9047:10578:updated
test09:2575:1884:1611 <-- this is now from the main file
Although it seems to be working fine, could comeone please explain the second "if" in the code? Does it also need the condition for $2~/^[0-9]+/
too?
{ if (($1 in a) && (a[$1] > $3))
awk gawk
add a comment |Â
up vote
1
down vote
favorite
I have multiple files with contents similar to:
main file1:
test01:6733:4370:5342
test02:7776:2018:1001
test03:9865:5632:1429
test04:8477:4757:1890
test05:8019:8860:5298
test06:5602:3100:6995
test07:1445:2850:2755
test08:10924:2562:4867
test09:2575:1884:1611
sample file2:
test01:8777:1060:9236
test02:1322:1211:10837
test04:3737:10175:5219
test05:8467:8988:9739
test06:7452:3100:2709
test08:4707:9047:10578
test09:8669:2867:8233
test10:8615:10002:7056
sample file3:
test01:10957:8172:2472
test02:1401:6160:5894
test03:7245:8934:5725
test04:8477:10106:10069
test05:10769:10381:1102
test06:3605:3713:7695
test08:10924:2562:10568
test09:2913:5628:1305
test10:5501:10293:2319
I want to update each line in the main file1 with a line from another file with the same first column and in 3rd column having the biggest number from all the files.
Only first columns in main file should be considered (test## which are existing in the other files but are not existing in the main file should be ignored).
When more lines are found in the other files (with bigger but the same number in 3rd column) any (one) of them can be taken to update the main file.
here is my not optimal solution
$ awk -F: 'print $1,$3' main|while read a b;do grep ^$a: main file*|sort -t":" -rnk4|awk -F: -vb=$b 'if($4>b)print $0;next else print ($1=="main")? $0 : NULL'|head -1;done
file3:test01:10957:8172:2472
file3:test02:1401:6160:5894
file3:test03:7245:8934:5725
file2:test04:3737:10175:5219
file3:test05:10769:10381:1102
file3:test06:3605:3713:7695
main:test07:1445:2850:2755
file2:test08:4707:9047:10578
file3:test09:2913:5628:1305
how to process all such files in awk at once and do the job without while loops and many pipes which I have in my command?
Update:
@RomanPerekhrest, thank you for your awesome code, how to add yet :updated suffix to all lines which comes from the other files? I'd like to have something like:
test01:10957:8172:2472:updated
test02:1401:6160:5894:updated
test03:7245:8934:5725:updated
test04:3737:10175:5219:updated
test05:10769:10381:1102:updated
test06:3605:3713:7695:updated
test07:1445:2850:2755
test08:4707:9047:10578:updated
test09:2913:5628:1305:updated
Update:
I have new case, which I did not predict before, which is with the other files having bigger value in $3 but also non-digit in column $2 - in such case such line (although $3 bigger) should be ignored becasue of wrong values in $2.
To show this case, using above sample files, in "test09" line of file2 I replace second column with "xxxxx", and now I have:
$ grep test09 *
file2:test09:xxxxx:2867:8233
file3:test09:2913:5628:1305
main:test09:2575:1884:1611
$ awk -F':' 'FILENAME != "main" if ($2~/^[0-9]+/&&(!($1 in a) if (($1 in a) && (a[$1] > $3)) print b[$1]":updated"; delete b[$1] else print ' file* main
test01:10957:8172:2472:updated
test02:1401:6160:5894:updated
test03:7245:8934:5725:updated
test04:3737:10175:5219:updated
test05:10769:10381:1102:updated
test06:3605:3713:7695:updated
test07:1445:2850:2755
test08:4707:9047:10578:updated
test09:2913:5628:1305:updated <- this is now update from file3
next, I changed $2 value on "test09" line in file3 to non-digits too:
$ grep test09 *
file2:test09:xxxxx:2867:8233
file3:test09:zzzzz:5628:1305
main:test09:2575:1884:1611
$ awk -F':' 'FILENAME != "main" if ($2~/^[0-9]+/&&(!($1 in a) if (($1 in a) && (a[$1] > $3)) print b[$1]":updated"; delete b[$1] else print ' file* main
test01:10957:8172:2472:updated
test02:1401:6160:5894:updated
test03:7245:8934:5725:updated
test04:3737:10175:5219:updated
test05:10769:10381:1102:updated
test06:3605:3713:7695:updated
test07:1445:2850:2755
test08:4707:9047:10578:updated
test09:2575:1884:1611 <-- this is now from the main file
Although it seems to be working fine, could comeone please explain the second "if" in the code? Does it also need the condition for $2~/^[0-9]+/
too?
{ if (($1 in a) && (a[$1] > $3))
awk gawk
add a comment |Â
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I have multiple files with contents similar to:
main file1:
test01:6733:4370:5342
test02:7776:2018:1001
test03:9865:5632:1429
test04:8477:4757:1890
test05:8019:8860:5298
test06:5602:3100:6995
test07:1445:2850:2755
test08:10924:2562:4867
test09:2575:1884:1611
sample file2:
test01:8777:1060:9236
test02:1322:1211:10837
test04:3737:10175:5219
test05:8467:8988:9739
test06:7452:3100:2709
test08:4707:9047:10578
test09:8669:2867:8233
test10:8615:10002:7056
sample file3:
test01:10957:8172:2472
test02:1401:6160:5894
test03:7245:8934:5725
test04:8477:10106:10069
test05:10769:10381:1102
test06:3605:3713:7695
test08:10924:2562:10568
test09:2913:5628:1305
test10:5501:10293:2319
I want to update each line in the main file1 with a line from another file with the same first column and in 3rd column having the biggest number from all the files.
Only first columns in main file should be considered (test## which are existing in the other files but are not existing in the main file should be ignored).
When more lines are found in the other files (with bigger but the same number in 3rd column) any (one) of them can be taken to update the main file.
here is my not optimal solution
$ awk -F: 'print $1,$3' main|while read a b;do grep ^$a: main file*|sort -t":" -rnk4|awk -F: -vb=$b 'if($4>b)print $0;next else print ($1=="main")? $0 : NULL'|head -1;done
file3:test01:10957:8172:2472
file3:test02:1401:6160:5894
file3:test03:7245:8934:5725
file2:test04:3737:10175:5219
file3:test05:10769:10381:1102
file3:test06:3605:3713:7695
main:test07:1445:2850:2755
file2:test08:4707:9047:10578
file3:test09:2913:5628:1305
how to process all such files in awk at once and do the job without while loops and many pipes which I have in my command?
Update:
@RomanPerekhrest, thank you for your awesome code, how to add yet :updated suffix to all lines which comes from the other files? I'd like to have something like:
test01:10957:8172:2472:updated
test02:1401:6160:5894:updated
test03:7245:8934:5725:updated
test04:3737:10175:5219:updated
test05:10769:10381:1102:updated
test06:3605:3713:7695:updated
test07:1445:2850:2755
test08:4707:9047:10578:updated
test09:2913:5628:1305:updated
Update:
I have new case, which I did not predict before, which is with the other files having bigger value in $3 but also non-digit in column $2 - in such case such line (although $3 bigger) should be ignored becasue of wrong values in $2.
To show this case, using above sample files, in "test09" line of file2 I replace second column with "xxxxx", and now I have:
$ grep test09 *
file2:test09:xxxxx:2867:8233
file3:test09:2913:5628:1305
main:test09:2575:1884:1611
$ awk -F':' 'FILENAME != "main" if ($2~/^[0-9]+/&&(!($1 in a) if (($1 in a) && (a[$1] > $3)) print b[$1]":updated"; delete b[$1] else print ' file* main
test01:10957:8172:2472:updated
test02:1401:6160:5894:updated
test03:7245:8934:5725:updated
test04:3737:10175:5219:updated
test05:10769:10381:1102:updated
test06:3605:3713:7695:updated
test07:1445:2850:2755
test08:4707:9047:10578:updated
test09:2913:5628:1305:updated <- this is now update from file3
next, I changed $2 value on "test09" line in file3 to non-digits too:
$ grep test09 *
file2:test09:xxxxx:2867:8233
file3:test09:zzzzz:5628:1305
main:test09:2575:1884:1611
$ awk -F':' 'FILENAME != "main" if ($2~/^[0-9]+/&&(!($1 in a) if (($1 in a) && (a[$1] > $3)) print b[$1]":updated"; delete b[$1] else print ' file* main
test01:10957:8172:2472:updated
test02:1401:6160:5894:updated
test03:7245:8934:5725:updated
test04:3737:10175:5219:updated
test05:10769:10381:1102:updated
test06:3605:3713:7695:updated
test07:1445:2850:2755
test08:4707:9047:10578:updated
test09:2575:1884:1611 <-- this is now from the main file
Although it seems to be working fine, could comeone please explain the second "if" in the code? Does it also need the condition for $2~/^[0-9]+/
too?
{ if (($1 in a) && (a[$1] > $3))
awk gawk
I have multiple files with contents similar to:
main file1:
test01:6733:4370:5342
test02:7776:2018:1001
test03:9865:5632:1429
test04:8477:4757:1890
test05:8019:8860:5298
test06:5602:3100:6995
test07:1445:2850:2755
test08:10924:2562:4867
test09:2575:1884:1611
sample file2:
test01:8777:1060:9236
test02:1322:1211:10837
test04:3737:10175:5219
test05:8467:8988:9739
test06:7452:3100:2709
test08:4707:9047:10578
test09:8669:2867:8233
test10:8615:10002:7056
sample file3:
test01:10957:8172:2472
test02:1401:6160:5894
test03:7245:8934:5725
test04:8477:10106:10069
test05:10769:10381:1102
test06:3605:3713:7695
test08:10924:2562:10568
test09:2913:5628:1305
test10:5501:10293:2319
I want to update each line in the main file1 with a line from another file with the same first column and in 3rd column having the biggest number from all the files.
Only first columns in main file should be considered (test## which are existing in the other files but are not existing in the main file should be ignored).
When more lines are found in the other files (with bigger but the same number in 3rd column) any (one) of them can be taken to update the main file.
here is my not optimal solution
$ awk -F: 'print $1,$3' main|while read a b;do grep ^$a: main file*|sort -t":" -rnk4|awk -F: -vb=$b 'if($4>b)print $0;next else print ($1=="main")? $0 : NULL'|head -1;done
file3:test01:10957:8172:2472
file3:test02:1401:6160:5894
file3:test03:7245:8934:5725
file2:test04:3737:10175:5219
file3:test05:10769:10381:1102
file3:test06:3605:3713:7695
main:test07:1445:2850:2755
file2:test08:4707:9047:10578
file3:test09:2913:5628:1305
how to process all such files in awk at once and do the job without while loops and many pipes which I have in my command?
Update:
@RomanPerekhrest, thank you for your awesome code, how to add yet :updated suffix to all lines which comes from the other files? I'd like to have something like:
test01:10957:8172:2472:updated
test02:1401:6160:5894:updated
test03:7245:8934:5725:updated
test04:3737:10175:5219:updated
test05:10769:10381:1102:updated
test06:3605:3713:7695:updated
test07:1445:2850:2755
test08:4707:9047:10578:updated
test09:2913:5628:1305:updated
Update:
I have new case, which I did not predict before, which is with the other files having bigger value in $3 but also non-digit in column $2 - in such case such line (although $3 bigger) should be ignored becasue of wrong values in $2.
To show this case, using above sample files, in "test09" line of file2 I replace second column with "xxxxx", and now I have:
$ grep test09 *
file2:test09:xxxxx:2867:8233
file3:test09:2913:5628:1305
main:test09:2575:1884:1611
$ awk -F':' 'FILENAME != "main" if ($2~/^[0-9]+/&&(!($1 in a) if (($1 in a) && (a[$1] > $3)) print b[$1]":updated"; delete b[$1] else print ' file* main
test01:10957:8172:2472:updated
test02:1401:6160:5894:updated
test03:7245:8934:5725:updated
test04:3737:10175:5219:updated
test05:10769:10381:1102:updated
test06:3605:3713:7695:updated
test07:1445:2850:2755
test08:4707:9047:10578:updated
test09:2913:5628:1305:updated <- this is now update from file3
next, I changed $2 value on "test09" line in file3 to non-digits too:
$ grep test09 *
file2:test09:xxxxx:2867:8233
file3:test09:zzzzz:5628:1305
main:test09:2575:1884:1611
$ awk -F':' 'FILENAME != "main" if ($2~/^[0-9]+/&&(!($1 in a) if (($1 in a) && (a[$1] > $3)) print b[$1]":updated"; delete b[$1] else print ' file* main
test01:10957:8172:2472:updated
test02:1401:6160:5894:updated
test03:7245:8934:5725:updated
test04:3737:10175:5219:updated
test05:10769:10381:1102:updated
test06:3605:3713:7695:updated
test07:1445:2850:2755
test08:4707:9047:10578:updated
test09:2575:1884:1611 <-- this is now from the main file
Although it seems to be working fine, could comeone please explain the second "if" in the code? Does it also need the condition for $2~/^[0-9]+/
too?
{ if (($1 in a) && (a[$1] > $3))
awk gawk
edited Apr 8 at 17:47
asked Mar 24 at 19:04
DonJ
768
768
add a comment |Â
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
3
down vote
accepted
Optimized awk
solution which is about 27 times faster:
awk -F':' 'FILENAME != "main" $3 > a[$1]) a[$1] = $3; b[$1] = $0 next;
if (($1 in a) && (a[$1] > $3)) print b[$1]; delete b[$1]
else print;
' file* main
The output:
test01:10957:8172:2472
test02:1401:6160:5894
test03:7245:8934:5725
test04:3737:10175:5219
test05:10769:10381:1102
test06:3605:3713:7695
test07:1445:2850:2755
test08:4707:9047:10578
test09:2913:5628:1305
Execution Time comparison:
$ time(awk -F: 'print $1,$3' main |while read a b; do grep ^$a: main file* | sort -t":" -rnk4 | awk -F':' -vb=$b 'if($4>b)print $0;next else print ($1=="main")? $0 : NULL' | head -1; done > /dev/null)
real 0m0.111s
user 0m0.004s
sys 0m0.012s
$ time(awk -F':' 'FILENAME != "main" $3 > a[$1]) a[$1]=$3; b[$1]=$0 next if (($1 in a) && (a[$1] > $3)) print b[$1]; delete b[$1] else print ' file* main > /dev/null)
real 0m0.004s
user 0m0.000s
sys 0m0.000s
Thank you, it is awesome. I forgot about one thing, cour it be possible to add to all newer lines, these which are from the other files, additional column at the end, eg. :updated?
â DonJ
Mar 24 at 22:53
@DonJ, welcome. As for additional suffix:updated
- change the 2ndif
condition to the following:if (($1 in a) && (a[$1] > $3)) print b[$1]":updated"; delete b[$1]
â RomanPerekhrest
Mar 25 at 6:36
So you don't have to hardcode"main"
inside awk is to take the last filename:BEGIN last_file = ARGV[ARGC-1] FILENAME != last_file ...
â glenn jackman
Apr 14 at 13:33
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
accepted
Optimized awk
solution which is about 27 times faster:
awk -F':' 'FILENAME != "main" $3 > a[$1]) a[$1] = $3; b[$1] = $0 next;
if (($1 in a) && (a[$1] > $3)) print b[$1]; delete b[$1]
else print;
' file* main
The output:
test01:10957:8172:2472
test02:1401:6160:5894
test03:7245:8934:5725
test04:3737:10175:5219
test05:10769:10381:1102
test06:3605:3713:7695
test07:1445:2850:2755
test08:4707:9047:10578
test09:2913:5628:1305
Execution Time comparison:
$ time(awk -F: 'print $1,$3' main |while read a b; do grep ^$a: main file* | sort -t":" -rnk4 | awk -F':' -vb=$b 'if($4>b)print $0;next else print ($1=="main")? $0 : NULL' | head -1; done > /dev/null)
real 0m0.111s
user 0m0.004s
sys 0m0.012s
$ time(awk -F':' 'FILENAME != "main" $3 > a[$1]) a[$1]=$3; b[$1]=$0 next if (($1 in a) && (a[$1] > $3)) print b[$1]; delete b[$1] else print ' file* main > /dev/null)
real 0m0.004s
user 0m0.000s
sys 0m0.000s
Thank you, it is awesome. I forgot about one thing, cour it be possible to add to all newer lines, these which are from the other files, additional column at the end, eg. :updated?
â DonJ
Mar 24 at 22:53
@DonJ, welcome. As for additional suffix:updated
- change the 2ndif
condition to the following:if (($1 in a) && (a[$1] > $3)) print b[$1]":updated"; delete b[$1]
â RomanPerekhrest
Mar 25 at 6:36
So you don't have to hardcode"main"
inside awk is to take the last filename:BEGIN last_file = ARGV[ARGC-1] FILENAME != last_file ...
â glenn jackman
Apr 14 at 13:33
add a comment |Â
up vote
3
down vote
accepted
Optimized awk
solution which is about 27 times faster:
awk -F':' 'FILENAME != "main" $3 > a[$1]) a[$1] = $3; b[$1] = $0 next;
if (($1 in a) && (a[$1] > $3)) print b[$1]; delete b[$1]
else print;
' file* main
The output:
test01:10957:8172:2472
test02:1401:6160:5894
test03:7245:8934:5725
test04:3737:10175:5219
test05:10769:10381:1102
test06:3605:3713:7695
test07:1445:2850:2755
test08:4707:9047:10578
test09:2913:5628:1305
Execution Time comparison:
$ time(awk -F: 'print $1,$3' main |while read a b; do grep ^$a: main file* | sort -t":" -rnk4 | awk -F':' -vb=$b 'if($4>b)print $0;next else print ($1=="main")? $0 : NULL' | head -1; done > /dev/null)
real 0m0.111s
user 0m0.004s
sys 0m0.012s
$ time(awk -F':' 'FILENAME != "main" $3 > a[$1]) a[$1]=$3; b[$1]=$0 next if (($1 in a) && (a[$1] > $3)) print b[$1]; delete b[$1] else print ' file* main > /dev/null)
real 0m0.004s
user 0m0.000s
sys 0m0.000s
Thank you, it is awesome. I forgot about one thing, cour it be possible to add to all newer lines, these which are from the other files, additional column at the end, eg. :updated?
â DonJ
Mar 24 at 22:53
@DonJ, welcome. As for additional suffix:updated
- change the 2ndif
condition to the following:if (($1 in a) && (a[$1] > $3)) print b[$1]":updated"; delete b[$1]
â RomanPerekhrest
Mar 25 at 6:36
So you don't have to hardcode"main"
inside awk is to take the last filename:BEGIN last_file = ARGV[ARGC-1] FILENAME != last_file ...
â glenn jackman
Apr 14 at 13:33
add a comment |Â
up vote
3
down vote
accepted
up vote
3
down vote
accepted
Optimized awk
solution which is about 27 times faster:
awk -F':' 'FILENAME != "main" $3 > a[$1]) a[$1] = $3; b[$1] = $0 next;
if (($1 in a) && (a[$1] > $3)) print b[$1]; delete b[$1]
else print;
' file* main
The output:
test01:10957:8172:2472
test02:1401:6160:5894
test03:7245:8934:5725
test04:3737:10175:5219
test05:10769:10381:1102
test06:3605:3713:7695
test07:1445:2850:2755
test08:4707:9047:10578
test09:2913:5628:1305
Execution Time comparison:
$ time(awk -F: 'print $1,$3' main |while read a b; do grep ^$a: main file* | sort -t":" -rnk4 | awk -F':' -vb=$b 'if($4>b)print $0;next else print ($1=="main")? $0 : NULL' | head -1; done > /dev/null)
real 0m0.111s
user 0m0.004s
sys 0m0.012s
$ time(awk -F':' 'FILENAME != "main" $3 > a[$1]) a[$1]=$3; b[$1]=$0 next if (($1 in a) && (a[$1] > $3)) print b[$1]; delete b[$1] else print ' file* main > /dev/null)
real 0m0.004s
user 0m0.000s
sys 0m0.000s
Optimized awk
solution which is about 27 times faster:
awk -F':' 'FILENAME != "main" $3 > a[$1]) a[$1] = $3; b[$1] = $0 next;
if (($1 in a) && (a[$1] > $3)) print b[$1]; delete b[$1]
else print;
' file* main
The output:
test01:10957:8172:2472
test02:1401:6160:5894
test03:7245:8934:5725
test04:3737:10175:5219
test05:10769:10381:1102
test06:3605:3713:7695
test07:1445:2850:2755
test08:4707:9047:10578
test09:2913:5628:1305
Execution Time comparison:
$ time(awk -F: 'print $1,$3' main |while read a b; do grep ^$a: main file* | sort -t":" -rnk4 | awk -F':' -vb=$b 'if($4>b)print $0;next else print ($1=="main")? $0 : NULL' | head -1; done > /dev/null)
real 0m0.111s
user 0m0.004s
sys 0m0.012s
$ time(awk -F':' 'FILENAME != "main" $3 > a[$1]) a[$1]=$3; b[$1]=$0 next if (($1 in a) && (a[$1] > $3)) print b[$1]; delete b[$1] else print ' file* main > /dev/null)
real 0m0.004s
user 0m0.000s
sys 0m0.000s
answered Mar 24 at 20:25
RomanPerekhrest
22.4k12144
22.4k12144
Thank you, it is awesome. I forgot about one thing, cour it be possible to add to all newer lines, these which are from the other files, additional column at the end, eg. :updated?
â DonJ
Mar 24 at 22:53
@DonJ, welcome. As for additional suffix:updated
- change the 2ndif
condition to the following:if (($1 in a) && (a[$1] > $3)) print b[$1]":updated"; delete b[$1]
â RomanPerekhrest
Mar 25 at 6:36
So you don't have to hardcode"main"
inside awk is to take the last filename:BEGIN last_file = ARGV[ARGC-1] FILENAME != last_file ...
â glenn jackman
Apr 14 at 13:33
add a comment |Â
Thank you, it is awesome. I forgot about one thing, cour it be possible to add to all newer lines, these which are from the other files, additional column at the end, eg. :updated?
â DonJ
Mar 24 at 22:53
@DonJ, welcome. As for additional suffix:updated
- change the 2ndif
condition to the following:if (($1 in a) && (a[$1] > $3)) print b[$1]":updated"; delete b[$1]
â RomanPerekhrest
Mar 25 at 6:36
So you don't have to hardcode"main"
inside awk is to take the last filename:BEGIN last_file = ARGV[ARGC-1] FILENAME != last_file ...
â glenn jackman
Apr 14 at 13:33
Thank you, it is awesome. I forgot about one thing, cour it be possible to add to all newer lines, these which are from the other files, additional column at the end, eg. :updated?
â DonJ
Mar 24 at 22:53
Thank you, it is awesome. I forgot about one thing, cour it be possible to add to all newer lines, these which are from the other files, additional column at the end, eg. :updated?
â DonJ
Mar 24 at 22:53
@DonJ, welcome. As for additional suffix
:updated
- change the 2nd if
condition to the following: if (($1 in a) && (a[$1] > $3)) print b[$1]":updated"; delete b[$1]
â RomanPerekhrest
Mar 25 at 6:36
@DonJ, welcome. As for additional suffix
:updated
- change the 2nd if
condition to the following: if (($1 in a) && (a[$1] > $3)) print b[$1]":updated"; delete b[$1]
â RomanPerekhrest
Mar 25 at 6:36
So you don't have to hardcode
"main"
inside awk is to take the last filename: BEGIN last_file = ARGV[ARGC-1] FILENAME != last_file ...
â glenn jackman
Apr 14 at 13:33
So you don't have to hardcode
"main"
inside awk is to take the last filename: BEGIN last_file = ARGV[ARGC-1] FILENAME != last_file ...
â glenn jackman
Apr 14 at 13:33
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f433302%2fsearching-multiple-files-for-a-line-with-bigger-number-in-column-3-of-matched-li%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password