Using grep vs awk

Clash Royale CLAN TAG#URR8PPP
up vote
14
down vote
favorite
To capture a particular pattern, awk and grep can be used. Why should we use one over the other? Which is faster and why?
If I had a log file and I wanted to grab a certain pattern, I could do one of the following
awk '/pattern/' /var/log/messages
or
grep 'pattern' /var/log/messages
I haven't done any benchmarking, so I wouldn't know. Can someone elaborate this? It is great to know the inner workings of these two tools.
linux awk grep performance
add a comment |Â
up vote
14
down vote
favorite
To capture a particular pattern, awk and grep can be used. Why should we use one over the other? Which is faster and why?
If I had a log file and I wanted to grab a certain pattern, I could do one of the following
awk '/pattern/' /var/log/messages
or
grep 'pattern' /var/log/messages
I haven't done any benchmarking, so I wouldn't know. Can someone elaborate this? It is great to know the inner workings of these two tools.
linux awk grep performance
Precede any command, even shell scripts, with thetimecommand to time how long it takes to run the command. Ex:time ls -l.
â Bulrush
Aug 26 '16 at 12:20
add a comment |Â
up vote
14
down vote
favorite
up vote
14
down vote
favorite
To capture a particular pattern, awk and grep can be used. Why should we use one over the other? Which is faster and why?
If I had a log file and I wanted to grab a certain pattern, I could do one of the following
awk '/pattern/' /var/log/messages
or
grep 'pattern' /var/log/messages
I haven't done any benchmarking, so I wouldn't know. Can someone elaborate this? It is great to know the inner workings of these two tools.
linux awk grep performance
To capture a particular pattern, awk and grep can be used. Why should we use one over the other? Which is faster and why?
If I had a log file and I wanted to grab a certain pattern, I could do one of the following
awk '/pattern/' /var/log/messages
or
grep 'pattern' /var/log/messages
I haven't done any benchmarking, so I wouldn't know. Can someone elaborate this? It is great to know the inner workings of these two tools.
linux awk grep performance
linux awk grep performance
edited Aug 19 at 6:54
codeforester
353314
353314
asked Aug 28 '13 at 8:20
holasz
139112
139112
Precede any command, even shell scripts, with thetimecommand to time how long it takes to run the command. Ex:time ls -l.
â Bulrush
Aug 26 '16 at 12:20
add a comment |Â
Precede any command, even shell scripts, with thetimecommand to time how long it takes to run the command. Ex:time ls -l.
â Bulrush
Aug 26 '16 at 12:20
Precede any command, even shell scripts, with the
time command to time how long it takes to run the command. Ex: time ls -l.â Bulrush
Aug 26 '16 at 12:20
Precede any command, even shell scripts, with the
time command to time how long it takes to run the command. Ex: time ls -l.â Bulrush
Aug 26 '16 at 12:20
add a comment |Â
5 Answers
5
active
oldest
votes
up vote
23
down vote
accepted
grep will most likely be faster:
# time awk '/USAGE/' imapd.log.1 | wc -l
73832
real 0m2.756s
user 0m2.740s
sys 0m0.020s
# time grep 'USAGE' imapd.log.1 | wc -l
73832
real 0m0.110s
user 0m0.100s
sys 0m0.030s
awk is a interpreted programming language, where as grep is a compiled c-code program (which is additionally optimized towards finding patterns in files).
(Note - I ran both commands twice so that caching would not potentially skew the results)
More details about interpreted languages on wikipedia.
As Stephane has rightly pointed out in comments, your mileage may vary due to the implementation of the grep and awk you use, the operating system it is on and the character set you are processing.
2
Without saying what grep or awk implementation you're using and on what computer architecture, and with which system character set, those timings have little value.
â Stéphane Chazelas
Aug 28 '13 at 11:59
1
the second command will also use the newly cached version. I dont doubt that grep is quicker but not by as much as your numbers show.
â exussum
Aug 28 '13 at 12:18
(hence running awk, grep, awk, grep and posting the results from the second set of awk and grep :) and FYI, I live in a UTF8 locale.
â Drav Sloan
Aug 28 '13 at 12:58
1
Funny enough, with the BSD tools (on a Mac), awk (31.74s) is slightly faster than sed (33.34s), which is slightly faster than grep (34.21s). Gnu awk owns them all at 5.24s, I don't have gnu grep or sed to test.
â Kevin
Aug 28 '13 at 14:25
1
grep should be slightly faster because awk does more with each input line than just search for a regexp in it, e.g. if a field is referenced in the script (which it's not in this case) awk will split each input line into fields based on the field-separator value and it populates builtin variables. but with what you posted there should be almost no difference. By far the most important difference between grep and awk wrt matching regexps is that grep searches the whole line for a matching string while awk can search specific fields and so provide more precision and fewer false matches.
â Ed Morton
Aug 19 at 16:14
add a comment |Â
up vote
11
down vote
Use the most specific and expressive tool. The tool that best fits your use case is likely to be the fastest.
As a rough guide:
- searching for lines matching a substring or regexp? Use grep.
- selecting certain columns from a simply-delimited file? Use cut.
- performing pattern-based substitutions or ... other stuff sed can reasonably do? Use sed.
- need some combination of the above 3, or printf formatting, or general purpose loops and branches? Use awk.
+1 except useperlinstead ofawk. if you need something more complicated than grep/cut/sed, then chances are awk won't be enough and you need something "full-blown"
â sds
Aug 28 '13 at 14:34
@sds why not python instead
â RetroCode
Sep 23 '16 at 18:45
@RetroCode: python is more "general purpose" than perl; the equivalent one-liner will probably be much longer.
â sds
Sep 23 '16 at 19:48
2
@sds no, you don't need perl unless you're going to do something other than text processing. awk is just fine for the text processing stuff that's more complicated than grep/cut/sed and as a bonus comes as standard on all UNIX installations, unlike perl.
â Ed Morton
Aug 19 at 16:19
add a comment |Â
up vote
8
down vote
When only searching for strings, and speed matters, you should almost always use grep. It's orders of magnitude faster than awk when it comes to just gross searching.
source The functional and performance differences of sed, awk and other Unix parsing utilities
UTILITY OPERATION TYPE EXECUTION TIME CHARACTERS PROCESSED PER SECOND
(10 ITERATIONS)
------- -------------- --------------- -------------------------------
grep search only 41 sec. 489.3 million
sed search & replace 4 min. 4 sec. 82.1 million
awk search & replace 4 min. 46 sec. 69.8 million
Python search & replace 4 min. 50 sec. 69.0 million
PHP search & replace 15 min. 44 sec. 21.2 million
Thanks for this nice overview of all these programs. It really sheds light in the darkness.
â holasz
Aug 28 '13 at 10:53
1
~headtilt~ PHP is on there but Perl isn't?
â Izkata
Aug 28 '13 at 11:34
@Izkata - I thought the same thing when I saw this table a while ago.
â slmâ¦
Aug 28 '13 at 11:52
1
It's not really fair to the other utils that grep is just searching and they are also replacing.
â Kevin
Aug 28 '13 at 13:57
1
Those are completely bogus numbers. Talk about comparing apples and oranges - it's like saying you can only find a new car on web site A in 5 secs whereas you can find a car, negotiate a price, get a loan, and purchase the car on site B in 1 hour so therefore site A is faster than site B.The article you quoted is completely wrong in it's statements of relative execution speed between grep, sed, and awk and it also saysawk ... has PCRE matching for regular expressionswhich is just completely untrue.
â Ed Morton
Aug 19 at 16:23
 |Â
show 4 more comments
up vote
5
down vote
While I agree that in theory grep should be faster than awk, in practice, YMMV as that depends a lot on the implementation you use.
here comparing busybox 1.20.0's grep and awk, GNU grep 2.14, mawk 1.3.3, GNU awk 4.0.1 on Debian/Linux 7.0 amd64 (with glibc 2.17) in a UTF-8 locale on a 240MB file of 2.5M lines of ASCII-only characters.
$ time busybox grep error error | wc -l
331003
busybox grep error error 8.31s user 0.12s system 99% cpu 8.450 total
wc -l 0.07s user 0.11s system 2% cpu 8.448 total
$ time busybox awk /error/ error | wc -l
331003
busybox awk /error/ error 2.39s user 0.84s system 98% cpu 3.265 total
wc -l 0.12s user 1.23s system 41% cpu 3.264 total
$ time grep error error | wc -l
331003
grep error error 0.80s user 0.10s system 99% cpu 0.914 total
wc -l 0.00s user 0.11s system 12% cpu 0.913 total
$ time mawk /error/ error | wc -l
330803
mawk /error/ error 0.54s user 0.13s system 91% cpu 0.732 total
wc -l 0.03s user 0.08s system 14% cpu 0.731 total
$ time gawk /error/ error | wc -l
331003
gawk /error/ error 1.37s user 0.12s system 99% cpu 1.494 total
wc -l 0.04s user 0.07s system 7% cpu 1.492 total
$ time
In the C locale, only GNU grep gets a significant boost and becomes faster than mawk.
The dataset, the type of the regexp may also make a big difference. For regexps, awk should be compared to grep -E as awk's regexps are extended REs.
For this dataset, awk could be faster than grep on busybox based systems or systems where mawk is the default awk and the default locale is UTF-8 based (IIRC, it used to be the case in Ubuntu).
add a comment |Â
up vote
2
down vote
In a nutshell, grep does one thing only as many other UNIX tools and that's matching a line to the given pattern and it does it well. On the other hand, awk is more sophisticated tool as it is a complete programming language defined by POSIX standard with typical features like variables, arrays, expressions, functions or control statements for pattern scanning and processing.
In my opinion, it depends on the implementation how both tools perform in case of pattern matching and on the size of some input you want to process. I would expect that grep is usually more efficient than awk as it does matching only. But you can't write with grep a simple code to perform more complex tasks like further processing of matched records, computation or printing results without using other tools.
add a comment |Â
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
23
down vote
accepted
grep will most likely be faster:
# time awk '/USAGE/' imapd.log.1 | wc -l
73832
real 0m2.756s
user 0m2.740s
sys 0m0.020s
# time grep 'USAGE' imapd.log.1 | wc -l
73832
real 0m0.110s
user 0m0.100s
sys 0m0.030s
awk is a interpreted programming language, where as grep is a compiled c-code program (which is additionally optimized towards finding patterns in files).
(Note - I ran both commands twice so that caching would not potentially skew the results)
More details about interpreted languages on wikipedia.
As Stephane has rightly pointed out in comments, your mileage may vary due to the implementation of the grep and awk you use, the operating system it is on and the character set you are processing.
2
Without saying what grep or awk implementation you're using and on what computer architecture, and with which system character set, those timings have little value.
â Stéphane Chazelas
Aug 28 '13 at 11:59
1
the second command will also use the newly cached version. I dont doubt that grep is quicker but not by as much as your numbers show.
â exussum
Aug 28 '13 at 12:18
(hence running awk, grep, awk, grep and posting the results from the second set of awk and grep :) and FYI, I live in a UTF8 locale.
â Drav Sloan
Aug 28 '13 at 12:58
1
Funny enough, with the BSD tools (on a Mac), awk (31.74s) is slightly faster than sed (33.34s), which is slightly faster than grep (34.21s). Gnu awk owns them all at 5.24s, I don't have gnu grep or sed to test.
â Kevin
Aug 28 '13 at 14:25
1
grep should be slightly faster because awk does more with each input line than just search for a regexp in it, e.g. if a field is referenced in the script (which it's not in this case) awk will split each input line into fields based on the field-separator value and it populates builtin variables. but with what you posted there should be almost no difference. By far the most important difference between grep and awk wrt matching regexps is that grep searches the whole line for a matching string while awk can search specific fields and so provide more precision and fewer false matches.
â Ed Morton
Aug 19 at 16:14
add a comment |Â
up vote
23
down vote
accepted
grep will most likely be faster:
# time awk '/USAGE/' imapd.log.1 | wc -l
73832
real 0m2.756s
user 0m2.740s
sys 0m0.020s
# time grep 'USAGE' imapd.log.1 | wc -l
73832
real 0m0.110s
user 0m0.100s
sys 0m0.030s
awk is a interpreted programming language, where as grep is a compiled c-code program (which is additionally optimized towards finding patterns in files).
(Note - I ran both commands twice so that caching would not potentially skew the results)
More details about interpreted languages on wikipedia.
As Stephane has rightly pointed out in comments, your mileage may vary due to the implementation of the grep and awk you use, the operating system it is on and the character set you are processing.
2
Without saying what grep or awk implementation you're using and on what computer architecture, and with which system character set, those timings have little value.
â Stéphane Chazelas
Aug 28 '13 at 11:59
1
the second command will also use the newly cached version. I dont doubt that grep is quicker but not by as much as your numbers show.
â exussum
Aug 28 '13 at 12:18
(hence running awk, grep, awk, grep and posting the results from the second set of awk and grep :) and FYI, I live in a UTF8 locale.
â Drav Sloan
Aug 28 '13 at 12:58
1
Funny enough, with the BSD tools (on a Mac), awk (31.74s) is slightly faster than sed (33.34s), which is slightly faster than grep (34.21s). Gnu awk owns them all at 5.24s, I don't have gnu grep or sed to test.
â Kevin
Aug 28 '13 at 14:25
1
grep should be slightly faster because awk does more with each input line than just search for a regexp in it, e.g. if a field is referenced in the script (which it's not in this case) awk will split each input line into fields based on the field-separator value and it populates builtin variables. but with what you posted there should be almost no difference. By far the most important difference between grep and awk wrt matching regexps is that grep searches the whole line for a matching string while awk can search specific fields and so provide more precision and fewer false matches.
â Ed Morton
Aug 19 at 16:14
add a comment |Â
up vote
23
down vote
accepted
up vote
23
down vote
accepted
grep will most likely be faster:
# time awk '/USAGE/' imapd.log.1 | wc -l
73832
real 0m2.756s
user 0m2.740s
sys 0m0.020s
# time grep 'USAGE' imapd.log.1 | wc -l
73832
real 0m0.110s
user 0m0.100s
sys 0m0.030s
awk is a interpreted programming language, where as grep is a compiled c-code program (which is additionally optimized towards finding patterns in files).
(Note - I ran both commands twice so that caching would not potentially skew the results)
More details about interpreted languages on wikipedia.
As Stephane has rightly pointed out in comments, your mileage may vary due to the implementation of the grep and awk you use, the operating system it is on and the character set you are processing.
grep will most likely be faster:
# time awk '/USAGE/' imapd.log.1 | wc -l
73832
real 0m2.756s
user 0m2.740s
sys 0m0.020s
# time grep 'USAGE' imapd.log.1 | wc -l
73832
real 0m0.110s
user 0m0.100s
sys 0m0.030s
awk is a interpreted programming language, where as grep is a compiled c-code program (which is additionally optimized towards finding patterns in files).
(Note - I ran both commands twice so that caching would not potentially skew the results)
More details about interpreted languages on wikipedia.
As Stephane has rightly pointed out in comments, your mileage may vary due to the implementation of the grep and awk you use, the operating system it is on and the character set you are processing.
edited Aug 28 '13 at 12:04
answered Aug 28 '13 at 8:44
Drav Sloan
9,20023038
9,20023038
2
Without saying what grep or awk implementation you're using and on what computer architecture, and with which system character set, those timings have little value.
â Stéphane Chazelas
Aug 28 '13 at 11:59
1
the second command will also use the newly cached version. I dont doubt that grep is quicker but not by as much as your numbers show.
â exussum
Aug 28 '13 at 12:18
(hence running awk, grep, awk, grep and posting the results from the second set of awk and grep :) and FYI, I live in a UTF8 locale.
â Drav Sloan
Aug 28 '13 at 12:58
1
Funny enough, with the BSD tools (on a Mac), awk (31.74s) is slightly faster than sed (33.34s), which is slightly faster than grep (34.21s). Gnu awk owns them all at 5.24s, I don't have gnu grep or sed to test.
â Kevin
Aug 28 '13 at 14:25
1
grep should be slightly faster because awk does more with each input line than just search for a regexp in it, e.g. if a field is referenced in the script (which it's not in this case) awk will split each input line into fields based on the field-separator value and it populates builtin variables. but with what you posted there should be almost no difference. By far the most important difference between grep and awk wrt matching regexps is that grep searches the whole line for a matching string while awk can search specific fields and so provide more precision and fewer false matches.
â Ed Morton
Aug 19 at 16:14
add a comment |Â
2
Without saying what grep or awk implementation you're using and on what computer architecture, and with which system character set, those timings have little value.
â Stéphane Chazelas
Aug 28 '13 at 11:59
1
the second command will also use the newly cached version. I dont doubt that grep is quicker but not by as much as your numbers show.
â exussum
Aug 28 '13 at 12:18
(hence running awk, grep, awk, grep and posting the results from the second set of awk and grep :) and FYI, I live in a UTF8 locale.
â Drav Sloan
Aug 28 '13 at 12:58
1
Funny enough, with the BSD tools (on a Mac), awk (31.74s) is slightly faster than sed (33.34s), which is slightly faster than grep (34.21s). Gnu awk owns them all at 5.24s, I don't have gnu grep or sed to test.
â Kevin
Aug 28 '13 at 14:25
1
grep should be slightly faster because awk does more with each input line than just search for a regexp in it, e.g. if a field is referenced in the script (which it's not in this case) awk will split each input line into fields based on the field-separator value and it populates builtin variables. but with what you posted there should be almost no difference. By far the most important difference between grep and awk wrt matching regexps is that grep searches the whole line for a matching string while awk can search specific fields and so provide more precision and fewer false matches.
â Ed Morton
Aug 19 at 16:14
2
2
Without saying what grep or awk implementation you're using and on what computer architecture, and with which system character set, those timings have little value.
â Stéphane Chazelas
Aug 28 '13 at 11:59
Without saying what grep or awk implementation you're using and on what computer architecture, and with which system character set, those timings have little value.
â Stéphane Chazelas
Aug 28 '13 at 11:59
1
1
the second command will also use the newly cached version. I dont doubt that grep is quicker but not by as much as your numbers show.
â exussum
Aug 28 '13 at 12:18
the second command will also use the newly cached version. I dont doubt that grep is quicker but not by as much as your numbers show.
â exussum
Aug 28 '13 at 12:18
(hence running awk, grep, awk, grep and posting the results from the second set of awk and grep :) and FYI, I live in a UTF8 locale.
â Drav Sloan
Aug 28 '13 at 12:58
(hence running awk, grep, awk, grep and posting the results from the second set of awk and grep :) and FYI, I live in a UTF8 locale.
â Drav Sloan
Aug 28 '13 at 12:58
1
1
Funny enough, with the BSD tools (on a Mac), awk (31.74s) is slightly faster than sed (33.34s), which is slightly faster than grep (34.21s). Gnu awk owns them all at 5.24s, I don't have gnu grep or sed to test.
â Kevin
Aug 28 '13 at 14:25
Funny enough, with the BSD tools (on a Mac), awk (31.74s) is slightly faster than sed (33.34s), which is slightly faster than grep (34.21s). Gnu awk owns them all at 5.24s, I don't have gnu grep or sed to test.
â Kevin
Aug 28 '13 at 14:25
1
1
grep should be slightly faster because awk does more with each input line than just search for a regexp in it, e.g. if a field is referenced in the script (which it's not in this case) awk will split each input line into fields based on the field-separator value and it populates builtin variables. but with what you posted there should be almost no difference. By far the most important difference between grep and awk wrt matching regexps is that grep searches the whole line for a matching string while awk can search specific fields and so provide more precision and fewer false matches.
â Ed Morton
Aug 19 at 16:14
grep should be slightly faster because awk does more with each input line than just search for a regexp in it, e.g. if a field is referenced in the script (which it's not in this case) awk will split each input line into fields based on the field-separator value and it populates builtin variables. but with what you posted there should be almost no difference. By far the most important difference between grep and awk wrt matching regexps is that grep searches the whole line for a matching string while awk can search specific fields and so provide more precision and fewer false matches.
â Ed Morton
Aug 19 at 16:14
add a comment |Â
up vote
11
down vote
Use the most specific and expressive tool. The tool that best fits your use case is likely to be the fastest.
As a rough guide:
- searching for lines matching a substring or regexp? Use grep.
- selecting certain columns from a simply-delimited file? Use cut.
- performing pattern-based substitutions or ... other stuff sed can reasonably do? Use sed.
- need some combination of the above 3, or printf formatting, or general purpose loops and branches? Use awk.
+1 except useperlinstead ofawk. if you need something more complicated than grep/cut/sed, then chances are awk won't be enough and you need something "full-blown"
â sds
Aug 28 '13 at 14:34
@sds why not python instead
â RetroCode
Sep 23 '16 at 18:45
@RetroCode: python is more "general purpose" than perl; the equivalent one-liner will probably be much longer.
â sds
Sep 23 '16 at 19:48
2
@sds no, you don't need perl unless you're going to do something other than text processing. awk is just fine for the text processing stuff that's more complicated than grep/cut/sed and as a bonus comes as standard on all UNIX installations, unlike perl.
â Ed Morton
Aug 19 at 16:19
add a comment |Â
up vote
11
down vote
Use the most specific and expressive tool. The tool that best fits your use case is likely to be the fastest.
As a rough guide:
- searching for lines matching a substring or regexp? Use grep.
- selecting certain columns from a simply-delimited file? Use cut.
- performing pattern-based substitutions or ... other stuff sed can reasonably do? Use sed.
- need some combination of the above 3, or printf formatting, or general purpose loops and branches? Use awk.
+1 except useperlinstead ofawk. if you need something more complicated than grep/cut/sed, then chances are awk won't be enough and you need something "full-blown"
â sds
Aug 28 '13 at 14:34
@sds why not python instead
â RetroCode
Sep 23 '16 at 18:45
@RetroCode: python is more "general purpose" than perl; the equivalent one-liner will probably be much longer.
â sds
Sep 23 '16 at 19:48
2
@sds no, you don't need perl unless you're going to do something other than text processing. awk is just fine for the text processing stuff that's more complicated than grep/cut/sed and as a bonus comes as standard on all UNIX installations, unlike perl.
â Ed Morton
Aug 19 at 16:19
add a comment |Â
up vote
11
down vote
up vote
11
down vote
Use the most specific and expressive tool. The tool that best fits your use case is likely to be the fastest.
As a rough guide:
- searching for lines matching a substring or regexp? Use grep.
- selecting certain columns from a simply-delimited file? Use cut.
- performing pattern-based substitutions or ... other stuff sed can reasonably do? Use sed.
- need some combination of the above 3, or printf formatting, or general purpose loops and branches? Use awk.
Use the most specific and expressive tool. The tool that best fits your use case is likely to be the fastest.
As a rough guide:
- searching for lines matching a substring or regexp? Use grep.
- selecting certain columns from a simply-delimited file? Use cut.
- performing pattern-based substitutions or ... other stuff sed can reasonably do? Use sed.
- need some combination of the above 3, or printf formatting, or general purpose loops and branches? Use awk.
answered Aug 28 '13 at 12:31
Useless
3,2881318
3,2881318
+1 except useperlinstead ofawk. if you need something more complicated than grep/cut/sed, then chances are awk won't be enough and you need something "full-blown"
â sds
Aug 28 '13 at 14:34
@sds why not python instead
â RetroCode
Sep 23 '16 at 18:45
@RetroCode: python is more "general purpose" than perl; the equivalent one-liner will probably be much longer.
â sds
Sep 23 '16 at 19:48
2
@sds no, you don't need perl unless you're going to do something other than text processing. awk is just fine for the text processing stuff that's more complicated than grep/cut/sed and as a bonus comes as standard on all UNIX installations, unlike perl.
â Ed Morton
Aug 19 at 16:19
add a comment |Â
+1 except useperlinstead ofawk. if you need something more complicated than grep/cut/sed, then chances are awk won't be enough and you need something "full-blown"
â sds
Aug 28 '13 at 14:34
@sds why not python instead
â RetroCode
Sep 23 '16 at 18:45
@RetroCode: python is more "general purpose" than perl; the equivalent one-liner will probably be much longer.
â sds
Sep 23 '16 at 19:48
2
@sds no, you don't need perl unless you're going to do something other than text processing. awk is just fine for the text processing stuff that's more complicated than grep/cut/sed and as a bonus comes as standard on all UNIX installations, unlike perl.
â Ed Morton
Aug 19 at 16:19
+1 except use
perl instead of awk. if you need something more complicated than grep/cut/sed, then chances are awk won't be enough and you need something "full-blown"â sds
Aug 28 '13 at 14:34
+1 except use
perl instead of awk. if you need something more complicated than grep/cut/sed, then chances are awk won't be enough and you need something "full-blown"â sds
Aug 28 '13 at 14:34
@sds why not python instead
â RetroCode
Sep 23 '16 at 18:45
@sds why not python instead
â RetroCode
Sep 23 '16 at 18:45
@RetroCode: python is more "general purpose" than perl; the equivalent one-liner will probably be much longer.
â sds
Sep 23 '16 at 19:48
@RetroCode: python is more "general purpose" than perl; the equivalent one-liner will probably be much longer.
â sds
Sep 23 '16 at 19:48
2
2
@sds no, you don't need perl unless you're going to do something other than text processing. awk is just fine for the text processing stuff that's more complicated than grep/cut/sed and as a bonus comes as standard on all UNIX installations, unlike perl.
â Ed Morton
Aug 19 at 16:19
@sds no, you don't need perl unless you're going to do something other than text processing. awk is just fine for the text processing stuff that's more complicated than grep/cut/sed and as a bonus comes as standard on all UNIX installations, unlike perl.
â Ed Morton
Aug 19 at 16:19
add a comment |Â
up vote
8
down vote
When only searching for strings, and speed matters, you should almost always use grep. It's orders of magnitude faster than awk when it comes to just gross searching.
source The functional and performance differences of sed, awk and other Unix parsing utilities
UTILITY OPERATION TYPE EXECUTION TIME CHARACTERS PROCESSED PER SECOND
(10 ITERATIONS)
------- -------------- --------------- -------------------------------
grep search only 41 sec. 489.3 million
sed search & replace 4 min. 4 sec. 82.1 million
awk search & replace 4 min. 46 sec. 69.8 million
Python search & replace 4 min. 50 sec. 69.0 million
PHP search & replace 15 min. 44 sec. 21.2 million
Thanks for this nice overview of all these programs. It really sheds light in the darkness.
â holasz
Aug 28 '13 at 10:53
1
~headtilt~ PHP is on there but Perl isn't?
â Izkata
Aug 28 '13 at 11:34
@Izkata - I thought the same thing when I saw this table a while ago.
â slmâ¦
Aug 28 '13 at 11:52
1
It's not really fair to the other utils that grep is just searching and they are also replacing.
â Kevin
Aug 28 '13 at 13:57
1
Those are completely bogus numbers. Talk about comparing apples and oranges - it's like saying you can only find a new car on web site A in 5 secs whereas you can find a car, negotiate a price, get a loan, and purchase the car on site B in 1 hour so therefore site A is faster than site B.The article you quoted is completely wrong in it's statements of relative execution speed between grep, sed, and awk and it also saysawk ... has PCRE matching for regular expressionswhich is just completely untrue.
â Ed Morton
Aug 19 at 16:23
 |Â
show 4 more comments
up vote
8
down vote
When only searching for strings, and speed matters, you should almost always use grep. It's orders of magnitude faster than awk when it comes to just gross searching.
source The functional and performance differences of sed, awk and other Unix parsing utilities
UTILITY OPERATION TYPE EXECUTION TIME CHARACTERS PROCESSED PER SECOND
(10 ITERATIONS)
------- -------------- --------------- -------------------------------
grep search only 41 sec. 489.3 million
sed search & replace 4 min. 4 sec. 82.1 million
awk search & replace 4 min. 46 sec. 69.8 million
Python search & replace 4 min. 50 sec. 69.0 million
PHP search & replace 15 min. 44 sec. 21.2 million
Thanks for this nice overview of all these programs. It really sheds light in the darkness.
â holasz
Aug 28 '13 at 10:53
1
~headtilt~ PHP is on there but Perl isn't?
â Izkata
Aug 28 '13 at 11:34
@Izkata - I thought the same thing when I saw this table a while ago.
â slmâ¦
Aug 28 '13 at 11:52
1
It's not really fair to the other utils that grep is just searching and they are also replacing.
â Kevin
Aug 28 '13 at 13:57
1
Those are completely bogus numbers. Talk about comparing apples and oranges - it's like saying you can only find a new car on web site A in 5 secs whereas you can find a car, negotiate a price, get a loan, and purchase the car on site B in 1 hour so therefore site A is faster than site B.The article you quoted is completely wrong in it's statements of relative execution speed between grep, sed, and awk and it also saysawk ... has PCRE matching for regular expressionswhich is just completely untrue.
â Ed Morton
Aug 19 at 16:23
 |Â
show 4 more comments
up vote
8
down vote
up vote
8
down vote
When only searching for strings, and speed matters, you should almost always use grep. It's orders of magnitude faster than awk when it comes to just gross searching.
source The functional and performance differences of sed, awk and other Unix parsing utilities
UTILITY OPERATION TYPE EXECUTION TIME CHARACTERS PROCESSED PER SECOND
(10 ITERATIONS)
------- -------------- --------------- -------------------------------
grep search only 41 sec. 489.3 million
sed search & replace 4 min. 4 sec. 82.1 million
awk search & replace 4 min. 46 sec. 69.8 million
Python search & replace 4 min. 50 sec. 69.0 million
PHP search & replace 15 min. 44 sec. 21.2 million
When only searching for strings, and speed matters, you should almost always use grep. It's orders of magnitude faster than awk when it comes to just gross searching.
source The functional and performance differences of sed, awk and other Unix parsing utilities
UTILITY OPERATION TYPE EXECUTION TIME CHARACTERS PROCESSED PER SECOND
(10 ITERATIONS)
------- -------------- --------------- -------------------------------
grep search only 41 sec. 489.3 million
sed search & replace 4 min. 4 sec. 82.1 million
awk search & replace 4 min. 46 sec. 69.8 million
Python search & replace 4 min. 50 sec. 69.0 million
PHP search & replace 15 min. 44 sec. 21.2 million
answered Aug 28 '13 at 9:12
slmâ¦
238k65493664
238k65493664
Thanks for this nice overview of all these programs. It really sheds light in the darkness.
â holasz
Aug 28 '13 at 10:53
1
~headtilt~ PHP is on there but Perl isn't?
â Izkata
Aug 28 '13 at 11:34
@Izkata - I thought the same thing when I saw this table a while ago.
â slmâ¦
Aug 28 '13 at 11:52
1
It's not really fair to the other utils that grep is just searching and they are also replacing.
â Kevin
Aug 28 '13 at 13:57
1
Those are completely bogus numbers. Talk about comparing apples and oranges - it's like saying you can only find a new car on web site A in 5 secs whereas you can find a car, negotiate a price, get a loan, and purchase the car on site B in 1 hour so therefore site A is faster than site B.The article you quoted is completely wrong in it's statements of relative execution speed between grep, sed, and awk and it also saysawk ... has PCRE matching for regular expressionswhich is just completely untrue.
â Ed Morton
Aug 19 at 16:23
 |Â
show 4 more comments
Thanks for this nice overview of all these programs. It really sheds light in the darkness.
â holasz
Aug 28 '13 at 10:53
1
~headtilt~ PHP is on there but Perl isn't?
â Izkata
Aug 28 '13 at 11:34
@Izkata - I thought the same thing when I saw this table a while ago.
â slmâ¦
Aug 28 '13 at 11:52
1
It's not really fair to the other utils that grep is just searching and they are also replacing.
â Kevin
Aug 28 '13 at 13:57
1
Those are completely bogus numbers. Talk about comparing apples and oranges - it's like saying you can only find a new car on web site A in 5 secs whereas you can find a car, negotiate a price, get a loan, and purchase the car on site B in 1 hour so therefore site A is faster than site B.The article you quoted is completely wrong in it's statements of relative execution speed between grep, sed, and awk and it also saysawk ... has PCRE matching for regular expressionswhich is just completely untrue.
â Ed Morton
Aug 19 at 16:23
Thanks for this nice overview of all these programs. It really sheds light in the darkness.
â holasz
Aug 28 '13 at 10:53
Thanks for this nice overview of all these programs. It really sheds light in the darkness.
â holasz
Aug 28 '13 at 10:53
1
1
~headtilt~ PHP is on there but Perl isn't?
â Izkata
Aug 28 '13 at 11:34
~headtilt~ PHP is on there but Perl isn't?
â Izkata
Aug 28 '13 at 11:34
@Izkata - I thought the same thing when I saw this table a while ago.
â slmâ¦
Aug 28 '13 at 11:52
@Izkata - I thought the same thing when I saw this table a while ago.
â slmâ¦
Aug 28 '13 at 11:52
1
1
It's not really fair to the other utils that grep is just searching and they are also replacing.
â Kevin
Aug 28 '13 at 13:57
It's not really fair to the other utils that grep is just searching and they are also replacing.
â Kevin
Aug 28 '13 at 13:57
1
1
Those are completely bogus numbers. Talk about comparing apples and oranges - it's like saying you can only find a new car on web site A in 5 secs whereas you can find a car, negotiate a price, get a loan, and purchase the car on site B in 1 hour so therefore site A is faster than site B.The article you quoted is completely wrong in it's statements of relative execution speed between grep, sed, and awk and it also says
awk ... has PCRE matching for regular expressions which is just completely untrue.â Ed Morton
Aug 19 at 16:23
Those are completely bogus numbers. Talk about comparing apples and oranges - it's like saying you can only find a new car on web site A in 5 secs whereas you can find a car, negotiate a price, get a loan, and purchase the car on site B in 1 hour so therefore site A is faster than site B.The article you quoted is completely wrong in it's statements of relative execution speed between grep, sed, and awk and it also says
awk ... has PCRE matching for regular expressions which is just completely untrue.â Ed Morton
Aug 19 at 16:23
 |Â
show 4 more comments
up vote
5
down vote
While I agree that in theory grep should be faster than awk, in practice, YMMV as that depends a lot on the implementation you use.
here comparing busybox 1.20.0's grep and awk, GNU grep 2.14, mawk 1.3.3, GNU awk 4.0.1 on Debian/Linux 7.0 amd64 (with glibc 2.17) in a UTF-8 locale on a 240MB file of 2.5M lines of ASCII-only characters.
$ time busybox grep error error | wc -l
331003
busybox grep error error 8.31s user 0.12s system 99% cpu 8.450 total
wc -l 0.07s user 0.11s system 2% cpu 8.448 total
$ time busybox awk /error/ error | wc -l
331003
busybox awk /error/ error 2.39s user 0.84s system 98% cpu 3.265 total
wc -l 0.12s user 1.23s system 41% cpu 3.264 total
$ time grep error error | wc -l
331003
grep error error 0.80s user 0.10s system 99% cpu 0.914 total
wc -l 0.00s user 0.11s system 12% cpu 0.913 total
$ time mawk /error/ error | wc -l
330803
mawk /error/ error 0.54s user 0.13s system 91% cpu 0.732 total
wc -l 0.03s user 0.08s system 14% cpu 0.731 total
$ time gawk /error/ error | wc -l
331003
gawk /error/ error 1.37s user 0.12s system 99% cpu 1.494 total
wc -l 0.04s user 0.07s system 7% cpu 1.492 total
$ time
In the C locale, only GNU grep gets a significant boost and becomes faster than mawk.
The dataset, the type of the regexp may also make a big difference. For regexps, awk should be compared to grep -E as awk's regexps are extended REs.
For this dataset, awk could be faster than grep on busybox based systems or systems where mawk is the default awk and the default locale is UTF-8 based (IIRC, it used to be the case in Ubuntu).
add a comment |Â
up vote
5
down vote
While I agree that in theory grep should be faster than awk, in practice, YMMV as that depends a lot on the implementation you use.
here comparing busybox 1.20.0's grep and awk, GNU grep 2.14, mawk 1.3.3, GNU awk 4.0.1 on Debian/Linux 7.0 amd64 (with glibc 2.17) in a UTF-8 locale on a 240MB file of 2.5M lines of ASCII-only characters.
$ time busybox grep error error | wc -l
331003
busybox grep error error 8.31s user 0.12s system 99% cpu 8.450 total
wc -l 0.07s user 0.11s system 2% cpu 8.448 total
$ time busybox awk /error/ error | wc -l
331003
busybox awk /error/ error 2.39s user 0.84s system 98% cpu 3.265 total
wc -l 0.12s user 1.23s system 41% cpu 3.264 total
$ time grep error error | wc -l
331003
grep error error 0.80s user 0.10s system 99% cpu 0.914 total
wc -l 0.00s user 0.11s system 12% cpu 0.913 total
$ time mawk /error/ error | wc -l
330803
mawk /error/ error 0.54s user 0.13s system 91% cpu 0.732 total
wc -l 0.03s user 0.08s system 14% cpu 0.731 total
$ time gawk /error/ error | wc -l
331003
gawk /error/ error 1.37s user 0.12s system 99% cpu 1.494 total
wc -l 0.04s user 0.07s system 7% cpu 1.492 total
$ time
In the C locale, only GNU grep gets a significant boost and becomes faster than mawk.
The dataset, the type of the regexp may also make a big difference. For regexps, awk should be compared to grep -E as awk's regexps are extended REs.
For this dataset, awk could be faster than grep on busybox based systems or systems where mawk is the default awk and the default locale is UTF-8 based (IIRC, it used to be the case in Ubuntu).
add a comment |Â
up vote
5
down vote
up vote
5
down vote
While I agree that in theory grep should be faster than awk, in practice, YMMV as that depends a lot on the implementation you use.
here comparing busybox 1.20.0's grep and awk, GNU grep 2.14, mawk 1.3.3, GNU awk 4.0.1 on Debian/Linux 7.0 amd64 (with glibc 2.17) in a UTF-8 locale on a 240MB file of 2.5M lines of ASCII-only characters.
$ time busybox grep error error | wc -l
331003
busybox grep error error 8.31s user 0.12s system 99% cpu 8.450 total
wc -l 0.07s user 0.11s system 2% cpu 8.448 total
$ time busybox awk /error/ error | wc -l
331003
busybox awk /error/ error 2.39s user 0.84s system 98% cpu 3.265 total
wc -l 0.12s user 1.23s system 41% cpu 3.264 total
$ time grep error error | wc -l
331003
grep error error 0.80s user 0.10s system 99% cpu 0.914 total
wc -l 0.00s user 0.11s system 12% cpu 0.913 total
$ time mawk /error/ error | wc -l
330803
mawk /error/ error 0.54s user 0.13s system 91% cpu 0.732 total
wc -l 0.03s user 0.08s system 14% cpu 0.731 total
$ time gawk /error/ error | wc -l
331003
gawk /error/ error 1.37s user 0.12s system 99% cpu 1.494 total
wc -l 0.04s user 0.07s system 7% cpu 1.492 total
$ time
In the C locale, only GNU grep gets a significant boost and becomes faster than mawk.
The dataset, the type of the regexp may also make a big difference. For regexps, awk should be compared to grep -E as awk's regexps are extended REs.
For this dataset, awk could be faster than grep on busybox based systems or systems where mawk is the default awk and the default locale is UTF-8 based (IIRC, it used to be the case in Ubuntu).
While I agree that in theory grep should be faster than awk, in practice, YMMV as that depends a lot on the implementation you use.
here comparing busybox 1.20.0's grep and awk, GNU grep 2.14, mawk 1.3.3, GNU awk 4.0.1 on Debian/Linux 7.0 amd64 (with glibc 2.17) in a UTF-8 locale on a 240MB file of 2.5M lines of ASCII-only characters.
$ time busybox grep error error | wc -l
331003
busybox grep error error 8.31s user 0.12s system 99% cpu 8.450 total
wc -l 0.07s user 0.11s system 2% cpu 8.448 total
$ time busybox awk /error/ error | wc -l
331003
busybox awk /error/ error 2.39s user 0.84s system 98% cpu 3.265 total
wc -l 0.12s user 1.23s system 41% cpu 3.264 total
$ time grep error error | wc -l
331003
grep error error 0.80s user 0.10s system 99% cpu 0.914 total
wc -l 0.00s user 0.11s system 12% cpu 0.913 total
$ time mawk /error/ error | wc -l
330803
mawk /error/ error 0.54s user 0.13s system 91% cpu 0.732 total
wc -l 0.03s user 0.08s system 14% cpu 0.731 total
$ time gawk /error/ error | wc -l
331003
gawk /error/ error 1.37s user 0.12s system 99% cpu 1.494 total
wc -l 0.04s user 0.07s system 7% cpu 1.492 total
$ time
In the C locale, only GNU grep gets a significant boost and becomes faster than mawk.
The dataset, the type of the regexp may also make a big difference. For regexps, awk should be compared to grep -E as awk's regexps are extended REs.
For this dataset, awk could be faster than grep on busybox based systems or systems where mawk is the default awk and the default locale is UTF-8 based (IIRC, it used to be the case in Ubuntu).
edited Aug 28 '13 at 12:30
answered Aug 28 '13 at 12:19
Stéphane Chazelas
285k53525864
285k53525864
add a comment |Â
add a comment |Â
up vote
2
down vote
In a nutshell, grep does one thing only as many other UNIX tools and that's matching a line to the given pattern and it does it well. On the other hand, awk is more sophisticated tool as it is a complete programming language defined by POSIX standard with typical features like variables, arrays, expressions, functions or control statements for pattern scanning and processing.
In my opinion, it depends on the implementation how both tools perform in case of pattern matching and on the size of some input you want to process. I would expect that grep is usually more efficient than awk as it does matching only. But you can't write with grep a simple code to perform more complex tasks like further processing of matched records, computation or printing results without using other tools.
add a comment |Â
up vote
2
down vote
In a nutshell, grep does one thing only as many other UNIX tools and that's matching a line to the given pattern and it does it well. On the other hand, awk is more sophisticated tool as it is a complete programming language defined by POSIX standard with typical features like variables, arrays, expressions, functions or control statements for pattern scanning and processing.
In my opinion, it depends on the implementation how both tools perform in case of pattern matching and on the size of some input you want to process. I would expect that grep is usually more efficient than awk as it does matching only. But you can't write with grep a simple code to perform more complex tasks like further processing of matched records, computation or printing results without using other tools.
add a comment |Â
up vote
2
down vote
up vote
2
down vote
In a nutshell, grep does one thing only as many other UNIX tools and that's matching a line to the given pattern and it does it well. On the other hand, awk is more sophisticated tool as it is a complete programming language defined by POSIX standard with typical features like variables, arrays, expressions, functions or control statements for pattern scanning and processing.
In my opinion, it depends on the implementation how both tools perform in case of pattern matching and on the size of some input you want to process. I would expect that grep is usually more efficient than awk as it does matching only. But you can't write with grep a simple code to perform more complex tasks like further processing of matched records, computation or printing results without using other tools.
In a nutshell, grep does one thing only as many other UNIX tools and that's matching a line to the given pattern and it does it well. On the other hand, awk is more sophisticated tool as it is a complete programming language defined by POSIX standard with typical features like variables, arrays, expressions, functions or control statements for pattern scanning and processing.
In my opinion, it depends on the implementation how both tools perform in case of pattern matching and on the size of some input you want to process. I would expect that grep is usually more efficient than awk as it does matching only. But you can't write with grep a simple code to perform more complex tasks like further processing of matched records, computation or printing results without using other tools.
answered Aug 28 '13 at 9:05
dsmsk80
2,155813
2,155813
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f88503%2fusing-grep-vs-awk%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Precede any command, even shell scripts, with the
timecommand to time how long it takes to run the command. Ex:time ls -l.â Bulrush
Aug 26 '16 at 12:20