Using grep vs awk

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
14
down vote

favorite
3












To capture a particular pattern, awk and grep can be used. Why should we use one over the other? Which is faster and why?



If I had a log file and I wanted to grab a certain pattern, I could do one of the following



awk '/pattern/' /var/log/messages


or



grep 'pattern' /var/log/messages


I haven't done any benchmarking, so I wouldn't know. Can someone elaborate this? It is great to know the inner workings of these two tools.










share|improve this question























  • Precede any command, even shell scripts, with the time command to time how long it takes to run the command. Ex: time ls -l.
    – Bulrush
    Aug 26 '16 at 12:20














up vote
14
down vote

favorite
3












To capture a particular pattern, awk and grep can be used. Why should we use one over the other? Which is faster and why?



If I had a log file and I wanted to grab a certain pattern, I could do one of the following



awk '/pattern/' /var/log/messages


or



grep 'pattern' /var/log/messages


I haven't done any benchmarking, so I wouldn't know. Can someone elaborate this? It is great to know the inner workings of these two tools.










share|improve this question























  • Precede any command, even shell scripts, with the time command to time how long it takes to run the command. Ex: time ls -l.
    – Bulrush
    Aug 26 '16 at 12:20












up vote
14
down vote

favorite
3









up vote
14
down vote

favorite
3






3





To capture a particular pattern, awk and grep can be used. Why should we use one over the other? Which is faster and why?



If I had a log file and I wanted to grab a certain pattern, I could do one of the following



awk '/pattern/' /var/log/messages


or



grep 'pattern' /var/log/messages


I haven't done any benchmarking, so I wouldn't know. Can someone elaborate this? It is great to know the inner workings of these two tools.










share|improve this question















To capture a particular pattern, awk and grep can be used. Why should we use one over the other? Which is faster and why?



If I had a log file and I wanted to grab a certain pattern, I could do one of the following



awk '/pattern/' /var/log/messages


or



grep 'pattern' /var/log/messages


I haven't done any benchmarking, so I wouldn't know. Can someone elaborate this? It is great to know the inner workings of these two tools.







linux awk grep performance






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Aug 19 at 6:54









codeforester

353314




353314










asked Aug 28 '13 at 8:20









holasz

139112




139112











  • Precede any command, even shell scripts, with the time command to time how long it takes to run the command. Ex: time ls -l.
    – Bulrush
    Aug 26 '16 at 12:20
















  • Precede any command, even shell scripts, with the time command to time how long it takes to run the command. Ex: time ls -l.
    – Bulrush
    Aug 26 '16 at 12:20















Precede any command, even shell scripts, with the time command to time how long it takes to run the command. Ex: time ls -l.
– Bulrush
Aug 26 '16 at 12:20




Precede any command, even shell scripts, with the time command to time how long it takes to run the command. Ex: time ls -l.
– Bulrush
Aug 26 '16 at 12:20










5 Answers
5






active

oldest

votes

















up vote
23
down vote



accepted










grep will most likely be faster:



# time awk '/USAGE/' imapd.log.1 | wc -l
73832

real 0m2.756s
user 0m2.740s
sys 0m0.020s

# time grep 'USAGE' imapd.log.1 | wc -l
73832

real 0m0.110s
user 0m0.100s
sys 0m0.030s


awk is a interpreted programming language, where as grep is a compiled c-code program (which is additionally optimized towards finding patterns in files).



(Note - I ran both commands twice so that caching would not potentially skew the results)



More details about interpreted languages on wikipedia.



As Stephane has rightly pointed out in comments, your mileage may vary due to the implementation of the grep and awk you use, the operating system it is on and the character set you are processing.






share|improve this answer


















  • 2




    Without saying what grep or awk implementation you're using and on what computer architecture, and with which system character set, those timings have little value.
    – Stéphane Chazelas
    Aug 28 '13 at 11:59






  • 1




    the second command will also use the newly cached version. I dont doubt that grep is quicker but not by as much as your numbers show.
    – exussum
    Aug 28 '13 at 12:18










  • (hence running awk, grep, awk, grep and posting the results from the second set of awk and grep :) and FYI, I live in a UTF8 locale.
    – Drav Sloan
    Aug 28 '13 at 12:58






  • 1




    Funny enough, with the BSD tools (on a Mac), awk (31.74s) is slightly faster than sed (33.34s), which is slightly faster than grep (34.21s). Gnu awk owns them all at 5.24s, I don't have gnu grep or sed to test.
    – Kevin
    Aug 28 '13 at 14:25






  • 1




    grep should be slightly faster because awk does more with each input line than just search for a regexp in it, e.g. if a field is referenced in the script (which it's not in this case) awk will split each input line into fields based on the field-separator value and it populates builtin variables. but with what you posted there should be almost no difference. By far the most important difference between grep and awk wrt matching regexps is that grep searches the whole line for a matching string while awk can search specific fields and so provide more precision and fewer false matches.
    – Ed Morton
    Aug 19 at 16:14


















up vote
11
down vote













Use the most specific and expressive tool. The tool that best fits your use case is likely to be the fastest.



As a rough guide:



  • searching for lines matching a substring or regexp? Use grep.

  • selecting certain columns from a simply-delimited file? Use cut.

  • performing pattern-based substitutions or ... other stuff sed can reasonably do? Use sed.

  • need some combination of the above 3, or printf formatting, or general purpose loops and branches? Use awk.





share|improve this answer




















  • +1 except use perl instead of awk. if you need something more complicated than grep/cut/sed, then chances are awk won't be enough and you need something "full-blown"
    – sds
    Aug 28 '13 at 14:34










  • @sds why not python instead
    – RetroCode
    Sep 23 '16 at 18:45










  • @RetroCode: python is more "general purpose" than perl; the equivalent one-liner will probably be much longer.
    – sds
    Sep 23 '16 at 19:48






  • 2




    @sds no, you don't need perl unless you're going to do something other than text processing. awk is just fine for the text processing stuff that's more complicated than grep/cut/sed and as a bonus comes as standard on all UNIX installations, unlike perl.
    – Ed Morton
    Aug 19 at 16:19

















up vote
8
down vote













When only searching for strings, and speed matters, you should almost always use grep. It's orders of magnitude faster than awk when it comes to just gross searching.



source The functional and performance differences of sed, awk and other Unix parsing utilities



UTILITY OPERATION TYPE EXECUTION TIME CHARACTERS PROCESSED PER SECOND
(10 ITERATIONS)
------- -------------- --------------- -------------------------------
grep search only 41 sec. 489.3 million
sed search & replace 4 min. 4 sec. 82.1 million
awk search & replace 4 min. 46 sec. 69.8 million
Python search & replace 4 min. 50 sec. 69.0 million
PHP search & replace 15 min. 44 sec. 21.2 million





share|improve this answer




















  • Thanks for this nice overview of all these programs. It really sheds light in the darkness.
    – holasz
    Aug 28 '13 at 10:53






  • 1




    ~headtilt~ PHP is on there but Perl isn't?
    – Izkata
    Aug 28 '13 at 11:34










  • @Izkata - I thought the same thing when I saw this table a while ago.
    – slm♦
    Aug 28 '13 at 11:52






  • 1




    It's not really fair to the other utils that grep is just searching and they are also replacing.
    – Kevin
    Aug 28 '13 at 13:57






  • 1




    Those are completely bogus numbers. Talk about comparing apples and oranges - it's like saying you can only find a new car on web site A in 5 secs whereas you can find a car, negotiate a price, get a loan, and purchase the car on site B in 1 hour so therefore site A is faster than site B.The article you quoted is completely wrong in it's statements of relative execution speed between grep, sed, and awk and it also says awk ... has PCRE matching for regular expressions which is just completely untrue.
    – Ed Morton
    Aug 19 at 16:23


















up vote
5
down vote













While I agree that in theory grep should be faster than awk, in practice, YMMV as that depends a lot on the implementation you use.



here comparing busybox 1.20.0's grep and awk, GNU grep 2.14, mawk 1.3.3, GNU awk 4.0.1 on Debian/Linux 7.0 amd64 (with glibc 2.17) in a UTF-8 locale on a 240MB file of 2.5M lines of ASCII-only characters.



$ time busybox grep error error | wc -l
331003
busybox grep error error 8.31s user 0.12s system 99% cpu 8.450 total
wc -l 0.07s user 0.11s system 2% cpu 8.448 total
$ time busybox awk /error/ error | wc -l
331003
busybox awk /error/ error 2.39s user 0.84s system 98% cpu 3.265 total
wc -l 0.12s user 1.23s system 41% cpu 3.264 total
$ time grep error error | wc -l
331003
grep error error 0.80s user 0.10s system 99% cpu 0.914 total
wc -l 0.00s user 0.11s system 12% cpu 0.913 total
$ time mawk /error/ error | wc -l
330803
mawk /error/ error 0.54s user 0.13s system 91% cpu 0.732 total
wc -l 0.03s user 0.08s system 14% cpu 0.731 total
$ time gawk /error/ error | wc -l
331003
gawk /error/ error 1.37s user 0.12s system 99% cpu 1.494 total
wc -l 0.04s user 0.07s system 7% cpu 1.492 total
$ time


In the C locale, only GNU grep gets a significant boost and becomes faster than mawk.



The dataset, the type of the regexp may also make a big difference. For regexps, awk should be compared to grep -E as awk's regexps are extended REs.



For this dataset, awk could be faster than grep on busybox based systems or systems where mawk is the default awk and the default locale is UTF-8 based (IIRC, it used to be the case in Ubuntu).






share|improve this answer





























    up vote
    2
    down vote













    In a nutshell, grep does one thing only as many other UNIX tools and that's matching a line to the given pattern and it does it well. On the other hand, awk is more sophisticated tool as it is a complete programming language defined by POSIX standard with typical features like variables, arrays, expressions, functions or control statements for pattern scanning and processing.



    In my opinion, it depends on the implementation how both tools perform in case of pattern matching and on the size of some input you want to process. I would expect that grep is usually more efficient than awk as it does matching only. But you can't write with grep a simple code to perform more complex tasks like further processing of matched records, computation or printing results without using other tools.






    share|improve this answer




















      Your Answer







      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "106"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      convertImagesToLinks: false,
      noModals: false,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













       

      draft saved


      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f88503%2fusing-grep-vs-awk%23new-answer', 'question_page');

      );

      Post as a guest






























      5 Answers
      5






      active

      oldest

      votes








      5 Answers
      5






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      23
      down vote



      accepted










      grep will most likely be faster:



      # time awk '/USAGE/' imapd.log.1 | wc -l
      73832

      real 0m2.756s
      user 0m2.740s
      sys 0m0.020s

      # time grep 'USAGE' imapd.log.1 | wc -l
      73832

      real 0m0.110s
      user 0m0.100s
      sys 0m0.030s


      awk is a interpreted programming language, where as grep is a compiled c-code program (which is additionally optimized towards finding patterns in files).



      (Note - I ran both commands twice so that caching would not potentially skew the results)



      More details about interpreted languages on wikipedia.



      As Stephane has rightly pointed out in comments, your mileage may vary due to the implementation of the grep and awk you use, the operating system it is on and the character set you are processing.






      share|improve this answer


















      • 2




        Without saying what grep or awk implementation you're using and on what computer architecture, and with which system character set, those timings have little value.
        – Stéphane Chazelas
        Aug 28 '13 at 11:59






      • 1




        the second command will also use the newly cached version. I dont doubt that grep is quicker but not by as much as your numbers show.
        – exussum
        Aug 28 '13 at 12:18










      • (hence running awk, grep, awk, grep and posting the results from the second set of awk and grep :) and FYI, I live in a UTF8 locale.
        – Drav Sloan
        Aug 28 '13 at 12:58






      • 1




        Funny enough, with the BSD tools (on a Mac), awk (31.74s) is slightly faster than sed (33.34s), which is slightly faster than grep (34.21s). Gnu awk owns them all at 5.24s, I don't have gnu grep or sed to test.
        – Kevin
        Aug 28 '13 at 14:25






      • 1




        grep should be slightly faster because awk does more with each input line than just search for a regexp in it, e.g. if a field is referenced in the script (which it's not in this case) awk will split each input line into fields based on the field-separator value and it populates builtin variables. but with what you posted there should be almost no difference. By far the most important difference between grep and awk wrt matching regexps is that grep searches the whole line for a matching string while awk can search specific fields and so provide more precision and fewer false matches.
        – Ed Morton
        Aug 19 at 16:14















      up vote
      23
      down vote



      accepted










      grep will most likely be faster:



      # time awk '/USAGE/' imapd.log.1 | wc -l
      73832

      real 0m2.756s
      user 0m2.740s
      sys 0m0.020s

      # time grep 'USAGE' imapd.log.1 | wc -l
      73832

      real 0m0.110s
      user 0m0.100s
      sys 0m0.030s


      awk is a interpreted programming language, where as grep is a compiled c-code program (which is additionally optimized towards finding patterns in files).



      (Note - I ran both commands twice so that caching would not potentially skew the results)



      More details about interpreted languages on wikipedia.



      As Stephane has rightly pointed out in comments, your mileage may vary due to the implementation of the grep and awk you use, the operating system it is on and the character set you are processing.






      share|improve this answer


















      • 2




        Without saying what grep or awk implementation you're using and on what computer architecture, and with which system character set, those timings have little value.
        – Stéphane Chazelas
        Aug 28 '13 at 11:59






      • 1




        the second command will also use the newly cached version. I dont doubt that grep is quicker but not by as much as your numbers show.
        – exussum
        Aug 28 '13 at 12:18










      • (hence running awk, grep, awk, grep and posting the results from the second set of awk and grep :) and FYI, I live in a UTF8 locale.
        – Drav Sloan
        Aug 28 '13 at 12:58






      • 1




        Funny enough, with the BSD tools (on a Mac), awk (31.74s) is slightly faster than sed (33.34s), which is slightly faster than grep (34.21s). Gnu awk owns them all at 5.24s, I don't have gnu grep or sed to test.
        – Kevin
        Aug 28 '13 at 14:25






      • 1




        grep should be slightly faster because awk does more with each input line than just search for a regexp in it, e.g. if a field is referenced in the script (which it's not in this case) awk will split each input line into fields based on the field-separator value and it populates builtin variables. but with what you posted there should be almost no difference. By far the most important difference between grep and awk wrt matching regexps is that grep searches the whole line for a matching string while awk can search specific fields and so provide more precision and fewer false matches.
        – Ed Morton
        Aug 19 at 16:14













      up vote
      23
      down vote



      accepted







      up vote
      23
      down vote



      accepted






      grep will most likely be faster:



      # time awk '/USAGE/' imapd.log.1 | wc -l
      73832

      real 0m2.756s
      user 0m2.740s
      sys 0m0.020s

      # time grep 'USAGE' imapd.log.1 | wc -l
      73832

      real 0m0.110s
      user 0m0.100s
      sys 0m0.030s


      awk is a interpreted programming language, where as grep is a compiled c-code program (which is additionally optimized towards finding patterns in files).



      (Note - I ran both commands twice so that caching would not potentially skew the results)



      More details about interpreted languages on wikipedia.



      As Stephane has rightly pointed out in comments, your mileage may vary due to the implementation of the grep and awk you use, the operating system it is on and the character set you are processing.






      share|improve this answer














      grep will most likely be faster:



      # time awk '/USAGE/' imapd.log.1 | wc -l
      73832

      real 0m2.756s
      user 0m2.740s
      sys 0m0.020s

      # time grep 'USAGE' imapd.log.1 | wc -l
      73832

      real 0m0.110s
      user 0m0.100s
      sys 0m0.030s


      awk is a interpreted programming language, where as grep is a compiled c-code program (which is additionally optimized towards finding patterns in files).



      (Note - I ran both commands twice so that caching would not potentially skew the results)



      More details about interpreted languages on wikipedia.



      As Stephane has rightly pointed out in comments, your mileage may vary due to the implementation of the grep and awk you use, the operating system it is on and the character set you are processing.







      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Aug 28 '13 at 12:04

























      answered Aug 28 '13 at 8:44









      Drav Sloan

      9,20023038




      9,20023038







      • 2




        Without saying what grep or awk implementation you're using and on what computer architecture, and with which system character set, those timings have little value.
        – Stéphane Chazelas
        Aug 28 '13 at 11:59






      • 1




        the second command will also use the newly cached version. I dont doubt that grep is quicker but not by as much as your numbers show.
        – exussum
        Aug 28 '13 at 12:18










      • (hence running awk, grep, awk, grep and posting the results from the second set of awk and grep :) and FYI, I live in a UTF8 locale.
        – Drav Sloan
        Aug 28 '13 at 12:58






      • 1




        Funny enough, with the BSD tools (on a Mac), awk (31.74s) is slightly faster than sed (33.34s), which is slightly faster than grep (34.21s). Gnu awk owns them all at 5.24s, I don't have gnu grep or sed to test.
        – Kevin
        Aug 28 '13 at 14:25






      • 1




        grep should be slightly faster because awk does more with each input line than just search for a regexp in it, e.g. if a field is referenced in the script (which it's not in this case) awk will split each input line into fields based on the field-separator value and it populates builtin variables. but with what you posted there should be almost no difference. By far the most important difference between grep and awk wrt matching regexps is that grep searches the whole line for a matching string while awk can search specific fields and so provide more precision and fewer false matches.
        – Ed Morton
        Aug 19 at 16:14













      • 2




        Without saying what grep or awk implementation you're using and on what computer architecture, and with which system character set, those timings have little value.
        – Stéphane Chazelas
        Aug 28 '13 at 11:59






      • 1




        the second command will also use the newly cached version. I dont doubt that grep is quicker but not by as much as your numbers show.
        – exussum
        Aug 28 '13 at 12:18










      • (hence running awk, grep, awk, grep and posting the results from the second set of awk and grep :) and FYI, I live in a UTF8 locale.
        – Drav Sloan
        Aug 28 '13 at 12:58






      • 1




        Funny enough, with the BSD tools (on a Mac), awk (31.74s) is slightly faster than sed (33.34s), which is slightly faster than grep (34.21s). Gnu awk owns them all at 5.24s, I don't have gnu grep or sed to test.
        – Kevin
        Aug 28 '13 at 14:25






      • 1




        grep should be slightly faster because awk does more with each input line than just search for a regexp in it, e.g. if a field is referenced in the script (which it's not in this case) awk will split each input line into fields based on the field-separator value and it populates builtin variables. but with what you posted there should be almost no difference. By far the most important difference between grep and awk wrt matching regexps is that grep searches the whole line for a matching string while awk can search specific fields and so provide more precision and fewer false matches.
        – Ed Morton
        Aug 19 at 16:14








      2




      2




      Without saying what grep or awk implementation you're using and on what computer architecture, and with which system character set, those timings have little value.
      – Stéphane Chazelas
      Aug 28 '13 at 11:59




      Without saying what grep or awk implementation you're using and on what computer architecture, and with which system character set, those timings have little value.
      – Stéphane Chazelas
      Aug 28 '13 at 11:59




      1




      1




      the second command will also use the newly cached version. I dont doubt that grep is quicker but not by as much as your numbers show.
      – exussum
      Aug 28 '13 at 12:18




      the second command will also use the newly cached version. I dont doubt that grep is quicker but not by as much as your numbers show.
      – exussum
      Aug 28 '13 at 12:18












      (hence running awk, grep, awk, grep and posting the results from the second set of awk and grep :) and FYI, I live in a UTF8 locale.
      – Drav Sloan
      Aug 28 '13 at 12:58




      (hence running awk, grep, awk, grep and posting the results from the second set of awk and grep :) and FYI, I live in a UTF8 locale.
      – Drav Sloan
      Aug 28 '13 at 12:58




      1




      1




      Funny enough, with the BSD tools (on a Mac), awk (31.74s) is slightly faster than sed (33.34s), which is slightly faster than grep (34.21s). Gnu awk owns them all at 5.24s, I don't have gnu grep or sed to test.
      – Kevin
      Aug 28 '13 at 14:25




      Funny enough, with the BSD tools (on a Mac), awk (31.74s) is slightly faster than sed (33.34s), which is slightly faster than grep (34.21s). Gnu awk owns them all at 5.24s, I don't have gnu grep or sed to test.
      – Kevin
      Aug 28 '13 at 14:25




      1




      1




      grep should be slightly faster because awk does more with each input line than just search for a regexp in it, e.g. if a field is referenced in the script (which it's not in this case) awk will split each input line into fields based on the field-separator value and it populates builtin variables. but with what you posted there should be almost no difference. By far the most important difference between grep and awk wrt matching regexps is that grep searches the whole line for a matching string while awk can search specific fields and so provide more precision and fewer false matches.
      – Ed Morton
      Aug 19 at 16:14





      grep should be slightly faster because awk does more with each input line than just search for a regexp in it, e.g. if a field is referenced in the script (which it's not in this case) awk will split each input line into fields based on the field-separator value and it populates builtin variables. but with what you posted there should be almost no difference. By far the most important difference between grep and awk wrt matching regexps is that grep searches the whole line for a matching string while awk can search specific fields and so provide more precision and fewer false matches.
      – Ed Morton
      Aug 19 at 16:14













      up vote
      11
      down vote













      Use the most specific and expressive tool. The tool that best fits your use case is likely to be the fastest.



      As a rough guide:



      • searching for lines matching a substring or regexp? Use grep.

      • selecting certain columns from a simply-delimited file? Use cut.

      • performing pattern-based substitutions or ... other stuff sed can reasonably do? Use sed.

      • need some combination of the above 3, or printf formatting, or general purpose loops and branches? Use awk.





      share|improve this answer




















      • +1 except use perl instead of awk. if you need something more complicated than grep/cut/sed, then chances are awk won't be enough and you need something "full-blown"
        – sds
        Aug 28 '13 at 14:34










      • @sds why not python instead
        – RetroCode
        Sep 23 '16 at 18:45










      • @RetroCode: python is more "general purpose" than perl; the equivalent one-liner will probably be much longer.
        – sds
        Sep 23 '16 at 19:48






      • 2




        @sds no, you don't need perl unless you're going to do something other than text processing. awk is just fine for the text processing stuff that's more complicated than grep/cut/sed and as a bonus comes as standard on all UNIX installations, unlike perl.
        – Ed Morton
        Aug 19 at 16:19














      up vote
      11
      down vote













      Use the most specific and expressive tool. The tool that best fits your use case is likely to be the fastest.



      As a rough guide:



      • searching for lines matching a substring or regexp? Use grep.

      • selecting certain columns from a simply-delimited file? Use cut.

      • performing pattern-based substitutions or ... other stuff sed can reasonably do? Use sed.

      • need some combination of the above 3, or printf formatting, or general purpose loops and branches? Use awk.





      share|improve this answer




















      • +1 except use perl instead of awk. if you need something more complicated than grep/cut/sed, then chances are awk won't be enough and you need something "full-blown"
        – sds
        Aug 28 '13 at 14:34










      • @sds why not python instead
        – RetroCode
        Sep 23 '16 at 18:45










      • @RetroCode: python is more "general purpose" than perl; the equivalent one-liner will probably be much longer.
        – sds
        Sep 23 '16 at 19:48






      • 2




        @sds no, you don't need perl unless you're going to do something other than text processing. awk is just fine for the text processing stuff that's more complicated than grep/cut/sed and as a bonus comes as standard on all UNIX installations, unlike perl.
        – Ed Morton
        Aug 19 at 16:19












      up vote
      11
      down vote










      up vote
      11
      down vote









      Use the most specific and expressive tool. The tool that best fits your use case is likely to be the fastest.



      As a rough guide:



      • searching for lines matching a substring or regexp? Use grep.

      • selecting certain columns from a simply-delimited file? Use cut.

      • performing pattern-based substitutions or ... other stuff sed can reasonably do? Use sed.

      • need some combination of the above 3, or printf formatting, or general purpose loops and branches? Use awk.





      share|improve this answer












      Use the most specific and expressive tool. The tool that best fits your use case is likely to be the fastest.



      As a rough guide:



      • searching for lines matching a substring or regexp? Use grep.

      • selecting certain columns from a simply-delimited file? Use cut.

      • performing pattern-based substitutions or ... other stuff sed can reasonably do? Use sed.

      • need some combination of the above 3, or printf formatting, or general purpose loops and branches? Use awk.






      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Aug 28 '13 at 12:31









      Useless

      3,2881318




      3,2881318











      • +1 except use perl instead of awk. if you need something more complicated than grep/cut/sed, then chances are awk won't be enough and you need something "full-blown"
        – sds
        Aug 28 '13 at 14:34










      • @sds why not python instead
        – RetroCode
        Sep 23 '16 at 18:45










      • @RetroCode: python is more "general purpose" than perl; the equivalent one-liner will probably be much longer.
        – sds
        Sep 23 '16 at 19:48






      • 2




        @sds no, you don't need perl unless you're going to do something other than text processing. awk is just fine for the text processing stuff that's more complicated than grep/cut/sed and as a bonus comes as standard on all UNIX installations, unlike perl.
        – Ed Morton
        Aug 19 at 16:19
















      • +1 except use perl instead of awk. if you need something more complicated than grep/cut/sed, then chances are awk won't be enough and you need something "full-blown"
        – sds
        Aug 28 '13 at 14:34










      • @sds why not python instead
        – RetroCode
        Sep 23 '16 at 18:45










      • @RetroCode: python is more "general purpose" than perl; the equivalent one-liner will probably be much longer.
        – sds
        Sep 23 '16 at 19:48






      • 2




        @sds no, you don't need perl unless you're going to do something other than text processing. awk is just fine for the text processing stuff that's more complicated than grep/cut/sed and as a bonus comes as standard on all UNIX installations, unlike perl.
        – Ed Morton
        Aug 19 at 16:19















      +1 except use perl instead of awk. if you need something more complicated than grep/cut/sed, then chances are awk won't be enough and you need something "full-blown"
      – sds
      Aug 28 '13 at 14:34




      +1 except use perl instead of awk. if you need something more complicated than grep/cut/sed, then chances are awk won't be enough and you need something "full-blown"
      – sds
      Aug 28 '13 at 14:34












      @sds why not python instead
      – RetroCode
      Sep 23 '16 at 18:45




      @sds why not python instead
      – RetroCode
      Sep 23 '16 at 18:45












      @RetroCode: python is more "general purpose" than perl; the equivalent one-liner will probably be much longer.
      – sds
      Sep 23 '16 at 19:48




      @RetroCode: python is more "general purpose" than perl; the equivalent one-liner will probably be much longer.
      – sds
      Sep 23 '16 at 19:48




      2




      2




      @sds no, you don't need perl unless you're going to do something other than text processing. awk is just fine for the text processing stuff that's more complicated than grep/cut/sed and as a bonus comes as standard on all UNIX installations, unlike perl.
      – Ed Morton
      Aug 19 at 16:19




      @sds no, you don't need perl unless you're going to do something other than text processing. awk is just fine for the text processing stuff that's more complicated than grep/cut/sed and as a bonus comes as standard on all UNIX installations, unlike perl.
      – Ed Morton
      Aug 19 at 16:19










      up vote
      8
      down vote













      When only searching for strings, and speed matters, you should almost always use grep. It's orders of magnitude faster than awk when it comes to just gross searching.



      source The functional and performance differences of sed, awk and other Unix parsing utilities



      UTILITY OPERATION TYPE EXECUTION TIME CHARACTERS PROCESSED PER SECOND
      (10 ITERATIONS)
      ------- -------------- --------------- -------------------------------
      grep search only 41 sec. 489.3 million
      sed search & replace 4 min. 4 sec. 82.1 million
      awk search & replace 4 min. 46 sec. 69.8 million
      Python search & replace 4 min. 50 sec. 69.0 million
      PHP search & replace 15 min. 44 sec. 21.2 million





      share|improve this answer




















      • Thanks for this nice overview of all these programs. It really sheds light in the darkness.
        – holasz
        Aug 28 '13 at 10:53






      • 1




        ~headtilt~ PHP is on there but Perl isn't?
        – Izkata
        Aug 28 '13 at 11:34










      • @Izkata - I thought the same thing when I saw this table a while ago.
        – slm♦
        Aug 28 '13 at 11:52






      • 1




        It's not really fair to the other utils that grep is just searching and they are also replacing.
        – Kevin
        Aug 28 '13 at 13:57






      • 1




        Those are completely bogus numbers. Talk about comparing apples and oranges - it's like saying you can only find a new car on web site A in 5 secs whereas you can find a car, negotiate a price, get a loan, and purchase the car on site B in 1 hour so therefore site A is faster than site B.The article you quoted is completely wrong in it's statements of relative execution speed between grep, sed, and awk and it also says awk ... has PCRE matching for regular expressions which is just completely untrue.
        – Ed Morton
        Aug 19 at 16:23















      up vote
      8
      down vote













      When only searching for strings, and speed matters, you should almost always use grep. It's orders of magnitude faster than awk when it comes to just gross searching.



      source The functional and performance differences of sed, awk and other Unix parsing utilities



      UTILITY OPERATION TYPE EXECUTION TIME CHARACTERS PROCESSED PER SECOND
      (10 ITERATIONS)
      ------- -------------- --------------- -------------------------------
      grep search only 41 sec. 489.3 million
      sed search & replace 4 min. 4 sec. 82.1 million
      awk search & replace 4 min. 46 sec. 69.8 million
      Python search & replace 4 min. 50 sec. 69.0 million
      PHP search & replace 15 min. 44 sec. 21.2 million





      share|improve this answer




















      • Thanks for this nice overview of all these programs. It really sheds light in the darkness.
        – holasz
        Aug 28 '13 at 10:53






      • 1




        ~headtilt~ PHP is on there but Perl isn't?
        – Izkata
        Aug 28 '13 at 11:34










      • @Izkata - I thought the same thing when I saw this table a while ago.
        – slm♦
        Aug 28 '13 at 11:52






      • 1




        It's not really fair to the other utils that grep is just searching and they are also replacing.
        – Kevin
        Aug 28 '13 at 13:57






      • 1




        Those are completely bogus numbers. Talk about comparing apples and oranges - it's like saying you can only find a new car on web site A in 5 secs whereas you can find a car, negotiate a price, get a loan, and purchase the car on site B in 1 hour so therefore site A is faster than site B.The article you quoted is completely wrong in it's statements of relative execution speed between grep, sed, and awk and it also says awk ... has PCRE matching for regular expressions which is just completely untrue.
        – Ed Morton
        Aug 19 at 16:23













      up vote
      8
      down vote










      up vote
      8
      down vote









      When only searching for strings, and speed matters, you should almost always use grep. It's orders of magnitude faster than awk when it comes to just gross searching.



      source The functional and performance differences of sed, awk and other Unix parsing utilities



      UTILITY OPERATION TYPE EXECUTION TIME CHARACTERS PROCESSED PER SECOND
      (10 ITERATIONS)
      ------- -------------- --------------- -------------------------------
      grep search only 41 sec. 489.3 million
      sed search & replace 4 min. 4 sec. 82.1 million
      awk search & replace 4 min. 46 sec. 69.8 million
      Python search & replace 4 min. 50 sec. 69.0 million
      PHP search & replace 15 min. 44 sec. 21.2 million





      share|improve this answer












      When only searching for strings, and speed matters, you should almost always use grep. It's orders of magnitude faster than awk when it comes to just gross searching.



      source The functional and performance differences of sed, awk and other Unix parsing utilities



      UTILITY OPERATION TYPE EXECUTION TIME CHARACTERS PROCESSED PER SECOND
      (10 ITERATIONS)
      ------- -------------- --------------- -------------------------------
      grep search only 41 sec. 489.3 million
      sed search & replace 4 min. 4 sec. 82.1 million
      awk search & replace 4 min. 46 sec. 69.8 million
      Python search & replace 4 min. 50 sec. 69.0 million
      PHP search & replace 15 min. 44 sec. 21.2 million






      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Aug 28 '13 at 9:12









      slm♦

      238k65493664




      238k65493664











      • Thanks for this nice overview of all these programs. It really sheds light in the darkness.
        – holasz
        Aug 28 '13 at 10:53






      • 1




        ~headtilt~ PHP is on there but Perl isn't?
        – Izkata
        Aug 28 '13 at 11:34










      • @Izkata - I thought the same thing when I saw this table a while ago.
        – slm♦
        Aug 28 '13 at 11:52






      • 1




        It's not really fair to the other utils that grep is just searching and they are also replacing.
        – Kevin
        Aug 28 '13 at 13:57






      • 1




        Those are completely bogus numbers. Talk about comparing apples and oranges - it's like saying you can only find a new car on web site A in 5 secs whereas you can find a car, negotiate a price, get a loan, and purchase the car on site B in 1 hour so therefore site A is faster than site B.The article you quoted is completely wrong in it's statements of relative execution speed between grep, sed, and awk and it also says awk ... has PCRE matching for regular expressions which is just completely untrue.
        – Ed Morton
        Aug 19 at 16:23

















      • Thanks for this nice overview of all these programs. It really sheds light in the darkness.
        – holasz
        Aug 28 '13 at 10:53






      • 1




        ~headtilt~ PHP is on there but Perl isn't?
        – Izkata
        Aug 28 '13 at 11:34










      • @Izkata - I thought the same thing when I saw this table a while ago.
        – slm♦
        Aug 28 '13 at 11:52






      • 1




        It's not really fair to the other utils that grep is just searching and they are also replacing.
        – Kevin
        Aug 28 '13 at 13:57






      • 1




        Those are completely bogus numbers. Talk about comparing apples and oranges - it's like saying you can only find a new car on web site A in 5 secs whereas you can find a car, negotiate a price, get a loan, and purchase the car on site B in 1 hour so therefore site A is faster than site B.The article you quoted is completely wrong in it's statements of relative execution speed between grep, sed, and awk and it also says awk ... has PCRE matching for regular expressions which is just completely untrue.
        – Ed Morton
        Aug 19 at 16:23
















      Thanks for this nice overview of all these programs. It really sheds light in the darkness.
      – holasz
      Aug 28 '13 at 10:53




      Thanks for this nice overview of all these programs. It really sheds light in the darkness.
      – holasz
      Aug 28 '13 at 10:53




      1




      1




      ~headtilt~ PHP is on there but Perl isn't?
      – Izkata
      Aug 28 '13 at 11:34




      ~headtilt~ PHP is on there but Perl isn't?
      – Izkata
      Aug 28 '13 at 11:34












      @Izkata - I thought the same thing when I saw this table a while ago.
      – slm♦
      Aug 28 '13 at 11:52




      @Izkata - I thought the same thing when I saw this table a while ago.
      – slm♦
      Aug 28 '13 at 11:52




      1




      1




      It's not really fair to the other utils that grep is just searching and they are also replacing.
      – Kevin
      Aug 28 '13 at 13:57




      It's not really fair to the other utils that grep is just searching and they are also replacing.
      – Kevin
      Aug 28 '13 at 13:57




      1




      1




      Those are completely bogus numbers. Talk about comparing apples and oranges - it's like saying you can only find a new car on web site A in 5 secs whereas you can find a car, negotiate a price, get a loan, and purchase the car on site B in 1 hour so therefore site A is faster than site B.The article you quoted is completely wrong in it's statements of relative execution speed between grep, sed, and awk and it also says awk ... has PCRE matching for regular expressions which is just completely untrue.
      – Ed Morton
      Aug 19 at 16:23





      Those are completely bogus numbers. Talk about comparing apples and oranges - it's like saying you can only find a new car on web site A in 5 secs whereas you can find a car, negotiate a price, get a loan, and purchase the car on site B in 1 hour so therefore site A is faster than site B.The article you quoted is completely wrong in it's statements of relative execution speed between grep, sed, and awk and it also says awk ... has PCRE matching for regular expressions which is just completely untrue.
      – Ed Morton
      Aug 19 at 16:23











      up vote
      5
      down vote













      While I agree that in theory grep should be faster than awk, in practice, YMMV as that depends a lot on the implementation you use.



      here comparing busybox 1.20.0's grep and awk, GNU grep 2.14, mawk 1.3.3, GNU awk 4.0.1 on Debian/Linux 7.0 amd64 (with glibc 2.17) in a UTF-8 locale on a 240MB file of 2.5M lines of ASCII-only characters.



      $ time busybox grep error error | wc -l
      331003
      busybox grep error error 8.31s user 0.12s system 99% cpu 8.450 total
      wc -l 0.07s user 0.11s system 2% cpu 8.448 total
      $ time busybox awk /error/ error | wc -l
      331003
      busybox awk /error/ error 2.39s user 0.84s system 98% cpu 3.265 total
      wc -l 0.12s user 1.23s system 41% cpu 3.264 total
      $ time grep error error | wc -l
      331003
      grep error error 0.80s user 0.10s system 99% cpu 0.914 total
      wc -l 0.00s user 0.11s system 12% cpu 0.913 total
      $ time mawk /error/ error | wc -l
      330803
      mawk /error/ error 0.54s user 0.13s system 91% cpu 0.732 total
      wc -l 0.03s user 0.08s system 14% cpu 0.731 total
      $ time gawk /error/ error | wc -l
      331003
      gawk /error/ error 1.37s user 0.12s system 99% cpu 1.494 total
      wc -l 0.04s user 0.07s system 7% cpu 1.492 total
      $ time


      In the C locale, only GNU grep gets a significant boost and becomes faster than mawk.



      The dataset, the type of the regexp may also make a big difference. For regexps, awk should be compared to grep -E as awk's regexps are extended REs.



      For this dataset, awk could be faster than grep on busybox based systems or systems where mawk is the default awk and the default locale is UTF-8 based (IIRC, it used to be the case in Ubuntu).






      share|improve this answer


























        up vote
        5
        down vote













        While I agree that in theory grep should be faster than awk, in practice, YMMV as that depends a lot on the implementation you use.



        here comparing busybox 1.20.0's grep and awk, GNU grep 2.14, mawk 1.3.3, GNU awk 4.0.1 on Debian/Linux 7.0 amd64 (with glibc 2.17) in a UTF-8 locale on a 240MB file of 2.5M lines of ASCII-only characters.



        $ time busybox grep error error | wc -l
        331003
        busybox grep error error 8.31s user 0.12s system 99% cpu 8.450 total
        wc -l 0.07s user 0.11s system 2% cpu 8.448 total
        $ time busybox awk /error/ error | wc -l
        331003
        busybox awk /error/ error 2.39s user 0.84s system 98% cpu 3.265 total
        wc -l 0.12s user 1.23s system 41% cpu 3.264 total
        $ time grep error error | wc -l
        331003
        grep error error 0.80s user 0.10s system 99% cpu 0.914 total
        wc -l 0.00s user 0.11s system 12% cpu 0.913 total
        $ time mawk /error/ error | wc -l
        330803
        mawk /error/ error 0.54s user 0.13s system 91% cpu 0.732 total
        wc -l 0.03s user 0.08s system 14% cpu 0.731 total
        $ time gawk /error/ error | wc -l
        331003
        gawk /error/ error 1.37s user 0.12s system 99% cpu 1.494 total
        wc -l 0.04s user 0.07s system 7% cpu 1.492 total
        $ time


        In the C locale, only GNU grep gets a significant boost and becomes faster than mawk.



        The dataset, the type of the regexp may also make a big difference. For regexps, awk should be compared to grep -E as awk's regexps are extended REs.



        For this dataset, awk could be faster than grep on busybox based systems or systems where mawk is the default awk and the default locale is UTF-8 based (IIRC, it used to be the case in Ubuntu).






        share|improve this answer
























          up vote
          5
          down vote










          up vote
          5
          down vote









          While I agree that in theory grep should be faster than awk, in practice, YMMV as that depends a lot on the implementation you use.



          here comparing busybox 1.20.0's grep and awk, GNU grep 2.14, mawk 1.3.3, GNU awk 4.0.1 on Debian/Linux 7.0 amd64 (with glibc 2.17) in a UTF-8 locale on a 240MB file of 2.5M lines of ASCII-only characters.



          $ time busybox grep error error | wc -l
          331003
          busybox grep error error 8.31s user 0.12s system 99% cpu 8.450 total
          wc -l 0.07s user 0.11s system 2% cpu 8.448 total
          $ time busybox awk /error/ error | wc -l
          331003
          busybox awk /error/ error 2.39s user 0.84s system 98% cpu 3.265 total
          wc -l 0.12s user 1.23s system 41% cpu 3.264 total
          $ time grep error error | wc -l
          331003
          grep error error 0.80s user 0.10s system 99% cpu 0.914 total
          wc -l 0.00s user 0.11s system 12% cpu 0.913 total
          $ time mawk /error/ error | wc -l
          330803
          mawk /error/ error 0.54s user 0.13s system 91% cpu 0.732 total
          wc -l 0.03s user 0.08s system 14% cpu 0.731 total
          $ time gawk /error/ error | wc -l
          331003
          gawk /error/ error 1.37s user 0.12s system 99% cpu 1.494 total
          wc -l 0.04s user 0.07s system 7% cpu 1.492 total
          $ time


          In the C locale, only GNU grep gets a significant boost and becomes faster than mawk.



          The dataset, the type of the regexp may also make a big difference. For regexps, awk should be compared to grep -E as awk's regexps are extended REs.



          For this dataset, awk could be faster than grep on busybox based systems or systems where mawk is the default awk and the default locale is UTF-8 based (IIRC, it used to be the case in Ubuntu).






          share|improve this answer














          While I agree that in theory grep should be faster than awk, in practice, YMMV as that depends a lot on the implementation you use.



          here comparing busybox 1.20.0's grep and awk, GNU grep 2.14, mawk 1.3.3, GNU awk 4.0.1 on Debian/Linux 7.0 amd64 (with glibc 2.17) in a UTF-8 locale on a 240MB file of 2.5M lines of ASCII-only characters.



          $ time busybox grep error error | wc -l
          331003
          busybox grep error error 8.31s user 0.12s system 99% cpu 8.450 total
          wc -l 0.07s user 0.11s system 2% cpu 8.448 total
          $ time busybox awk /error/ error | wc -l
          331003
          busybox awk /error/ error 2.39s user 0.84s system 98% cpu 3.265 total
          wc -l 0.12s user 1.23s system 41% cpu 3.264 total
          $ time grep error error | wc -l
          331003
          grep error error 0.80s user 0.10s system 99% cpu 0.914 total
          wc -l 0.00s user 0.11s system 12% cpu 0.913 total
          $ time mawk /error/ error | wc -l
          330803
          mawk /error/ error 0.54s user 0.13s system 91% cpu 0.732 total
          wc -l 0.03s user 0.08s system 14% cpu 0.731 total
          $ time gawk /error/ error | wc -l
          331003
          gawk /error/ error 1.37s user 0.12s system 99% cpu 1.494 total
          wc -l 0.04s user 0.07s system 7% cpu 1.492 total
          $ time


          In the C locale, only GNU grep gets a significant boost and becomes faster than mawk.



          The dataset, the type of the regexp may also make a big difference. For regexps, awk should be compared to grep -E as awk's regexps are extended REs.



          For this dataset, awk could be faster than grep on busybox based systems or systems where mawk is the default awk and the default locale is UTF-8 based (IIRC, it used to be the case in Ubuntu).







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Aug 28 '13 at 12:30

























          answered Aug 28 '13 at 12:19









          Stéphane Chazelas

          285k53525864




          285k53525864




















              up vote
              2
              down vote













              In a nutshell, grep does one thing only as many other UNIX tools and that's matching a line to the given pattern and it does it well. On the other hand, awk is more sophisticated tool as it is a complete programming language defined by POSIX standard with typical features like variables, arrays, expressions, functions or control statements for pattern scanning and processing.



              In my opinion, it depends on the implementation how both tools perform in case of pattern matching and on the size of some input you want to process. I would expect that grep is usually more efficient than awk as it does matching only. But you can't write with grep a simple code to perform more complex tasks like further processing of matched records, computation or printing results without using other tools.






              share|improve this answer
























                up vote
                2
                down vote













                In a nutshell, grep does one thing only as many other UNIX tools and that's matching a line to the given pattern and it does it well. On the other hand, awk is more sophisticated tool as it is a complete programming language defined by POSIX standard with typical features like variables, arrays, expressions, functions or control statements for pattern scanning and processing.



                In my opinion, it depends on the implementation how both tools perform in case of pattern matching and on the size of some input you want to process. I would expect that grep is usually more efficient than awk as it does matching only. But you can't write with grep a simple code to perform more complex tasks like further processing of matched records, computation or printing results without using other tools.






                share|improve this answer






















                  up vote
                  2
                  down vote










                  up vote
                  2
                  down vote









                  In a nutshell, grep does one thing only as many other UNIX tools and that's matching a line to the given pattern and it does it well. On the other hand, awk is more sophisticated tool as it is a complete programming language defined by POSIX standard with typical features like variables, arrays, expressions, functions or control statements for pattern scanning and processing.



                  In my opinion, it depends on the implementation how both tools perform in case of pattern matching and on the size of some input you want to process. I would expect that grep is usually more efficient than awk as it does matching only. But you can't write with grep a simple code to perform more complex tasks like further processing of matched records, computation or printing results without using other tools.






                  share|improve this answer












                  In a nutshell, grep does one thing only as many other UNIX tools and that's matching a line to the given pattern and it does it well. On the other hand, awk is more sophisticated tool as it is a complete programming language defined by POSIX standard with typical features like variables, arrays, expressions, functions or control statements for pattern scanning and processing.



                  In my opinion, it depends on the implementation how both tools perform in case of pattern matching and on the size of some input you want to process. I would expect that grep is usually more efficient than awk as it does matching only. But you can't write with grep a simple code to perform more complex tasks like further processing of matched records, computation or printing results without using other tools.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Aug 28 '13 at 9:05









                  dsmsk80

                  2,155813




                  2,155813



























                       

                      draft saved


                      draft discarded















































                       


                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f88503%2fusing-grep-vs-awk%23new-answer', 'question_page');

                      );

                      Post as a guest













































































                      Popular posts from this blog

                      Peggy Mitchell

                      Palaiologos

                      The Forum (Inglewood, California)