Extract numerical data from complicated pattern in a plain-text file and produce tabular output

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
2
down vote

favorite
1












This is an SOS question. My professor asked me to obtain output from a long-running simulation code bequeathed to us by a former post-doc (who had explained to me its workings).



I had done a few small-scale trial runs and everything went fine. Then I started the full simulation about a month back and has been running continuously since then. But just a few minutes ago, due to some memory issues, the program crashed before it could write the formatted tabular output to disk.



Luckily, I had enabled terminal echoing of the intermediate results, and had set my scroll-back history to a large value. I managed to salvage partial output by entering scroll-back mode and copying the entire terminal dump into a text file (and also made backup copies of it).



Now, this terminal output is quite verbose (intentionally set so for debug purposes). The following is a snapshot from the salvaged terminal output text file (let us call it terminal_output.txt)



1 Linear search iteration no. 1 begins: Attempting to blah blah with 1 ...
2 blah blah
3 blah
4 blah blah blah
5 lorem ipsum
.........
........
75 Success with 128 blah ....
76 blah blah
77 blah blah
78 result_flag: 1, exit_reason: 6
79 blah
80 Completed optimal computation with T_init = 25.00 degC & T_sink = 35.00 degC


And then this exact pattern repeats. e.g.,



81 Linear search iteration no. 2 begins: Attempting to blah blah with 1 ...
82 blah
......
95 Success with 307 blah ....
......
......
100 Completed optimal computation with T_init = 30.00 degC & T_sink = 40.00 degC


My requirement is to extract the following information to produce a tabular output like:



25 35 128
30 40 307
...........
...........


i.e. the 1st & 2nd columns are from the numerical values corresponding to T_init and T_sink respectively, from those lines begin with Completed. The 3rd column is the numerical value from the line that begins with Success (which is always 5 rows before Completed if that helps). Any separator between the columns is acceptable-be it spaces, tabs or comma.



I wish to do this natively using standard *nix utilities such as grep, sed and awk or even vi/vim. Either piped one-liners strung together or bash scripts is fine. If necessary, I am open to using python, perl or other scripting languages also.







share|improve this question

























    up vote
    2
    down vote

    favorite
    1












    This is an SOS question. My professor asked me to obtain output from a long-running simulation code bequeathed to us by a former post-doc (who had explained to me its workings).



    I had done a few small-scale trial runs and everything went fine. Then I started the full simulation about a month back and has been running continuously since then. But just a few minutes ago, due to some memory issues, the program crashed before it could write the formatted tabular output to disk.



    Luckily, I had enabled terminal echoing of the intermediate results, and had set my scroll-back history to a large value. I managed to salvage partial output by entering scroll-back mode and copying the entire terminal dump into a text file (and also made backup copies of it).



    Now, this terminal output is quite verbose (intentionally set so for debug purposes). The following is a snapshot from the salvaged terminal output text file (let us call it terminal_output.txt)



    1 Linear search iteration no. 1 begins: Attempting to blah blah with 1 ...
    2 blah blah
    3 blah
    4 blah blah blah
    5 lorem ipsum
    .........
    ........
    75 Success with 128 blah ....
    76 blah blah
    77 blah blah
    78 result_flag: 1, exit_reason: 6
    79 blah
    80 Completed optimal computation with T_init = 25.00 degC & T_sink = 35.00 degC


    And then this exact pattern repeats. e.g.,



    81 Linear search iteration no. 2 begins: Attempting to blah blah with 1 ...
    82 blah
    ......
    95 Success with 307 blah ....
    ......
    ......
    100 Completed optimal computation with T_init = 30.00 degC & T_sink = 40.00 degC


    My requirement is to extract the following information to produce a tabular output like:



    25 35 128
    30 40 307
    ...........
    ...........


    i.e. the 1st & 2nd columns are from the numerical values corresponding to T_init and T_sink respectively, from those lines begin with Completed. The 3rd column is the numerical value from the line that begins with Success (which is always 5 rows before Completed if that helps). Any separator between the columns is acceptable-be it spaces, tabs or comma.



    I wish to do this natively using standard *nix utilities such as grep, sed and awk or even vi/vim. Either piped one-liners strung together or bash scripts is fine. If necessary, I am open to using python, perl or other scripting languages also.







    share|improve this question























      up vote
      2
      down vote

      favorite
      1









      up vote
      2
      down vote

      favorite
      1






      1





      This is an SOS question. My professor asked me to obtain output from a long-running simulation code bequeathed to us by a former post-doc (who had explained to me its workings).



      I had done a few small-scale trial runs and everything went fine. Then I started the full simulation about a month back and has been running continuously since then. But just a few minutes ago, due to some memory issues, the program crashed before it could write the formatted tabular output to disk.



      Luckily, I had enabled terminal echoing of the intermediate results, and had set my scroll-back history to a large value. I managed to salvage partial output by entering scroll-back mode and copying the entire terminal dump into a text file (and also made backup copies of it).



      Now, this terminal output is quite verbose (intentionally set so for debug purposes). The following is a snapshot from the salvaged terminal output text file (let us call it terminal_output.txt)



      1 Linear search iteration no. 1 begins: Attempting to blah blah with 1 ...
      2 blah blah
      3 blah
      4 blah blah blah
      5 lorem ipsum
      .........
      ........
      75 Success with 128 blah ....
      76 blah blah
      77 blah blah
      78 result_flag: 1, exit_reason: 6
      79 blah
      80 Completed optimal computation with T_init = 25.00 degC & T_sink = 35.00 degC


      And then this exact pattern repeats. e.g.,



      81 Linear search iteration no. 2 begins: Attempting to blah blah with 1 ...
      82 blah
      ......
      95 Success with 307 blah ....
      ......
      ......
      100 Completed optimal computation with T_init = 30.00 degC & T_sink = 40.00 degC


      My requirement is to extract the following information to produce a tabular output like:



      25 35 128
      30 40 307
      ...........
      ...........


      i.e. the 1st & 2nd columns are from the numerical values corresponding to T_init and T_sink respectively, from those lines begin with Completed. The 3rd column is the numerical value from the line that begins with Success (which is always 5 rows before Completed if that helps). Any separator between the columns is acceptable-be it spaces, tabs or comma.



      I wish to do this natively using standard *nix utilities such as grep, sed and awk or even vi/vim. Either piped one-liners strung together or bash scripts is fine. If necessary, I am open to using python, perl or other scripting languages also.







      share|improve this question













      This is an SOS question. My professor asked me to obtain output from a long-running simulation code bequeathed to us by a former post-doc (who had explained to me its workings).



      I had done a few small-scale trial runs and everything went fine. Then I started the full simulation about a month back and has been running continuously since then. But just a few minutes ago, due to some memory issues, the program crashed before it could write the formatted tabular output to disk.



      Luckily, I had enabled terminal echoing of the intermediate results, and had set my scroll-back history to a large value. I managed to salvage partial output by entering scroll-back mode and copying the entire terminal dump into a text file (and also made backup copies of it).



      Now, this terminal output is quite verbose (intentionally set so for debug purposes). The following is a snapshot from the salvaged terminal output text file (let us call it terminal_output.txt)



      1 Linear search iteration no. 1 begins: Attempting to blah blah with 1 ...
      2 blah blah
      3 blah
      4 blah blah blah
      5 lorem ipsum
      .........
      ........
      75 Success with 128 blah ....
      76 blah blah
      77 blah blah
      78 result_flag: 1, exit_reason: 6
      79 blah
      80 Completed optimal computation with T_init = 25.00 degC & T_sink = 35.00 degC


      And then this exact pattern repeats. e.g.,



      81 Linear search iteration no. 2 begins: Attempting to blah blah with 1 ...
      82 blah
      ......
      95 Success with 307 blah ....
      ......
      ......
      100 Completed optimal computation with T_init = 30.00 degC & T_sink = 40.00 degC


      My requirement is to extract the following information to produce a tabular output like:



      25 35 128
      30 40 307
      ...........
      ...........


      i.e. the 1st & 2nd columns are from the numerical values corresponding to T_init and T_sink respectively, from those lines begin with Completed. The 3rd column is the numerical value from the line that begins with Success (which is always 5 rows before Completed if that helps). Any separator between the columns is acceptable-be it spaces, tabs or comma.



      I wish to do this natively using standard *nix utilities such as grep, sed and awk or even vi/vim. Either piped one-liners strung together or bash scripts is fine. If necessary, I am open to using python, perl or other scripting languages also.









      share|improve this question












      share|improve this question




      share|improve this question








      edited May 2 at 15:35
























      asked May 2 at 14:46









      Krishna

      1828




      1828




















          3 Answers
          3






          active

          oldest

          votes

















          up vote
          2
          down vote



          accepted










          It's essentially a matter of capturing the parts you want, and discarding the parts you don't. For example using sed, you could capture the integer Success value and copy it to the hold space (h), retrieving and appending it (G) to the captured digits of the Completed line:



          sed -nE 
          -e '/Success/ s/.* ([0-9]+).*/1/; h;'
          -e '/Completed/G; s/.*T_init = ([0-9]+).00 degC & T_sink = ([0-9]+).*n/1 2 /; p;
          ' terminal_output.txt


          Perl provides a somewhat more expressive syntax, which IMHO is more readable:



          perl -lne '
          our $a = $1 if /Success.*?(d+)/; print join " ", /(d+).d+/g, $a if /Completed/
          ' terminal_output.txt


          produces the desired output



          25 35 128
          30 40 307





          share|improve this answer























          • Woohoo...That worked . Thank you very much. Really, using perl over the venerable sed and awk ? Also, for educational purposes, can you please provide an explanation of how your code works?
            – Krishna
            May 2 at 15:25











          • Is it because the PCRE engine is more advanced than the others?
            – Krishna
            May 2 at 16:32










          • @Krishna it's certainly possible in sed or awk, however perl has some features that make it easier IMHO
            – steeldriver
            May 2 at 20:55










          • thank you for the explanation and the nice answer. It was a life-saver!
            – Krishna
            May 3 at 8:59

















          up vote
          0
          down vote













          POSIX-compatible sed:



          grep -e 'Success' -e 'Completed' your_file | sed 'N;s/Success with ([[:digit:]]+).*T_init = ([^[:space:]]+).*T_sink = ([^[:space:]]+).*/2 3 1/;s/.00//g'


          GNU sed: (in which . does not match n at least in 4.2.2 on CentOS)



          grep -e 'Success' -e 'Completed' your_file | sed 'N;s/Success with ([[:digit:]]+).*n.*T_init = ([^[:space:]]+).*T_sink = ([^[:space:]]+).*/2 3 1/;s/.00//g'


          Grabs lines containing Success and Completed, then operating on two lines (way more explicitly than necessary) pulls the three fields you care about and orders them into one line.



          This only truncates .00 off of any numbers, leaving any significant fractional component alone (including something like 12.20, there would still be the single trailing zero).



          Caveat where it won't work if some of those ... lines contain Completed or Success






          share|improve this answer























          • This does not go all the way in extracting the numerical data and producing the tabular output. It does correctly part-way up until identifying only those lines containing the relevant data.
            – Krishna
            May 2 at 15:31










          • @Krishna, Aside from what I indicated with the .00 remaining, is there anything off? My tests do not indicate so. If you indicate that you are looking to truncate that data (didn't want to assume that everything was .00), I can update my answer as appropriate. If you want to keep the decimals for non-.00 data, that is also something we can do
            – cunninghamp3
            May 2 at 15:36











          • Yes. All the non-integers are just appended with trailing .00, but your present code does not produce the tabular data I am looking for. Please feel free to edit the answer
            – Krishna
            May 2 at 15:38











          • @Krishna, what do you mean "does not produce the tabular data" ? This should output a space-separated line of numbers, per result, same as the perl solution and your example (minus that mine includes decimals)?
            – cunninghamp3
            May 2 at 15:41










          • The output is sent to stdout - you can add an > outfile.txt on the end of the command to store it
            – cunninghamp3
            May 2 at 15:42

















          up vote
          -1
          down vote













          A quick awk command should get you started:



          awk '$2 ~ /Success/a=$4;next; $2 ~ /Completed/b=$8;c=$13;print a,b,c' terminal_output.txt



          This wont work if you have multiple Success lines before a Completed line etc.






          share|improve this answer





















          • This did not work.. I have exactly one Success line before a Completed line.. Any idea why this is not working? Also does awk modify the file itself? I see no changes whatsoever in the terminal_output.txt file and the output of the command did not echo anything to the screen
            – Krishna
            May 2 at 15:19










          Your Answer







          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "106"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );








           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f441341%2fextract-numerical-data-from-complicated-pattern-in-a-plain-text-file-and-produce%23new-answer', 'question_page');

          );

          Post as a guest






























          3 Answers
          3






          active

          oldest

          votes








          3 Answers
          3






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          2
          down vote



          accepted










          It's essentially a matter of capturing the parts you want, and discarding the parts you don't. For example using sed, you could capture the integer Success value and copy it to the hold space (h), retrieving and appending it (G) to the captured digits of the Completed line:



          sed -nE 
          -e '/Success/ s/.* ([0-9]+).*/1/; h;'
          -e '/Completed/G; s/.*T_init = ([0-9]+).00 degC & T_sink = ([0-9]+).*n/1 2 /; p;
          ' terminal_output.txt


          Perl provides a somewhat more expressive syntax, which IMHO is more readable:



          perl -lne '
          our $a = $1 if /Success.*?(d+)/; print join " ", /(d+).d+/g, $a if /Completed/
          ' terminal_output.txt


          produces the desired output



          25 35 128
          30 40 307





          share|improve this answer























          • Woohoo...That worked . Thank you very much. Really, using perl over the venerable sed and awk ? Also, for educational purposes, can you please provide an explanation of how your code works?
            – Krishna
            May 2 at 15:25











          • Is it because the PCRE engine is more advanced than the others?
            – Krishna
            May 2 at 16:32










          • @Krishna it's certainly possible in sed or awk, however perl has some features that make it easier IMHO
            – steeldriver
            May 2 at 20:55










          • thank you for the explanation and the nice answer. It was a life-saver!
            – Krishna
            May 3 at 8:59














          up vote
          2
          down vote



          accepted










          It's essentially a matter of capturing the parts you want, and discarding the parts you don't. For example using sed, you could capture the integer Success value and copy it to the hold space (h), retrieving and appending it (G) to the captured digits of the Completed line:



          sed -nE 
          -e '/Success/ s/.* ([0-9]+).*/1/; h;'
          -e '/Completed/G; s/.*T_init = ([0-9]+).00 degC & T_sink = ([0-9]+).*n/1 2 /; p;
          ' terminal_output.txt


          Perl provides a somewhat more expressive syntax, which IMHO is more readable:



          perl -lne '
          our $a = $1 if /Success.*?(d+)/; print join " ", /(d+).d+/g, $a if /Completed/
          ' terminal_output.txt


          produces the desired output



          25 35 128
          30 40 307





          share|improve this answer























          • Woohoo...That worked . Thank you very much. Really, using perl over the venerable sed and awk ? Also, for educational purposes, can you please provide an explanation of how your code works?
            – Krishna
            May 2 at 15:25











          • Is it because the PCRE engine is more advanced than the others?
            – Krishna
            May 2 at 16:32










          • @Krishna it's certainly possible in sed or awk, however perl has some features that make it easier IMHO
            – steeldriver
            May 2 at 20:55










          • thank you for the explanation and the nice answer. It was a life-saver!
            – Krishna
            May 3 at 8:59












          up vote
          2
          down vote



          accepted







          up vote
          2
          down vote



          accepted






          It's essentially a matter of capturing the parts you want, and discarding the parts you don't. For example using sed, you could capture the integer Success value and copy it to the hold space (h), retrieving and appending it (G) to the captured digits of the Completed line:



          sed -nE 
          -e '/Success/ s/.* ([0-9]+).*/1/; h;'
          -e '/Completed/G; s/.*T_init = ([0-9]+).00 degC & T_sink = ([0-9]+).*n/1 2 /; p;
          ' terminal_output.txt


          Perl provides a somewhat more expressive syntax, which IMHO is more readable:



          perl -lne '
          our $a = $1 if /Success.*?(d+)/; print join " ", /(d+).d+/g, $a if /Completed/
          ' terminal_output.txt


          produces the desired output



          25 35 128
          30 40 307





          share|improve this answer















          It's essentially a matter of capturing the parts you want, and discarding the parts you don't. For example using sed, you could capture the integer Success value and copy it to the hold space (h), retrieving and appending it (G) to the captured digits of the Completed line:



          sed -nE 
          -e '/Success/ s/.* ([0-9]+).*/1/; h;'
          -e '/Completed/G; s/.*T_init = ([0-9]+).00 degC & T_sink = ([0-9]+).*n/1 2 /; p;
          ' terminal_output.txt


          Perl provides a somewhat more expressive syntax, which IMHO is more readable:



          perl -lne '
          our $a = $1 if /Success.*?(d+)/; print join " ", /(d+).d+/g, $a if /Completed/
          ' terminal_output.txt


          produces the desired output



          25 35 128
          30 40 307






          share|improve this answer















          share|improve this answer



          share|improve this answer








          edited May 2 at 20:53


























          answered May 2 at 15:21









          steeldriver

          31.3k34978




          31.3k34978











          • Woohoo...That worked . Thank you very much. Really, using perl over the venerable sed and awk ? Also, for educational purposes, can you please provide an explanation of how your code works?
            – Krishna
            May 2 at 15:25











          • Is it because the PCRE engine is more advanced than the others?
            – Krishna
            May 2 at 16:32










          • @Krishna it's certainly possible in sed or awk, however perl has some features that make it easier IMHO
            – steeldriver
            May 2 at 20:55










          • thank you for the explanation and the nice answer. It was a life-saver!
            – Krishna
            May 3 at 8:59
















          • Woohoo...That worked . Thank you very much. Really, using perl over the venerable sed and awk ? Also, for educational purposes, can you please provide an explanation of how your code works?
            – Krishna
            May 2 at 15:25











          • Is it because the PCRE engine is more advanced than the others?
            – Krishna
            May 2 at 16:32










          • @Krishna it's certainly possible in sed or awk, however perl has some features that make it easier IMHO
            – steeldriver
            May 2 at 20:55










          • thank you for the explanation and the nice answer. It was a life-saver!
            – Krishna
            May 3 at 8:59















          Woohoo...That worked . Thank you very much. Really, using perl over the venerable sed and awk ? Also, for educational purposes, can you please provide an explanation of how your code works?
          – Krishna
          May 2 at 15:25





          Woohoo...That worked . Thank you very much. Really, using perl over the venerable sed and awk ? Also, for educational purposes, can you please provide an explanation of how your code works?
          – Krishna
          May 2 at 15:25













          Is it because the PCRE engine is more advanced than the others?
          – Krishna
          May 2 at 16:32




          Is it because the PCRE engine is more advanced than the others?
          – Krishna
          May 2 at 16:32












          @Krishna it's certainly possible in sed or awk, however perl has some features that make it easier IMHO
          – steeldriver
          May 2 at 20:55




          @Krishna it's certainly possible in sed or awk, however perl has some features that make it easier IMHO
          – steeldriver
          May 2 at 20:55












          thank you for the explanation and the nice answer. It was a life-saver!
          – Krishna
          May 3 at 8:59




          thank you for the explanation and the nice answer. It was a life-saver!
          – Krishna
          May 3 at 8:59












          up vote
          0
          down vote













          POSIX-compatible sed:



          grep -e 'Success' -e 'Completed' your_file | sed 'N;s/Success with ([[:digit:]]+).*T_init = ([^[:space:]]+).*T_sink = ([^[:space:]]+).*/2 3 1/;s/.00//g'


          GNU sed: (in which . does not match n at least in 4.2.2 on CentOS)



          grep -e 'Success' -e 'Completed' your_file | sed 'N;s/Success with ([[:digit:]]+).*n.*T_init = ([^[:space:]]+).*T_sink = ([^[:space:]]+).*/2 3 1/;s/.00//g'


          Grabs lines containing Success and Completed, then operating on two lines (way more explicitly than necessary) pulls the three fields you care about and orders them into one line.



          This only truncates .00 off of any numbers, leaving any significant fractional component alone (including something like 12.20, there would still be the single trailing zero).



          Caveat where it won't work if some of those ... lines contain Completed or Success






          share|improve this answer























          • This does not go all the way in extracting the numerical data and producing the tabular output. It does correctly part-way up until identifying only those lines containing the relevant data.
            – Krishna
            May 2 at 15:31










          • @Krishna, Aside from what I indicated with the .00 remaining, is there anything off? My tests do not indicate so. If you indicate that you are looking to truncate that data (didn't want to assume that everything was .00), I can update my answer as appropriate. If you want to keep the decimals for non-.00 data, that is also something we can do
            – cunninghamp3
            May 2 at 15:36











          • Yes. All the non-integers are just appended with trailing .00, but your present code does not produce the tabular data I am looking for. Please feel free to edit the answer
            – Krishna
            May 2 at 15:38











          • @Krishna, what do you mean "does not produce the tabular data" ? This should output a space-separated line of numbers, per result, same as the perl solution and your example (minus that mine includes decimals)?
            – cunninghamp3
            May 2 at 15:41










          • The output is sent to stdout - you can add an > outfile.txt on the end of the command to store it
            – cunninghamp3
            May 2 at 15:42














          up vote
          0
          down vote













          POSIX-compatible sed:



          grep -e 'Success' -e 'Completed' your_file | sed 'N;s/Success with ([[:digit:]]+).*T_init = ([^[:space:]]+).*T_sink = ([^[:space:]]+).*/2 3 1/;s/.00//g'


          GNU sed: (in which . does not match n at least in 4.2.2 on CentOS)



          grep -e 'Success' -e 'Completed' your_file | sed 'N;s/Success with ([[:digit:]]+).*n.*T_init = ([^[:space:]]+).*T_sink = ([^[:space:]]+).*/2 3 1/;s/.00//g'


          Grabs lines containing Success and Completed, then operating on two lines (way more explicitly than necessary) pulls the three fields you care about and orders them into one line.



          This only truncates .00 off of any numbers, leaving any significant fractional component alone (including something like 12.20, there would still be the single trailing zero).



          Caveat where it won't work if some of those ... lines contain Completed or Success






          share|improve this answer























          • This does not go all the way in extracting the numerical data and producing the tabular output. It does correctly part-way up until identifying only those lines containing the relevant data.
            – Krishna
            May 2 at 15:31










          • @Krishna, Aside from what I indicated with the .00 remaining, is there anything off? My tests do not indicate so. If you indicate that you are looking to truncate that data (didn't want to assume that everything was .00), I can update my answer as appropriate. If you want to keep the decimals for non-.00 data, that is also something we can do
            – cunninghamp3
            May 2 at 15:36











          • Yes. All the non-integers are just appended with trailing .00, but your present code does not produce the tabular data I am looking for. Please feel free to edit the answer
            – Krishna
            May 2 at 15:38











          • @Krishna, what do you mean "does not produce the tabular data" ? This should output a space-separated line of numbers, per result, same as the perl solution and your example (minus that mine includes decimals)?
            – cunninghamp3
            May 2 at 15:41










          • The output is sent to stdout - you can add an > outfile.txt on the end of the command to store it
            – cunninghamp3
            May 2 at 15:42












          up vote
          0
          down vote










          up vote
          0
          down vote









          POSIX-compatible sed:



          grep -e 'Success' -e 'Completed' your_file | sed 'N;s/Success with ([[:digit:]]+).*T_init = ([^[:space:]]+).*T_sink = ([^[:space:]]+).*/2 3 1/;s/.00//g'


          GNU sed: (in which . does not match n at least in 4.2.2 on CentOS)



          grep -e 'Success' -e 'Completed' your_file | sed 'N;s/Success with ([[:digit:]]+).*n.*T_init = ([^[:space:]]+).*T_sink = ([^[:space:]]+).*/2 3 1/;s/.00//g'


          Grabs lines containing Success and Completed, then operating on two lines (way more explicitly than necessary) pulls the three fields you care about and orders them into one line.



          This only truncates .00 off of any numbers, leaving any significant fractional component alone (including something like 12.20, there would still be the single trailing zero).



          Caveat where it won't work if some of those ... lines contain Completed or Success






          share|improve this answer















          POSIX-compatible sed:



          grep -e 'Success' -e 'Completed' your_file | sed 'N;s/Success with ([[:digit:]]+).*T_init = ([^[:space:]]+).*T_sink = ([^[:space:]]+).*/2 3 1/;s/.00//g'


          GNU sed: (in which . does not match n at least in 4.2.2 on CentOS)



          grep -e 'Success' -e 'Completed' your_file | sed 'N;s/Success with ([[:digit:]]+).*n.*T_init = ([^[:space:]]+).*T_sink = ([^[:space:]]+).*/2 3 1/;s/.00//g'


          Grabs lines containing Success and Completed, then operating on two lines (way more explicitly than necessary) pulls the three fields you care about and orders them into one line.



          This only truncates .00 off of any numbers, leaving any significant fractional component alone (including something like 12.20, there would still be the single trailing zero).



          Caveat where it won't work if some of those ... lines contain Completed or Success







          share|improve this answer















          share|improve this answer



          share|improve this answer








          edited May 2 at 18:52


























          answered May 2 at 15:22









          cunninghamp3

          473215




          473215











          • This does not go all the way in extracting the numerical data and producing the tabular output. It does correctly part-way up until identifying only those lines containing the relevant data.
            – Krishna
            May 2 at 15:31










          • @Krishna, Aside from what I indicated with the .00 remaining, is there anything off? My tests do not indicate so. If you indicate that you are looking to truncate that data (didn't want to assume that everything was .00), I can update my answer as appropriate. If you want to keep the decimals for non-.00 data, that is also something we can do
            – cunninghamp3
            May 2 at 15:36











          • Yes. All the non-integers are just appended with trailing .00, but your present code does not produce the tabular data I am looking for. Please feel free to edit the answer
            – Krishna
            May 2 at 15:38











          • @Krishna, what do you mean "does not produce the tabular data" ? This should output a space-separated line of numbers, per result, same as the perl solution and your example (minus that mine includes decimals)?
            – cunninghamp3
            May 2 at 15:41










          • The output is sent to stdout - you can add an > outfile.txt on the end of the command to store it
            – cunninghamp3
            May 2 at 15:42
















          • This does not go all the way in extracting the numerical data and producing the tabular output. It does correctly part-way up until identifying only those lines containing the relevant data.
            – Krishna
            May 2 at 15:31










          • @Krishna, Aside from what I indicated with the .00 remaining, is there anything off? My tests do not indicate so. If you indicate that you are looking to truncate that data (didn't want to assume that everything was .00), I can update my answer as appropriate. If you want to keep the decimals for non-.00 data, that is also something we can do
            – cunninghamp3
            May 2 at 15:36











          • Yes. All the non-integers are just appended with trailing .00, but your present code does not produce the tabular data I am looking for. Please feel free to edit the answer
            – Krishna
            May 2 at 15:38











          • @Krishna, what do you mean "does not produce the tabular data" ? This should output a space-separated line of numbers, per result, same as the perl solution and your example (minus that mine includes decimals)?
            – cunninghamp3
            May 2 at 15:41










          • The output is sent to stdout - you can add an > outfile.txt on the end of the command to store it
            – cunninghamp3
            May 2 at 15:42















          This does not go all the way in extracting the numerical data and producing the tabular output. It does correctly part-way up until identifying only those lines containing the relevant data.
          – Krishna
          May 2 at 15:31




          This does not go all the way in extracting the numerical data and producing the tabular output. It does correctly part-way up until identifying only those lines containing the relevant data.
          – Krishna
          May 2 at 15:31












          @Krishna, Aside from what I indicated with the .00 remaining, is there anything off? My tests do not indicate so. If you indicate that you are looking to truncate that data (didn't want to assume that everything was .00), I can update my answer as appropriate. If you want to keep the decimals for non-.00 data, that is also something we can do
          – cunninghamp3
          May 2 at 15:36





          @Krishna, Aside from what I indicated with the .00 remaining, is there anything off? My tests do not indicate so. If you indicate that you are looking to truncate that data (didn't want to assume that everything was .00), I can update my answer as appropriate. If you want to keep the decimals for non-.00 data, that is also something we can do
          – cunninghamp3
          May 2 at 15:36













          Yes. All the non-integers are just appended with trailing .00, but your present code does not produce the tabular data I am looking for. Please feel free to edit the answer
          – Krishna
          May 2 at 15:38





          Yes. All the non-integers are just appended with trailing .00, but your present code does not produce the tabular data I am looking for. Please feel free to edit the answer
          – Krishna
          May 2 at 15:38













          @Krishna, what do you mean "does not produce the tabular data" ? This should output a space-separated line of numbers, per result, same as the perl solution and your example (minus that mine includes decimals)?
          – cunninghamp3
          May 2 at 15:41




          @Krishna, what do you mean "does not produce the tabular data" ? This should output a space-separated line of numbers, per result, same as the perl solution and your example (minus that mine includes decimals)?
          – cunninghamp3
          May 2 at 15:41












          The output is sent to stdout - you can add an > outfile.txt on the end of the command to store it
          – cunninghamp3
          May 2 at 15:42




          The output is sent to stdout - you can add an > outfile.txt on the end of the command to store it
          – cunninghamp3
          May 2 at 15:42










          up vote
          -1
          down vote













          A quick awk command should get you started:



          awk '$2 ~ /Success/a=$4;next; $2 ~ /Completed/b=$8;c=$13;print a,b,c' terminal_output.txt



          This wont work if you have multiple Success lines before a Completed line etc.






          share|improve this answer





















          • This did not work.. I have exactly one Success line before a Completed line.. Any idea why this is not working? Also does awk modify the file itself? I see no changes whatsoever in the terminal_output.txt file and the output of the command did not echo anything to the screen
            – Krishna
            May 2 at 15:19














          up vote
          -1
          down vote













          A quick awk command should get you started:



          awk '$2 ~ /Success/a=$4;next; $2 ~ /Completed/b=$8;c=$13;print a,b,c' terminal_output.txt



          This wont work if you have multiple Success lines before a Completed line etc.






          share|improve this answer





















          • This did not work.. I have exactly one Success line before a Completed line.. Any idea why this is not working? Also does awk modify the file itself? I see no changes whatsoever in the terminal_output.txt file and the output of the command did not echo anything to the screen
            – Krishna
            May 2 at 15:19












          up vote
          -1
          down vote










          up vote
          -1
          down vote









          A quick awk command should get you started:



          awk '$2 ~ /Success/a=$4;next; $2 ~ /Completed/b=$8;c=$13;print a,b,c' terminal_output.txt



          This wont work if you have multiple Success lines before a Completed line etc.






          share|improve this answer













          A quick awk command should get you started:



          awk '$2 ~ /Success/a=$4;next; $2 ~ /Completed/b=$8;c=$13;print a,b,c' terminal_output.txt



          This wont work if you have multiple Success lines before a Completed line etc.







          share|improve this answer













          share|improve this answer



          share|improve this answer











          answered May 2 at 15:10









          rusty shackleford

          1,145115




          1,145115











          • This did not work.. I have exactly one Success line before a Completed line.. Any idea why this is not working? Also does awk modify the file itself? I see no changes whatsoever in the terminal_output.txt file and the output of the command did not echo anything to the screen
            – Krishna
            May 2 at 15:19
















          • This did not work.. I have exactly one Success line before a Completed line.. Any idea why this is not working? Also does awk modify the file itself? I see no changes whatsoever in the terminal_output.txt file and the output of the command did not echo anything to the screen
            – Krishna
            May 2 at 15:19















          This did not work.. I have exactly one Success line before a Completed line.. Any idea why this is not working? Also does awk modify the file itself? I see no changes whatsoever in the terminal_output.txt file and the output of the command did not echo anything to the screen
          – Krishna
          May 2 at 15:19




          This did not work.. I have exactly one Success line before a Completed line.. Any idea why this is not working? Also does awk modify the file itself? I see no changes whatsoever in the terminal_output.txt file and the output of the command did not echo anything to the screen
          – Krishna
          May 2 at 15:19












           

          draft saved


          draft discarded


























           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f441341%2fextract-numerical-data-from-complicated-pattern-in-a-plain-text-file-and-produce%23new-answer', 'question_page');

          );

          Post as a guest













































































          Popular posts from this blog

          How to check contact read email or not when send email to Individual?

          Displaying single band from multi-band raster using QGIS

          How many registers does an x86_64 CPU actually have?