Extract numerical data from complicated pattern in a plain-text file and produce tabular output
Clash Royale CLAN TAG#URR8PPP
up vote
2
down vote
favorite
This is an SOS question. My professor asked me to obtain output from a long-running simulation code bequeathed to us by a former post-doc (who had explained to me its workings).
I had done a few small-scale trial runs and everything went fine. Then I started the full simulation about a month back and has been running continuously since then. But just a few minutes ago, due to some memory issues, the program crashed before it could write the formatted tabular output to disk.
Luckily, I had enabled terminal echoing of the intermediate results, and had set my scroll-back history to a large value. I managed to salvage partial output by entering scroll-back mode and copying the entire terminal dump into a text file (and also made backup copies of it).
Now, this terminal output is quite verbose (intentionally set so for debug purposes). The following is a snapshot from the salvaged terminal output text file (let us call it terminal_output.txt
)
1 Linear search iteration no. 1 begins: Attempting to blah blah with 1 ...
2 blah blah
3 blah
4 blah blah blah
5 lorem ipsum
.........
........
75 Success with 128 blah ....
76 blah blah
77 blah blah
78 result_flag: 1, exit_reason: 6
79 blah
80 Completed optimal computation with T_init = 25.00 degC & T_sink = 35.00 degC
And then this exact pattern repeats. e.g.,
81 Linear search iteration no. 2 begins: Attempting to blah blah with 1 ...
82 blah
......
95 Success with 307 blah ....
......
......
100 Completed optimal computation with T_init = 30.00 degC & T_sink = 40.00 degC
My requirement is to extract the following information to produce a tabular output like:
25 35 128
30 40 307
...........
...........
i.e. the 1st & 2nd columns are from the numerical values corresponding to T_init
and T_sink
respectively, from those lines begin with Completed
. The 3rd column is the numerical value from the line that begins with Success
(which is always 5 rows before Completed
if that helps). Any separator between the columns is acceptable-be it spaces, tabs or comma.
I wish to do this natively using standard *nix utilities such as grep
, sed
and awk
or even vi/vim
. Either piped one-liners strung together or bash
scripts is fine. If necessary, I am open to using python
, perl
or other scripting languages also.
shell awk sed grep regular-expression
add a comment |Â
up vote
2
down vote
favorite
This is an SOS question. My professor asked me to obtain output from a long-running simulation code bequeathed to us by a former post-doc (who had explained to me its workings).
I had done a few small-scale trial runs and everything went fine. Then I started the full simulation about a month back and has been running continuously since then. But just a few minutes ago, due to some memory issues, the program crashed before it could write the formatted tabular output to disk.
Luckily, I had enabled terminal echoing of the intermediate results, and had set my scroll-back history to a large value. I managed to salvage partial output by entering scroll-back mode and copying the entire terminal dump into a text file (and also made backup copies of it).
Now, this terminal output is quite verbose (intentionally set so for debug purposes). The following is a snapshot from the salvaged terminal output text file (let us call it terminal_output.txt
)
1 Linear search iteration no. 1 begins: Attempting to blah blah with 1 ...
2 blah blah
3 blah
4 blah blah blah
5 lorem ipsum
.........
........
75 Success with 128 blah ....
76 blah blah
77 blah blah
78 result_flag: 1, exit_reason: 6
79 blah
80 Completed optimal computation with T_init = 25.00 degC & T_sink = 35.00 degC
And then this exact pattern repeats. e.g.,
81 Linear search iteration no. 2 begins: Attempting to blah blah with 1 ...
82 blah
......
95 Success with 307 blah ....
......
......
100 Completed optimal computation with T_init = 30.00 degC & T_sink = 40.00 degC
My requirement is to extract the following information to produce a tabular output like:
25 35 128
30 40 307
...........
...........
i.e. the 1st & 2nd columns are from the numerical values corresponding to T_init
and T_sink
respectively, from those lines begin with Completed
. The 3rd column is the numerical value from the line that begins with Success
(which is always 5 rows before Completed
if that helps). Any separator between the columns is acceptable-be it spaces, tabs or comma.
I wish to do this natively using standard *nix utilities such as grep
, sed
and awk
or even vi/vim
. Either piped one-liners strung together or bash
scripts is fine. If necessary, I am open to using python
, perl
or other scripting languages also.
shell awk sed grep regular-expression
add a comment |Â
up vote
2
down vote
favorite
up vote
2
down vote
favorite
This is an SOS question. My professor asked me to obtain output from a long-running simulation code bequeathed to us by a former post-doc (who had explained to me its workings).
I had done a few small-scale trial runs and everything went fine. Then I started the full simulation about a month back and has been running continuously since then. But just a few minutes ago, due to some memory issues, the program crashed before it could write the formatted tabular output to disk.
Luckily, I had enabled terminal echoing of the intermediate results, and had set my scroll-back history to a large value. I managed to salvage partial output by entering scroll-back mode and copying the entire terminal dump into a text file (and also made backup copies of it).
Now, this terminal output is quite verbose (intentionally set so for debug purposes). The following is a snapshot from the salvaged terminal output text file (let us call it terminal_output.txt
)
1 Linear search iteration no. 1 begins: Attempting to blah blah with 1 ...
2 blah blah
3 blah
4 blah blah blah
5 lorem ipsum
.........
........
75 Success with 128 blah ....
76 blah blah
77 blah blah
78 result_flag: 1, exit_reason: 6
79 blah
80 Completed optimal computation with T_init = 25.00 degC & T_sink = 35.00 degC
And then this exact pattern repeats. e.g.,
81 Linear search iteration no. 2 begins: Attempting to blah blah with 1 ...
82 blah
......
95 Success with 307 blah ....
......
......
100 Completed optimal computation with T_init = 30.00 degC & T_sink = 40.00 degC
My requirement is to extract the following information to produce a tabular output like:
25 35 128
30 40 307
...........
...........
i.e. the 1st & 2nd columns are from the numerical values corresponding to T_init
and T_sink
respectively, from those lines begin with Completed
. The 3rd column is the numerical value from the line that begins with Success
(which is always 5 rows before Completed
if that helps). Any separator between the columns is acceptable-be it spaces, tabs or comma.
I wish to do this natively using standard *nix utilities such as grep
, sed
and awk
or even vi/vim
. Either piped one-liners strung together or bash
scripts is fine. If necessary, I am open to using python
, perl
or other scripting languages also.
shell awk sed grep regular-expression
This is an SOS question. My professor asked me to obtain output from a long-running simulation code bequeathed to us by a former post-doc (who had explained to me its workings).
I had done a few small-scale trial runs and everything went fine. Then I started the full simulation about a month back and has been running continuously since then. But just a few minutes ago, due to some memory issues, the program crashed before it could write the formatted tabular output to disk.
Luckily, I had enabled terminal echoing of the intermediate results, and had set my scroll-back history to a large value. I managed to salvage partial output by entering scroll-back mode and copying the entire terminal dump into a text file (and also made backup copies of it).
Now, this terminal output is quite verbose (intentionally set so for debug purposes). The following is a snapshot from the salvaged terminal output text file (let us call it terminal_output.txt
)
1 Linear search iteration no. 1 begins: Attempting to blah blah with 1 ...
2 blah blah
3 blah
4 blah blah blah
5 lorem ipsum
.........
........
75 Success with 128 blah ....
76 blah blah
77 blah blah
78 result_flag: 1, exit_reason: 6
79 blah
80 Completed optimal computation with T_init = 25.00 degC & T_sink = 35.00 degC
And then this exact pattern repeats. e.g.,
81 Linear search iteration no. 2 begins: Attempting to blah blah with 1 ...
82 blah
......
95 Success with 307 blah ....
......
......
100 Completed optimal computation with T_init = 30.00 degC & T_sink = 40.00 degC
My requirement is to extract the following information to produce a tabular output like:
25 35 128
30 40 307
...........
...........
i.e. the 1st & 2nd columns are from the numerical values corresponding to T_init
and T_sink
respectively, from those lines begin with Completed
. The 3rd column is the numerical value from the line that begins with Success
(which is always 5 rows before Completed
if that helps). Any separator between the columns is acceptable-be it spaces, tabs or comma.
I wish to do this natively using standard *nix utilities such as grep
, sed
and awk
or even vi/vim
. Either piped one-liners strung together or bash
scripts is fine. If necessary, I am open to using python
, perl
or other scripting languages also.
shell awk sed grep regular-expression
edited May 2 at 15:35
asked May 2 at 14:46
Krishna
1828
1828
add a comment |Â
add a comment |Â
3 Answers
3
active
oldest
votes
up vote
2
down vote
accepted
It's essentially a matter of capturing the parts you want, and discarding the parts you don't. For example using sed
, you could capture the integer Success
value and copy it to the hold space (h), retrieving and appending it (G) to the captured digits of the Completed
line:
sed -nE
-e '/Success/ s/.* ([0-9]+).*/1/; h;'
-e '/Completed/G; s/.*T_init = ([0-9]+).00 degC & T_sink = ([0-9]+).*n/1 2 /; p;
' terminal_output.txt
Perl provides a somewhat more expressive syntax, which IMHO is more readable:
perl -lne '
our $a = $1 if /Success.*?(d+)/; print join " ", /(d+).d+/g, $a if /Completed/
' terminal_output.txt
produces the desired output
25 35 128
30 40 307
Woohoo...That worked . Thank you very much. Really, usingperl
over the venerablesed
andawk
? Also, for educational purposes, can you please provide an explanation of how your code works?
â Krishna
May 2 at 15:25
Is it because the PCRE engine is more advanced than the others?
â Krishna
May 2 at 16:32
@Krishna it's certainly possible in sed or awk, however perl has some features that make it easier IMHO
â steeldriver
May 2 at 20:55
thank you for the explanation and the nice answer. It was a life-saver!
â Krishna
May 3 at 8:59
add a comment |Â
up vote
0
down vote
POSIX-compatible sed
:
grep -e 'Success' -e 'Completed' your_file | sed 'N;s/Success with ([[:digit:]]+).*T_init = ([^[:space:]]+).*T_sink = ([^[:space:]]+).*/2 3 1/;s/.00//g'
GNU sed
: (in which .
does not match n
at least in 4.2.2 on CentOS)
grep -e 'Success' -e 'Completed' your_file | sed 'N;s/Success with ([[:digit:]]+).*n.*T_init = ([^[:space:]]+).*T_sink = ([^[:space:]]+).*/2 3 1/;s/.00//g'
Grabs lines containing Success
and Completed
, then operating on two lines (way more explicitly than necessary) pulls the three fields you care about and orders them into one line.
This only truncates .00
off of any numbers, leaving any significant fractional component alone (including something like 12.20
, there would still be the single trailing zero).
Caveat where it won't work if some of those ...
lines contain Completed
or Success
This does not go all the way in extracting the numerical data and producing the tabular output. It does correctly part-way up until identifying only those lines containing the relevant data.
â Krishna
May 2 at 15:31
@Krishna, Aside from what I indicated with the.00
remaining, is there anything off? My tests do not indicate so. If you indicate that you are looking to truncate that data (didn't want to assume that everything was.00
), I can update my answer as appropriate. If you want to keep the decimals for non-.00
data, that is also something we can do
â cunninghamp3
May 2 at 15:36
Yes. All the non-integers are just appended with trailing.00
, but your present code does not produce the tabular data I am looking for. Please feel free to edit the answer
â Krishna
May 2 at 15:38
@Krishna, what do you mean "does not produce the tabular data" ? This should output a space-separated line of numbers, per result, same as the perl solution and your example (minus that mine includes decimals)?
â cunninghamp3
May 2 at 15:41
The output is sent to stdout - you can add an> outfile.txt
on the end of the command to store it
â cunninghamp3
May 2 at 15:42
 |Â
show 2 more comments
up vote
-1
down vote
A quick awk
command should get you started:
awk '$2 ~ /Success/a=$4;next; $2 ~ /Completed/b=$8;c=$13;print a,b,c' terminal_output.txt
This wont work if you have multiple Success
lines before a Completed
line etc.
This did not work.. I have exactly oneSuccess
line before aCompleted
line.. Any idea why this is not working? Also doesawk
modify the file itself? I see no changes whatsoever in theterminal_output.txt
file and the output of the command did not echo anything to the screen
â Krishna
May 2 at 15:19
add a comment |Â
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
accepted
It's essentially a matter of capturing the parts you want, and discarding the parts you don't. For example using sed
, you could capture the integer Success
value and copy it to the hold space (h), retrieving and appending it (G) to the captured digits of the Completed
line:
sed -nE
-e '/Success/ s/.* ([0-9]+).*/1/; h;'
-e '/Completed/G; s/.*T_init = ([0-9]+).00 degC & T_sink = ([0-9]+).*n/1 2 /; p;
' terminal_output.txt
Perl provides a somewhat more expressive syntax, which IMHO is more readable:
perl -lne '
our $a = $1 if /Success.*?(d+)/; print join " ", /(d+).d+/g, $a if /Completed/
' terminal_output.txt
produces the desired output
25 35 128
30 40 307
Woohoo...That worked . Thank you very much. Really, usingperl
over the venerablesed
andawk
? Also, for educational purposes, can you please provide an explanation of how your code works?
â Krishna
May 2 at 15:25
Is it because the PCRE engine is more advanced than the others?
â Krishna
May 2 at 16:32
@Krishna it's certainly possible in sed or awk, however perl has some features that make it easier IMHO
â steeldriver
May 2 at 20:55
thank you for the explanation and the nice answer. It was a life-saver!
â Krishna
May 3 at 8:59
add a comment |Â
up vote
2
down vote
accepted
It's essentially a matter of capturing the parts you want, and discarding the parts you don't. For example using sed
, you could capture the integer Success
value and copy it to the hold space (h), retrieving and appending it (G) to the captured digits of the Completed
line:
sed -nE
-e '/Success/ s/.* ([0-9]+).*/1/; h;'
-e '/Completed/G; s/.*T_init = ([0-9]+).00 degC & T_sink = ([0-9]+).*n/1 2 /; p;
' terminal_output.txt
Perl provides a somewhat more expressive syntax, which IMHO is more readable:
perl -lne '
our $a = $1 if /Success.*?(d+)/; print join " ", /(d+).d+/g, $a if /Completed/
' terminal_output.txt
produces the desired output
25 35 128
30 40 307
Woohoo...That worked . Thank you very much. Really, usingperl
over the venerablesed
andawk
? Also, for educational purposes, can you please provide an explanation of how your code works?
â Krishna
May 2 at 15:25
Is it because the PCRE engine is more advanced than the others?
â Krishna
May 2 at 16:32
@Krishna it's certainly possible in sed or awk, however perl has some features that make it easier IMHO
â steeldriver
May 2 at 20:55
thank you for the explanation and the nice answer. It was a life-saver!
â Krishna
May 3 at 8:59
add a comment |Â
up vote
2
down vote
accepted
up vote
2
down vote
accepted
It's essentially a matter of capturing the parts you want, and discarding the parts you don't. For example using sed
, you could capture the integer Success
value and copy it to the hold space (h), retrieving and appending it (G) to the captured digits of the Completed
line:
sed -nE
-e '/Success/ s/.* ([0-9]+).*/1/; h;'
-e '/Completed/G; s/.*T_init = ([0-9]+).00 degC & T_sink = ([0-9]+).*n/1 2 /; p;
' terminal_output.txt
Perl provides a somewhat more expressive syntax, which IMHO is more readable:
perl -lne '
our $a = $1 if /Success.*?(d+)/; print join " ", /(d+).d+/g, $a if /Completed/
' terminal_output.txt
produces the desired output
25 35 128
30 40 307
It's essentially a matter of capturing the parts you want, and discarding the parts you don't. For example using sed
, you could capture the integer Success
value and copy it to the hold space (h), retrieving and appending it (G) to the captured digits of the Completed
line:
sed -nE
-e '/Success/ s/.* ([0-9]+).*/1/; h;'
-e '/Completed/G; s/.*T_init = ([0-9]+).00 degC & T_sink = ([0-9]+).*n/1 2 /; p;
' terminal_output.txt
Perl provides a somewhat more expressive syntax, which IMHO is more readable:
perl -lne '
our $a = $1 if /Success.*?(d+)/; print join " ", /(d+).d+/g, $a if /Completed/
' terminal_output.txt
produces the desired output
25 35 128
30 40 307
edited May 2 at 20:53
answered May 2 at 15:21
steeldriver
31.3k34978
31.3k34978
Woohoo...That worked . Thank you very much. Really, usingperl
over the venerablesed
andawk
? Also, for educational purposes, can you please provide an explanation of how your code works?
â Krishna
May 2 at 15:25
Is it because the PCRE engine is more advanced than the others?
â Krishna
May 2 at 16:32
@Krishna it's certainly possible in sed or awk, however perl has some features that make it easier IMHO
â steeldriver
May 2 at 20:55
thank you for the explanation and the nice answer. It was a life-saver!
â Krishna
May 3 at 8:59
add a comment |Â
Woohoo...That worked . Thank you very much. Really, usingperl
over the venerablesed
andawk
? Also, for educational purposes, can you please provide an explanation of how your code works?
â Krishna
May 2 at 15:25
Is it because the PCRE engine is more advanced than the others?
â Krishna
May 2 at 16:32
@Krishna it's certainly possible in sed or awk, however perl has some features that make it easier IMHO
â steeldriver
May 2 at 20:55
thank you for the explanation and the nice answer. It was a life-saver!
â Krishna
May 3 at 8:59
Woohoo...That worked . Thank you very much. Really, using
perl
over the venerable sed
and awk
? Also, for educational purposes, can you please provide an explanation of how your code works?â Krishna
May 2 at 15:25
Woohoo...That worked . Thank you very much. Really, using
perl
over the venerable sed
and awk
? Also, for educational purposes, can you please provide an explanation of how your code works?â Krishna
May 2 at 15:25
Is it because the PCRE engine is more advanced than the others?
â Krishna
May 2 at 16:32
Is it because the PCRE engine is more advanced than the others?
â Krishna
May 2 at 16:32
@Krishna it's certainly possible in sed or awk, however perl has some features that make it easier IMHO
â steeldriver
May 2 at 20:55
@Krishna it's certainly possible in sed or awk, however perl has some features that make it easier IMHO
â steeldriver
May 2 at 20:55
thank you for the explanation and the nice answer. It was a life-saver!
â Krishna
May 3 at 8:59
thank you for the explanation and the nice answer. It was a life-saver!
â Krishna
May 3 at 8:59
add a comment |Â
up vote
0
down vote
POSIX-compatible sed
:
grep -e 'Success' -e 'Completed' your_file | sed 'N;s/Success with ([[:digit:]]+).*T_init = ([^[:space:]]+).*T_sink = ([^[:space:]]+).*/2 3 1/;s/.00//g'
GNU sed
: (in which .
does not match n
at least in 4.2.2 on CentOS)
grep -e 'Success' -e 'Completed' your_file | sed 'N;s/Success with ([[:digit:]]+).*n.*T_init = ([^[:space:]]+).*T_sink = ([^[:space:]]+).*/2 3 1/;s/.00//g'
Grabs lines containing Success
and Completed
, then operating on two lines (way more explicitly than necessary) pulls the three fields you care about and orders them into one line.
This only truncates .00
off of any numbers, leaving any significant fractional component alone (including something like 12.20
, there would still be the single trailing zero).
Caveat where it won't work if some of those ...
lines contain Completed
or Success
This does not go all the way in extracting the numerical data and producing the tabular output. It does correctly part-way up until identifying only those lines containing the relevant data.
â Krishna
May 2 at 15:31
@Krishna, Aside from what I indicated with the.00
remaining, is there anything off? My tests do not indicate so. If you indicate that you are looking to truncate that data (didn't want to assume that everything was.00
), I can update my answer as appropriate. If you want to keep the decimals for non-.00
data, that is also something we can do
â cunninghamp3
May 2 at 15:36
Yes. All the non-integers are just appended with trailing.00
, but your present code does not produce the tabular data I am looking for. Please feel free to edit the answer
â Krishna
May 2 at 15:38
@Krishna, what do you mean "does not produce the tabular data" ? This should output a space-separated line of numbers, per result, same as the perl solution and your example (minus that mine includes decimals)?
â cunninghamp3
May 2 at 15:41
The output is sent to stdout - you can add an> outfile.txt
on the end of the command to store it
â cunninghamp3
May 2 at 15:42
 |Â
show 2 more comments
up vote
0
down vote
POSIX-compatible sed
:
grep -e 'Success' -e 'Completed' your_file | sed 'N;s/Success with ([[:digit:]]+).*T_init = ([^[:space:]]+).*T_sink = ([^[:space:]]+).*/2 3 1/;s/.00//g'
GNU sed
: (in which .
does not match n
at least in 4.2.2 on CentOS)
grep -e 'Success' -e 'Completed' your_file | sed 'N;s/Success with ([[:digit:]]+).*n.*T_init = ([^[:space:]]+).*T_sink = ([^[:space:]]+).*/2 3 1/;s/.00//g'
Grabs lines containing Success
and Completed
, then operating on two lines (way more explicitly than necessary) pulls the three fields you care about and orders them into one line.
This only truncates .00
off of any numbers, leaving any significant fractional component alone (including something like 12.20
, there would still be the single trailing zero).
Caveat where it won't work if some of those ...
lines contain Completed
or Success
This does not go all the way in extracting the numerical data and producing the tabular output. It does correctly part-way up until identifying only those lines containing the relevant data.
â Krishna
May 2 at 15:31
@Krishna, Aside from what I indicated with the.00
remaining, is there anything off? My tests do not indicate so. If you indicate that you are looking to truncate that data (didn't want to assume that everything was.00
), I can update my answer as appropriate. If you want to keep the decimals for non-.00
data, that is also something we can do
â cunninghamp3
May 2 at 15:36
Yes. All the non-integers are just appended with trailing.00
, but your present code does not produce the tabular data I am looking for. Please feel free to edit the answer
â Krishna
May 2 at 15:38
@Krishna, what do you mean "does not produce the tabular data" ? This should output a space-separated line of numbers, per result, same as the perl solution and your example (minus that mine includes decimals)?
â cunninghamp3
May 2 at 15:41
The output is sent to stdout - you can add an> outfile.txt
on the end of the command to store it
â cunninghamp3
May 2 at 15:42
 |Â
show 2 more comments
up vote
0
down vote
up vote
0
down vote
POSIX-compatible sed
:
grep -e 'Success' -e 'Completed' your_file | sed 'N;s/Success with ([[:digit:]]+).*T_init = ([^[:space:]]+).*T_sink = ([^[:space:]]+).*/2 3 1/;s/.00//g'
GNU sed
: (in which .
does not match n
at least in 4.2.2 on CentOS)
grep -e 'Success' -e 'Completed' your_file | sed 'N;s/Success with ([[:digit:]]+).*n.*T_init = ([^[:space:]]+).*T_sink = ([^[:space:]]+).*/2 3 1/;s/.00//g'
Grabs lines containing Success
and Completed
, then operating on two lines (way more explicitly than necessary) pulls the three fields you care about and orders them into one line.
This only truncates .00
off of any numbers, leaving any significant fractional component alone (including something like 12.20
, there would still be the single trailing zero).
Caveat where it won't work if some of those ...
lines contain Completed
or Success
POSIX-compatible sed
:
grep -e 'Success' -e 'Completed' your_file | sed 'N;s/Success with ([[:digit:]]+).*T_init = ([^[:space:]]+).*T_sink = ([^[:space:]]+).*/2 3 1/;s/.00//g'
GNU sed
: (in which .
does not match n
at least in 4.2.2 on CentOS)
grep -e 'Success' -e 'Completed' your_file | sed 'N;s/Success with ([[:digit:]]+).*n.*T_init = ([^[:space:]]+).*T_sink = ([^[:space:]]+).*/2 3 1/;s/.00//g'
Grabs lines containing Success
and Completed
, then operating on two lines (way more explicitly than necessary) pulls the three fields you care about and orders them into one line.
This only truncates .00
off of any numbers, leaving any significant fractional component alone (including something like 12.20
, there would still be the single trailing zero).
Caveat where it won't work if some of those ...
lines contain Completed
or Success
edited May 2 at 18:52
answered May 2 at 15:22
cunninghamp3
473215
473215
This does not go all the way in extracting the numerical data and producing the tabular output. It does correctly part-way up until identifying only those lines containing the relevant data.
â Krishna
May 2 at 15:31
@Krishna, Aside from what I indicated with the.00
remaining, is there anything off? My tests do not indicate so. If you indicate that you are looking to truncate that data (didn't want to assume that everything was.00
), I can update my answer as appropriate. If you want to keep the decimals for non-.00
data, that is also something we can do
â cunninghamp3
May 2 at 15:36
Yes. All the non-integers are just appended with trailing.00
, but your present code does not produce the tabular data I am looking for. Please feel free to edit the answer
â Krishna
May 2 at 15:38
@Krishna, what do you mean "does not produce the tabular data" ? This should output a space-separated line of numbers, per result, same as the perl solution and your example (minus that mine includes decimals)?
â cunninghamp3
May 2 at 15:41
The output is sent to stdout - you can add an> outfile.txt
on the end of the command to store it
â cunninghamp3
May 2 at 15:42
 |Â
show 2 more comments
This does not go all the way in extracting the numerical data and producing the tabular output. It does correctly part-way up until identifying only those lines containing the relevant data.
â Krishna
May 2 at 15:31
@Krishna, Aside from what I indicated with the.00
remaining, is there anything off? My tests do not indicate so. If you indicate that you are looking to truncate that data (didn't want to assume that everything was.00
), I can update my answer as appropriate. If you want to keep the decimals for non-.00
data, that is also something we can do
â cunninghamp3
May 2 at 15:36
Yes. All the non-integers are just appended with trailing.00
, but your present code does not produce the tabular data I am looking for. Please feel free to edit the answer
â Krishna
May 2 at 15:38
@Krishna, what do you mean "does not produce the tabular data" ? This should output a space-separated line of numbers, per result, same as the perl solution and your example (minus that mine includes decimals)?
â cunninghamp3
May 2 at 15:41
The output is sent to stdout - you can add an> outfile.txt
on the end of the command to store it
â cunninghamp3
May 2 at 15:42
This does not go all the way in extracting the numerical data and producing the tabular output. It does correctly part-way up until identifying only those lines containing the relevant data.
â Krishna
May 2 at 15:31
This does not go all the way in extracting the numerical data and producing the tabular output. It does correctly part-way up until identifying only those lines containing the relevant data.
â Krishna
May 2 at 15:31
@Krishna, Aside from what I indicated with the
.00
remaining, is there anything off? My tests do not indicate so. If you indicate that you are looking to truncate that data (didn't want to assume that everything was .00
), I can update my answer as appropriate. If you want to keep the decimals for non-.00
data, that is also something we can doâ cunninghamp3
May 2 at 15:36
@Krishna, Aside from what I indicated with the
.00
remaining, is there anything off? My tests do not indicate so. If you indicate that you are looking to truncate that data (didn't want to assume that everything was .00
), I can update my answer as appropriate. If you want to keep the decimals for non-.00
data, that is also something we can doâ cunninghamp3
May 2 at 15:36
Yes. All the non-integers are just appended with trailing
.00
, but your present code does not produce the tabular data I am looking for. Please feel free to edit the answerâ Krishna
May 2 at 15:38
Yes. All the non-integers are just appended with trailing
.00
, but your present code does not produce the tabular data I am looking for. Please feel free to edit the answerâ Krishna
May 2 at 15:38
@Krishna, what do you mean "does not produce the tabular data" ? This should output a space-separated line of numbers, per result, same as the perl solution and your example (minus that mine includes decimals)?
â cunninghamp3
May 2 at 15:41
@Krishna, what do you mean "does not produce the tabular data" ? This should output a space-separated line of numbers, per result, same as the perl solution and your example (minus that mine includes decimals)?
â cunninghamp3
May 2 at 15:41
The output is sent to stdout - you can add an
> outfile.txt
on the end of the command to store itâ cunninghamp3
May 2 at 15:42
The output is sent to stdout - you can add an
> outfile.txt
on the end of the command to store itâ cunninghamp3
May 2 at 15:42
 |Â
show 2 more comments
up vote
-1
down vote
A quick awk
command should get you started:
awk '$2 ~ /Success/a=$4;next; $2 ~ /Completed/b=$8;c=$13;print a,b,c' terminal_output.txt
This wont work if you have multiple Success
lines before a Completed
line etc.
This did not work.. I have exactly oneSuccess
line before aCompleted
line.. Any idea why this is not working? Also doesawk
modify the file itself? I see no changes whatsoever in theterminal_output.txt
file and the output of the command did not echo anything to the screen
â Krishna
May 2 at 15:19
add a comment |Â
up vote
-1
down vote
A quick awk
command should get you started:
awk '$2 ~ /Success/a=$4;next; $2 ~ /Completed/b=$8;c=$13;print a,b,c' terminal_output.txt
This wont work if you have multiple Success
lines before a Completed
line etc.
This did not work.. I have exactly oneSuccess
line before aCompleted
line.. Any idea why this is not working? Also doesawk
modify the file itself? I see no changes whatsoever in theterminal_output.txt
file and the output of the command did not echo anything to the screen
â Krishna
May 2 at 15:19
add a comment |Â
up vote
-1
down vote
up vote
-1
down vote
A quick awk
command should get you started:
awk '$2 ~ /Success/a=$4;next; $2 ~ /Completed/b=$8;c=$13;print a,b,c' terminal_output.txt
This wont work if you have multiple Success
lines before a Completed
line etc.
A quick awk
command should get you started:
awk '$2 ~ /Success/a=$4;next; $2 ~ /Completed/b=$8;c=$13;print a,b,c' terminal_output.txt
This wont work if you have multiple Success
lines before a Completed
line etc.
answered May 2 at 15:10
rusty shackleford
1,145115
1,145115
This did not work.. I have exactly oneSuccess
line before aCompleted
line.. Any idea why this is not working? Also doesawk
modify the file itself? I see no changes whatsoever in theterminal_output.txt
file and the output of the command did not echo anything to the screen
â Krishna
May 2 at 15:19
add a comment |Â
This did not work.. I have exactly oneSuccess
line before aCompleted
line.. Any idea why this is not working? Also doesawk
modify the file itself? I see no changes whatsoever in theterminal_output.txt
file and the output of the command did not echo anything to the screen
â Krishna
May 2 at 15:19
This did not work.. I have exactly one
Success
line before a Completed
line.. Any idea why this is not working? Also does awk
modify the file itself? I see no changes whatsoever in the terminal_output.txt
file and the output of the command did not echo anything to the screenâ Krishna
May 2 at 15:19
This did not work.. I have exactly one
Success
line before a Completed
line.. Any idea why this is not working? Also does awk
modify the file itself? I see no changes whatsoever in the terminal_output.txt
file and the output of the command did not echo anything to the screenâ Krishna
May 2 at 15:19
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f441341%2fextract-numerical-data-from-complicated-pattern-in-a-plain-text-file-and-produce%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password