Pattern recognition betweel two sentences in a file which has spaces and special characters?
Clash Royale CLAN TAG#URR8PPP
up vote
0
down vote
favorite
I have a file in which I want to print all the lines between two patterns. Pattern1 = # Begin TRACE A Data
and Pattern 2 =# Done Data $capture
, I want to print every line between pattern1 and pattern2.
File 1:
# Lower Limit
LIMIT_FLAG=0
LIMIT_POINT0=2884982910000.000000 -102800 -1
LIMIT_POINT1=2892982910000.000000 -102800 -1
# Limit Done
# Begin SPA Emission Mask
MASK SEGMENTS=0
MASK REFERENCE MODE=0
MASK REFERENCE LEVEL=0
MASK CENTER FREQUENCY=0
**
# SPA Emission Mask Done
# Begin SPA Data
<AP P_DATA>
** # Begin TRACE A Data **
P_0=-103.976000 , 2884.982910 MHz
P_1=-103.580000 , 2884.997456 MHz
P_2=-103.748000 , 2885.012001 MHz
P_3=-104.020000 , 2885.026547 MHz
P_4=-103.472000 , 2885.041092 MHz
P_5=-103.720000 , 2885.055638 MHz
P_6=-103.752000 , 2885.070183 MHz
P_7=-103.512000 , 2885.084729 MHz
P_8=-103.664000 , 2885.099274 MHz
P_9=-103.948000 , 2885.113820 MHz
P_10=-103.720000 , 2885.128365 MHz
P_11=-103.480000 , 2885.142911
# Done Data $capture
# Begin SPA Emission Mask
MASK SEGMENTS=0
MASK REFERENCE MODE=0
MASK REFERENCE LEVEL=0
MASK CENTER FREQUENCY=0
# End SPA Data
<APP_DATA_END>
# End SPA Data
<APP_DATA_END>
Expected output:
-103.976000 2884.982910
-103.580000 2884.997456
-103.748000 2885.012001
-104.020000 2885.026547
-103.472000 2885.041092
....
....
No extra lines or blank lines should be printed but just the lines data.
awk sed grep pattern-matching
add a comment |Â
up vote
0
down vote
favorite
I have a file in which I want to print all the lines between two patterns. Pattern1 = # Begin TRACE A Data
and Pattern 2 =# Done Data $capture
, I want to print every line between pattern1 and pattern2.
File 1:
# Lower Limit
LIMIT_FLAG=0
LIMIT_POINT0=2884982910000.000000 -102800 -1
LIMIT_POINT1=2892982910000.000000 -102800 -1
# Limit Done
# Begin SPA Emission Mask
MASK SEGMENTS=0
MASK REFERENCE MODE=0
MASK REFERENCE LEVEL=0
MASK CENTER FREQUENCY=0
**
# SPA Emission Mask Done
# Begin SPA Data
<AP P_DATA>
** # Begin TRACE A Data **
P_0=-103.976000 , 2884.982910 MHz
P_1=-103.580000 , 2884.997456 MHz
P_2=-103.748000 , 2885.012001 MHz
P_3=-104.020000 , 2885.026547 MHz
P_4=-103.472000 , 2885.041092 MHz
P_5=-103.720000 , 2885.055638 MHz
P_6=-103.752000 , 2885.070183 MHz
P_7=-103.512000 , 2885.084729 MHz
P_8=-103.664000 , 2885.099274 MHz
P_9=-103.948000 , 2885.113820 MHz
P_10=-103.720000 , 2885.128365 MHz
P_11=-103.480000 , 2885.142911
# Done Data $capture
# Begin SPA Emission Mask
MASK SEGMENTS=0
MASK REFERENCE MODE=0
MASK REFERENCE LEVEL=0
MASK CENTER FREQUENCY=0
# End SPA Data
<APP_DATA_END>
# End SPA Data
<APP_DATA_END>
Expected output:
-103.976000 2884.982910
-103.580000 2884.997456
-103.748000 2885.012001
-104.020000 2885.026547
-103.472000 2885.041092
....
....
No extra lines or blank lines should be printed but just the lines data.
awk sed grep pattern-matching
add a comment |Â
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have a file in which I want to print all the lines between two patterns. Pattern1 = # Begin TRACE A Data
and Pattern 2 =# Done Data $capture
, I want to print every line between pattern1 and pattern2.
File 1:
# Lower Limit
LIMIT_FLAG=0
LIMIT_POINT0=2884982910000.000000 -102800 -1
LIMIT_POINT1=2892982910000.000000 -102800 -1
# Limit Done
# Begin SPA Emission Mask
MASK SEGMENTS=0
MASK REFERENCE MODE=0
MASK REFERENCE LEVEL=0
MASK CENTER FREQUENCY=0
**
# SPA Emission Mask Done
# Begin SPA Data
<AP P_DATA>
** # Begin TRACE A Data **
P_0=-103.976000 , 2884.982910 MHz
P_1=-103.580000 , 2884.997456 MHz
P_2=-103.748000 , 2885.012001 MHz
P_3=-104.020000 , 2885.026547 MHz
P_4=-103.472000 , 2885.041092 MHz
P_5=-103.720000 , 2885.055638 MHz
P_6=-103.752000 , 2885.070183 MHz
P_7=-103.512000 , 2885.084729 MHz
P_8=-103.664000 , 2885.099274 MHz
P_9=-103.948000 , 2885.113820 MHz
P_10=-103.720000 , 2885.128365 MHz
P_11=-103.480000 , 2885.142911
# Done Data $capture
# Begin SPA Emission Mask
MASK SEGMENTS=0
MASK REFERENCE MODE=0
MASK REFERENCE LEVEL=0
MASK CENTER FREQUENCY=0
# End SPA Data
<APP_DATA_END>
# End SPA Data
<APP_DATA_END>
Expected output:
-103.976000 2884.982910
-103.580000 2884.997456
-103.748000 2885.012001
-104.020000 2885.026547
-103.472000 2885.041092
....
....
No extra lines or blank lines should be printed but just the lines data.
awk sed grep pattern-matching
I have a file in which I want to print all the lines between two patterns. Pattern1 = # Begin TRACE A Data
and Pattern 2 =# Done Data $capture
, I want to print every line between pattern1 and pattern2.
File 1:
# Lower Limit
LIMIT_FLAG=0
LIMIT_POINT0=2884982910000.000000 -102800 -1
LIMIT_POINT1=2892982910000.000000 -102800 -1
# Limit Done
# Begin SPA Emission Mask
MASK SEGMENTS=0
MASK REFERENCE MODE=0
MASK REFERENCE LEVEL=0
MASK CENTER FREQUENCY=0
**
# SPA Emission Mask Done
# Begin SPA Data
<AP P_DATA>
** # Begin TRACE A Data **
P_0=-103.976000 , 2884.982910 MHz
P_1=-103.580000 , 2884.997456 MHz
P_2=-103.748000 , 2885.012001 MHz
P_3=-104.020000 , 2885.026547 MHz
P_4=-103.472000 , 2885.041092 MHz
P_5=-103.720000 , 2885.055638 MHz
P_6=-103.752000 , 2885.070183 MHz
P_7=-103.512000 , 2885.084729 MHz
P_8=-103.664000 , 2885.099274 MHz
P_9=-103.948000 , 2885.113820 MHz
P_10=-103.720000 , 2885.128365 MHz
P_11=-103.480000 , 2885.142911
# Done Data $capture
# Begin SPA Emission Mask
MASK SEGMENTS=0
MASK REFERENCE MODE=0
MASK REFERENCE LEVEL=0
MASK CENTER FREQUENCY=0
# End SPA Data
<APP_DATA_END>
# End SPA Data
<APP_DATA_END>
Expected output:
-103.976000 2884.982910
-103.580000 2884.997456
-103.748000 2885.012001
-104.020000 2885.026547
-103.472000 2885.041092
....
....
No extra lines or blank lines should be printed but just the lines data.
awk sed grep pattern-matching
edited May 3 at 11:01
Philippos
5,89211544
5,89211544
asked May 3 at 10:07
CCC
407
407
add a comment |Â
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
1
down vote
accepted
sed -n '/# Begin TRACE A Data/,/# Done Data $capture/s/ MHz//;s/,/ /;s/.*=//p;' filename
/pattern1/,/pattern2/
selects only the lines from the first to the second pattern, so everything insideis executed only for the range
s/ MHz//
removes the trailing units/,/ /
replaces the comma with a whitespaces/.*=//p
removes everything upto the=
and prints the pattern, so only lines in the range with that=
will get printed (option-n
suppresses default output)
Actually, for your example data, you could also do
sed -n 's/ MHz//;s/.*=//;s/,/ /p'
because only the lines you want contain a comma.
Perfect!, can it be WHITESPACE be replaced by tab?
â CCC
May 3 at 11:16
Sure, in this case you probably want to replace the comma along with the spaces before and after. With GNUsed
, you can uses/ , /t/
, but this is not standard, so better use a literal TAB instead oft
(in most shells you can enter a literal TAB by preceeding it with ctrl-V.
â Philippos
May 3 at 11:21
add a comment |Â
up vote
1
down vote
Awk
solution:
awk '/# Begin TRACE A Data/ f = 1; next
/# Done Data $capture/ f = 0
NF && f gsub(/^P.+=' file
The output:
-103.976000 2884.982910
-103.580000 2884.997456
-103.748000 2885.012001
-104.020000 2885.026547
-103.472000 2885.041092
-103.720000 2885.055638
-103.752000 2885.070183
-103.512000 2885.084729
-103.664000 2885.099274
-103.948000 2885.113820
-103.720000 2885.128365
-103.480000 2885.142911
That's nice, I made a mistake in asking. The data file further contains lines having string '# End SPA Data' after pattern2, I don't want that to be printed. Also in the output can I just get all the lines with numbers ex:- 103.9760000 2884.982910 tab-separated and ignore other characters?
â CCC
May 3 at 10:28
I ran this command awk '/TRACE A Data/,/Data Done/' 11-wm-2db.spa |grep -v 'Data' but I want output to be two columns with P_0= and comma and MHz ignored with two columns tab-separated So, output should be -103.9760000 2884.983910 -103.5800000 2884.997456 line-wise .. ..
â CCC
May 3 at 10:32
@CCC, do update both your question and expected result
â RomanPerekhrest
May 3 at 10:32
@CCC, see my update
â RomanPerekhrest
May 3 at 10:46
It is still printing the lines after pattern2 ( # Data Done $capture), they are the lines <APP_DATA_END> and # End SPA Data. Also, please will you explain the code?
â CCC
May 3 at 10:54
 |Â
show 2 more comments
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
sed -n '/# Begin TRACE A Data/,/# Done Data $capture/s/ MHz//;s/,/ /;s/.*=//p;' filename
/pattern1/,/pattern2/
selects only the lines from the first to the second pattern, so everything insideis executed only for the range
s/ MHz//
removes the trailing units/,/ /
replaces the comma with a whitespaces/.*=//p
removes everything upto the=
and prints the pattern, so only lines in the range with that=
will get printed (option-n
suppresses default output)
Actually, for your example data, you could also do
sed -n 's/ MHz//;s/.*=//;s/,/ /p'
because only the lines you want contain a comma.
Perfect!, can it be WHITESPACE be replaced by tab?
â CCC
May 3 at 11:16
Sure, in this case you probably want to replace the comma along with the spaces before and after. With GNUsed
, you can uses/ , /t/
, but this is not standard, so better use a literal TAB instead oft
(in most shells you can enter a literal TAB by preceeding it with ctrl-V.
â Philippos
May 3 at 11:21
add a comment |Â
up vote
1
down vote
accepted
sed -n '/# Begin TRACE A Data/,/# Done Data $capture/s/ MHz//;s/,/ /;s/.*=//p;' filename
/pattern1/,/pattern2/
selects only the lines from the first to the second pattern, so everything insideis executed only for the range
s/ MHz//
removes the trailing units/,/ /
replaces the comma with a whitespaces/.*=//p
removes everything upto the=
and prints the pattern, so only lines in the range with that=
will get printed (option-n
suppresses default output)
Actually, for your example data, you could also do
sed -n 's/ MHz//;s/.*=//;s/,/ /p'
because only the lines you want contain a comma.
Perfect!, can it be WHITESPACE be replaced by tab?
â CCC
May 3 at 11:16
Sure, in this case you probably want to replace the comma along with the spaces before and after. With GNUsed
, you can uses/ , /t/
, but this is not standard, so better use a literal TAB instead oft
(in most shells you can enter a literal TAB by preceeding it with ctrl-V.
â Philippos
May 3 at 11:21
add a comment |Â
up vote
1
down vote
accepted
up vote
1
down vote
accepted
sed -n '/# Begin TRACE A Data/,/# Done Data $capture/s/ MHz//;s/,/ /;s/.*=//p;' filename
/pattern1/,/pattern2/
selects only the lines from the first to the second pattern, so everything insideis executed only for the range
s/ MHz//
removes the trailing units/,/ /
replaces the comma with a whitespaces/.*=//p
removes everything upto the=
and prints the pattern, so only lines in the range with that=
will get printed (option-n
suppresses default output)
Actually, for your example data, you could also do
sed -n 's/ MHz//;s/.*=//;s/,/ /p'
because only the lines you want contain a comma.
sed -n '/# Begin TRACE A Data/,/# Done Data $capture/s/ MHz//;s/,/ /;s/.*=//p;' filename
/pattern1/,/pattern2/
selects only the lines from the first to the second pattern, so everything insideis executed only for the range
s/ MHz//
removes the trailing units/,/ /
replaces the comma with a whitespaces/.*=//p
removes everything upto the=
and prints the pattern, so only lines in the range with that=
will get printed (option-n
suppresses default output)
Actually, for your example data, you could also do
sed -n 's/ MHz//;s/.*=//;s/,/ /p'
because only the lines you want contain a comma.
answered May 3 at 11:11
Philippos
5,89211544
5,89211544
Perfect!, can it be WHITESPACE be replaced by tab?
â CCC
May 3 at 11:16
Sure, in this case you probably want to replace the comma along with the spaces before and after. With GNUsed
, you can uses/ , /t/
, but this is not standard, so better use a literal TAB instead oft
(in most shells you can enter a literal TAB by preceeding it with ctrl-V.
â Philippos
May 3 at 11:21
add a comment |Â
Perfect!, can it be WHITESPACE be replaced by tab?
â CCC
May 3 at 11:16
Sure, in this case you probably want to replace the comma along with the spaces before and after. With GNUsed
, you can uses/ , /t/
, but this is not standard, so better use a literal TAB instead oft
(in most shells you can enter a literal TAB by preceeding it with ctrl-V.
â Philippos
May 3 at 11:21
Perfect!, can it be WHITESPACE be replaced by tab?
â CCC
May 3 at 11:16
Perfect!, can it be WHITESPACE be replaced by tab?
â CCC
May 3 at 11:16
Sure, in this case you probably want to replace the comma along with the spaces before and after. With GNU
sed
, you can use s/ , /t/
, but this is not standard, so better use a literal TAB instead of t
(in most shells you can enter a literal TAB by preceeding it with ctrl-V.â Philippos
May 3 at 11:21
Sure, in this case you probably want to replace the comma along with the spaces before and after. With GNU
sed
, you can use s/ , /t/
, but this is not standard, so better use a literal TAB instead of t
(in most shells you can enter a literal TAB by preceeding it with ctrl-V.â Philippos
May 3 at 11:21
add a comment |Â
up vote
1
down vote
Awk
solution:
awk '/# Begin TRACE A Data/ f = 1; next
/# Done Data $capture/ f = 0
NF && f gsub(/^P.+=' file
The output:
-103.976000 2884.982910
-103.580000 2884.997456
-103.748000 2885.012001
-104.020000 2885.026547
-103.472000 2885.041092
-103.720000 2885.055638
-103.752000 2885.070183
-103.512000 2885.084729
-103.664000 2885.099274
-103.948000 2885.113820
-103.720000 2885.128365
-103.480000 2885.142911
That's nice, I made a mistake in asking. The data file further contains lines having string '# End SPA Data' after pattern2, I don't want that to be printed. Also in the output can I just get all the lines with numbers ex:- 103.9760000 2884.982910 tab-separated and ignore other characters?
â CCC
May 3 at 10:28
I ran this command awk '/TRACE A Data/,/Data Done/' 11-wm-2db.spa |grep -v 'Data' but I want output to be two columns with P_0= and comma and MHz ignored with two columns tab-separated So, output should be -103.9760000 2884.983910 -103.5800000 2884.997456 line-wise .. ..
â CCC
May 3 at 10:32
@CCC, do update both your question and expected result
â RomanPerekhrest
May 3 at 10:32
@CCC, see my update
â RomanPerekhrest
May 3 at 10:46
It is still printing the lines after pattern2 ( # Data Done $capture), they are the lines <APP_DATA_END> and # End SPA Data. Also, please will you explain the code?
â CCC
May 3 at 10:54
 |Â
show 2 more comments
up vote
1
down vote
Awk
solution:
awk '/# Begin TRACE A Data/ f = 1; next
/# Done Data $capture/ f = 0
NF && f gsub(/^P.+=' file
The output:
-103.976000 2884.982910
-103.580000 2884.997456
-103.748000 2885.012001
-104.020000 2885.026547
-103.472000 2885.041092
-103.720000 2885.055638
-103.752000 2885.070183
-103.512000 2885.084729
-103.664000 2885.099274
-103.948000 2885.113820
-103.720000 2885.128365
-103.480000 2885.142911
That's nice, I made a mistake in asking. The data file further contains lines having string '# End SPA Data' after pattern2, I don't want that to be printed. Also in the output can I just get all the lines with numbers ex:- 103.9760000 2884.982910 tab-separated and ignore other characters?
â CCC
May 3 at 10:28
I ran this command awk '/TRACE A Data/,/Data Done/' 11-wm-2db.spa |grep -v 'Data' but I want output to be two columns with P_0= and comma and MHz ignored with two columns tab-separated So, output should be -103.9760000 2884.983910 -103.5800000 2884.997456 line-wise .. ..
â CCC
May 3 at 10:32
@CCC, do update both your question and expected result
â RomanPerekhrest
May 3 at 10:32
@CCC, see my update
â RomanPerekhrest
May 3 at 10:46
It is still printing the lines after pattern2 ( # Data Done $capture), they are the lines <APP_DATA_END> and # End SPA Data. Also, please will you explain the code?
â CCC
May 3 at 10:54
 |Â
show 2 more comments
up vote
1
down vote
up vote
1
down vote
Awk
solution:
awk '/# Begin TRACE A Data/ f = 1; next
/# Done Data $capture/ f = 0
NF && f gsub(/^P.+=' file
The output:
-103.976000 2884.982910
-103.580000 2884.997456
-103.748000 2885.012001
-104.020000 2885.026547
-103.472000 2885.041092
-103.720000 2885.055638
-103.752000 2885.070183
-103.512000 2885.084729
-103.664000 2885.099274
-103.948000 2885.113820
-103.720000 2885.128365
-103.480000 2885.142911
Awk
solution:
awk '/# Begin TRACE A Data/ f = 1; next
/# Done Data $capture/ f = 0
NF && f gsub(/^P.+=' file
The output:
-103.976000 2884.982910
-103.580000 2884.997456
-103.748000 2885.012001
-104.020000 2885.026547
-103.472000 2885.041092
-103.720000 2885.055638
-103.752000 2885.070183
-103.512000 2885.084729
-103.664000 2885.099274
-103.948000 2885.113820
-103.720000 2885.128365
-103.480000 2885.142911
edited May 3 at 10:45
answered May 3 at 10:15
RomanPerekhrest
22.4k12144
22.4k12144
That's nice, I made a mistake in asking. The data file further contains lines having string '# End SPA Data' after pattern2, I don't want that to be printed. Also in the output can I just get all the lines with numbers ex:- 103.9760000 2884.982910 tab-separated and ignore other characters?
â CCC
May 3 at 10:28
I ran this command awk '/TRACE A Data/,/Data Done/' 11-wm-2db.spa |grep -v 'Data' but I want output to be two columns with P_0= and comma and MHz ignored with two columns tab-separated So, output should be -103.9760000 2884.983910 -103.5800000 2884.997456 line-wise .. ..
â CCC
May 3 at 10:32
@CCC, do update both your question and expected result
â RomanPerekhrest
May 3 at 10:32
@CCC, see my update
â RomanPerekhrest
May 3 at 10:46
It is still printing the lines after pattern2 ( # Data Done $capture), they are the lines <APP_DATA_END> and # End SPA Data. Also, please will you explain the code?
â CCC
May 3 at 10:54
 |Â
show 2 more comments
That's nice, I made a mistake in asking. The data file further contains lines having string '# End SPA Data' after pattern2, I don't want that to be printed. Also in the output can I just get all the lines with numbers ex:- 103.9760000 2884.982910 tab-separated and ignore other characters?
â CCC
May 3 at 10:28
I ran this command awk '/TRACE A Data/,/Data Done/' 11-wm-2db.spa |grep -v 'Data' but I want output to be two columns with P_0= and comma and MHz ignored with two columns tab-separated So, output should be -103.9760000 2884.983910 -103.5800000 2884.997456 line-wise .. ..
â CCC
May 3 at 10:32
@CCC, do update both your question and expected result
â RomanPerekhrest
May 3 at 10:32
@CCC, see my update
â RomanPerekhrest
May 3 at 10:46
It is still printing the lines after pattern2 ( # Data Done $capture), they are the lines <APP_DATA_END> and # End SPA Data. Also, please will you explain the code?
â CCC
May 3 at 10:54
That's nice, I made a mistake in asking. The data file further contains lines having string '# End SPA Data' after pattern2, I don't want that to be printed. Also in the output can I just get all the lines with numbers ex:- 103.9760000 2884.982910 tab-separated and ignore other characters?
â CCC
May 3 at 10:28
That's nice, I made a mistake in asking. The data file further contains lines having string '# End SPA Data' after pattern2, I don't want that to be printed. Also in the output can I just get all the lines with numbers ex:- 103.9760000 2884.982910 tab-separated and ignore other characters?
â CCC
May 3 at 10:28
I ran this command awk '/TRACE A Data/,/Data Done/' 11-wm-2db.spa |grep -v 'Data' but I want output to be two columns with P_0= and comma and MHz ignored with two columns tab-separated So, output should be -103.9760000 2884.983910 -103.5800000 2884.997456 line-wise .. ..
â CCC
May 3 at 10:32
I ran this command awk '/TRACE A Data/,/Data Done/' 11-wm-2db.spa |grep -v 'Data' but I want output to be two columns with P_0= and comma and MHz ignored with two columns tab-separated So, output should be -103.9760000 2884.983910 -103.5800000 2884.997456 line-wise .. ..
â CCC
May 3 at 10:32
@CCC, do update both your question and expected result
â RomanPerekhrest
May 3 at 10:32
@CCC, do update both your question and expected result
â RomanPerekhrest
May 3 at 10:32
@CCC, see my update
â RomanPerekhrest
May 3 at 10:46
@CCC, see my update
â RomanPerekhrest
May 3 at 10:46
It is still printing the lines after pattern2 ( # Data Done $capture), they are the lines <APP_DATA_END> and # End SPA Data. Also, please will you explain the code?
â CCC
May 3 at 10:54
It is still printing the lines after pattern2 ( # Data Done $capture), they are the lines <APP_DATA_END> and # End SPA Data. Also, please will you explain the code?
â CCC
May 3 at 10:54
 |Â
show 2 more comments
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f441503%2fpattern-recognition-betweel-two-sentences-in-a-file-which-has-spaces-and-special%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password