Removing text from pattern1 up to and including 2nd match of pattern2?
Clash Royale CLAN TAG#URR8PPP
up vote
3
down vote
favorite
I have a text file like so:
<!--START OF FILE -->
random text
<meta> more random text </meta>
x x x x x x x
more random text
that I dont need
x x x x x x x
I need everything
from this point
onwards
...
I need to remove everything between <!--START OF FILE -->
and the secondx x x x x x x
like so:
I need everything
from this point
onwards
...
I tried using sed '/<!--START OF FILE -->/,/x x x x x x x/d' test.txt
but this removes the block between the first occurence of x x x x x x x
which is not what I want.
text-processing awk sed
add a comment |Â
up vote
3
down vote
favorite
I have a text file like so:
<!--START OF FILE -->
random text
<meta> more random text </meta>
x x x x x x x
more random text
that I dont need
x x x x x x x
I need everything
from this point
onwards
...
I need to remove everything between <!--START OF FILE -->
and the secondx x x x x x x
like so:
I need everything
from this point
onwards
...
I tried using sed '/<!--START OF FILE -->/,/x x x x x x x/d' test.txt
but this removes the block between the first occurence of x x x x x x x
which is not what I want.
text-processing awk sed
probably duplicate of unix.stackexchange.com/questions/404175/⦠? just changef;
to!f;
â Sundeep
Dec 24 '17 at 12:23
1
Used below oneliner to achieve the same sed -n '/</,/x/!p' l.txt | sed '1,/x/d' file name
â Praveen Kumar BS
Dec 24 '17 at 13:20
add a comment |Â
up vote
3
down vote
favorite
up vote
3
down vote
favorite
I have a text file like so:
<!--START OF FILE -->
random text
<meta> more random text </meta>
x x x x x x x
more random text
that I dont need
x x x x x x x
I need everything
from this point
onwards
...
I need to remove everything between <!--START OF FILE -->
and the secondx x x x x x x
like so:
I need everything
from this point
onwards
...
I tried using sed '/<!--START OF FILE -->/,/x x x x x x x/d' test.txt
but this removes the block between the first occurence of x x x x x x x
which is not what I want.
text-processing awk sed
I have a text file like so:
<!--START OF FILE -->
random text
<meta> more random text </meta>
x x x x x x x
more random text
that I dont need
x x x x x x x
I need everything
from this point
onwards
...
I need to remove everything between <!--START OF FILE -->
and the secondx x x x x x x
like so:
I need everything
from this point
onwards
...
I tried using sed '/<!--START OF FILE -->/,/x x x x x x x/d' test.txt
but this removes the block between the first occurence of x x x x x x x
which is not what I want.
text-processing awk sed
edited Dec 24 '17 at 11:16
don_crissti
46.6k15124153
46.6k15124153
asked Dec 24 '17 at 9:23
fsociety
183
183
probably duplicate of unix.stackexchange.com/questions/404175/⦠? just changef;
to!f;
â Sundeep
Dec 24 '17 at 12:23
1
Used below oneliner to achieve the same sed -n '/</,/x/!p' l.txt | sed '1,/x/d' file name
â Praveen Kumar BS
Dec 24 '17 at 13:20
add a comment |Â
probably duplicate of unix.stackexchange.com/questions/404175/⦠? just changef;
to!f;
â Sundeep
Dec 24 '17 at 12:23
1
Used below oneliner to achieve the same sed -n '/</,/x/!p' l.txt | sed '1,/x/d' file name
â Praveen Kumar BS
Dec 24 '17 at 13:20
probably duplicate of unix.stackexchange.com/questions/404175/⦠? just change
f;
to !f;
â Sundeep
Dec 24 '17 at 12:23
probably duplicate of unix.stackexchange.com/questions/404175/⦠? just change
f;
to !f;
â Sundeep
Dec 24 '17 at 12:23
1
1
Used below oneliner to achieve the same sed -n '/</,/x/!p' l.txt | sed '1,/x/d' file name
â Praveen Kumar BS
Dec 24 '17 at 13:20
Used below oneliner to achieve the same sed -n '/</,/x/!p' l.txt | sed '1,/x/d' file name
â Praveen Kumar BS
Dec 24 '17 at 13:20
add a comment |Â
4 Answers
4
active
oldest
votes
up vote
3
down vote
accepted
This is quite the opposite of
How to print lines between pattern1 and 2nd match of pattern2?
With sed
you'd do something like:
sed -n '/PATTERN1/,$! # if not in this range
p;d # print and delete
/PATTERN2/!d # delete if it doesn't match PATTERN2
x;//!d # exchange and then, again, delete if no match
: do # label "do" (executed only after the 2nd match)
n;p # get the next line and print
b do' infile # go to label "do"
or, in one line (on gnu
setups):
sed -n '/PATTERN1/,$!p;d;;/PATTERN2/!d;x;//!d;: do;n;p;b do' infile
Sure, it's easier with awk
and counters. I'll leave that as an exercise for you...
add a comment |Â
up vote
1
down vote
Straightforward awk
:
$ awk '/<!--START OF FILE -->/ a=2; !a; /x x x x x x x/ && a a--' < data
I need everything
from this point
...
It just prints whenever a
is zero and decrements it when it sees the x x x ...
.
Or starting from the actual start of the file instead of a pattern, change the first block to BEGIN a=2
.
Note that your sample input has an empty line after the second x x x...
, and it remains in the output if we stop removing lines at the x x x...
line.
add a comment |Â
up vote
0
down vote
grep -Pz '(?s)<!--START OF FILE(.*?x x x x x x x)2K.*' input.txt
Explanation
grep -Pz
-P
- Interpret the pattern as a Perl-compatible regular expression (PCRE).-z
- process theinput.txt
as one big line.
(?s)<!--START OF FILE(.*?x x x x x x x)2K.*
(?s)
- Turn on "dot matches newline" for the remainder of the regular expression..*?
- non-greedy matching.2
- amount of repetitions of the pattern.K
- any previously-matched characters to be omitted from the final matched string.
add a comment |Â
up vote
0
down vote
This snippet:
# Utility functions: print-as-echo, print-line-with-visual-space.
pe() for _i;do printf "%s" "$_i";done; printf "n";
pl() pe;pe "-----" ;pe "$*";
pl " Input data file $FILE:"
head -v -n 20 $FILE
pl " Expected output on file $E:"
head -v $E
pl " Results:"
cgrep -V -D -w '<!--START OF FILE -->' +2 +w 'x x x x x x x' 'meta' $FILE
produces:
-----
Input data file data1:
==> data1 <==
<!--START OF FILE -->
random text
<meta> more random text </meta>
x x x x x x x
more random text
that I dont need
x x x x x x x
I need everything
from this point
-----
Expected output on file expected-output1:
I need everything
from this point
onwards
...
-----
Results:
I need everything
from this point
onwards
...
This omits (-V) a window beginning (-w) with '...START...', and ending (+w) with the second occurrence (+2) of a string '...x x...' that has the string 'meta' inside the window.
On a system like:
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution : Debian 8.9 (jessie)
bash GNU bash 4.3.30
Some details for cgrep:
cgrep shows context of matching patterns found in files (man)
Path : ~/executable/cgrep
Version : 8.15
Type : ELF 64-bit LSB executable, x86-64, version 1 (SYS ...)
Home : http://sourceforge.net/projects/cgrep/ (doc)
Although one would need to get and compile cgrep, I have had no trouble doing that on 32-bit or 64-bit systems, and it is available on macOS (High Sierra) with brew. The execution time is on a par with GNU grep.
Best wishes ... cheers, drl
add a comment |Â
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
accepted
This is quite the opposite of
How to print lines between pattern1 and 2nd match of pattern2?
With sed
you'd do something like:
sed -n '/PATTERN1/,$! # if not in this range
p;d # print and delete
/PATTERN2/!d # delete if it doesn't match PATTERN2
x;//!d # exchange and then, again, delete if no match
: do # label "do" (executed only after the 2nd match)
n;p # get the next line and print
b do' infile # go to label "do"
or, in one line (on gnu
setups):
sed -n '/PATTERN1/,$!p;d;;/PATTERN2/!d;x;//!d;: do;n;p;b do' infile
Sure, it's easier with awk
and counters. I'll leave that as an exercise for you...
add a comment |Â
up vote
3
down vote
accepted
This is quite the opposite of
How to print lines between pattern1 and 2nd match of pattern2?
With sed
you'd do something like:
sed -n '/PATTERN1/,$! # if not in this range
p;d # print and delete
/PATTERN2/!d # delete if it doesn't match PATTERN2
x;//!d # exchange and then, again, delete if no match
: do # label "do" (executed only after the 2nd match)
n;p # get the next line and print
b do' infile # go to label "do"
or, in one line (on gnu
setups):
sed -n '/PATTERN1/,$!p;d;;/PATTERN2/!d;x;//!d;: do;n;p;b do' infile
Sure, it's easier with awk
and counters. I'll leave that as an exercise for you...
add a comment |Â
up vote
3
down vote
accepted
up vote
3
down vote
accepted
This is quite the opposite of
How to print lines between pattern1 and 2nd match of pattern2?
With sed
you'd do something like:
sed -n '/PATTERN1/,$! # if not in this range
p;d # print and delete
/PATTERN2/!d # delete if it doesn't match PATTERN2
x;//!d # exchange and then, again, delete if no match
: do # label "do" (executed only after the 2nd match)
n;p # get the next line and print
b do' infile # go to label "do"
or, in one line (on gnu
setups):
sed -n '/PATTERN1/,$!p;d;;/PATTERN2/!d;x;//!d;: do;n;p;b do' infile
Sure, it's easier with awk
and counters. I'll leave that as an exercise for you...
This is quite the opposite of
How to print lines between pattern1 and 2nd match of pattern2?
With sed
you'd do something like:
sed -n '/PATTERN1/,$! # if not in this range
p;d # print and delete
/PATTERN2/!d # delete if it doesn't match PATTERN2
x;//!d # exchange and then, again, delete if no match
: do # label "do" (executed only after the 2nd match)
n;p # get the next line and print
b do' infile # go to label "do"
or, in one line (on gnu
setups):
sed -n '/PATTERN1/,$!p;d;;/PATTERN2/!d;x;//!d;: do;n;p;b do' infile
Sure, it's easier with awk
and counters. I'll leave that as an exercise for you...
answered Dec 24 '17 at 11:14
don_crissti
46.6k15124153
46.6k15124153
add a comment |Â
add a comment |Â
up vote
1
down vote
Straightforward awk
:
$ awk '/<!--START OF FILE -->/ a=2; !a; /x x x x x x x/ && a a--' < data
I need everything
from this point
...
It just prints whenever a
is zero and decrements it when it sees the x x x ...
.
Or starting from the actual start of the file instead of a pattern, change the first block to BEGIN a=2
.
Note that your sample input has an empty line after the second x x x...
, and it remains in the output if we stop removing lines at the x x x...
line.
add a comment |Â
up vote
1
down vote
Straightforward awk
:
$ awk '/<!--START OF FILE -->/ a=2; !a; /x x x x x x x/ && a a--' < data
I need everything
from this point
...
It just prints whenever a
is zero and decrements it when it sees the x x x ...
.
Or starting from the actual start of the file instead of a pattern, change the first block to BEGIN a=2
.
Note that your sample input has an empty line after the second x x x...
, and it remains in the output if we stop removing lines at the x x x...
line.
add a comment |Â
up vote
1
down vote
up vote
1
down vote
Straightforward awk
:
$ awk '/<!--START OF FILE -->/ a=2; !a; /x x x x x x x/ && a a--' < data
I need everything
from this point
...
It just prints whenever a
is zero and decrements it when it sees the x x x ...
.
Or starting from the actual start of the file instead of a pattern, change the first block to BEGIN a=2
.
Note that your sample input has an empty line after the second x x x...
, and it remains in the output if we stop removing lines at the x x x...
line.
Straightforward awk
:
$ awk '/<!--START OF FILE -->/ a=2; !a; /x x x x x x x/ && a a--' < data
I need everything
from this point
...
It just prints whenever a
is zero and decrements it when it sees the x x x ...
.
Or starting from the actual start of the file instead of a pattern, change the first block to BEGIN a=2
.
Note that your sample input has an empty line after the second x x x...
, and it remains in the output if we stop removing lines at the x x x...
line.
answered Dec 24 '17 at 18:41
ilkkachu
49.9k674137
49.9k674137
add a comment |Â
add a comment |Â
up vote
0
down vote
grep -Pz '(?s)<!--START OF FILE(.*?x x x x x x x)2K.*' input.txt
Explanation
grep -Pz
-P
- Interpret the pattern as a Perl-compatible regular expression (PCRE).-z
- process theinput.txt
as one big line.
(?s)<!--START OF FILE(.*?x x x x x x x)2K.*
(?s)
- Turn on "dot matches newline" for the remainder of the regular expression..*?
- non-greedy matching.2
- amount of repetitions of the pattern.K
- any previously-matched characters to be omitted from the final matched string.
add a comment |Â
up vote
0
down vote
grep -Pz '(?s)<!--START OF FILE(.*?x x x x x x x)2K.*' input.txt
Explanation
grep -Pz
-P
- Interpret the pattern as a Perl-compatible regular expression (PCRE).-z
- process theinput.txt
as one big line.
(?s)<!--START OF FILE(.*?x x x x x x x)2K.*
(?s)
- Turn on "dot matches newline" for the remainder of the regular expression..*?
- non-greedy matching.2
- amount of repetitions of the pattern.K
- any previously-matched characters to be omitted from the final matched string.
add a comment |Â
up vote
0
down vote
up vote
0
down vote
grep -Pz '(?s)<!--START OF FILE(.*?x x x x x x x)2K.*' input.txt
Explanation
grep -Pz
-P
- Interpret the pattern as a Perl-compatible regular expression (PCRE).-z
- process theinput.txt
as one big line.
(?s)<!--START OF FILE(.*?x x x x x x x)2K.*
(?s)
- Turn on "dot matches newline" for the remainder of the regular expression..*?
- non-greedy matching.2
- amount of repetitions of the pattern.K
- any previously-matched characters to be omitted from the final matched string.
grep -Pz '(?s)<!--START OF FILE(.*?x x x x x x x)2K.*' input.txt
Explanation
grep -Pz
-P
- Interpret the pattern as a Perl-compatible regular expression (PCRE).-z
- process theinput.txt
as one big line.
(?s)<!--START OF FILE(.*?x x x x x x x)2K.*
(?s)
- Turn on "dot matches newline" for the remainder of the regular expression..*?
- non-greedy matching.2
- amount of repetitions of the pattern.K
- any previously-matched characters to be omitted from the final matched string.
edited Dec 24 '17 at 11:14
answered Dec 24 '17 at 11:08
MiniMax
2,686718
2,686718
add a comment |Â
add a comment |Â
up vote
0
down vote
This snippet:
# Utility functions: print-as-echo, print-line-with-visual-space.
pe() for _i;do printf "%s" "$_i";done; printf "n";
pl() pe;pe "-----" ;pe "$*";
pl " Input data file $FILE:"
head -v -n 20 $FILE
pl " Expected output on file $E:"
head -v $E
pl " Results:"
cgrep -V -D -w '<!--START OF FILE -->' +2 +w 'x x x x x x x' 'meta' $FILE
produces:
-----
Input data file data1:
==> data1 <==
<!--START OF FILE -->
random text
<meta> more random text </meta>
x x x x x x x
more random text
that I dont need
x x x x x x x
I need everything
from this point
-----
Expected output on file expected-output1:
I need everything
from this point
onwards
...
-----
Results:
I need everything
from this point
onwards
...
This omits (-V) a window beginning (-w) with '...START...', and ending (+w) with the second occurrence (+2) of a string '...x x...' that has the string 'meta' inside the window.
On a system like:
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution : Debian 8.9 (jessie)
bash GNU bash 4.3.30
Some details for cgrep:
cgrep shows context of matching patterns found in files (man)
Path : ~/executable/cgrep
Version : 8.15
Type : ELF 64-bit LSB executable, x86-64, version 1 (SYS ...)
Home : http://sourceforge.net/projects/cgrep/ (doc)
Although one would need to get and compile cgrep, I have had no trouble doing that on 32-bit or 64-bit systems, and it is available on macOS (High Sierra) with brew. The execution time is on a par with GNU grep.
Best wishes ... cheers, drl
add a comment |Â
up vote
0
down vote
This snippet:
# Utility functions: print-as-echo, print-line-with-visual-space.
pe() for _i;do printf "%s" "$_i";done; printf "n";
pl() pe;pe "-----" ;pe "$*";
pl " Input data file $FILE:"
head -v -n 20 $FILE
pl " Expected output on file $E:"
head -v $E
pl " Results:"
cgrep -V -D -w '<!--START OF FILE -->' +2 +w 'x x x x x x x' 'meta' $FILE
produces:
-----
Input data file data1:
==> data1 <==
<!--START OF FILE -->
random text
<meta> more random text </meta>
x x x x x x x
more random text
that I dont need
x x x x x x x
I need everything
from this point
-----
Expected output on file expected-output1:
I need everything
from this point
onwards
...
-----
Results:
I need everything
from this point
onwards
...
This omits (-V) a window beginning (-w) with '...START...', and ending (+w) with the second occurrence (+2) of a string '...x x...' that has the string 'meta' inside the window.
On a system like:
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution : Debian 8.9 (jessie)
bash GNU bash 4.3.30
Some details for cgrep:
cgrep shows context of matching patterns found in files (man)
Path : ~/executable/cgrep
Version : 8.15
Type : ELF 64-bit LSB executable, x86-64, version 1 (SYS ...)
Home : http://sourceforge.net/projects/cgrep/ (doc)
Although one would need to get and compile cgrep, I have had no trouble doing that on 32-bit or 64-bit systems, and it is available on macOS (High Sierra) with brew. The execution time is on a par with GNU grep.
Best wishes ... cheers, drl
add a comment |Â
up vote
0
down vote
up vote
0
down vote
This snippet:
# Utility functions: print-as-echo, print-line-with-visual-space.
pe() for _i;do printf "%s" "$_i";done; printf "n";
pl() pe;pe "-----" ;pe "$*";
pl " Input data file $FILE:"
head -v -n 20 $FILE
pl " Expected output on file $E:"
head -v $E
pl " Results:"
cgrep -V -D -w '<!--START OF FILE -->' +2 +w 'x x x x x x x' 'meta' $FILE
produces:
-----
Input data file data1:
==> data1 <==
<!--START OF FILE -->
random text
<meta> more random text </meta>
x x x x x x x
more random text
that I dont need
x x x x x x x
I need everything
from this point
-----
Expected output on file expected-output1:
I need everything
from this point
onwards
...
-----
Results:
I need everything
from this point
onwards
...
This omits (-V) a window beginning (-w) with '...START...', and ending (+w) with the second occurrence (+2) of a string '...x x...' that has the string 'meta' inside the window.
On a system like:
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution : Debian 8.9 (jessie)
bash GNU bash 4.3.30
Some details for cgrep:
cgrep shows context of matching patterns found in files (man)
Path : ~/executable/cgrep
Version : 8.15
Type : ELF 64-bit LSB executable, x86-64, version 1 (SYS ...)
Home : http://sourceforge.net/projects/cgrep/ (doc)
Although one would need to get and compile cgrep, I have had no trouble doing that on 32-bit or 64-bit systems, and it is available on macOS (High Sierra) with brew. The execution time is on a par with GNU grep.
Best wishes ... cheers, drl
This snippet:
# Utility functions: print-as-echo, print-line-with-visual-space.
pe() for _i;do printf "%s" "$_i";done; printf "n";
pl() pe;pe "-----" ;pe "$*";
pl " Input data file $FILE:"
head -v -n 20 $FILE
pl " Expected output on file $E:"
head -v $E
pl " Results:"
cgrep -V -D -w '<!--START OF FILE -->' +2 +w 'x x x x x x x' 'meta' $FILE
produces:
-----
Input data file data1:
==> data1 <==
<!--START OF FILE -->
random text
<meta> more random text </meta>
x x x x x x x
more random text
that I dont need
x x x x x x x
I need everything
from this point
-----
Expected output on file expected-output1:
I need everything
from this point
onwards
...
-----
Results:
I need everything
from this point
onwards
...
This omits (-V) a window beginning (-w) with '...START...', and ending (+w) with the second occurrence (+2) of a string '...x x...' that has the string 'meta' inside the window.
On a system like:
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution : Debian 8.9 (jessie)
bash GNU bash 4.3.30
Some details for cgrep:
cgrep shows context of matching patterns found in files (man)
Path : ~/executable/cgrep
Version : 8.15
Type : ELF 64-bit LSB executable, x86-64, version 1 (SYS ...)
Home : http://sourceforge.net/projects/cgrep/ (doc)
Although one would need to get and compile cgrep, I have had no trouble doing that on 32-bit or 64-bit systems, and it is available on macOS (High Sierra) with brew. The execution time is on a par with GNU grep.
Best wishes ... cheers, drl
answered Dec 24 '17 at 18:13
drl
45225
45225
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f412786%2fremoving-text-from-pattern1-up-to-and-including-2nd-match-of-pattern2%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
probably duplicate of unix.stackexchange.com/questions/404175/⦠? just change
f;
to!f;
â Sundeep
Dec 24 '17 at 12:23
1
Used below oneliner to achieve the same sed -n '/</,/x/!p' l.txt | sed '1,/x/d' file name
â Praveen Kumar BS
Dec 24 '17 at 13:20