Text Processing - How to get pattern A matching line until first occurrence of pattern B matching line?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
1
down vote

favorite












I want to get the lines that, in reverse order, match from pattern A matching line to the first occurrence of pattern B matching line along with the lines that it passes.



UPDATED: example_file.txt



ISA*00* *00* *ZZ*SIX-SIX6 *12*666666666666 *66666666*6666*U*666666666*6666666666*0*P*
GS*FA*SIX-SIX-SIX*666666666*6666666*6666*6666*X*66666
ST*666*666
AK1*SX*666
AK2*777*6666666
AK5*A
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5
AK2*777*6666666
AK5*A
AK2*777*69696969
AK3*J7*5**3
AK4*3*6969*4
AK5*R*5
AK9*P*20*20*19
SE*69*6969
GE*1*6767
IEA*1*0000000000


What I want is to get, from the bottom up, all the AK5 pattern with R after it, like this:



Pattern A: AK5*R



and get all the lines going up until the first occurrence of pattern B is matched. e.g.:



Pattern B: AK2



Desired output:



First Pattern A matched will be called E1



AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5


UPDATED: Second Pattern A matched will be called E2



AK2*777*69696969
AK3*J7*5**3
AK4*3*6969*4
AK5*R*5


and so on if there are more than 1 pattern A matched.



EDIT: I know sed can do this but I still don't have any luck in getting the line from each pattern A matched to its first occurrence of pattern B matched and store them in a temporary text file to be process further.



This is my example sed command that gets all available pattern B in the example_file.txt



sed -ne '/AK2*/,/AK5*R/p' example_file.txt



Example command logical scenario:



A="AK5*R"
B="AK2"

find the first $A < example_file.txt; # AK5*R
move to previous line until first occurrence of $B line; # AK2*any_number*any_number
get all lines from first $A to its first occurrence of $B and store in a text file; # result > e1.txt
# The same way goes to the second occurrence of pattern A.


(NOTE: First occurrence of $B meaning, starting from each $A line get $A line and the previous lines until the very first $B matching line it encounters. So e.g. if the first $A line starts from the middle line of a file like in line number 50 if the file has 100 total lines then from there move to the previous line until command encounters the very first $B line it sees.) See example below.



example_file2.txt



ISA*00* *00* *ZZ*SIX-SIX6 *12*666666666666 *66666666*6666*U*666666666*6666666666*0*P*
GS*FA*SIX-SIX-SIX*666666666*6666666*6666*6666*X*66666
ST*666*666
AK1*SX*666
AK2*777*6666666
AK5*A
AK2*777*7777777
AK5*A
AK2*777*888888
AK5*A
AK2*777*7777777
AK5*A
AK2*777*5555555
AK5*A
AK2*777*7777777
AK5*A
AK2*777*4545435
AK5*A
AK2*777*7777777
AK5*A
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*A
AK2*777*0987654
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*A
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*A


Output:



AK2*777*0987654
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5






share|improve this question


















  • 1




    You say "don't have any luck in getting the line from pattern A to B" but you apparently want all the lines from patterm B to pattern A, at least that's whet you show in the desired output. Your first language is probably not English, but please try to edit your question to make it clearer what you want.
    – wurtel
    Feb 8 at 10:17







  • 1




    there's no Pattern B: B=AK2 in your input content. Update your question
    – RomanPerekhrest
    Feb 8 at 10:18










  • @wurtel, There are two Pattern B which is the AK2 in the example_text.file. I don't want to print all the lines from pattern B to A. As you can see I separated them in my desired output. I want a command that finds first pattern A and then move to previous lines until the first match of pattern be is found. In the example_file.txt the first match of pattern A is in line number 12. So from that point it moves up until first occurrence of pattern B is matched which is in line number 7. The same goes to the 2nd pattern A matched where pattern B is in line number 15.
    – WashichawbachaW
    Feb 9 at 0:12










  • @RomanPerekhrest, There is in line number 5: AK2*777*6666666, line number 7: AK2*777*7777777, line number 13: AK2*777*6666666, and line number 15: AK2*777*7777777. Sorry, I think you have literally see B=AK2 as the whole pattern. It's only AK2 is the pattern. I just put it in a variable B for representation of consistent pattern I want to find. Anyways, I'm just gonna correct this section to prevent confusion. Thanks
    – WashichawbachaW
    Feb 9 at 0:20










  • Yes, sed could extract the ranges: tac ../infile | sed -ne '/^AK5*R/,/AK2*/p' | tac. What it could not do is redirect each range to a separate file.
    – Isaac
    Feb 9 at 2:48














up vote
1
down vote

favorite












I want to get the lines that, in reverse order, match from pattern A matching line to the first occurrence of pattern B matching line along with the lines that it passes.



UPDATED: example_file.txt



ISA*00* *00* *ZZ*SIX-SIX6 *12*666666666666 *66666666*6666*U*666666666*6666666666*0*P*
GS*FA*SIX-SIX-SIX*666666666*6666666*6666*6666*X*66666
ST*666*666
AK1*SX*666
AK2*777*6666666
AK5*A
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5
AK2*777*6666666
AK5*A
AK2*777*69696969
AK3*J7*5**3
AK4*3*6969*4
AK5*R*5
AK9*P*20*20*19
SE*69*6969
GE*1*6767
IEA*1*0000000000


What I want is to get, from the bottom up, all the AK5 pattern with R after it, like this:



Pattern A: AK5*R



and get all the lines going up until the first occurrence of pattern B is matched. e.g.:



Pattern B: AK2



Desired output:



First Pattern A matched will be called E1



AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5


UPDATED: Second Pattern A matched will be called E2



AK2*777*69696969
AK3*J7*5**3
AK4*3*6969*4
AK5*R*5


and so on if there are more than 1 pattern A matched.



EDIT: I know sed can do this but I still don't have any luck in getting the line from each pattern A matched to its first occurrence of pattern B matched and store them in a temporary text file to be process further.



This is my example sed command that gets all available pattern B in the example_file.txt



sed -ne '/AK2*/,/AK5*R/p' example_file.txt



Example command logical scenario:



A="AK5*R"
B="AK2"

find the first $A < example_file.txt; # AK5*R
move to previous line until first occurrence of $B line; # AK2*any_number*any_number
get all lines from first $A to its first occurrence of $B and store in a text file; # result > e1.txt
# The same way goes to the second occurrence of pattern A.


(NOTE: First occurrence of $B meaning, starting from each $A line get $A line and the previous lines until the very first $B matching line it encounters. So e.g. if the first $A line starts from the middle line of a file like in line number 50 if the file has 100 total lines then from there move to the previous line until command encounters the very first $B line it sees.) See example below.



example_file2.txt



ISA*00* *00* *ZZ*SIX-SIX6 *12*666666666666 *66666666*6666*U*666666666*6666666666*0*P*
GS*FA*SIX-SIX-SIX*666666666*6666666*6666*6666*X*66666
ST*666*666
AK1*SX*666
AK2*777*6666666
AK5*A
AK2*777*7777777
AK5*A
AK2*777*888888
AK5*A
AK2*777*7777777
AK5*A
AK2*777*5555555
AK5*A
AK2*777*7777777
AK5*A
AK2*777*4545435
AK5*A
AK2*777*7777777
AK5*A
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*A
AK2*777*0987654
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*A
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*A


Output:



AK2*777*0987654
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5






share|improve this question


















  • 1




    You say "don't have any luck in getting the line from pattern A to B" but you apparently want all the lines from patterm B to pattern A, at least that's whet you show in the desired output. Your first language is probably not English, but please try to edit your question to make it clearer what you want.
    – wurtel
    Feb 8 at 10:17







  • 1




    there's no Pattern B: B=AK2 in your input content. Update your question
    – RomanPerekhrest
    Feb 8 at 10:18










  • @wurtel, There are two Pattern B which is the AK2 in the example_text.file. I don't want to print all the lines from pattern B to A. As you can see I separated them in my desired output. I want a command that finds first pattern A and then move to previous lines until the first match of pattern be is found. In the example_file.txt the first match of pattern A is in line number 12. So from that point it moves up until first occurrence of pattern B is matched which is in line number 7. The same goes to the 2nd pattern A matched where pattern B is in line number 15.
    – WashichawbachaW
    Feb 9 at 0:12










  • @RomanPerekhrest, There is in line number 5: AK2*777*6666666, line number 7: AK2*777*7777777, line number 13: AK2*777*6666666, and line number 15: AK2*777*7777777. Sorry, I think you have literally see B=AK2 as the whole pattern. It's only AK2 is the pattern. I just put it in a variable B for representation of consistent pattern I want to find. Anyways, I'm just gonna correct this section to prevent confusion. Thanks
    – WashichawbachaW
    Feb 9 at 0:20










  • Yes, sed could extract the ranges: tac ../infile | sed -ne '/^AK5*R/,/AK2*/p' | tac. What it could not do is redirect each range to a separate file.
    – Isaac
    Feb 9 at 2:48












up vote
1
down vote

favorite









up vote
1
down vote

favorite











I want to get the lines that, in reverse order, match from pattern A matching line to the first occurrence of pattern B matching line along with the lines that it passes.



UPDATED: example_file.txt



ISA*00* *00* *ZZ*SIX-SIX6 *12*666666666666 *66666666*6666*U*666666666*6666666666*0*P*
GS*FA*SIX-SIX-SIX*666666666*6666666*6666*6666*X*66666
ST*666*666
AK1*SX*666
AK2*777*6666666
AK5*A
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5
AK2*777*6666666
AK5*A
AK2*777*69696969
AK3*J7*5**3
AK4*3*6969*4
AK5*R*5
AK9*P*20*20*19
SE*69*6969
GE*1*6767
IEA*1*0000000000


What I want is to get, from the bottom up, all the AK5 pattern with R after it, like this:



Pattern A: AK5*R



and get all the lines going up until the first occurrence of pattern B is matched. e.g.:



Pattern B: AK2



Desired output:



First Pattern A matched will be called E1



AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5


UPDATED: Second Pattern A matched will be called E2



AK2*777*69696969
AK3*J7*5**3
AK4*3*6969*4
AK5*R*5


and so on if there are more than 1 pattern A matched.



EDIT: I know sed can do this but I still don't have any luck in getting the line from each pattern A matched to its first occurrence of pattern B matched and store them in a temporary text file to be process further.



This is my example sed command that gets all available pattern B in the example_file.txt



sed -ne '/AK2*/,/AK5*R/p' example_file.txt



Example command logical scenario:



A="AK5*R"
B="AK2"

find the first $A < example_file.txt; # AK5*R
move to previous line until first occurrence of $B line; # AK2*any_number*any_number
get all lines from first $A to its first occurrence of $B and store in a text file; # result > e1.txt
# The same way goes to the second occurrence of pattern A.


(NOTE: First occurrence of $B meaning, starting from each $A line get $A line and the previous lines until the very first $B matching line it encounters. So e.g. if the first $A line starts from the middle line of a file like in line number 50 if the file has 100 total lines then from there move to the previous line until command encounters the very first $B line it sees.) See example below.



example_file2.txt



ISA*00* *00* *ZZ*SIX-SIX6 *12*666666666666 *66666666*6666*U*666666666*6666666666*0*P*
GS*FA*SIX-SIX-SIX*666666666*6666666*6666*6666*X*66666
ST*666*666
AK1*SX*666
AK2*777*6666666
AK5*A
AK2*777*7777777
AK5*A
AK2*777*888888
AK5*A
AK2*777*7777777
AK5*A
AK2*777*5555555
AK5*A
AK2*777*7777777
AK5*A
AK2*777*4545435
AK5*A
AK2*777*7777777
AK5*A
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*A
AK2*777*0987654
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*A
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*A


Output:



AK2*777*0987654
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5






share|improve this question














I want to get the lines that, in reverse order, match from pattern A matching line to the first occurrence of pattern B matching line along with the lines that it passes.



UPDATED: example_file.txt



ISA*00* *00* *ZZ*SIX-SIX6 *12*666666666666 *66666666*6666*U*666666666*6666666666*0*P*
GS*FA*SIX-SIX-SIX*666666666*6666666*6666*6666*X*66666
ST*666*666
AK1*SX*666
AK2*777*6666666
AK5*A
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5
AK2*777*6666666
AK5*A
AK2*777*69696969
AK3*J7*5**3
AK4*3*6969*4
AK5*R*5
AK9*P*20*20*19
SE*69*6969
GE*1*6767
IEA*1*0000000000


What I want is to get, from the bottom up, all the AK5 pattern with R after it, like this:



Pattern A: AK5*R



and get all the lines going up until the first occurrence of pattern B is matched. e.g.:



Pattern B: AK2



Desired output:



First Pattern A matched will be called E1



AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5


UPDATED: Second Pattern A matched will be called E2



AK2*777*69696969
AK3*J7*5**3
AK4*3*6969*4
AK5*R*5


and so on if there are more than 1 pattern A matched.



EDIT: I know sed can do this but I still don't have any luck in getting the line from each pattern A matched to its first occurrence of pattern B matched and store them in a temporary text file to be process further.



This is my example sed command that gets all available pattern B in the example_file.txt



sed -ne '/AK2*/,/AK5*R/p' example_file.txt



Example command logical scenario:



A="AK5*R"
B="AK2"

find the first $A < example_file.txt; # AK5*R
move to previous line until first occurrence of $B line; # AK2*any_number*any_number
get all lines from first $A to its first occurrence of $B and store in a text file; # result > e1.txt
# The same way goes to the second occurrence of pattern A.


(NOTE: First occurrence of $B meaning, starting from each $A line get $A line and the previous lines until the very first $B matching line it encounters. So e.g. if the first $A line starts from the middle line of a file like in line number 50 if the file has 100 total lines then from there move to the previous line until command encounters the very first $B line it sees.) See example below.



example_file2.txt



ISA*00* *00* *ZZ*SIX-SIX6 *12*666666666666 *66666666*6666*U*666666666*6666666666*0*P*
GS*FA*SIX-SIX-SIX*666666666*6666666*6666*6666*X*66666
ST*666*666
AK1*SX*666
AK2*777*6666666
AK5*A
AK2*777*7777777
AK5*A
AK2*777*888888
AK5*A
AK2*777*7777777
AK5*A
AK2*777*5555555
AK5*A
AK2*777*7777777
AK5*A
AK2*777*4545435
AK5*A
AK2*777*7777777
AK5*A
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*A
AK2*777*0987654
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*A
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*A


Output:



AK2*777*0987654
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5








share|improve this question













share|improve this question




share|improve this question








edited Feb 9 at 2:19

























asked Feb 8 at 10:07









WashichawbachaW

12510




12510







  • 1




    You say "don't have any luck in getting the line from pattern A to B" but you apparently want all the lines from patterm B to pattern A, at least that's whet you show in the desired output. Your first language is probably not English, but please try to edit your question to make it clearer what you want.
    – wurtel
    Feb 8 at 10:17







  • 1




    there's no Pattern B: B=AK2 in your input content. Update your question
    – RomanPerekhrest
    Feb 8 at 10:18










  • @wurtel, There are two Pattern B which is the AK2 in the example_text.file. I don't want to print all the lines from pattern B to A. As you can see I separated them in my desired output. I want a command that finds first pattern A and then move to previous lines until the first match of pattern be is found. In the example_file.txt the first match of pattern A is in line number 12. So from that point it moves up until first occurrence of pattern B is matched which is in line number 7. The same goes to the 2nd pattern A matched where pattern B is in line number 15.
    – WashichawbachaW
    Feb 9 at 0:12










  • @RomanPerekhrest, There is in line number 5: AK2*777*6666666, line number 7: AK2*777*7777777, line number 13: AK2*777*6666666, and line number 15: AK2*777*7777777. Sorry, I think you have literally see B=AK2 as the whole pattern. It's only AK2 is the pattern. I just put it in a variable B for representation of consistent pattern I want to find. Anyways, I'm just gonna correct this section to prevent confusion. Thanks
    – WashichawbachaW
    Feb 9 at 0:20










  • Yes, sed could extract the ranges: tac ../infile | sed -ne '/^AK5*R/,/AK2*/p' | tac. What it could not do is redirect each range to a separate file.
    – Isaac
    Feb 9 at 2:48












  • 1




    You say "don't have any luck in getting the line from pattern A to B" but you apparently want all the lines from patterm B to pattern A, at least that's whet you show in the desired output. Your first language is probably not English, but please try to edit your question to make it clearer what you want.
    – wurtel
    Feb 8 at 10:17







  • 1




    there's no Pattern B: B=AK2 in your input content. Update your question
    – RomanPerekhrest
    Feb 8 at 10:18










  • @wurtel, There are two Pattern B which is the AK2 in the example_text.file. I don't want to print all the lines from pattern B to A. As you can see I separated them in my desired output. I want a command that finds first pattern A and then move to previous lines until the first match of pattern be is found. In the example_file.txt the first match of pattern A is in line number 12. So from that point it moves up until first occurrence of pattern B is matched which is in line number 7. The same goes to the 2nd pattern A matched where pattern B is in line number 15.
    – WashichawbachaW
    Feb 9 at 0:12










  • @RomanPerekhrest, There is in line number 5: AK2*777*6666666, line number 7: AK2*777*7777777, line number 13: AK2*777*6666666, and line number 15: AK2*777*7777777. Sorry, I think you have literally see B=AK2 as the whole pattern. It's only AK2 is the pattern. I just put it in a variable B for representation of consistent pattern I want to find. Anyways, I'm just gonna correct this section to prevent confusion. Thanks
    – WashichawbachaW
    Feb 9 at 0:20










  • Yes, sed could extract the ranges: tac ../infile | sed -ne '/^AK5*R/,/AK2*/p' | tac. What it could not do is redirect each range to a separate file.
    – Isaac
    Feb 9 at 2:48







1




1




You say "don't have any luck in getting the line from pattern A to B" but you apparently want all the lines from patterm B to pattern A, at least that's whet you show in the desired output. Your first language is probably not English, but please try to edit your question to make it clearer what you want.
– wurtel
Feb 8 at 10:17





You say "don't have any luck in getting the line from pattern A to B" but you apparently want all the lines from patterm B to pattern A, at least that's whet you show in the desired output. Your first language is probably not English, but please try to edit your question to make it clearer what you want.
– wurtel
Feb 8 at 10:17





1




1




there's no Pattern B: B=AK2 in your input content. Update your question
– RomanPerekhrest
Feb 8 at 10:18




there's no Pattern B: B=AK2 in your input content. Update your question
– RomanPerekhrest
Feb 8 at 10:18












@wurtel, There are two Pattern B which is the AK2 in the example_text.file. I don't want to print all the lines from pattern B to A. As you can see I separated them in my desired output. I want a command that finds first pattern A and then move to previous lines until the first match of pattern be is found. In the example_file.txt the first match of pattern A is in line number 12. So from that point it moves up until first occurrence of pattern B is matched which is in line number 7. The same goes to the 2nd pattern A matched where pattern B is in line number 15.
– WashichawbachaW
Feb 9 at 0:12




@wurtel, There are two Pattern B which is the AK2 in the example_text.file. I don't want to print all the lines from pattern B to A. As you can see I separated them in my desired output. I want a command that finds first pattern A and then move to previous lines until the first match of pattern be is found. In the example_file.txt the first match of pattern A is in line number 12. So from that point it moves up until first occurrence of pattern B is matched which is in line number 7. The same goes to the 2nd pattern A matched where pattern B is in line number 15.
– WashichawbachaW
Feb 9 at 0:12












@RomanPerekhrest, There is in line number 5: AK2*777*6666666, line number 7: AK2*777*7777777, line number 13: AK2*777*6666666, and line number 15: AK2*777*7777777. Sorry, I think you have literally see B=AK2 as the whole pattern. It's only AK2 is the pattern. I just put it in a variable B for representation of consistent pattern I want to find. Anyways, I'm just gonna correct this section to prevent confusion. Thanks
– WashichawbachaW
Feb 9 at 0:20




@RomanPerekhrest, There is in line number 5: AK2*777*6666666, line number 7: AK2*777*7777777, line number 13: AK2*777*6666666, and line number 15: AK2*777*7777777. Sorry, I think you have literally see B=AK2 as the whole pattern. It's only AK2 is the pattern. I just put it in a variable B for representation of consistent pattern I want to find. Anyways, I'm just gonna correct this section to prevent confusion. Thanks
– WashichawbachaW
Feb 9 at 0:20












Yes, sed could extract the ranges: tac ../infile | sed -ne '/^AK5*R/,/AK2*/p' | tac. What it could not do is redirect each range to a separate file.
– Isaac
Feb 9 at 2:48




Yes, sed could extract the ranges: tac ../infile | sed -ne '/^AK5*R/,/AK2*/p' | tac. What it could not do is redirect each range to a separate file.
– Isaac
Feb 9 at 2:48










2 Answers
2






active

oldest

votes

















up vote
1
down vote



accepted










Reading again your description I understand that you want the first match of pattern B from the bottom up until (going up) the first match of pattern A. But the resulting sections should be in the order that the file has.



That requires a lot of logic. The following shell script does it all. Will place the results in the correct internal order in files E and some number, first file (E1) will have the first match from the top, last file will have the last match section.



#!/bin/bash

rm -rf resE* E*

tac ../example_file.txt |
awk 'BEGINi=1
/^AK5*R.*/p=1
if(p==1)f="resE" i;print($0)>>f;close(f)
/^AK2.*/if(p==1)i++;p=0
'
set -- resE*
c=$#
for (( i=1;i<=$c;i++)); do
pos=$(($c-$i+1))
[ -f "$1" ] && tac "$1" > "E$pos"
shift
done


The resulting ranges will be:



$ cat E1
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5

$ cat E2
AK2*777*7777777
AK3*J7*5**3
AK4*3*6969*4
AK5*R*5





share|improve this answer






















  • I'm trying to execute a command to first get the pattern A matched AK5*R and from that line, it moves to previous line until the very first pattern B AK2 is found (Note: Not the very first pattern B matching line which is line number 5:AK2*777*6666666 from the file but the very first AK2 matching line starting from each pattern A AK5*R ). See my updated output.
    – WashichawbachaW
    Feb 9 at 1:49










  • I will run some series of test. But so far, it prints the way I want it to be. I will mark it check after I'm done with some files I have here.
    – WashichawbachaW
    Feb 9 at 2:44

















up vote
1
down vote














POSIX ex to the rescue again!



ex is the POSIX-specified scriptable file editor. For anything involving backwards addressing, it's usually a far better solution than Awk or Sed.



The following one-liner works perfectly on your example_file2.txt:



printf '%sn' 'g/AK5[*]R/?AK2?,.p' | ex example_file.txt


On your example_file.txt, it also works, but because the global command in ex can't write to a separate destination for each range acted upon, the desired two output files are merged like so:



AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5
AK2*777*69696969
AK3*J7*5**3
AK4*3*6969*4
AK5*R*5


However, this is easy enough to handle—with another POSIX tool, csplit, which is designed to split files according to a "context."



Portable POSIX solution:



patA='AK5[*]R'
patB='AK2'

printf '%sn' "g/$patA/?$patB?,.p" |
ex example_file.txt |
csplit -f my_unique_prefix_ -n 1 -s -k - "/$patB/" '999'

for f in my_unique_prefix_*; do
mv "$f" "e$f##my_unique_prefix_.txt";
done

rm e0.txt


There is one final element to make this a perfect solution, which is to renumber the files in reverse order. I haven't done this portion.




If you don't care about the file numbering being in the same order as the file, and if you don't mind if the extension .txt is omitted, and if you don't mind if the files are numbered from e01 rather than from e1, and if you don't mind a diagnostic message being printed about how many lines were put in each file, then we can simplify:



patA='AK5[*]R'
patB='AK2'

printf '%sn' "g/$patA/?$patB?,.p" |
ex example_file.txt |
csplit -f e -k - "/$patB/" '999'

rm e00





share|improve this answer






















  • csplit: /AK2/': match not found` This what happens.
    – WashichawbachaW
    Feb 13 at 0:41










  • @WashichawbachaW, I get, for example: csplit: '/AK2/': match not found on repetition 16. But that doesn't matter. The -k option means that csplit will leave the files it already created in place, even though it subsequently encountered an error (because you don't have 999 instances of AK2 in your input file). If you're using the version of the command at the very end of my answer, check with ls -l e?? and you should see all the files desired.
    – Wildcard
    Feb 13 at 0:52











Your Answer







StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f422760%2ftext-processing-how-to-get-pattern-a-matching-line-until-first-occurrence-of-p%23new-answer', 'question_page');

);

Post as a guest






























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
1
down vote



accepted










Reading again your description I understand that you want the first match of pattern B from the bottom up until (going up) the first match of pattern A. But the resulting sections should be in the order that the file has.



That requires a lot of logic. The following shell script does it all. Will place the results in the correct internal order in files E and some number, first file (E1) will have the first match from the top, last file will have the last match section.



#!/bin/bash

rm -rf resE* E*

tac ../example_file.txt |
awk 'BEGINi=1
/^AK5*R.*/p=1
if(p==1)f="resE" i;print($0)>>f;close(f)
/^AK2.*/if(p==1)i++;p=0
'
set -- resE*
c=$#
for (( i=1;i<=$c;i++)); do
pos=$(($c-$i+1))
[ -f "$1" ] && tac "$1" > "E$pos"
shift
done


The resulting ranges will be:



$ cat E1
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5

$ cat E2
AK2*777*7777777
AK3*J7*5**3
AK4*3*6969*4
AK5*R*5





share|improve this answer






















  • I'm trying to execute a command to first get the pattern A matched AK5*R and from that line, it moves to previous line until the very first pattern B AK2 is found (Note: Not the very first pattern B matching line which is line number 5:AK2*777*6666666 from the file but the very first AK2 matching line starting from each pattern A AK5*R ). See my updated output.
    – WashichawbachaW
    Feb 9 at 1:49










  • I will run some series of test. But so far, it prints the way I want it to be. I will mark it check after I'm done with some files I have here.
    – WashichawbachaW
    Feb 9 at 2:44














up vote
1
down vote



accepted










Reading again your description I understand that you want the first match of pattern B from the bottom up until (going up) the first match of pattern A. But the resulting sections should be in the order that the file has.



That requires a lot of logic. The following shell script does it all. Will place the results in the correct internal order in files E and some number, first file (E1) will have the first match from the top, last file will have the last match section.



#!/bin/bash

rm -rf resE* E*

tac ../example_file.txt |
awk 'BEGINi=1
/^AK5*R.*/p=1
if(p==1)f="resE" i;print($0)>>f;close(f)
/^AK2.*/if(p==1)i++;p=0
'
set -- resE*
c=$#
for (( i=1;i<=$c;i++)); do
pos=$(($c-$i+1))
[ -f "$1" ] && tac "$1" > "E$pos"
shift
done


The resulting ranges will be:



$ cat E1
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5

$ cat E2
AK2*777*7777777
AK3*J7*5**3
AK4*3*6969*4
AK5*R*5





share|improve this answer






















  • I'm trying to execute a command to first get the pattern A matched AK5*R and from that line, it moves to previous line until the very first pattern B AK2 is found (Note: Not the very first pattern B matching line which is line number 5:AK2*777*6666666 from the file but the very first AK2 matching line starting from each pattern A AK5*R ). See my updated output.
    – WashichawbachaW
    Feb 9 at 1:49










  • I will run some series of test. But so far, it prints the way I want it to be. I will mark it check after I'm done with some files I have here.
    – WashichawbachaW
    Feb 9 at 2:44












up vote
1
down vote



accepted







up vote
1
down vote



accepted






Reading again your description I understand that you want the first match of pattern B from the bottom up until (going up) the first match of pattern A. But the resulting sections should be in the order that the file has.



That requires a lot of logic. The following shell script does it all. Will place the results in the correct internal order in files E and some number, first file (E1) will have the first match from the top, last file will have the last match section.



#!/bin/bash

rm -rf resE* E*

tac ../example_file.txt |
awk 'BEGINi=1
/^AK5*R.*/p=1
if(p==1)f="resE" i;print($0)>>f;close(f)
/^AK2.*/if(p==1)i++;p=0
'
set -- resE*
c=$#
for (( i=1;i<=$c;i++)); do
pos=$(($c-$i+1))
[ -f "$1" ] && tac "$1" > "E$pos"
shift
done


The resulting ranges will be:



$ cat E1
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5

$ cat E2
AK2*777*7777777
AK3*J7*5**3
AK4*3*6969*4
AK5*R*5





share|improve this answer














Reading again your description I understand that you want the first match of pattern B from the bottom up until (going up) the first match of pattern A. But the resulting sections should be in the order that the file has.



That requires a lot of logic. The following shell script does it all. Will place the results in the correct internal order in files E and some number, first file (E1) will have the first match from the top, last file will have the last match section.



#!/bin/bash

rm -rf resE* E*

tac ../example_file.txt |
awk 'BEGINi=1
/^AK5*R.*/p=1
if(p==1)f="resE" i;print($0)>>f;close(f)
/^AK2.*/if(p==1)i++;p=0
'
set -- resE*
c=$#
for (( i=1;i<=$c;i++)); do
pos=$(($c-$i+1))
[ -f "$1" ] && tac "$1" > "E$pos"
shift
done


The resulting ranges will be:



$ cat E1
AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5

$ cat E2
AK2*777*7777777
AK3*J7*5**3
AK4*3*6969*4
AK5*R*5






share|improve this answer














share|improve this answer



share|improve this answer








edited Feb 9 at 2:42

























answered Feb 8 at 18:11









Isaac

6,6381734




6,6381734











  • I'm trying to execute a command to first get the pattern A matched AK5*R and from that line, it moves to previous line until the very first pattern B AK2 is found (Note: Not the very first pattern B matching line which is line number 5:AK2*777*6666666 from the file but the very first AK2 matching line starting from each pattern A AK5*R ). See my updated output.
    – WashichawbachaW
    Feb 9 at 1:49










  • I will run some series of test. But so far, it prints the way I want it to be. I will mark it check after I'm done with some files I have here.
    – WashichawbachaW
    Feb 9 at 2:44
















  • I'm trying to execute a command to first get the pattern A matched AK5*R and from that line, it moves to previous line until the very first pattern B AK2 is found (Note: Not the very first pattern B matching line which is line number 5:AK2*777*6666666 from the file but the very first AK2 matching line starting from each pattern A AK5*R ). See my updated output.
    – WashichawbachaW
    Feb 9 at 1:49










  • I will run some series of test. But so far, it prints the way I want it to be. I will mark it check after I'm done with some files I have here.
    – WashichawbachaW
    Feb 9 at 2:44















I'm trying to execute a command to first get the pattern A matched AK5*R and from that line, it moves to previous line until the very first pattern B AK2 is found (Note: Not the very first pattern B matching line which is line number 5:AK2*777*6666666 from the file but the very first AK2 matching line starting from each pattern A AK5*R ). See my updated output.
– WashichawbachaW
Feb 9 at 1:49




I'm trying to execute a command to first get the pattern A matched AK5*R and from that line, it moves to previous line until the very first pattern B AK2 is found (Note: Not the very first pattern B matching line which is line number 5:AK2*777*6666666 from the file but the very first AK2 matching line starting from each pattern A AK5*R ). See my updated output.
– WashichawbachaW
Feb 9 at 1:49












I will run some series of test. But so far, it prints the way I want it to be. I will mark it check after I'm done with some files I have here.
– WashichawbachaW
Feb 9 at 2:44




I will run some series of test. But so far, it prints the way I want it to be. I will mark it check after I'm done with some files I have here.
– WashichawbachaW
Feb 9 at 2:44












up vote
1
down vote














POSIX ex to the rescue again!



ex is the POSIX-specified scriptable file editor. For anything involving backwards addressing, it's usually a far better solution than Awk or Sed.



The following one-liner works perfectly on your example_file2.txt:



printf '%sn' 'g/AK5[*]R/?AK2?,.p' | ex example_file.txt


On your example_file.txt, it also works, but because the global command in ex can't write to a separate destination for each range acted upon, the desired two output files are merged like so:



AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5
AK2*777*69696969
AK3*J7*5**3
AK4*3*6969*4
AK5*R*5


However, this is easy enough to handle—with another POSIX tool, csplit, which is designed to split files according to a "context."



Portable POSIX solution:



patA='AK5[*]R'
patB='AK2'

printf '%sn' "g/$patA/?$patB?,.p" |
ex example_file.txt |
csplit -f my_unique_prefix_ -n 1 -s -k - "/$patB/" '999'

for f in my_unique_prefix_*; do
mv "$f" "e$f##my_unique_prefix_.txt";
done

rm e0.txt


There is one final element to make this a perfect solution, which is to renumber the files in reverse order. I haven't done this portion.




If you don't care about the file numbering being in the same order as the file, and if you don't mind if the extension .txt is omitted, and if you don't mind if the files are numbered from e01 rather than from e1, and if you don't mind a diagnostic message being printed about how many lines were put in each file, then we can simplify:



patA='AK5[*]R'
patB='AK2'

printf '%sn' "g/$patA/?$patB?,.p" |
ex example_file.txt |
csplit -f e -k - "/$patB/" '999'

rm e00





share|improve this answer






















  • csplit: /AK2/': match not found` This what happens.
    – WashichawbachaW
    Feb 13 at 0:41










  • @WashichawbachaW, I get, for example: csplit: '/AK2/': match not found on repetition 16. But that doesn't matter. The -k option means that csplit will leave the files it already created in place, even though it subsequently encountered an error (because you don't have 999 instances of AK2 in your input file). If you're using the version of the command at the very end of my answer, check with ls -l e?? and you should see all the files desired.
    – Wildcard
    Feb 13 at 0:52















up vote
1
down vote














POSIX ex to the rescue again!



ex is the POSIX-specified scriptable file editor. For anything involving backwards addressing, it's usually a far better solution than Awk or Sed.



The following one-liner works perfectly on your example_file2.txt:



printf '%sn' 'g/AK5[*]R/?AK2?,.p' | ex example_file.txt


On your example_file.txt, it also works, but because the global command in ex can't write to a separate destination for each range acted upon, the desired two output files are merged like so:



AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5
AK2*777*69696969
AK3*J7*5**3
AK4*3*6969*4
AK5*R*5


However, this is easy enough to handle—with another POSIX tool, csplit, which is designed to split files according to a "context."



Portable POSIX solution:



patA='AK5[*]R'
patB='AK2'

printf '%sn' "g/$patA/?$patB?,.p" |
ex example_file.txt |
csplit -f my_unique_prefix_ -n 1 -s -k - "/$patB/" '999'

for f in my_unique_prefix_*; do
mv "$f" "e$f##my_unique_prefix_.txt";
done

rm e0.txt


There is one final element to make this a perfect solution, which is to renumber the files in reverse order. I haven't done this portion.




If you don't care about the file numbering being in the same order as the file, and if you don't mind if the extension .txt is omitted, and if you don't mind if the files are numbered from e01 rather than from e1, and if you don't mind a diagnostic message being printed about how many lines were put in each file, then we can simplify:



patA='AK5[*]R'
patB='AK2'

printf '%sn' "g/$patA/?$patB?,.p" |
ex example_file.txt |
csplit -f e -k - "/$patB/" '999'

rm e00





share|improve this answer






















  • csplit: /AK2/': match not found` This what happens.
    – WashichawbachaW
    Feb 13 at 0:41










  • @WashichawbachaW, I get, for example: csplit: '/AK2/': match not found on repetition 16. But that doesn't matter. The -k option means that csplit will leave the files it already created in place, even though it subsequently encountered an error (because you don't have 999 instances of AK2 in your input file). If you're using the version of the command at the very end of my answer, check with ls -l e?? and you should see all the files desired.
    – Wildcard
    Feb 13 at 0:52













up vote
1
down vote










up vote
1
down vote










POSIX ex to the rescue again!



ex is the POSIX-specified scriptable file editor. For anything involving backwards addressing, it's usually a far better solution than Awk or Sed.



The following one-liner works perfectly on your example_file2.txt:



printf '%sn' 'g/AK5[*]R/?AK2?,.p' | ex example_file.txt


On your example_file.txt, it also works, but because the global command in ex can't write to a separate destination for each range acted upon, the desired two output files are merged like so:



AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5
AK2*777*69696969
AK3*J7*5**3
AK4*3*6969*4
AK5*R*5


However, this is easy enough to handle—with another POSIX tool, csplit, which is designed to split files according to a "context."



Portable POSIX solution:



patA='AK5[*]R'
patB='AK2'

printf '%sn' "g/$patA/?$patB?,.p" |
ex example_file.txt |
csplit -f my_unique_prefix_ -n 1 -s -k - "/$patB/" '999'

for f in my_unique_prefix_*; do
mv "$f" "e$f##my_unique_prefix_.txt";
done

rm e0.txt


There is one final element to make this a perfect solution, which is to renumber the files in reverse order. I haven't done this portion.




If you don't care about the file numbering being in the same order as the file, and if you don't mind if the extension .txt is omitted, and if you don't mind if the files are numbered from e01 rather than from e1, and if you don't mind a diagnostic message being printed about how many lines were put in each file, then we can simplify:



patA='AK5[*]R'
patB='AK2'

printf '%sn' "g/$patA/?$patB?,.p" |
ex example_file.txt |
csplit -f e -k - "/$patB/" '999'

rm e00





share|improve this answer















POSIX ex to the rescue again!



ex is the POSIX-specified scriptable file editor. For anything involving backwards addressing, it's usually a far better solution than Awk or Sed.



The following one-liner works perfectly on your example_file2.txt:



printf '%sn' 'g/AK5[*]R/?AK2?,.p' | ex example_file.txt


On your example_file.txt, it also works, but because the global command in ex can't write to a separate destination for each range acted upon, the desired two output files are merged like so:



AK2*777*7777777
AK3*S6*5**3
AK3*A2*5**3
AK4*3*6969*4
AK4*7*6969*4
AK5*R*5
AK2*777*69696969
AK3*J7*5**3
AK4*3*6969*4
AK5*R*5


However, this is easy enough to handle—with another POSIX tool, csplit, which is designed to split files according to a "context."



Portable POSIX solution:



patA='AK5[*]R'
patB='AK2'

printf '%sn' "g/$patA/?$patB?,.p" |
ex example_file.txt |
csplit -f my_unique_prefix_ -n 1 -s -k - "/$patB/" '999'

for f in my_unique_prefix_*; do
mv "$f" "e$f##my_unique_prefix_.txt";
done

rm e0.txt


There is one final element to make this a perfect solution, which is to renumber the files in reverse order. I haven't done this portion.




If you don't care about the file numbering being in the same order as the file, and if you don't mind if the extension .txt is omitted, and if you don't mind if the files are numbered from e01 rather than from e1, and if you don't mind a diagnostic message being printed about how many lines were put in each file, then we can simplify:



patA='AK5[*]R'
patB='AK2'

printf '%sn' "g/$patA/?$patB?,.p" |
ex example_file.txt |
csplit -f e -k - "/$patB/" '999'

rm e00






share|improve this answer














share|improve this answer



share|improve this answer








edited Feb 9 at 3:58

























answered Feb 9 at 3:39









Wildcard

22k855153




22k855153











  • csplit: /AK2/': match not found` This what happens.
    – WashichawbachaW
    Feb 13 at 0:41










  • @WashichawbachaW, I get, for example: csplit: '/AK2/': match not found on repetition 16. But that doesn't matter. The -k option means that csplit will leave the files it already created in place, even though it subsequently encountered an error (because you don't have 999 instances of AK2 in your input file). If you're using the version of the command at the very end of my answer, check with ls -l e?? and you should see all the files desired.
    – Wildcard
    Feb 13 at 0:52

















  • csplit: /AK2/': match not found` This what happens.
    – WashichawbachaW
    Feb 13 at 0:41










  • @WashichawbachaW, I get, for example: csplit: '/AK2/': match not found on repetition 16. But that doesn't matter. The -k option means that csplit will leave the files it already created in place, even though it subsequently encountered an error (because you don't have 999 instances of AK2 in your input file). If you're using the version of the command at the very end of my answer, check with ls -l e?? and you should see all the files desired.
    – Wildcard
    Feb 13 at 0:52
















csplit: /AK2/': match not found` This what happens.
– WashichawbachaW
Feb 13 at 0:41




csplit: /AK2/': match not found` This what happens.
– WashichawbachaW
Feb 13 at 0:41












@WashichawbachaW, I get, for example: csplit: '/AK2/': match not found on repetition 16. But that doesn't matter. The -k option means that csplit will leave the files it already created in place, even though it subsequently encountered an error (because you don't have 999 instances of AK2 in your input file). If you're using the version of the command at the very end of my answer, check with ls -l e?? and you should see all the files desired.
– Wildcard
Feb 13 at 0:52





@WashichawbachaW, I get, for example: csplit: '/AK2/': match not found on repetition 16. But that doesn't matter. The -k option means that csplit will leave the files it already created in place, even though it subsequently encountered an error (because you don't have 999 instances of AK2 in your input file). If you're using the version of the command at the very end of my answer, check with ls -l e?? and you should see all the files desired.
– Wildcard
Feb 13 at 0:52













 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f422760%2ftext-processing-how-to-get-pattern-a-matching-line-until-first-occurrence-of-p%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

How to check contact read email or not when send email to Individual?

Displaying single band from multi-band raster using QGIS

How many registers does an x86_64 CPU actually have?