Select lines based on lines above them

up vote
0
down vote

favorite

I have a list of items, from which I want to select the names of active items:

item 
 status: "Active"
 properties 
 key_a: value
 
 id: 42
 name: "Foo"

item 
 status: "Disabled"
 properties 
 key_b: value
 
 id: 12
 name: "Bar"

item 
 status: "Active"
 id: 2
 name: "Baz"

I know that I can extract the names using capture groups with pcregrep:

$ cat list.txt | pcregrep -o1 -i '^ name: "(.*)"'
Foo
Bar
Baz

Using an OR expression, I can also get a list of repeated status values and names:

$ cat list.txt | pcregrep -o2 -i '^ (status|name): "(.*)"'
Active
Foo
Disabled
Bar
Active
Baz

Finally, I need to filter the names in the list based on their preceding lines. How can I do this?

The final output should be:

Foo
Baz

edited Aug 11 at 12:34

asked Aug 10 at 16:36

danijar

352312

Why don't the properties have a closing brace?
â€“Â glenn jackman
Aug 10 at 23:28

I had the same question as asked by glenn jackman, what about the closing parens for the "properties" property. Secondly, you can have your result by feeding pcregrep's o/p to: sed -n 'N;s/^Activen//p'
â€“Â Rakesh Sharma
Aug 11 at 3:09

@glennjackman Thanks for pointing that out -- I updated the example to add the missing braces.
â€“Â danijar
Aug 11 at 12:35

@RakeshSharma Thanks a lot, this works. Since this seems to be the easiest solution, would you mind creating an answer with a little explanation so I can accept it?
â€“Â danijar
Aug 11 at 12:36

Is this a JSON document? Why not use a JSON parser like jq?
â€“Â Kusalananda
Aug 11 at 12:38

Â |Â
show 2 more comments

up vote
0
down vote

favorite

I have a list of items, from which I want to select the names of active items:

item 
 status: "Active"
 properties 
 key_a: value
 
 id: 42
 name: "Foo"

item 
 status: "Disabled"
 properties 
 key_b: value
 
 id: 12
 name: "Bar"

item 
 status: "Active"
 id: 2
 name: "Baz"

I know that I can extract the names using capture groups with pcregrep:

$ cat list.txt | pcregrep -o1 -i '^ name: "(.*)"'
Foo
Bar
Baz

Using an OR expression, I can also get a list of repeated status values and names:

$ cat list.txt | pcregrep -o2 -i '^ (status|name): "(.*)"'
Active
Foo
Disabled
Bar
Active
Baz

Finally, I need to filter the names in the list based on their preceding lines. How can I do this?

The final output should be:

Foo
Baz

edited Aug 11 at 12:34

asked Aug 10 at 16:36

danijar

352312

Why don't the properties have a closing brace?
â€“Â glenn jackman
Aug 10 at 23:28

I had the same question as asked by glenn jackman, what about the closing parens for the "properties" property. Secondly, you can have your result by feeding pcregrep's o/p to: sed -n 'N;s/^Activen//p'
â€“Â Rakesh Sharma
Aug 11 at 3:09

@glennjackman Thanks for pointing that out -- I updated the example to add the missing braces.
â€“Â danijar
Aug 11 at 12:35

@RakeshSharma Thanks a lot, this works. Since this seems to be the easiest solution, would you mind creating an answer with a little explanation so I can accept it?
â€“Â danijar
Aug 11 at 12:36

Is this a JSON document? Why not use a JSON parser like jq?
â€“Â Kusalananda
Aug 11 at 12:38

Â |Â
show 2 more comments

up vote
0
down vote

favorite

I have a list of items, from which I want to select the names of active items:

item 
 status: "Active"
 properties 
 key_a: value
 
 id: 42
 name: "Foo"

item 
 status: "Disabled"
 properties 
 key_b: value
 
 id: 12
 name: "Bar"

item 
 status: "Active"
 id: 2
 name: "Baz"

I know that I can extract the names using capture groups with pcregrep:

$ cat list.txt | pcregrep -o1 -i '^ name: "(.*)"'
Foo
Bar
Baz

Using an OR expression, I can also get a list of repeated status values and names:

$ cat list.txt | pcregrep -o2 -i '^ (status|name): "(.*)"'
Active
Foo
Disabled
Bar
Active
Baz

Finally, I need to filter the names in the list based on their preceding lines. How can I do this?

The final output should be:

Foo
Baz

edited Aug 11 at 12:34

asked Aug 10 at 16:36

danijar

352312

I have a list of items, from which I want to select the names of active items:

item 
 status: "Active"
 properties 
 key_a: value
 
 id: 42
 name: "Foo"

item 
 status: "Disabled"
 properties 
 key_b: value
 
 id: 12
 name: "Bar"

item 
 status: "Active"
 id: 2
 name: "Baz"

I know that I can extract the names using capture groups with pcregrep:

$ cat list.txt | pcregrep -o1 -i '^ name: "(.*)"'
Foo
Bar
Baz

Using an OR expression, I can also get a list of repeated status values and names:

$ cat list.txt | pcregrep -o2 -i '^ (status|name): "(.*)"'
Active
Foo
Disabled
Bar
Active
Baz

Finally, I need to filter the names in the list based on their preceding lines. How can I do this?

The final output should be:

Foo
Baz

text-processing grep pcregrep

edited Aug 11 at 12:34

asked Aug 10 at 16:36

danijar

352312

edited Aug 11 at 12:34

asked Aug 10 at 16:36

danijar

352312

edited Aug 11 at 12:34

asked Aug 10 at 16:36

danijar

352312

asked Aug 10 at 16:36

danijar

352312

asked Aug 10 at 16:36

danijar

352312

Why don't the properties have a closing brace?
â€“Â glenn jackman
Aug 10 at 23:28

I had the same question as asked by glenn jackman, what about the closing parens for the "properties" property. Secondly, you can have your result by feeding pcregrep's o/p to: sed -n 'N;s/^Activen//p'
â€“Â Rakesh Sharma
Aug 11 at 3:09

@glennjackman Thanks for pointing that out -- I updated the example to add the missing braces.
â€“Â danijar
Aug 11 at 12:35

@RakeshSharma Thanks a lot, this works. Since this seems to be the easiest solution, would you mind creating an answer with a little explanation so I can accept it?
â€“Â danijar
Aug 11 at 12:36

Is this a JSON document? Why not use a JSON parser like jq?
â€“Â Kusalananda
Aug 11 at 12:38

Â |Â
show 2 more comments

Why don't the properties have a closing brace?
â€“Â glenn jackman
Aug 10 at 23:28

I had the same question as asked by glenn jackman, what about the closing parens for the "properties" property. Secondly, you can have your result by feeding pcregrep's o/p to: sed -n 'N;s/^Activen//p'
â€“Â Rakesh Sharma
Aug 11 at 3:09

@glennjackman Thanks for pointing that out -- I updated the example to add the missing braces.
â€“Â danijar
Aug 11 at 12:35

@RakeshSharma Thanks a lot, this works. Since this seems to be the easiest solution, would you mind creating an answer with a little explanation so I can accept it?
â€“Â danijar
Aug 11 at 12:36

Is this a JSON document? Why not use a JSON parser like jq?
â€“Â Kusalananda
Aug 11 at 12:38

Why don't the properties have a closing brace?
â€“Â glenn jackman
Aug 10 at 23:28

I had the same question as asked by glenn jackman, what about the closing parens for the "properties" property. Secondly, you can have your result by feeding pcregrep's o/p to: sed -n 'N;s/^Activen//p'
â€“Â Rakesh Sharma
Aug 11 at 3:09

@glennjackman Thanks for pointing that out -- I updated the example to add the missing braces.
â€“Â danijar
Aug 11 at 12:35

@RakeshSharma Thanks a lot, this works. Since this seems to be the easiest solution, would you mind creating an answer with a little explanation so I can accept it?
â€“Â danijar
Aug 11 at 12:36

Is this a JSON document? Why not use a JSON parser like jq?
â€“Â Kusalananda
Aug 11 at 12:38

Â |Â
show 2 more comments

4 Answers
4

active

oldest

votes

up vote
1
down vote

accepted

Since most of the heavy lifting has already been done by pcregrep, you can now pass on it' s o/p to this short sed snippet :

 sed -ne 'N;s/^Activen//p'

which makes sed to look at 2 lines at a time, rather than the default of 1. The N command sticks the next line to the pattern space by separating with a newline n. Now, only if sed was able to remove the Active first line in the pattern space, is the remaining pattern space going to be printed. This is a conditional print. Otw nothing and -n shall ensure no autoprinting of the pattern space. HTH.

answered Aug 11 at 17:11

Rakesh Sharma

157113

add a commentÂ |Â

up vote
2
down vote

I don't think you can do this with a grep variation alone (admittedly I don't know pcregrep). Try awk:

awk '/^ *status.*Active.$/ ACT = 1 /^ *name:/ && ACT gsub (/"/, "", $2); print $2; ACT = 0' file
Foo
Baz

answered Aug 10 at 17:04

RudiC

1,1837

pcregrepis is just grep built with Perl-compatible regular expressions.
â€“Â glenn jackman
Aug 11 at 13:55

add a commentÂ |Â

up vote
1
down vote

You can use sed too

sed '/status.*Active/,/name/!d;/name/!d;s/[^"]*"([^"]*)"/1/' infile

answered Aug 10 at 18:32

ctac_

1,016116

add a commentÂ |Â

up vote
1
down vote

You could also use the range operator of Perl and constrain it with a boolean condition to deal with nested parentheses in a block.

Normally, one would write a range in Perl as /re1/ ... /re2/, this will causeperl to select the blocks that begin with regex /re1/ and end on those lines that satisfy the regex /re2/. We could further constrain this, to, say: /re1/ ... /re2/ && $depth==0.

This will cause perl to select only those blocks that have an additional constraint of the depth being zero. Like in this case, the block ending happens only when the } is found that causes the depth count to fall to zero, OTW, the block accumulation continues past this mark as well.

perl -lne '
 if ( // ... // && !$depth ) Â

up vote
2
down vote

I don't think you can do this with a grep variation alone (admittedly I don't know pcregrep). Try awk:

awk '/^ *status.*Active.$/ ACT = 1 /^ *name:/ && ACT gsub (/"/, "", $2); print $2; ACT = 0' file
Foo
Baz

share is found that causes the depth count to fall to zero, OTW, the block accumulation continues past this mark as well.

perl -lne '
 if ( // ... // && !$depth ) 
 if ( // ) $depth = /^h*itemh+h*$/ ? 0 : ++$depth; 
 elsif ( // ) print($name),undef($flag) if !$depth-- && $flag; 
 elsif ( /^h*status:h*"Active"h*$/ ) $flag = 1; 
 elsif ( /^h*name:h/ ) $name = (split /"/)[1]; 
 
' input.file

answered Aug 12 at 11:20

Rakesh Sharma

58513

add a commentÂ is found that causes the depth count to fall to zero, OTW, the block accumulation continues past this mark as well.

perl -lne '
 if ( // ... // && !$depth ) 
 if ( // ) $depth = /^h*itemh+h*$/ ? 0 : ++$depth; 
 elsif ( // ) print($name),undef($flag) if !$depth-- && $flag; 
 elsif ( /^h*status:h*"Active"h*$/ ) $flag = 1; 
 elsif ( /^h*name:h/ ) $name = (split /"/)[1]; 
 
' input.file

share is found that causes the depth count to fall to zero, OTW, the block accumulation continues past this mark as well.

perl -lne '
 if ( // ... // && !$depth ) {
 if ( // ) $depth = /^h*itemh+h*$/ ? 0 : ++$depth; 
 elsif ( // ) print($name),undef($flag) if !$depth-- && $flag; 
 elsif ( /^h*status:h*"Active"h*$/ ) $flag = 1; 
 elsif ( /^h*name:h/ ) $name = (split /"/)[1]; 
 
' input.file

answered Aug 12 at 11:20

Rakesh Sharma

58513

answered Aug 12 at 11:20

Rakesh Sharma

58513

answered Aug 12 at 11:20

Rakesh Sharma

58513

answered Aug 12 at 11:20

Rakesh Sharma

58513

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f461848%2fselect-lines-based-on-lines-above-them%23new-answer', 'question_page');

);

Post as a guest

Name

搜尋此網誌

mjhjmtu