Getting a match count of objects in a file

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
1
down vote

favorite












I have a large file that has entries that look like this:



entry-id: 1
sn: John
cn: Smith
empType: A
ADID: 123456

entry-id: 2
sn: James
cn: Smith
empType: B
ADID: 123456

entry-id: 3
sn: Jobu
cn: Smith
empType: A
ADID: 123456

entry-id: 4
sn: Jobu
cn: Smith
empType: A
ADID:


Each entry is separated by a new line. I need a count of entries that have an empType of A, and MUST ALSO have a value after ADID(total of 2). I've tried to use awk and grep and egrep, and still having no luck. Any ideas?







share|improve this question




















  • What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ n++ END print n' file should work
    – steeldriver
    Dec 14 '17 at 3:39










  • running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....
    – King of NES
    Dec 14 '17 at 3:54











  • You did include the correct filename to read as input?
    – bu5hman
    Dec 14 '17 at 4:15










  • it was the correct file...
    – King of NES
    Dec 14 '17 at 4:18














up vote
1
down vote

favorite












I have a large file that has entries that look like this:



entry-id: 1
sn: John
cn: Smith
empType: A
ADID: 123456

entry-id: 2
sn: James
cn: Smith
empType: B
ADID: 123456

entry-id: 3
sn: Jobu
cn: Smith
empType: A
ADID: 123456

entry-id: 4
sn: Jobu
cn: Smith
empType: A
ADID:


Each entry is separated by a new line. I need a count of entries that have an empType of A, and MUST ALSO have a value after ADID(total of 2). I've tried to use awk and grep and egrep, and still having no luck. Any ideas?







share|improve this question




















  • What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ n++ END print n' file should work
    – steeldriver
    Dec 14 '17 at 3:39










  • running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....
    – King of NES
    Dec 14 '17 at 3:54











  • You did include the correct filename to read as input?
    – bu5hman
    Dec 14 '17 at 4:15










  • it was the correct file...
    – King of NES
    Dec 14 '17 at 4:18












up vote
1
down vote

favorite









up vote
1
down vote

favorite











I have a large file that has entries that look like this:



entry-id: 1
sn: John
cn: Smith
empType: A
ADID: 123456

entry-id: 2
sn: James
cn: Smith
empType: B
ADID: 123456

entry-id: 3
sn: Jobu
cn: Smith
empType: A
ADID: 123456

entry-id: 4
sn: Jobu
cn: Smith
empType: A
ADID:


Each entry is separated by a new line. I need a count of entries that have an empType of A, and MUST ALSO have a value after ADID(total of 2). I've tried to use awk and grep and egrep, and still having no luck. Any ideas?







share|improve this question












I have a large file that has entries that look like this:



entry-id: 1
sn: John
cn: Smith
empType: A
ADID: 123456

entry-id: 2
sn: James
cn: Smith
empType: B
ADID: 123456

entry-id: 3
sn: Jobu
cn: Smith
empType: A
ADID: 123456

entry-id: 4
sn: Jobu
cn: Smith
empType: A
ADID:


Each entry is separated by a new line. I need a count of entries that have an empType of A, and MUST ALSO have a value after ADID(total of 2). I've tried to use awk and grep and egrep, and still having no luck. Any ideas?









share|improve this question











share|improve this question




share|improve this question










asked Dec 14 '17 at 3:29









King of NES

1163




1163











  • What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ n++ END print n' file should work
    – steeldriver
    Dec 14 '17 at 3:39










  • running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....
    – King of NES
    Dec 14 '17 at 3:54











  • You did include the correct filename to read as input?
    – bu5hman
    Dec 14 '17 at 4:15










  • it was the correct file...
    – King of NES
    Dec 14 '17 at 4:18
















  • What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ n++ END print n' file should work
    – steeldriver
    Dec 14 '17 at 3:39










  • running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....
    – King of NES
    Dec 14 '17 at 3:54











  • You did include the correct filename to read as input?
    – bu5hman
    Dec 14 '17 at 4:15










  • it was the correct file...
    – King of NES
    Dec 14 '17 at 4:18















What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ n++ END print n' file should work
– steeldriver
Dec 14 '17 at 3:39




What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ n++ END print n' file should work
– steeldriver
Dec 14 '17 at 3:39












running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....
– King of NES
Dec 14 '17 at 3:54





running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....
– King of NES
Dec 14 '17 at 3:54













You did include the correct filename to read as input?
– bu5hman
Dec 14 '17 at 4:15




You did include the correct filename to read as input?
– bu5hman
Dec 14 '17 at 4:15












it was the correct file...
– King of NES
Dec 14 '17 at 4:18




it was the correct file...
– King of NES
Dec 14 '17 at 4:18










6 Answers
6






active

oldest

votes

















up vote
0
down vote













Awk solution:



awk '/empType: / f=($2=="A"? 1:0) f && /ADID: [0-9]+/ c++ END print c ' file



  • f - flag indicating empType: A section processing


  • c - count of empType: A entries with filled ADID key


The output:



2





share|improve this answer





























    up vote
    0
    down vote













    Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS



    BEGIN RS=""; FS="n"

    split($4,a,": ")
    split($5,b,": ")

    a[2]=="A" && b[2]!="" c++
    END print c


    the script can be executed with



    awk -f main.awk file





    share|improve this answer



























      up vote
      0
      down vote













      Simple two grep method, where data is the input file:



      grep -A1 'empType: A' data | grep -c 'ADID: .+'


      Output:



      2





      share|improve this answer





























        up vote
        0
        down vote













        I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:



        #!/usr/bin/env awk
        # getids.awk

        BEGIN
        RS="";
        FS="n"


        /ADID: [0-9]/ && /empType: A/print $1


        And here it is in action:



        user@host:~$ awk -f getids.awk data.txt
        entry-id: 1
        entry-id: 3

        user@host:~$ awk -f getids.awk data.txt | wc -l
        2


        Of course if you just want the count we can do that too:



        #!/usr/bin/env awk
        # count.awk

        BEGIN
        RS="";
        FS="n";
        count=0;


        /ADID: [0-9]/ && /empType: A/count++

        END
        print count



        And because I love Python, here is a Python script that does the same thing:



        #!/usr/bin/env python2
        # -*- coding: ascii -*-
        """getids.py"""

        import sys

        # Create a list to store the matched records
        records =

        # Iterate over the lines of the input file
        with open(sys.argv[1]) as data:
        for line in data:

        # When an "entry-id" is reached, create a new record
        if line.startswith('entry-id'):
        entry_id = line.split(':')[1].strip()
        records.append('entry-id': entry_id)

        # For other lines, update the current record
        elif line.strip():
        key = line.partition(':')[0].strip()
        value = line.partition(':')[2].strip()
        records[-1][key] = value

        # Extract the list of records meeting the desired critera
        matches = [record for record in records if record['empType'] == 'A' and record['ADID']]

        # Print out the entry-ids for all of the matches
        for match in matches:
        print('entry-id: ' + match['entry-id'])


        And here's the Python script in action:



        user@host:~$ python getids.py data.txt
        entry-id: 1
        entry-id: 3

        user@host:~$ python getids.py data.txt | wc -l
        2


        And if we really do just want the counts:



        #!/usr/bin/env python2
        # -*- coding: ascii -*-
        """count.py"""

        import sys

        # Keep a count of the number of matches
        count = 0

        # Use flags to keep track of the current record
        emptype_flag = False
        adid_flag = False

        # Iterate over the lines of the input file
        with open(sys.argv[1]) as data:
        for line in data:

        # When an "entry-id" is reached, reset the flags
        if line.startswith('entry-id'):
        emptype_flag = False
        adid_flag = False
        elif line.strip() == "empType: A":
        emptype_flag = True
        elif line.startswith("ADID") and line.strip().split(':')[1]:
        adid_flag = True

        # If both conditions hold the increment the counter
        # and reset the flags
        if emptype_flag and adid_flag:
        count = count + 1
        emptype_flag = False
        adid_flag = False

        # Print the number of matches
        print(count)


        And, while we're at it, how about a pure Bash script? Here's one:



        #!/usr/bin/env bash

        # getids.bash

        while read line; do
        if [[ "$line" =~ "entry-id:" ]]; then
        entry_id="$line"
        emptype=false
        adid=false
        elif [[ "$line" =~ "empType: A" ]]; then
        emptype=true
        elif [[ "$line" =~ ADID: [0-9] ]]; then
        adid=true
        fi
        if [[ "$emptype" == true && "$adid" == true ]]; then
        echo "$entry_id"
        emptype=false
        adid=false
        fi
        done < "$1"


        And running the bash script:



        user@host:~$ bash getids.bash data.txt
        entry-id: 1
        entry-id: 3


        And finally, here's something using just grep and wc:



        user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l

        2





        share|improve this answer





























          up vote
          0
          down vote













          With perl, that could be:



          perl -l -00ne '
          my %f = /(.*?):s*(.*)/g;
          ++$n if $fempType eq "A" && $fADID ne "";
          END print 0+$n' < file



          • -n causes the code given to -e to be applied to each input record


          • -00 for records to be paragraphs.

          • We build a %f associative array where key and values are mapped to each (key):spaces(value) in the record.

          • and increment $n where the conditions are met.

          • we print $n in the END (adding 0 to make sure we get 0 and not an empty string if there's no match).





          share|improve this answer





























            up vote
            0
            down vote













            I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
            What I did find that worked was



            perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"


            Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph,
            -n while loop
            e one line of program??
            print paragraph if you find empType: A
            now pipe those matched paragraphs to |
            grep -i -c "^ADID:" find ignore cased and count number of ADIDs.
            I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....






            share|improve this answer




















              Your Answer







              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "106"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              convertImagesToLinks: false,
              noModals: false,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );








               

              draft saved


              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f410792%2fgetting-a-match-count-of-objects-in-a-file%23new-answer', 'question_page');

              );

              Post as a guest






























              6 Answers
              6






              active

              oldest

              votes








              6 Answers
              6






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes








              up vote
              0
              down vote













              Awk solution:



              awk '/empType: / f=($2=="A"? 1:0) f && /ADID: [0-9]+/ c++ END print c ' file



              • f - flag indicating empType: A section processing


              • c - count of empType: A entries with filled ADID key


              The output:



              2





              share|improve this answer


























                up vote
                0
                down vote













                Awk solution:



                awk '/empType: / f=($2=="A"? 1:0) f && /ADID: [0-9]+/ c++ END print c ' file



                • f - flag indicating empType: A section processing


                • c - count of empType: A entries with filled ADID key


                The output:



                2





                share|improve this answer
























                  up vote
                  0
                  down vote










                  up vote
                  0
                  down vote









                  Awk solution:



                  awk '/empType: / f=($2=="A"? 1:0) f && /ADID: [0-9]+/ c++ END print c ' file



                  • f - flag indicating empType: A section processing


                  • c - count of empType: A entries with filled ADID key


                  The output:



                  2





                  share|improve this answer














                  Awk solution:



                  awk '/empType: / f=($2=="A"? 1:0) f && /ADID: [0-9]+/ c++ END print c ' file



                  • f - flag indicating empType: A section processing


                  • c - count of empType: A entries with filled ADID key


                  The output:



                  2






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Dec 14 '17 at 6:12

























                  answered Dec 14 '17 at 6:00









                  RomanPerekhrest

                  22.4k12145




                  22.4k12145






















                      up vote
                      0
                      down vote













                      Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS



                      BEGIN RS=""; FS="n"

                      split($4,a,": ")
                      split($5,b,": ")

                      a[2]=="A" && b[2]!="" c++
                      END print c


                      the script can be executed with



                      awk -f main.awk file





                      share|improve this answer
























                        up vote
                        0
                        down vote













                        Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS



                        BEGIN RS=""; FS="n"

                        split($4,a,": ")
                        split($5,b,": ")

                        a[2]=="A" && b[2]!="" c++
                        END print c


                        the script can be executed with



                        awk -f main.awk file





                        share|improve this answer






















                          up vote
                          0
                          down vote










                          up vote
                          0
                          down vote









                          Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS



                          BEGIN RS=""; FS="n"

                          split($4,a,": ")
                          split($5,b,": ")

                          a[2]=="A" && b[2]!="" c++
                          END print c


                          the script can be executed with



                          awk -f main.awk file





                          share|improve this answer












                          Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS



                          BEGIN RS=""; FS="n"

                          split($4,a,": ")
                          split($5,b,": ")

                          a[2]=="A" && b[2]!="" c++
                          END print c


                          the script can be executed with



                          awk -f main.awk file






                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Dec 14 '17 at 6:39









                          etopylight

                          383117




                          383117




















                              up vote
                              0
                              down vote













                              Simple two grep method, where data is the input file:



                              grep -A1 'empType: A' data | grep -c 'ADID: .+'


                              Output:



                              2





                              share|improve this answer


























                                up vote
                                0
                                down vote













                                Simple two grep method, where data is the input file:



                                grep -A1 'empType: A' data | grep -c 'ADID: .+'


                                Output:



                                2





                                share|improve this answer
























                                  up vote
                                  0
                                  down vote










                                  up vote
                                  0
                                  down vote









                                  Simple two grep method, where data is the input file:



                                  grep -A1 'empType: A' data | grep -c 'ADID: .+'


                                  Output:



                                  2





                                  share|improve this answer














                                  Simple two grep method, where data is the input file:



                                  grep -A1 'empType: A' data | grep -c 'ADID: .+'


                                  Output:



                                  2






                                  share|improve this answer














                                  share|improve this answer



                                  share|improve this answer








                                  edited Dec 14 '17 at 7:15

























                                  answered Dec 14 '17 at 7:09









                                  agc

                                  4,1101935




                                  4,1101935




















                                      up vote
                                      0
                                      down vote













                                      I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:



                                      #!/usr/bin/env awk
                                      # getids.awk

                                      BEGIN
                                      RS="";
                                      FS="n"


                                      /ADID: [0-9]/ && /empType: A/print $1


                                      And here it is in action:



                                      user@host:~$ awk -f getids.awk data.txt
                                      entry-id: 1
                                      entry-id: 3

                                      user@host:~$ awk -f getids.awk data.txt | wc -l
                                      2


                                      Of course if you just want the count we can do that too:



                                      #!/usr/bin/env awk
                                      # count.awk

                                      BEGIN
                                      RS="";
                                      FS="n";
                                      count=0;


                                      /ADID: [0-9]/ && /empType: A/count++

                                      END
                                      print count



                                      And because I love Python, here is a Python script that does the same thing:



                                      #!/usr/bin/env python2
                                      # -*- coding: ascii -*-
                                      """getids.py"""

                                      import sys

                                      # Create a list to store the matched records
                                      records =

                                      # Iterate over the lines of the input file
                                      with open(sys.argv[1]) as data:
                                      for line in data:

                                      # When an "entry-id" is reached, create a new record
                                      if line.startswith('entry-id'):
                                      entry_id = line.split(':')[1].strip()
                                      records.append('entry-id': entry_id)

                                      # For other lines, update the current record
                                      elif line.strip():
                                      key = line.partition(':')[0].strip()
                                      value = line.partition(':')[2].strip()
                                      records[-1][key] = value

                                      # Extract the list of records meeting the desired critera
                                      matches = [record for record in records if record['empType'] == 'A' and record['ADID']]

                                      # Print out the entry-ids for all of the matches
                                      for match in matches:
                                      print('entry-id: ' + match['entry-id'])


                                      And here's the Python script in action:



                                      user@host:~$ python getids.py data.txt
                                      entry-id: 1
                                      entry-id: 3

                                      user@host:~$ python getids.py data.txt | wc -l
                                      2


                                      And if we really do just want the counts:



                                      #!/usr/bin/env python2
                                      # -*- coding: ascii -*-
                                      """count.py"""

                                      import sys

                                      # Keep a count of the number of matches
                                      count = 0

                                      # Use flags to keep track of the current record
                                      emptype_flag = False
                                      adid_flag = False

                                      # Iterate over the lines of the input file
                                      with open(sys.argv[1]) as data:
                                      for line in data:

                                      # When an "entry-id" is reached, reset the flags
                                      if line.startswith('entry-id'):
                                      emptype_flag = False
                                      adid_flag = False
                                      elif line.strip() == "empType: A":
                                      emptype_flag = True
                                      elif line.startswith("ADID") and line.strip().split(':')[1]:
                                      adid_flag = True

                                      # If both conditions hold the increment the counter
                                      # and reset the flags
                                      if emptype_flag and adid_flag:
                                      count = count + 1
                                      emptype_flag = False
                                      adid_flag = False

                                      # Print the number of matches
                                      print(count)


                                      And, while we're at it, how about a pure Bash script? Here's one:



                                      #!/usr/bin/env bash

                                      # getids.bash

                                      while read line; do
                                      if [[ "$line" =~ "entry-id:" ]]; then
                                      entry_id="$line"
                                      emptype=false
                                      adid=false
                                      elif [[ "$line" =~ "empType: A" ]]; then
                                      emptype=true
                                      elif [[ "$line" =~ ADID: [0-9] ]]; then
                                      adid=true
                                      fi
                                      if [[ "$emptype" == true && "$adid" == true ]]; then
                                      echo "$entry_id"
                                      emptype=false
                                      adid=false
                                      fi
                                      done < "$1"


                                      And running the bash script:



                                      user@host:~$ bash getids.bash data.txt
                                      entry-id: 1
                                      entry-id: 3


                                      And finally, here's something using just grep and wc:



                                      user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l

                                      2





                                      share|improve this answer


























                                        up vote
                                        0
                                        down vote













                                        I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:



                                        #!/usr/bin/env awk
                                        # getids.awk

                                        BEGIN
                                        RS="";
                                        FS="n"


                                        /ADID: [0-9]/ && /empType: A/print $1


                                        And here it is in action:



                                        user@host:~$ awk -f getids.awk data.txt
                                        entry-id: 1
                                        entry-id: 3

                                        user@host:~$ awk -f getids.awk data.txt | wc -l
                                        2


                                        Of course if you just want the count we can do that too:



                                        #!/usr/bin/env awk
                                        # count.awk

                                        BEGIN
                                        RS="";
                                        FS="n";
                                        count=0;


                                        /ADID: [0-9]/ && /empType: A/count++

                                        END
                                        print count



                                        And because I love Python, here is a Python script that does the same thing:



                                        #!/usr/bin/env python2
                                        # -*- coding: ascii -*-
                                        """getids.py"""

                                        import sys

                                        # Create a list to store the matched records
                                        records =

                                        # Iterate over the lines of the input file
                                        with open(sys.argv[1]) as data:
                                        for line in data:

                                        # When an "entry-id" is reached, create a new record
                                        if line.startswith('entry-id'):
                                        entry_id = line.split(':')[1].strip()
                                        records.append('entry-id': entry_id)

                                        # For other lines, update the current record
                                        elif line.strip():
                                        key = line.partition(':')[0].strip()
                                        value = line.partition(':')[2].strip()
                                        records[-1][key] = value

                                        # Extract the list of records meeting the desired critera
                                        matches = [record for record in records if record['empType'] == 'A' and record['ADID']]

                                        # Print out the entry-ids for all of the matches
                                        for match in matches:
                                        print('entry-id: ' + match['entry-id'])


                                        And here's the Python script in action:



                                        user@host:~$ python getids.py data.txt
                                        entry-id: 1
                                        entry-id: 3

                                        user@host:~$ python getids.py data.txt | wc -l
                                        2


                                        And if we really do just want the counts:



                                        #!/usr/bin/env python2
                                        # -*- coding: ascii -*-
                                        """count.py"""

                                        import sys

                                        # Keep a count of the number of matches
                                        count = 0

                                        # Use flags to keep track of the current record
                                        emptype_flag = False
                                        adid_flag = False

                                        # Iterate over the lines of the input file
                                        with open(sys.argv[1]) as data:
                                        for line in data:

                                        # When an "entry-id" is reached, reset the flags
                                        if line.startswith('entry-id'):
                                        emptype_flag = False
                                        adid_flag = False
                                        elif line.strip() == "empType: A":
                                        emptype_flag = True
                                        elif line.startswith("ADID") and line.strip().split(':')[1]:
                                        adid_flag = True

                                        # If both conditions hold the increment the counter
                                        # and reset the flags
                                        if emptype_flag and adid_flag:
                                        count = count + 1
                                        emptype_flag = False
                                        adid_flag = False

                                        # Print the number of matches
                                        print(count)


                                        And, while we're at it, how about a pure Bash script? Here's one:



                                        #!/usr/bin/env bash

                                        # getids.bash

                                        while read line; do
                                        if [[ "$line" =~ "entry-id:" ]]; then
                                        entry_id="$line"
                                        emptype=false
                                        adid=false
                                        elif [[ "$line" =~ "empType: A" ]]; then
                                        emptype=true
                                        elif [[ "$line" =~ ADID: [0-9] ]]; then
                                        adid=true
                                        fi
                                        if [[ "$emptype" == true && "$adid" == true ]]; then
                                        echo "$entry_id"
                                        emptype=false
                                        adid=false
                                        fi
                                        done < "$1"


                                        And running the bash script:



                                        user@host:~$ bash getids.bash data.txt
                                        entry-id: 1
                                        entry-id: 3


                                        And finally, here's something using just grep and wc:



                                        user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l

                                        2





                                        share|improve this answer
























                                          up vote
                                          0
                                          down vote










                                          up vote
                                          0
                                          down vote









                                          I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:



                                          #!/usr/bin/env awk
                                          # getids.awk

                                          BEGIN
                                          RS="";
                                          FS="n"


                                          /ADID: [0-9]/ && /empType: A/print $1


                                          And here it is in action:



                                          user@host:~$ awk -f getids.awk data.txt
                                          entry-id: 1
                                          entry-id: 3

                                          user@host:~$ awk -f getids.awk data.txt | wc -l
                                          2


                                          Of course if you just want the count we can do that too:



                                          #!/usr/bin/env awk
                                          # count.awk

                                          BEGIN
                                          RS="";
                                          FS="n";
                                          count=0;


                                          /ADID: [0-9]/ && /empType: A/count++

                                          END
                                          print count



                                          And because I love Python, here is a Python script that does the same thing:



                                          #!/usr/bin/env python2
                                          # -*- coding: ascii -*-
                                          """getids.py"""

                                          import sys

                                          # Create a list to store the matched records
                                          records =

                                          # Iterate over the lines of the input file
                                          with open(sys.argv[1]) as data:
                                          for line in data:

                                          # When an "entry-id" is reached, create a new record
                                          if line.startswith('entry-id'):
                                          entry_id = line.split(':')[1].strip()
                                          records.append('entry-id': entry_id)

                                          # For other lines, update the current record
                                          elif line.strip():
                                          key = line.partition(':')[0].strip()
                                          value = line.partition(':')[2].strip()
                                          records[-1][key] = value

                                          # Extract the list of records meeting the desired critera
                                          matches = [record for record in records if record['empType'] == 'A' and record['ADID']]

                                          # Print out the entry-ids for all of the matches
                                          for match in matches:
                                          print('entry-id: ' + match['entry-id'])


                                          And here's the Python script in action:



                                          user@host:~$ python getids.py data.txt
                                          entry-id: 1
                                          entry-id: 3

                                          user@host:~$ python getids.py data.txt | wc -l
                                          2


                                          And if we really do just want the counts:



                                          #!/usr/bin/env python2
                                          # -*- coding: ascii -*-
                                          """count.py"""

                                          import sys

                                          # Keep a count of the number of matches
                                          count = 0

                                          # Use flags to keep track of the current record
                                          emptype_flag = False
                                          adid_flag = False

                                          # Iterate over the lines of the input file
                                          with open(sys.argv[1]) as data:
                                          for line in data:

                                          # When an "entry-id" is reached, reset the flags
                                          if line.startswith('entry-id'):
                                          emptype_flag = False
                                          adid_flag = False
                                          elif line.strip() == "empType: A":
                                          emptype_flag = True
                                          elif line.startswith("ADID") and line.strip().split(':')[1]:
                                          adid_flag = True

                                          # If both conditions hold the increment the counter
                                          # and reset the flags
                                          if emptype_flag and adid_flag:
                                          count = count + 1
                                          emptype_flag = False
                                          adid_flag = False

                                          # Print the number of matches
                                          print(count)


                                          And, while we're at it, how about a pure Bash script? Here's one:



                                          #!/usr/bin/env bash

                                          # getids.bash

                                          while read line; do
                                          if [[ "$line" =~ "entry-id:" ]]; then
                                          entry_id="$line"
                                          emptype=false
                                          adid=false
                                          elif [[ "$line" =~ "empType: A" ]]; then
                                          emptype=true
                                          elif [[ "$line" =~ ADID: [0-9] ]]; then
                                          adid=true
                                          fi
                                          if [[ "$emptype" == true && "$adid" == true ]]; then
                                          echo "$entry_id"
                                          emptype=false
                                          adid=false
                                          fi
                                          done < "$1"


                                          And running the bash script:



                                          user@host:~$ bash getids.bash data.txt
                                          entry-id: 1
                                          entry-id: 3


                                          And finally, here's something using just grep and wc:



                                          user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l

                                          2





                                          share|improve this answer














                                          I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:



                                          #!/usr/bin/env awk
                                          # getids.awk

                                          BEGIN
                                          RS="";
                                          FS="n"


                                          /ADID: [0-9]/ && /empType: A/print $1


                                          And here it is in action:



                                          user@host:~$ awk -f getids.awk data.txt
                                          entry-id: 1
                                          entry-id: 3

                                          user@host:~$ awk -f getids.awk data.txt | wc -l
                                          2


                                          Of course if you just want the count we can do that too:



                                          #!/usr/bin/env awk
                                          # count.awk

                                          BEGIN
                                          RS="";
                                          FS="n";
                                          count=0;


                                          /ADID: [0-9]/ && /empType: A/count++

                                          END
                                          print count



                                          And because I love Python, here is a Python script that does the same thing:



                                          #!/usr/bin/env python2
                                          # -*- coding: ascii -*-
                                          """getids.py"""

                                          import sys

                                          # Create a list to store the matched records
                                          records =

                                          # Iterate over the lines of the input file
                                          with open(sys.argv[1]) as data:
                                          for line in data:

                                          # When an "entry-id" is reached, create a new record
                                          if line.startswith('entry-id'):
                                          entry_id = line.split(':')[1].strip()
                                          records.append('entry-id': entry_id)

                                          # For other lines, update the current record
                                          elif line.strip():
                                          key = line.partition(':')[0].strip()
                                          value = line.partition(':')[2].strip()
                                          records[-1][key] = value

                                          # Extract the list of records meeting the desired critera
                                          matches = [record for record in records if record['empType'] == 'A' and record['ADID']]

                                          # Print out the entry-ids for all of the matches
                                          for match in matches:
                                          print('entry-id: ' + match['entry-id'])


                                          And here's the Python script in action:



                                          user@host:~$ python getids.py data.txt
                                          entry-id: 1
                                          entry-id: 3

                                          user@host:~$ python getids.py data.txt | wc -l
                                          2


                                          And if we really do just want the counts:



                                          #!/usr/bin/env python2
                                          # -*- coding: ascii -*-
                                          """count.py"""

                                          import sys

                                          # Keep a count of the number of matches
                                          count = 0

                                          # Use flags to keep track of the current record
                                          emptype_flag = False
                                          adid_flag = False

                                          # Iterate over the lines of the input file
                                          with open(sys.argv[1]) as data:
                                          for line in data:

                                          # When an "entry-id" is reached, reset the flags
                                          if line.startswith('entry-id'):
                                          emptype_flag = False
                                          adid_flag = False
                                          elif line.strip() == "empType: A":
                                          emptype_flag = True
                                          elif line.startswith("ADID") and line.strip().split(':')[1]:
                                          adid_flag = True

                                          # If both conditions hold the increment the counter
                                          # and reset the flags
                                          if emptype_flag and adid_flag:
                                          count = count + 1
                                          emptype_flag = False
                                          adid_flag = False

                                          # Print the number of matches
                                          print(count)


                                          And, while we're at it, how about a pure Bash script? Here's one:



                                          #!/usr/bin/env bash

                                          # getids.bash

                                          while read line; do
                                          if [[ "$line" =~ "entry-id:" ]]; then
                                          entry_id="$line"
                                          emptype=false
                                          adid=false
                                          elif [[ "$line" =~ "empType: A" ]]; then
                                          emptype=true
                                          elif [[ "$line" =~ ADID: [0-9] ]]; then
                                          adid=true
                                          fi
                                          if [[ "$emptype" == true && "$adid" == true ]]; then
                                          echo "$entry_id"
                                          emptype=false
                                          adid=false
                                          fi
                                          done < "$1"


                                          And running the bash script:



                                          user@host:~$ bash getids.bash data.txt
                                          entry-id: 1
                                          entry-id: 3


                                          And finally, here's something using just grep and wc:



                                          user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l

                                          2






                                          share|improve this answer














                                          share|improve this answer



                                          share|improve this answer








                                          edited Dec 14 '17 at 13:36

























                                          answered Dec 14 '17 at 5:39









                                          igal

                                          4,830930




                                          4,830930




















                                              up vote
                                              0
                                              down vote













                                              With perl, that could be:



                                              perl -l -00ne '
                                              my %f = /(.*?):s*(.*)/g;
                                              ++$n if $fempType eq "A" && $fADID ne "";
                                              END print 0+$n' < file



                                              • -n causes the code given to -e to be applied to each input record


                                              • -00 for records to be paragraphs.

                                              • We build a %f associative array where key and values are mapped to each (key):spaces(value) in the record.

                                              • and increment $n where the conditions are met.

                                              • we print $n in the END (adding 0 to make sure we get 0 and not an empty string if there's no match).





                                              share|improve this answer


























                                                up vote
                                                0
                                                down vote













                                                With perl, that could be:



                                                perl -l -00ne '
                                                my %f = /(.*?):s*(.*)/g;
                                                ++$n if $fempType eq "A" && $fADID ne "";
                                                END print 0+$n' < file



                                                • -n causes the code given to -e to be applied to each input record


                                                • -00 for records to be paragraphs.

                                                • We build a %f associative array where key and values are mapped to each (key):spaces(value) in the record.

                                                • and increment $n where the conditions are met.

                                                • we print $n in the END (adding 0 to make sure we get 0 and not an empty string if there's no match).





                                                share|improve this answer
























                                                  up vote
                                                  0
                                                  down vote










                                                  up vote
                                                  0
                                                  down vote









                                                  With perl, that could be:



                                                  perl -l -00ne '
                                                  my %f = /(.*?):s*(.*)/g;
                                                  ++$n if $fempType eq "A" && $fADID ne "";
                                                  END print 0+$n' < file



                                                  • -n causes the code given to -e to be applied to each input record


                                                  • -00 for records to be paragraphs.

                                                  • We build a %f associative array where key and values are mapped to each (key):spaces(value) in the record.

                                                  • and increment $n where the conditions are met.

                                                  • we print $n in the END (adding 0 to make sure we get 0 and not an empty string if there's no match).





                                                  share|improve this answer














                                                  With perl, that could be:



                                                  perl -l -00ne '
                                                  my %f = /(.*?):s*(.*)/g;
                                                  ++$n if $fempType eq "A" && $fADID ne "";
                                                  END print 0+$n' < file



                                                  • -n causes the code given to -e to be applied to each input record


                                                  • -00 for records to be paragraphs.

                                                  • We build a %f associative array where key and values are mapped to each (key):spaces(value) in the record.

                                                  • and increment $n where the conditions are met.

                                                  • we print $n in the END (adding 0 to make sure we get 0 and not an empty string if there's no match).






                                                  share|improve this answer














                                                  share|improve this answer



                                                  share|improve this answer








                                                  edited Dec 14 '17 at 14:57

























                                                  answered Dec 14 '17 at 14:14









                                                  Stéphane Chazelas

                                                  282k53520854




                                                  282k53520854




















                                                      up vote
                                                      0
                                                      down vote













                                                      I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
                                                      What I did find that worked was



                                                      perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"


                                                      Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph,
                                                      -n while loop
                                                      e one line of program??
                                                      print paragraph if you find empType: A
                                                      now pipe those matched paragraphs to |
                                                      grep -i -c "^ADID:" find ignore cased and count number of ADIDs.
                                                      I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....






                                                      share|improve this answer
























                                                        up vote
                                                        0
                                                        down vote













                                                        I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
                                                        What I did find that worked was



                                                        perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"


                                                        Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph,
                                                        -n while loop
                                                        e one line of program??
                                                        print paragraph if you find empType: A
                                                        now pipe those matched paragraphs to |
                                                        grep -i -c "^ADID:" find ignore cased and count number of ADIDs.
                                                        I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....






                                                        share|improve this answer






















                                                          up vote
                                                          0
                                                          down vote










                                                          up vote
                                                          0
                                                          down vote









                                                          I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
                                                          What I did find that worked was



                                                          perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"


                                                          Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph,
                                                          -n while loop
                                                          e one line of program??
                                                          print paragraph if you find empType: A
                                                          now pipe those matched paragraphs to |
                                                          grep -i -c "^ADID:" find ignore cased and count number of ADIDs.
                                                          I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....






                                                          share|improve this answer












                                                          I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
                                                          What I did find that worked was



                                                          perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"


                                                          Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph,
                                                          -n while loop
                                                          e one line of program??
                                                          print paragraph if you find empType: A
                                                          now pipe those matched paragraphs to |
                                                          grep -i -c "^ADID:" find ignore cased and count number of ADIDs.
                                                          I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....







                                                          share|improve this answer












                                                          share|improve this answer



                                                          share|improve this answer










                                                          answered Dec 14 '17 at 16:13









                                                          King of NES

                                                          1163




                                                          1163






















                                                               

                                                              draft saved


                                                              draft discarded


























                                                               


                                                              draft saved


                                                              draft discarded














                                                              StackExchange.ready(
                                                              function ()
                                                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f410792%2fgetting-a-match-count-of-objects-in-a-file%23new-answer', 'question_page');

                                                              );

                                                              Post as a guest













































































                                                              Popular posts from this blog

                                                              How to check contact read email or not when send email to Individual?

                                                              Displaying single band from multi-band raster using QGIS

                                                              How many registers does an x86_64 CPU actually have?