Getting a match count of objects in a file

 Clash Royale CLAN TAG#URR8PPP
Clash Royale CLAN TAG#URR8PPP
up vote
1
down vote
favorite
I have a large file that has entries that look like this:
entry-id: 1
sn: John
cn: Smith
empType: A
ADID: 123456
entry-id: 2
sn: James
cn: Smith
empType: B
ADID: 123456
entry-id: 3
sn: Jobu
cn: Smith
empType: A
ADID: 123456
entry-id: 4
sn: Jobu
cn: Smith
empType: A
ADID: 
Each entry is separated by a new line. I need a count of entries that have an empType of A, and MUST ALSO have a value after ADID(total of 2). I've tried to use awk and grep and egrep, and still having no luck. Any ideas?
linux command-line
add a comment |Â
up vote
1
down vote
favorite
I have a large file that has entries that look like this:
entry-id: 1
sn: John
cn: Smith
empType: A
ADID: 123456
entry-id: 2
sn: James
cn: Smith
empType: B
ADID: 123456
entry-id: 3
sn: Jobu
cn: Smith
empType: A
ADID: 123456
entry-id: 4
sn: Jobu
cn: Smith
empType: A
ADID: 
Each entry is separated by a new line. I need a count of entries that have an empType of A, and MUST ALSO have a value after ADID(total of 2). I've tried to use awk and grep and egrep, and still having no luck. Any ideas?
linux command-line
 
 
 
 
 
 
 What exactly did you try in awk? I would think something like- awk -vRS= '/empType: A/ && /ADID: [0-9]+/ n++ END print n' fileshould work
 â steeldriver
 Dec 14 '17 at 3:39
 
 
 
 
 
 
 
 
 
 running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....
 â King of NES
 Dec 14 '17 at 3:54
 
 
 
 
 
 
 
 
 
 
 You did include the correct filename to read as input?
 â bu5hman
 Dec 14 '17 at 4:15
 
 
 
 
 
 
 
 
 
 it was the correct file...
 â King of NES
 Dec 14 '17 at 4:18
 
 
 
add a comment |Â
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I have a large file that has entries that look like this:
entry-id: 1
sn: John
cn: Smith
empType: A
ADID: 123456
entry-id: 2
sn: James
cn: Smith
empType: B
ADID: 123456
entry-id: 3
sn: Jobu
cn: Smith
empType: A
ADID: 123456
entry-id: 4
sn: Jobu
cn: Smith
empType: A
ADID: 
Each entry is separated by a new line. I need a count of entries that have an empType of A, and MUST ALSO have a value after ADID(total of 2). I've tried to use awk and grep and egrep, and still having no luck. Any ideas?
linux command-line
I have a large file that has entries that look like this:
entry-id: 1
sn: John
cn: Smith
empType: A
ADID: 123456
entry-id: 2
sn: James
cn: Smith
empType: B
ADID: 123456
entry-id: 3
sn: Jobu
cn: Smith
empType: A
ADID: 123456
entry-id: 4
sn: Jobu
cn: Smith
empType: A
ADID: 
Each entry is separated by a new line. I need a count of entries that have an empType of A, and MUST ALSO have a value after ADID(total of 2). I've tried to use awk and grep and egrep, and still having no luck. Any ideas?
linux command-line
asked Dec 14 '17 at 3:29
King of NES
1163
1163
 
 
 
 
 
 
 What exactly did you try in awk? I would think something like- awk -vRS= '/empType: A/ && /ADID: [0-9]+/ n++ END print n' fileshould work
 â steeldriver
 Dec 14 '17 at 3:39
 
 
 
 
 
 
 
 
 
 running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....
 â King of NES
 Dec 14 '17 at 3:54
 
 
 
 
 
 
 
 
 
 
 You did include the correct filename to read as input?
 â bu5hman
 Dec 14 '17 at 4:15
 
 
 
 
 
 
 
 
 
 it was the correct file...
 â King of NES
 Dec 14 '17 at 4:18
 
 
 
add a comment |Â
 
 
 
 
 
 
 What exactly did you try in awk? I would think something like- awk -vRS= '/empType: A/ && /ADID: [0-9]+/ n++ END print n' fileshould work
 â steeldriver
 Dec 14 '17 at 3:39
 
 
 
 
 
 
 
 
 
 running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....
 â King of NES
 Dec 14 '17 at 3:54
 
 
 
 
 
 
 
 
 
 
 You did include the correct filename to read as input?
 â bu5hman
 Dec 14 '17 at 4:15
 
 
 
 
 
 
 
 
 
 it was the correct file...
 â King of NES
 Dec 14 '17 at 4:18
 
 
 
What exactly did you try in awk? I would think something like
awk -vRS= '/empType: A/ && /ADID: [0-9]+/ n++ END print n' file should workâ steeldriver
Dec 14 '17 at 3:39
What exactly did you try in awk? I would think something like
awk -vRS= '/empType: A/ && /ADID: [0-9]+/ n++ END print n' file should workâ steeldriver
Dec 14 '17 at 3:39
running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....
â King of NES
Dec 14 '17 at 3:54
running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....
â King of NES
Dec 14 '17 at 3:54
You did include the correct filename to read as input?
â bu5hman
Dec 14 '17 at 4:15
You did include the correct filename to read as input?
â bu5hman
Dec 14 '17 at 4:15
it was the correct file...
â King of NES
Dec 14 '17 at 4:18
it was the correct file...
â King of NES
Dec 14 '17 at 4:18
add a comment |Â
 6 Answers
 6
 
active
oldest
votes
up vote
0
down vote
Awk solution:
awk '/empType: / f=($2=="A"? 1:0) f && /ADID: [0-9]+/ c++ END print c ' file
- f- flag indicating- empType: Asection processing
- c- count of- empType: Aentries with filled- ADIDkey
The output:
2
add a comment |Â
up vote
0
down vote
Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS
BEGIN RS=""; FS="n"
 split($4,a,": ")
 split($5,b,": ")
a[2]=="A" && b[2]!="" c++
END print c
the script can be executed with
awk -f main.awk file
add a comment |Â
up vote
0
down vote
Simple two grep method, where data is the input file:
grep -A1 'empType: A' data | grep -c 'ADID: .+'
Output:
2
add a comment |Â
up vote
0
down vote
I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:
#!/usr/bin/env awk
# getids.awk
BEGIN
 RS="";
 FS="n"
/ADID: [0-9]/ && /empType: A/print $1
And here it is in action:
user@host:~$ awk -f getids.awk data.txt
entry-id: 1
entry-id: 3
user@host:~$ awk -f getids.awk data.txt | wc -l
2
Of course if you just want the count we can do that too:
#!/usr/bin/env awk
# count.awk
BEGIN 
 RS="";
 FS="n";
 count=0;
/ADID: [0-9]/ && /empType: A/count++
END 
 print count
And because I love Python, here is a Python script that does the same thing:
#!/usr/bin/env python2
# -*- coding: ascii -*-
"""getids.py"""
import sys
# Create a list to store the matched records
records = 
# Iterate over the lines of the input file
with open(sys.argv[1]) as data:
 for line in data:
 # When an "entry-id" is reached, create a new record
 if line.startswith('entry-id'):
 entry_id = line.split(':')[1].strip()
 records.append('entry-id': entry_id)
 # For other lines, update the current record
 elif line.strip():
 key = line.partition(':')[0].strip()
 value = line.partition(':')[2].strip()
 records[-1][key] = value
 # Extract the list of records meeting the desired critera
 matches = [record for record in records if record['empType'] == 'A' and record['ADID']]
 # Print out the entry-ids for all of the matches
 for match in matches:
 print('entry-id: ' + match['entry-id'])
And here's the Python script in action:
user@host:~$ python getids.py data.txt
entry-id: 1
entry-id: 3
user@host:~$ python getids.py data.txt | wc -l
2
And if we really do just want the counts:
#!/usr/bin/env python2
# -*- coding: ascii -*-
"""count.py"""
import sys
# Keep a count of the number of matches 
count = 0
# Use flags to keep track of the current record
emptype_flag = False
adid_flag = False
# Iterate over the lines of the input file
with open(sys.argv[1]) as data:
 for line in data:
 # When an "entry-id" is reached, reset the flags 
 if line.startswith('entry-id'):
 emptype_flag = False
 adid_flag = False
 elif line.strip() == "empType: A":
 emptype_flag = True
 elif line.startswith("ADID") and line.strip().split(':')[1]:
 adid_flag = True
 # If both conditions hold the increment the counter
 # and reset the flags
 if emptype_flag and adid_flag:
 count = count + 1
 emptype_flag = False
 adid_flag = False
 # Print the number of matches
 print(count)
And, while we're at it, how about a pure Bash script? Here's one:
#!/usr/bin/env bash
# getids.bash
while read line; do
if [[ "$line" =~ "entry-id:" ]]; then
 entry_id="$line"
 emptype=false
 adid=false
elif [[ "$line" =~ "empType: A" ]]; then
 emptype=true
elif [[ "$line" =~ ADID: [0-9] ]]; then
 adid=true
fi
if [[ "$emptype" == true && "$adid" == true ]]; then
 echo "$entry_id"
 emptype=false
 adid=false
fi
done < "$1"
And running the bash script:
user@host:~$ bash getids.bash data.txt
entry-id: 1
entry-id: 3
And finally, here's something using just grep and wc:
user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l
2
add a comment |Â
up vote
0
down vote
With perl, that could be:
perl -l -00ne '
 my %f = /(.*?):s*(.*)/g;
 ++$n if $fempType eq "A" && $fADID ne "";
 END print 0+$n' < file
- -ncauses the code given to- -eto be applied to each input record
- -00for records to be paragraphs.
- We build a %fassociative array where key and values are mapped to each(key):spaces(value)in the record.
- and increment $nwhere the conditions are met.
- we print $nin theEND(adding0to make sure we get0and not an empty string if there's no match).
add a comment |Â
up vote
0
down vote
I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
What I did find that worked was
perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"
Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph, 
-n while loop
e one line of program??
print paragraph if you find empType: A
now pipe those matched paragraphs to |
grep -i -c "^ADID:" find ignore cased and count number of ADIDs. 
I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....
add a comment |Â
 6 Answers
 6
 
active
oldest
votes
 6 Answers
 6
 
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
Awk solution:
awk '/empType: / f=($2=="A"? 1:0) f && /ADID: [0-9]+/ c++ END print c ' file
- f- flag indicating- empType: Asection processing
- c- count of- empType: Aentries with filled- ADIDkey
The output:
2
add a comment |Â
up vote
0
down vote
Awk solution:
awk '/empType: / f=($2=="A"? 1:0) f && /ADID: [0-9]+/ c++ END print c ' file
- f- flag indicating- empType: Asection processing
- c- count of- empType: Aentries with filled- ADIDkey
The output:
2
add a comment |Â
up vote
0
down vote
up vote
0
down vote
Awk solution:
awk '/empType: / f=($2=="A"? 1:0) f && /ADID: [0-9]+/ c++ END print c ' file
- f- flag indicating- empType: Asection processing
- c- count of- empType: Aentries with filled- ADIDkey
The output:
2
Awk solution:
awk '/empType: / f=($2=="A"? 1:0) f && /ADID: [0-9]+/ c++ END print c ' file
- f- flag indicating- empType: Asection processing
- c- count of- empType: Aentries with filled- ADIDkey
The output:
2
edited Dec 14 '17 at 6:12
answered Dec 14 '17 at 6:00


RomanPerekhrest
22.4k12145
22.4k12145
add a comment |Â
add a comment |Â
up vote
0
down vote
Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS
BEGIN RS=""; FS="n"
 split($4,a,": ")
 split($5,b,": ")
a[2]=="A" && b[2]!="" c++
END print c
the script can be executed with
awk -f main.awk file
add a comment |Â
up vote
0
down vote
Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS
BEGIN RS=""; FS="n"
 split($4,a,": ")
 split($5,b,": ")
a[2]=="A" && b[2]!="" c++
END print c
the script can be executed with
awk -f main.awk file
add a comment |Â
up vote
0
down vote
up vote
0
down vote
Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS
BEGIN RS=""; FS="n"
 split($4,a,": ")
 split($5,b,": ")
a[2]=="A" && b[2]!="" c++
END print c
the script can be executed with
awk -f main.awk file
Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS
BEGIN RS=""; FS="n"
 split($4,a,": ")
 split($5,b,": ")
a[2]=="A" && b[2]!="" c++
END print c
the script can be executed with
awk -f main.awk file
answered Dec 14 '17 at 6:39


etopylight
383117
383117
add a comment |Â
add a comment |Â
up vote
0
down vote
Simple two grep method, where data is the input file:
grep -A1 'empType: A' data | grep -c 'ADID: .+'
Output:
2
add a comment |Â
up vote
0
down vote
Simple two grep method, where data is the input file:
grep -A1 'empType: A' data | grep -c 'ADID: .+'
Output:
2
add a comment |Â
up vote
0
down vote
up vote
0
down vote
Simple two grep method, where data is the input file:
grep -A1 'empType: A' data | grep -c 'ADID: .+'
Output:
2
Simple two grep method, where data is the input file:
grep -A1 'empType: A' data | grep -c 'ADID: .+'
Output:
2
edited Dec 14 '17 at 7:15
answered Dec 14 '17 at 7:09
agc
4,1101935
4,1101935
add a comment |Â
add a comment |Â
up vote
0
down vote
I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:
#!/usr/bin/env awk
# getids.awk
BEGIN
 RS="";
 FS="n"
/ADID: [0-9]/ && /empType: A/print $1
And here it is in action:
user@host:~$ awk -f getids.awk data.txt
entry-id: 1
entry-id: 3
user@host:~$ awk -f getids.awk data.txt | wc -l
2
Of course if you just want the count we can do that too:
#!/usr/bin/env awk
# count.awk
BEGIN 
 RS="";
 FS="n";
 count=0;
/ADID: [0-9]/ && /empType: A/count++
END 
 print count
And because I love Python, here is a Python script that does the same thing:
#!/usr/bin/env python2
# -*- coding: ascii -*-
"""getids.py"""
import sys
# Create a list to store the matched records
records = 
# Iterate over the lines of the input file
with open(sys.argv[1]) as data:
 for line in data:
 # When an "entry-id" is reached, create a new record
 if line.startswith('entry-id'):
 entry_id = line.split(':')[1].strip()
 records.append('entry-id': entry_id)
 # For other lines, update the current record
 elif line.strip():
 key = line.partition(':')[0].strip()
 value = line.partition(':')[2].strip()
 records[-1][key] = value
 # Extract the list of records meeting the desired critera
 matches = [record for record in records if record['empType'] == 'A' and record['ADID']]
 # Print out the entry-ids for all of the matches
 for match in matches:
 print('entry-id: ' + match['entry-id'])
And here's the Python script in action:
user@host:~$ python getids.py data.txt
entry-id: 1
entry-id: 3
user@host:~$ python getids.py data.txt | wc -l
2
And if we really do just want the counts:
#!/usr/bin/env python2
# -*- coding: ascii -*-
"""count.py"""
import sys
# Keep a count of the number of matches 
count = 0
# Use flags to keep track of the current record
emptype_flag = False
adid_flag = False
# Iterate over the lines of the input file
with open(sys.argv[1]) as data:
 for line in data:
 # When an "entry-id" is reached, reset the flags 
 if line.startswith('entry-id'):
 emptype_flag = False
 adid_flag = False
 elif line.strip() == "empType: A":
 emptype_flag = True
 elif line.startswith("ADID") and line.strip().split(':')[1]:
 adid_flag = True
 # If both conditions hold the increment the counter
 # and reset the flags
 if emptype_flag and adid_flag:
 count = count + 1
 emptype_flag = False
 adid_flag = False
 # Print the number of matches
 print(count)
And, while we're at it, how about a pure Bash script? Here's one:
#!/usr/bin/env bash
# getids.bash
while read line; do
if [[ "$line" =~ "entry-id:" ]]; then
 entry_id="$line"
 emptype=false
 adid=false
elif [[ "$line" =~ "empType: A" ]]; then
 emptype=true
elif [[ "$line" =~ ADID: [0-9] ]]; then
 adid=true
fi
if [[ "$emptype" == true && "$adid" == true ]]; then
 echo "$entry_id"
 emptype=false
 adid=false
fi
done < "$1"
And running the bash script:
user@host:~$ bash getids.bash data.txt
entry-id: 1
entry-id: 3
And finally, here's something using just grep and wc:
user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l
2
add a comment |Â
up vote
0
down vote
I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:
#!/usr/bin/env awk
# getids.awk
BEGIN
 RS="";
 FS="n"
/ADID: [0-9]/ && /empType: A/print $1
And here it is in action:
user@host:~$ awk -f getids.awk data.txt
entry-id: 1
entry-id: 3
user@host:~$ awk -f getids.awk data.txt | wc -l
2
Of course if you just want the count we can do that too:
#!/usr/bin/env awk
# count.awk
BEGIN 
 RS="";
 FS="n";
 count=0;
/ADID: [0-9]/ && /empType: A/count++
END 
 print count
And because I love Python, here is a Python script that does the same thing:
#!/usr/bin/env python2
# -*- coding: ascii -*-
"""getids.py"""
import sys
# Create a list to store the matched records
records = 
# Iterate over the lines of the input file
with open(sys.argv[1]) as data:
 for line in data:
 # When an "entry-id" is reached, create a new record
 if line.startswith('entry-id'):
 entry_id = line.split(':')[1].strip()
 records.append('entry-id': entry_id)
 # For other lines, update the current record
 elif line.strip():
 key = line.partition(':')[0].strip()
 value = line.partition(':')[2].strip()
 records[-1][key] = value
 # Extract the list of records meeting the desired critera
 matches = [record for record in records if record['empType'] == 'A' and record['ADID']]
 # Print out the entry-ids for all of the matches
 for match in matches:
 print('entry-id: ' + match['entry-id'])
And here's the Python script in action:
user@host:~$ python getids.py data.txt
entry-id: 1
entry-id: 3
user@host:~$ python getids.py data.txt | wc -l
2
And if we really do just want the counts:
#!/usr/bin/env python2
# -*- coding: ascii -*-
"""count.py"""
import sys
# Keep a count of the number of matches 
count = 0
# Use flags to keep track of the current record
emptype_flag = False
adid_flag = False
# Iterate over the lines of the input file
with open(sys.argv[1]) as data:
 for line in data:
 # When an "entry-id" is reached, reset the flags 
 if line.startswith('entry-id'):
 emptype_flag = False
 adid_flag = False
 elif line.strip() == "empType: A":
 emptype_flag = True
 elif line.startswith("ADID") and line.strip().split(':')[1]:
 adid_flag = True
 # If both conditions hold the increment the counter
 # and reset the flags
 if emptype_flag and adid_flag:
 count = count + 1
 emptype_flag = False
 adid_flag = False
 # Print the number of matches
 print(count)
And, while we're at it, how about a pure Bash script? Here's one:
#!/usr/bin/env bash
# getids.bash
while read line; do
if [[ "$line" =~ "entry-id:" ]]; then
 entry_id="$line"
 emptype=false
 adid=false
elif [[ "$line" =~ "empType: A" ]]; then
 emptype=true
elif [[ "$line" =~ ADID: [0-9] ]]; then
 adid=true
fi
if [[ "$emptype" == true && "$adid" == true ]]; then
 echo "$entry_id"
 emptype=false
 adid=false
fi
done < "$1"
And running the bash script:
user@host:~$ bash getids.bash data.txt
entry-id: 1
entry-id: 3
And finally, here's something using just grep and wc:
user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l
2
add a comment |Â
up vote
0
down vote
up vote
0
down vote
I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:
#!/usr/bin/env awk
# getids.awk
BEGIN
 RS="";
 FS="n"
/ADID: [0-9]/ && /empType: A/print $1
And here it is in action:
user@host:~$ awk -f getids.awk data.txt
entry-id: 1
entry-id: 3
user@host:~$ awk -f getids.awk data.txt | wc -l
2
Of course if you just want the count we can do that too:
#!/usr/bin/env awk
# count.awk
BEGIN 
 RS="";
 FS="n";
 count=0;
/ADID: [0-9]/ && /empType: A/count++
END 
 print count
And because I love Python, here is a Python script that does the same thing:
#!/usr/bin/env python2
# -*- coding: ascii -*-
"""getids.py"""
import sys
# Create a list to store the matched records
records = 
# Iterate over the lines of the input file
with open(sys.argv[1]) as data:
 for line in data:
 # When an "entry-id" is reached, create a new record
 if line.startswith('entry-id'):
 entry_id = line.split(':')[1].strip()
 records.append('entry-id': entry_id)
 # For other lines, update the current record
 elif line.strip():
 key = line.partition(':')[0].strip()
 value = line.partition(':')[2].strip()
 records[-1][key] = value
 # Extract the list of records meeting the desired critera
 matches = [record for record in records if record['empType'] == 'A' and record['ADID']]
 # Print out the entry-ids for all of the matches
 for match in matches:
 print('entry-id: ' + match['entry-id'])
And here's the Python script in action:
user@host:~$ python getids.py data.txt
entry-id: 1
entry-id: 3
user@host:~$ python getids.py data.txt | wc -l
2
And if we really do just want the counts:
#!/usr/bin/env python2
# -*- coding: ascii -*-
"""count.py"""
import sys
# Keep a count of the number of matches 
count = 0
# Use flags to keep track of the current record
emptype_flag = False
adid_flag = False
# Iterate over the lines of the input file
with open(sys.argv[1]) as data:
 for line in data:
 # When an "entry-id" is reached, reset the flags 
 if line.startswith('entry-id'):
 emptype_flag = False
 adid_flag = False
 elif line.strip() == "empType: A":
 emptype_flag = True
 elif line.startswith("ADID") and line.strip().split(':')[1]:
 adid_flag = True
 # If both conditions hold the increment the counter
 # and reset the flags
 if emptype_flag and adid_flag:
 count = count + 1
 emptype_flag = False
 adid_flag = False
 # Print the number of matches
 print(count)
And, while we're at it, how about a pure Bash script? Here's one:
#!/usr/bin/env bash
# getids.bash
while read line; do
if [[ "$line" =~ "entry-id:" ]]; then
 entry_id="$line"
 emptype=false
 adid=false
elif [[ "$line" =~ "empType: A" ]]; then
 emptype=true
elif [[ "$line" =~ ADID: [0-9] ]]; then
 adid=true
fi
if [[ "$emptype" == true && "$adid" == true ]]; then
 echo "$entry_id"
 emptype=false
 adid=false
fi
done < "$1"
And running the bash script:
user@host:~$ bash getids.bash data.txt
entry-id: 1
entry-id: 3
And finally, here's something using just grep and wc:
user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l
2
I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:
#!/usr/bin/env awk
# getids.awk
BEGIN
 RS="";
 FS="n"
/ADID: [0-9]/ && /empType: A/print $1
And here it is in action:
user@host:~$ awk -f getids.awk data.txt
entry-id: 1
entry-id: 3
user@host:~$ awk -f getids.awk data.txt | wc -l
2
Of course if you just want the count we can do that too:
#!/usr/bin/env awk
# count.awk
BEGIN 
 RS="";
 FS="n";
 count=0;
/ADID: [0-9]/ && /empType: A/count++
END 
 print count
And because I love Python, here is a Python script that does the same thing:
#!/usr/bin/env python2
# -*- coding: ascii -*-
"""getids.py"""
import sys
# Create a list to store the matched records
records = 
# Iterate over the lines of the input file
with open(sys.argv[1]) as data:
 for line in data:
 # When an "entry-id" is reached, create a new record
 if line.startswith('entry-id'):
 entry_id = line.split(':')[1].strip()
 records.append('entry-id': entry_id)
 # For other lines, update the current record
 elif line.strip():
 key = line.partition(':')[0].strip()
 value = line.partition(':')[2].strip()
 records[-1][key] = value
 # Extract the list of records meeting the desired critera
 matches = [record for record in records if record['empType'] == 'A' and record['ADID']]
 # Print out the entry-ids for all of the matches
 for match in matches:
 print('entry-id: ' + match['entry-id'])
And here's the Python script in action:
user@host:~$ python getids.py data.txt
entry-id: 1
entry-id: 3
user@host:~$ python getids.py data.txt | wc -l
2
And if we really do just want the counts:
#!/usr/bin/env python2
# -*- coding: ascii -*-
"""count.py"""
import sys
# Keep a count of the number of matches 
count = 0
# Use flags to keep track of the current record
emptype_flag = False
adid_flag = False
# Iterate over the lines of the input file
with open(sys.argv[1]) as data:
 for line in data:
 # When an "entry-id" is reached, reset the flags 
 if line.startswith('entry-id'):
 emptype_flag = False
 adid_flag = False
 elif line.strip() == "empType: A":
 emptype_flag = True
 elif line.startswith("ADID") and line.strip().split(':')[1]:
 adid_flag = True
 # If both conditions hold the increment the counter
 # and reset the flags
 if emptype_flag and adid_flag:
 count = count + 1
 emptype_flag = False
 adid_flag = False
 # Print the number of matches
 print(count)
And, while we're at it, how about a pure Bash script? Here's one:
#!/usr/bin/env bash
# getids.bash
while read line; do
if [[ "$line" =~ "entry-id:" ]]; then
 entry_id="$line"
 emptype=false
 adid=false
elif [[ "$line" =~ "empType: A" ]]; then
 emptype=true
elif [[ "$line" =~ ADID: [0-9] ]]; then
 adid=true
fi
if [[ "$emptype" == true && "$adid" == true ]]; then
 echo "$entry_id"
 emptype=false
 adid=false
fi
done < "$1"
And running the bash script:
user@host:~$ bash getids.bash data.txt
entry-id: 1
entry-id: 3
And finally, here's something using just grep and wc:
user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l
2
edited Dec 14 '17 at 13:36
answered Dec 14 '17 at 5:39
igal
4,830930
4,830930
add a comment |Â
add a comment |Â
up vote
0
down vote
With perl, that could be:
perl -l -00ne '
 my %f = /(.*?):s*(.*)/g;
 ++$n if $fempType eq "A" && $fADID ne "";
 END print 0+$n' < file
- -ncauses the code given to- -eto be applied to each input record
- -00for records to be paragraphs.
- We build a %fassociative array where key and values are mapped to each(key):spaces(value)in the record.
- and increment $nwhere the conditions are met.
- we print $nin theEND(adding0to make sure we get0and not an empty string if there's no match).
add a comment |Â
up vote
0
down vote
With perl, that could be:
perl -l -00ne '
 my %f = /(.*?):s*(.*)/g;
 ++$n if $fempType eq "A" && $fADID ne "";
 END print 0+$n' < file
- -ncauses the code given to- -eto be applied to each input record
- -00for records to be paragraphs.
- We build a %fassociative array where key and values are mapped to each(key):spaces(value)in the record.
- and increment $nwhere the conditions are met.
- we print $nin theEND(adding0to make sure we get0and not an empty string if there's no match).
add a comment |Â
up vote
0
down vote
up vote
0
down vote
With perl, that could be:
perl -l -00ne '
 my %f = /(.*?):s*(.*)/g;
 ++$n if $fempType eq "A" && $fADID ne "";
 END print 0+$n' < file
- -ncauses the code given to- -eto be applied to each input record
- -00for records to be paragraphs.
- We build a %fassociative array where key and values are mapped to each(key):spaces(value)in the record.
- and increment $nwhere the conditions are met.
- we print $nin theEND(adding0to make sure we get0and not an empty string if there's no match).
With perl, that could be:
perl -l -00ne '
 my %f = /(.*?):s*(.*)/g;
 ++$n if $fempType eq "A" && $fADID ne "";
 END print 0+$n' < file
- -ncauses the code given to- -eto be applied to each input record
- -00for records to be paragraphs.
- We build a %fassociative array where key and values are mapped to each(key):spaces(value)in the record.
- and increment $nwhere the conditions are met.
- we print $nin theEND(adding0to make sure we get0and not an empty string if there's no match).
edited Dec 14 '17 at 14:57
answered Dec 14 '17 at 14:14


Stéphane Chazelas
282k53520854
282k53520854
add a comment |Â
add a comment |Â
up vote
0
down vote
I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
What I did find that worked was
perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"
Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph, 
-n while loop
e one line of program??
print paragraph if you find empType: A
now pipe those matched paragraphs to |
grep -i -c "^ADID:" find ignore cased and count number of ADIDs. 
I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....
add a comment |Â
up vote
0
down vote
I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
What I did find that worked was
perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"
Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph, 
-n while loop
e one line of program??
print paragraph if you find empType: A
now pipe those matched paragraphs to |
grep -i -c "^ADID:" find ignore cased and count number of ADIDs. 
I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....
add a comment |Â
up vote
0
down vote
up vote
0
down vote
I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
What I did find that worked was
perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"
Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph, 
-n while loop
e one line of program??
print paragraph if you find empType: A
now pipe those matched paragraphs to |
grep -i -c "^ADID:" find ignore cased and count number of ADIDs. 
I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....
I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
What I did find that worked was
perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"
Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph, 
-n while loop
e one line of program??
print paragraph if you find empType: A
now pipe those matched paragraphs to |
grep -i -c "^ADID:" find ignore cased and count number of ADIDs. 
I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....
answered Dec 14 '17 at 16:13
King of NES
1163
1163
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f410792%2fgetting-a-match-count-of-objects-in-a-file%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
What exactly did you try in awk? I would think something like
awk -vRS= '/empType: A/ && /ADID: [0-9]+/ n++ END print n' fileshould workâ steeldriver
Dec 14 '17 at 3:39
running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....
â King of NES
Dec 14 '17 at 3:54
You did include the correct filename to read as input?
â bu5hman
Dec 14 '17 at 4:15
it was the correct file...
â King of NES
Dec 14 '17 at 4:18