Need to sum of last column based on pipe delimiter

Clash Royale CLAN TAG#URR8PPP
up vote
3
down vote
favorite
I have below inputs with huge number of rows
11|ABCD|19900101123123445455|555|AAA|50505050|0000009030
11|ABCD|19900101123123445455|555|AAA|50505050|0000000199
13|ABCD|201803010YYY66666666|600|ETC|20180300|0000084099
11|ABCD|19900101123123445455|555|AAA|50505050|0008995001
And I need to get below output
11|ABCD|19900101123123445455|555|AAA|50505050|9004230
13|ABCD|201803010YYY66666666|600|ETC|20180300|84099
I have been trying with below awk but having too limited knowledge with arrays.
cat test|awk -F"|" ' a[$1]++;b[$2]++;c[$3]++;d[$4]++;e[$5]++;f[$6]+=$6 ; END for (i in a); print i, f[i]'
I need to sum last column of column number 6 and print all first 5 columns, which are separated by pipe and last 6th column as sum of 6th column.
shell-script awk
add a comment |Â
up vote
3
down vote
favorite
I have below inputs with huge number of rows
11|ABCD|19900101123123445455|555|AAA|50505050|0000009030
11|ABCD|19900101123123445455|555|AAA|50505050|0000000199
13|ABCD|201803010YYY66666666|600|ETC|20180300|0000084099
11|ABCD|19900101123123445455|555|AAA|50505050|0008995001
And I need to get below output
11|ABCD|19900101123123445455|555|AAA|50505050|9004230
13|ABCD|201803010YYY66666666|600|ETC|20180300|84099
I have been trying with below awk but having too limited knowledge with arrays.
cat test|awk -F"|" ' a[$1]++;b[$2]++;c[$3]++;d[$4]++;e[$5]++;f[$6]+=$6 ; END for (i in a); print i, f[i]'
I need to sum last column of column number 6 and print all first 5 columns, which are separated by pipe and last 6th column as sum of 6th column.
shell-script awk
add a comment |Â
up vote
3
down vote
favorite
up vote
3
down vote
favorite
I have below inputs with huge number of rows
11|ABCD|19900101123123445455|555|AAA|50505050|0000009030
11|ABCD|19900101123123445455|555|AAA|50505050|0000000199
13|ABCD|201803010YYY66666666|600|ETC|20180300|0000084099
11|ABCD|19900101123123445455|555|AAA|50505050|0008995001
And I need to get below output
11|ABCD|19900101123123445455|555|AAA|50505050|9004230
13|ABCD|201803010YYY66666666|600|ETC|20180300|84099
I have been trying with below awk but having too limited knowledge with arrays.
cat test|awk -F"|" ' a[$1]++;b[$2]++;c[$3]++;d[$4]++;e[$5]++;f[$6]+=$6 ; END for (i in a); print i, f[i]'
I need to sum last column of column number 6 and print all first 5 columns, which are separated by pipe and last 6th column as sum of 6th column.
shell-script awk
I have below inputs with huge number of rows
11|ABCD|19900101123123445455|555|AAA|50505050|0000009030
11|ABCD|19900101123123445455|555|AAA|50505050|0000000199
13|ABCD|201803010YYY66666666|600|ETC|20180300|0000084099
11|ABCD|19900101123123445455|555|AAA|50505050|0008995001
And I need to get below output
11|ABCD|19900101123123445455|555|AAA|50505050|9004230
13|ABCD|201803010YYY66666666|600|ETC|20180300|84099
I have been trying with below awk but having too limited knowledge with arrays.
cat test|awk -F"|" ' a[$1]++;b[$2]++;c[$3]++;d[$4]++;e[$5]++;f[$6]+=$6 ; END for (i in a); print i, f[i]'
I need to sum last column of column number 6 and print all first 5 columns, which are separated by pipe and last 6th column as sum of 6th column.
shell-script awk
edited Mar 25 at 11:51
asked Mar 25 at 11:44
Ganesh
213
213
add a comment |Â
add a comment |Â
4 Answers
4
active
oldest
votes
up vote
5
down vote
accepted
Awk solution:
awk 'BEGIN"
a[$1 FS $2 FS $3 FS $4 FS $5 FS $6] += $7
END for (i in a) print i, a[i] ' file
The output:
11|ABCD|19900101123123445455|555|AAA|50505050|9004230
13|ABCD|201803010YYY66666666|600|ETC|20180300|84099
add a comment |Â
up vote
9
down vote
With GNU datamash command:
$ datamash -t'|' -s -g 1,2,3,4,5,6 sum 7 < infile
11|ABCD|19900101123123445455|555|AAA|50505050|9004230
13|ABCD|201803010YYY66666666|600|ETC|20180300|8409
In datamash v1.2+, you can specify the columns range also.
$ datamash -t'|' -s -g 1-6 sum 7 < infile
Or shortest AWK alternative and where you had N columns and you should not specify all one by one:
awk -F'|' 'x=$NF;NF--; a[$0]+=x ENDfor(i in a) print i, a[i]' OFS='|' infile
add a comment |Â
up vote
4
down vote
The idea is right, but for such a requirement you create the hash key as the values except the last column and use that key to sum up values in the last column. Once all the lines are processed in the END clause we print the summed up values
awk '
BEGIN FS=OFS="
hashKey = ""
for(i=1;i<=(NF-1); i++)
hashKey = ( hashKey ? (hashKey FS $i):$i )
total[hashKey]+=$NF
END for ( j in total ) print j, total[j]
' file
add a comment |Â
up vote
0
down vote
and perl
perl -lne '
$sum$1 += $2 if /(.*)|(.*)/
} END ETC END Â
up vote
0
down vote
and perl
perl -lne '
$sum$1 += $2 if /(.*) END Â
up vote
0
down vote
up vote
0
down vote
and perl
perl -lne '
$sum$1 += $2 if /(.*) END (.*)/
END {
print "$_|$sum$_" for keys %sum
' file
answered Mar 25 at 15:35
glenn jackman
46.1k265101
46.1k265101
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f433404%2fneed-to-sum-of-last-column-based-on-pipe-delimiter%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password