Need to sum of last column based on pipe delimiter

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
3
down vote

favorite
1












I have below inputs with huge number of rows



11|ABCD|19900101123123445455|555|AAA|50505050|0000009030
11|ABCD|19900101123123445455|555|AAA|50505050|0000000199
13|ABCD|201803010YYY66666666|600|ETC|20180300|0000084099
11|ABCD|19900101123123445455|555|AAA|50505050|0008995001


And I need to get below output



11|ABCD|19900101123123445455|555|AAA|50505050|9004230
13|ABCD|201803010YYY66666666|600|ETC|20180300|84099


I have been trying with below awk but having too limited knowledge with arrays.



cat test|awk -F"|" ' a[$1]++;b[$2]++;c[$3]++;d[$4]++;e[$5]++;f[$6]+=$6 ; END for (i in a); print i, f[i]'


I need to sum last column of column number 6 and print all first 5 columns, which are separated by pipe and last 6th column as sum of 6th column.







share|improve this question


























    up vote
    3
    down vote

    favorite
    1












    I have below inputs with huge number of rows



    11|ABCD|19900101123123445455|555|AAA|50505050|0000009030
    11|ABCD|19900101123123445455|555|AAA|50505050|0000000199
    13|ABCD|201803010YYY66666666|600|ETC|20180300|0000084099
    11|ABCD|19900101123123445455|555|AAA|50505050|0008995001


    And I need to get below output



    11|ABCD|19900101123123445455|555|AAA|50505050|9004230
    13|ABCD|201803010YYY66666666|600|ETC|20180300|84099


    I have been trying with below awk but having too limited knowledge with arrays.



    cat test|awk -F"|" ' a[$1]++;b[$2]++;c[$3]++;d[$4]++;e[$5]++;f[$6]+=$6 ; END for (i in a); print i, f[i]'


    I need to sum last column of column number 6 and print all first 5 columns, which are separated by pipe and last 6th column as sum of 6th column.







    share|improve this question
























      up vote
      3
      down vote

      favorite
      1









      up vote
      3
      down vote

      favorite
      1






      1





      I have below inputs with huge number of rows



      11|ABCD|19900101123123445455|555|AAA|50505050|0000009030
      11|ABCD|19900101123123445455|555|AAA|50505050|0000000199
      13|ABCD|201803010YYY66666666|600|ETC|20180300|0000084099
      11|ABCD|19900101123123445455|555|AAA|50505050|0008995001


      And I need to get below output



      11|ABCD|19900101123123445455|555|AAA|50505050|9004230
      13|ABCD|201803010YYY66666666|600|ETC|20180300|84099


      I have been trying with below awk but having too limited knowledge with arrays.



      cat test|awk -F"|" ' a[$1]++;b[$2]++;c[$3]++;d[$4]++;e[$5]++;f[$6]+=$6 ; END for (i in a); print i, f[i]'


      I need to sum last column of column number 6 and print all first 5 columns, which are separated by pipe and last 6th column as sum of 6th column.







      share|improve this question














      I have below inputs with huge number of rows



      11|ABCD|19900101123123445455|555|AAA|50505050|0000009030
      11|ABCD|19900101123123445455|555|AAA|50505050|0000000199
      13|ABCD|201803010YYY66666666|600|ETC|20180300|0000084099
      11|ABCD|19900101123123445455|555|AAA|50505050|0008995001


      And I need to get below output



      11|ABCD|19900101123123445455|555|AAA|50505050|9004230
      13|ABCD|201803010YYY66666666|600|ETC|20180300|84099


      I have been trying with below awk but having too limited knowledge with arrays.



      cat test|awk -F"|" ' a[$1]++;b[$2]++;c[$3]++;d[$4]++;e[$5]++;f[$6]+=$6 ; END for (i in a); print i, f[i]'


      I need to sum last column of column number 6 and print all first 5 columns, which are separated by pipe and last 6th column as sum of 6th column.









      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 25 at 11:51

























      asked Mar 25 at 11:44









      Ganesh

      213




      213




















          4 Answers
          4






          active

          oldest

          votes

















          up vote
          5
          down vote



          accepted










          Awk solution:



          awk 'BEGIN" 
          a[$1 FS $2 FS $3 FS $4 FS $5 FS $6] += $7
          END for (i in a) print i, a[i] ' file


          The output:



          11|ABCD|19900101123123445455|555|AAA|50505050|9004230
          13|ABCD|201803010YYY66666666|600|ETC|20180300|84099





          share|improve this answer





























            up vote
            9
            down vote













            With GNU datamash command:



            $ datamash -t'|' -s -g 1,2,3,4,5,6 sum 7 < infile
            11|ABCD|19900101123123445455|555|AAA|50505050|9004230
            13|ABCD|201803010YYY66666666|600|ETC|20180300|8409


            In datamash v1.2+, you can specify the columns range also.



            $ datamash -t'|' -s -g 1-6 sum 7 < infile


            Or shortest AWK alternative and where you had N columns and you should not specify all one by one:



            awk -F'|' 'x=$NF;NF--; a[$0]+=x ENDfor(i in a) print i, a[i]' OFS='|' infile





            share|improve this answer





























              up vote
              4
              down vote













              The idea is right, but for such a requirement you create the hash key as the values except the last column and use that key to sum up values in the last column. Once all the lines are processed in the END clause we print the summed up values



              awk '
              BEGIN FS=OFS="
              hashKey = ""
              for(i=1;i<=(NF-1); i++)
              hashKey = ( hashKey ? (hashKey FS $i):$i )

              total[hashKey]+=$NF

              END for ( j in total ) print j, total[j]
              ' file





              share|improve this answer



























                up vote
                0
                down vote













                and perl



                perl -lne '
                $sum$1 += $2 if /(.*)|(.*)/
                } END ETC END  










                up vote
                0
                down vote













                and perl



                perl -lne '
                $sum$1 += $2 if /(.*) END  








                up vote
                0
                down vote










                up vote
                0
                down vote









                and perl



                perl -lne '
                $sum$1 += $2 if /(.*) END (.*)/
                END {
                print "$_|$sum$_" for keys %sum
                ' file






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Mar 25 at 15:35









                glenn jackman

                46.1k265101




                46.1k265101






















                     

                    draft saved


                    draft discarded


























                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f433404%2fneed-to-sum-of-last-column-based-on-pipe-delimiter%23new-answer', 'question_page');

                    );

                    Post as a guest













































































                    Popular posts from this blog

                    Peggy Mitchell

                    Palaiologos

                    The Forum (Inglewood, California)