Removing duplicate entries and replacing it with comma | Bash

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP












0















I have a file which contains ip address and port number in this order:



ipaddress : port



1.1.1.1:21

1.1.1.1:22

2.2.2.2:443

3.3.3.3:80

3.3.3.3:443



I need Result in this below format


ipaddress : port, port



1.1.1.1:21,22

2.2.2.2:443

3.3.3.3:80,443










share|improve this question




























    0















    I have a file which contains ip address and port number in this order:



    ipaddress : port



    1.1.1.1:21

    1.1.1.1:22

    2.2.2.2:443

    3.3.3.3:80

    3.3.3.3:443



    I need Result in this below format


    ipaddress : port, port



    1.1.1.1:21,22

    2.2.2.2:443

    3.3.3.3:80,443










    share|improve this question


























      0












      0








      0








      I have a file which contains ip address and port number in this order:



      ipaddress : port



      1.1.1.1:21

      1.1.1.1:22

      2.2.2.2:443

      3.3.3.3:80

      3.3.3.3:443



      I need Result in this below format


      ipaddress : port, port



      1.1.1.1:21,22

      2.2.2.2:443

      3.3.3.3:80,443










      share|improve this question
















      I have a file which contains ip address and port number in this order:



      ipaddress : port



      1.1.1.1:21

      1.1.1.1:22

      2.2.2.2:443

      3.3.3.3:80

      3.3.3.3:443



      I need Result in this below format


      ipaddress : port, port



      1.1.1.1:21,22

      2.2.2.2:443

      3.3.3.3:80,443







      text-processing awk sed






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Feb 1 at 15:44









      Jeff Schaller

      42k1156133




      42k1156133










      asked Feb 1 at 13:33









      user334662user334662

      32




      32




















          5 Answers
          5






          active

          oldest

          votes


















          2














          Assuming there are no trailing spaces on the lines in the input file:



          $ awk -F ':' 'BEGIN OFS=FS $1 in ports ports[$1] = ports[$1] "," $2; next ports[$1] = $2 END for (ip in ports) print ip, ports[ip] ' file
          3.3.3.3:80,443
          1.1.1.1:21,22
          2.2.2.2:443


          The awk script,



          BEGIN OFS=FS 
          $1 in ports ports[$1] = ports[$1] "," $2; next
          ports[$1] = $2
          END for (ip in ports) print ip, ports[ip]


          would first set the output field separator to be the same as the input field separator, which is a : character (this is given on the command line with -F ':'), then it would test whether the current first field (the IP address) is a key in the ports array. If it is, the port number (the second field) is added with a comma as a delimiter to that array entry. If it's not, the entry in the array is simply set to the port number for that IP address.



          At the end, all stored IP addresses are printed with their collected port numbers.






          share|improve this answer























          • Thank you soo much it worked :)

            – user334662
            Feb 1 at 13:53


















          2














          With GNU Datamash



          datamash -t: -s groupby 1 collapse 2 < file


          If your data are already sorted, you can omit the -s .




          Or using an anonymous array inside a hash in Perl:



          $ perl -F: -lne '
          push @ $h$F[0] , $F[1]
          } sort  












          2














          With GNU Datamash



          datamash -t: -s groupby 1 collapse 2 < file


          If your data are already sorted, you can omit the -s .




          Or using an anonymous array inside a hash in Perl:



          $ perl -F: -lne '
          push @ $h$F[0] , $F[1]
           











          2














          With GNU Datamash



          datamash -t: -s groupby 1 collapse 2 < file


          If your data are already sorted, you can omit the -s .




          Or using an anonymous array inside a hash in Perl:



          $ perl -F: -lne '
          push @ $h$F[0] , $F[1]

          for $k (sort keys %h) print "$k:", join ",", @ $h$k
          ' file
          1.1.1.1:21,22
          2.2.2.2:443
          3.3.3.3:80,443





          share
          for $k (sort keys %h) print "$k:", join ",", @ $h$k
          ' file
          1.1.1.1:21,22
          2.2.2.2:443
          3.3.3.3:80,443





          share uniq`; do awk -F ":" -v i="$i" '$1 == iprint i,$2' l.txt 











          0














          You can do using the sed editor. There we maintain 2 lines at any time in the pattern space and look for changes in the IP number. So long as we continue getting the same IP, we remove from the 2nd portion the IP and join it with the 1st portion with a comma. If not, then that means an IP change has been detected and we promptly print the first portion only, remove it from the pattern space, and go back and read in the next IP line into the pattern space and repeat the same checks.



          $ sed -e '
          :loop
          $!N
          s/^(([^:]*:).*[^[:space:]]).*n2/1,/
          tloop
          P;D
          ' input-file.txt

          1.1.1.1:21,22
          2.2.2.2:443
          3.3.3.3:80,443

          $ perl -lne '
          my($ip, $port) = /(H+):(H+)/;
          push @seen, $ip if ! exists $h$ip;
          push @$h$ip, $port; 









          0












          0








          0







          You can do using the sed editor. There we maintain 2 lines at any time in the pattern space and look for changes in the IP number. So long as we continue getting the same IP, we remove from the 2nd portion the IP and join it with the 1st portion with a comma. If not, then that means an IP change has been detected and we promptly print the first portion only, remove it from the pattern space, and go back and read in the next IP line into the pattern space and repeat the same checks.



          $ sed -e '
          :loop
          $!N
          s/^(([^:]*:).*[^[:space:]]).*n2/1,/
          tloop
          P;D
          ' input-file.txt

          1.1.1.1:21,22
          2.2.2.2:443
          3.3.3.3:80,443

          $ perl -lne '
          my($ip, $port) = /(H+):(H+)/;
          push @seen, $ip if ! exists $h$ip;
          push @$h$ip, $port;improve this answer















          You can do using the sed editor. There we maintain 2 lines at any time in the pattern space and look for changes in the IP number. So long as we continue getting the same IP, we remove from the 2nd portion the IP and join it with the 1st portion with a comma. If not, then that means an IP change has been detected and we promptly print the first portion only, remove it from the pattern space, and go back and read in the next IP line into the pattern space and repeat the same checks.



          $ sed -e '
          :loop
          $!N
          s/^(([^:]*:).*[^[:space:]]).*n2/1,/
          tloop
          P;D
          ' input-file.txt

          1.1.1.1:21,22
          2.2.2.2:443
          3.3.3.3:80,443

          $ perl -lne '
          my($ip, $port) = /(H+):(H+)/;
          push @seen, $ip if ! exists $h$ip;
          push @$h$ip, $port;{
          print $_, ":", join ",", @$h$_ for @seen;
          ' input-file.txt


          With Perl we can do the same by means of a hash which will maintain the IPs as it's keys and an array ref as the values comprising the ports. Also, we ensure to not consider any trailing blanks. The array @seen maintains the IPs in the order they were seen.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Feb 2 at 7:33

























          answered Feb 2 at 6:01









          Rakesh SharmaRakesh Sharma

          312113




          312113



























              draft saved

              draft discarded
















































              Thanks for contributing an answer to Unix & Linux Stack Exchange!


              • Please be sure to answer the question. Provide details and share your research!

              But avoid


              • Asking for help, clarification, or responding to other answers.

              • Making statements based on opinion; back them up with references or personal experience.

              To learn more, see our tips on writing great answers.




              draft saved


              draft discarded














              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f498132%2fremoving-duplicate-entries-and-replacing-it-with-comma-bash%23new-answer', 'question_page');

              );

              Post as a guest















              Required, but never shown





















































              Required, but never shown














              Required, but never shown












              Required, but never shown







              Required, but never shown

































              Required, but never shown














              Required, but never shown












              Required, but never shown







              Required, but never shown






              Popular posts from this blog

              How to check contact read email or not when send email to Individual?

              Displaying single band from multi-band raster using QGIS

              How many registers does an x86_64 CPU actually have?