Removing duplicate entries and replacing it with comma | Bash
Clash Royale CLAN TAG#URR8PPP
I have a file which contains ip address and port number in this order:
ipaddress : port
1.1.1.1:21
1.1.1.1:22
2.2.2.2:443
3.3.3.3:80
3.3.3.3:443
I need Result in this below format
ipaddress : port, port
1.1.1.1:21,22
2.2.2.2:443
3.3.3.3:80,443
text-processing awk sed
add a comment |
I have a file which contains ip address and port number in this order:
ipaddress : port
1.1.1.1:21
1.1.1.1:22
2.2.2.2:443
3.3.3.3:80
3.3.3.3:443
I need Result in this below format
ipaddress : port, port
1.1.1.1:21,22
2.2.2.2:443
3.3.3.3:80,443
text-processing awk sed
add a comment |
I have a file which contains ip address and port number in this order:
ipaddress : port
1.1.1.1:21
1.1.1.1:22
2.2.2.2:443
3.3.3.3:80
3.3.3.3:443
I need Result in this below format
ipaddress : port, port
1.1.1.1:21,22
2.2.2.2:443
3.3.3.3:80,443
text-processing awk sed
I have a file which contains ip address and port number in this order:
ipaddress : port
1.1.1.1:21
1.1.1.1:22
2.2.2.2:443
3.3.3.3:80
3.3.3.3:443
I need Result in this below format
ipaddress : port, port
1.1.1.1:21,22
2.2.2.2:443
3.3.3.3:80,443
text-processing awk sed
text-processing awk sed
edited Feb 1 at 15:44
Jeff Schaller
42k1156133
42k1156133
asked Feb 1 at 13:33
user334662user334662
32
32
add a comment |
add a comment |
5 Answers
5
active
oldest
votes
Assuming there are no trailing spaces on the lines in the input file:
$ awk -F ':' 'BEGIN OFS=FS $1 in ports ports[$1] = ports[$1] "," $2; next ports[$1] = $2 END for (ip in ports) print ip, ports[ip] ' file
3.3.3.3:80,443
1.1.1.1:21,22
2.2.2.2:443
The awk
script,
BEGIN OFS=FS
$1 in ports ports[$1] = ports[$1] "," $2; next
ports[$1] = $2
END for (ip in ports) print ip, ports[ip]
would first set the output field separator to be the same as the input field separator, which is a :
character (this is given on the command line with -F ':'
), then it would test whether the current first field (the IP address) is a key in the ports
array. If it is, the port number (the second field) is added with a comma as a delimiter to that array entry. If it's not, the entry in the array is simply set to the port number for that IP address.
At the end, all stored IP addresses are printed with their collected port numbers.
Thank you soo much it worked :)
– user334662
Feb 1 at 13:53
add a comment |
With GNU Datamash
datamash -t: -s groupby 1 collapse 2 < file
If your data are already sorted, you can omit the -s
.
Or using an anonymous array inside a hash in Perl:
$ perl -F: -lne '
push @ $h$F[0] , $F[1]
} sort
With GNU Datamash
datamash -t: -s groupby 1 collapse 2 < file
If your data are already sorted, you can omit the -s
.
Or using an anonymous array inside a hash in Perl:
$ perl -F: -lne '
push @ $h$F[0] , $F[1]
With GNU Datamash
datamash -t: -s groupby 1 collapse 2 < file
If your data are already sorted, you can omit the -s
.
Or using an anonymous array inside a hash in Perl:
$ perl -F: -lne '
push @ $h$F[0] , $F[1]
for $k (sort keys %h) print "$k:", join ",", @ $h$k
' file
1.1.1.1:21,22
2.2.2.2:443
3.3.3.3:80,443
You can do using the sed
editor. There we maintain 2 lines at any time in the pattern space and look for changes in the IP number. So long as we continue getting the same IP, we remove from the 2nd portion the IP and join it with the 1st portion with a comma. If not, then that means an IP change has been detected and we promptly print the first portion only, remove it from the pattern space, and go back and read in the next IP line into the pattern space and repeat the same checks.
$ sed -e '
:loop
$!N
s/^(([^:]*:).*[^[:space:]]).*n2/1,/
tloop
P;D
' input-file.txt
1.1.1.1:21,22
2.2.2.2:443
3.3.3.3:80,443
$ perl -lne '
my($ip, $port) = /(H+):(H+)/;
push @seen, $ip if ! exists $h$ip;
push @$h$ip, $port;
You can do using the sed
editor. There we maintain 2 lines at any time in the pattern space and look for changes in the IP number. So long as we continue getting the same IP, we remove from the 2nd portion the IP and join it with the 1st portion with a comma. If not, then that means an IP change has been detected and we promptly print the first portion only, remove it from the pattern space, and go back and read in the next IP line into the pattern space and repeat the same checks.
$ sed -e '
:loop
$!N
s/^(([^:]*:).*[^[:space:]]).*n2/1,/
tloop
P;D
' input-file.txt
1.1.1.1:21,22
2.2.2.2:443
3.3.3.3:80,443
$ perl -lne '
my($ip, $port) = /(H+):(H+)/;
push @seen, $ip if ! exists $h$ip;
push @$h$ip, $port;improve this answer
You can do using the sed
editor. There we maintain 2 lines at any time in the pattern space and look for changes in the IP number. So long as we continue getting the same IP, we remove from the 2nd portion the IP and join it with the 1st portion with a comma. If not, then that means an IP change has been detected and we promptly print the first portion only, remove it from the pattern space, and go back and read in the next IP line into the pattern space and repeat the same checks.
$ sed -e '
:loop
$!N
s/^(([^:]*:).*[^[:space:]]).*n2/1,/
tloop
P;D
' input-file.txt
1.1.1.1:21,22
2.2.2.2:443
3.3.3.3:80,443
$ perl -lne '
my($ip, $port) = /(H+):(H+)/;
push @seen, $ip if ! exists $h$ip;
push @$h$ip, $port;{
print $_, ":", join ",", @$h$_ for @seen;
' input-file.txt
With Perl we can do the same by means of a hash which will maintain the IPs as it's keys and an array ref as the values comprising the ports. Also, we ensure to not consider any trailing blanks. The array @seen maintains the IPs in the order they were seen.
edited Feb 2 at 7:33
answered Feb 2 at 6:01
Rakesh SharmaRakesh Sharma
312113
312113
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f498132%2fremoving-duplicate-entries-and-replacing-it-with-comma-bash%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown