Sed or awk - Insert a new line after Matching pattern
Clash Royale CLAN TAG#URR8PPP
I have a file which contains multiple URLs. But unfortunately, all the URLs are in one line.
cat url_file
http://transfer.sh/PIGfk/my-file.002554http://transfer.sh/Ep9Md/my-file.002555http://transfer.sh/Ep9Md/my-file.002556http://transfer.sh/Ep9Md/my-file.002557
Expected output:
http://transfer.sh/PIGfk/my-file.002554
http://transfer.sh/Ep9Md/my-file.002555
http://transfer.sh/Ep9Md/my-file.002556
http://transfer.sh/Ep9Md/my-file.002557
text-processing awk sed
add a comment |
I have a file which contains multiple URLs. But unfortunately, all the URLs are in one line.
cat url_file
http://transfer.sh/PIGfk/my-file.002554http://transfer.sh/Ep9Md/my-file.002555http://transfer.sh/Ep9Md/my-file.002556http://transfer.sh/Ep9Md/my-file.002557
Expected output:
http://transfer.sh/PIGfk/my-file.002554
http://transfer.sh/Ep9Md/my-file.002555
http://transfer.sh/Ep9Md/my-file.002556
http://transfer.sh/Ep9Md/my-file.002557
text-processing awk sed
4
Are you sure there's no invisible NUL character in there? Check withsed -n l < url_file
– Stéphane Chazelas
Feb 27 at 22:50
add a comment |
I have a file which contains multiple URLs. But unfortunately, all the URLs are in one line.
cat url_file
http://transfer.sh/PIGfk/my-file.002554http://transfer.sh/Ep9Md/my-file.002555http://transfer.sh/Ep9Md/my-file.002556http://transfer.sh/Ep9Md/my-file.002557
Expected output:
http://transfer.sh/PIGfk/my-file.002554
http://transfer.sh/Ep9Md/my-file.002555
http://transfer.sh/Ep9Md/my-file.002556
http://transfer.sh/Ep9Md/my-file.002557
text-processing awk sed
I have a file which contains multiple URLs. But unfortunately, all the URLs are in one line.
cat url_file
http://transfer.sh/PIGfk/my-file.002554http://transfer.sh/Ep9Md/my-file.002555http://transfer.sh/Ep9Md/my-file.002556http://transfer.sh/Ep9Md/my-file.002557
Expected output:
http://transfer.sh/PIGfk/my-file.002554
http://transfer.sh/Ep9Md/my-file.002555
http://transfer.sh/Ep9Md/my-file.002556
http://transfer.sh/Ep9Md/my-file.002557
text-processing awk sed
text-processing awk sed
edited Feb 28 at 0:36
Isaac
12.2k11954
12.2k11954
asked Feb 27 at 21:31
BhuvaneshBhuvanesh
1236
1236
4
Are you sure there's no invisible NUL character in there? Check withsed -n l < url_file
– Stéphane Chazelas
Feb 27 at 22:50
add a comment |
4
Are you sure there's no invisible NUL character in there? Check withsed -n l < url_file
– Stéphane Chazelas
Feb 27 at 22:50
4
4
Are you sure there's no invisible NUL character in there? Check with
sed -n l < url_file
– Stéphane Chazelas
Feb 27 at 22:50
Are you sure there's no invisible NUL character in there? Check with
sed -n l < url_file
– Stéphane Chazelas
Feb 27 at 22:50
add a comment |
5 Answers
5
active
oldest
votes
Using perl
:
perl -pe 's#(?<=.)(?=http://)#n#g' url_file
Explanation
This uses a positive lookahead to find substrings that begin with http://
and place a newline (n
) before them.
It also uses a positive lookbehind to only match when there is a character before the http://
. In this way, no newline is insterted before the first url on a line. This will be extra handy if you end up with multiple lines.
Update
Prior to @steeldriver's awesome comment, a lookbehind wasn't used and I'd relied on sed '1d'
to delete the first line.
add a comment |
GNU grep
grep -oP 'http://.+?(?=http://|$)' url_file
add a comment |
You can use this GNU sed
command:
sed 's,http://,n&,g' url_file | tail -n +2
It looks for the pattern http://
and insert a CR before it.
The tail -n +2
skips the first (empty) line inserted by this sed command.
orsed 's,(.)http://,1nhttp://,g'
– Jeff Schaller♦
Feb 27 at 22:19
Thanks @JeffSchaller. I've never been able to do positive lookaheads with sed. That's super handy. You could also dosed 's,(.)(http://),1n2,g' url_file
This way you don't have to re-write the http:// and it's more universal.
– Crypteya
Feb 27 at 22:32
@JeffSchaller, why does your sed match the first URL? There are no characters before the first url to be matched by(.)
– Crypteya
Feb 27 at 22:53
1
It doesn't match the first URL, as there's no character before the first one. That's why I mentioned it.
– Jeff Schaller♦
Feb 27 at 22:54
5
Thatn
is a GNU extension. With GNUsed
, you can also dosed 's|http://|n&|2g'
– Stéphane Chazelas
Feb 27 at 22:55
add a comment |
Here's a way to do it all within POSIX sed
:
$ sed -e '
s|http://|
&|2;P;D
' input.file
This places a newline before the 2nd http://
substring to be found in the current line. Then we perform the action "print upto 1st newline, chop upto 1st newline, rinse & repeat
" till you run out of the pattern space. When only 1 http:// is left then the substitution does nothing, and that's the last print and delete action for the current record.
You can use Perl
arrays to do the job:
perl -F'http://' -lane 'print "http://$_" for @F[1..$#F]' input.file
The first field $F[0]
is empty so is skipped over while printing.
add a comment |
I have done by below 3 methods
python
#!/usr/bin/python
import re
k=open('filename','r')
for i in k:
print re.sub("http","nhttp",i)
perl
perl -pne "s/http/nhttp/g" filename
sed command
sed "s/http/n&/g" filename
output
http://transfer.sh/PIGfk/my-file.002554
http://transfer.sh/Ep9Md/my-file.002555
http://transfer.sh/Ep9Md/my-file.002556
http://transfer.sh/Ep9Md/my-file.002557
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f503421%2fsed-or-awk-insert-a-new-line-after-matching-pattern%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
Using perl
:
perl -pe 's#(?<=.)(?=http://)#n#g' url_file
Explanation
This uses a positive lookahead to find substrings that begin with http://
and place a newline (n
) before them.
It also uses a positive lookbehind to only match when there is a character before the http://
. In this way, no newline is insterted before the first url on a line. This will be extra handy if you end up with multiple lines.
Update
Prior to @steeldriver's awesome comment, a lookbehind wasn't used and I'd relied on sed '1d'
to delete the first line.
add a comment |
Using perl
:
perl -pe 's#(?<=.)(?=http://)#n#g' url_file
Explanation
This uses a positive lookahead to find substrings that begin with http://
and place a newline (n
) before them.
It also uses a positive lookbehind to only match when there is a character before the http://
. In this way, no newline is insterted before the first url on a line. This will be extra handy if you end up with multiple lines.
Update
Prior to @steeldriver's awesome comment, a lookbehind wasn't used and I'd relied on sed '1d'
to delete the first line.
add a comment |
Using perl
:
perl -pe 's#(?<=.)(?=http://)#n#g' url_file
Explanation
This uses a positive lookahead to find substrings that begin with http://
and place a newline (n
) before them.
It also uses a positive lookbehind to only match when there is a character before the http://
. In this way, no newline is insterted before the first url on a line. This will be extra handy if you end up with multiple lines.
Update
Prior to @steeldriver's awesome comment, a lookbehind wasn't used and I'd relied on sed '1d'
to delete the first line.
Using perl
:
perl -pe 's#(?<=.)(?=http://)#n#g' url_file
Explanation
This uses a positive lookahead to find substrings that begin with http://
and place a newline (n
) before them.
It also uses a positive lookbehind to only match when there is a character before the http://
. In this way, no newline is insterted before the first url on a line. This will be extra handy if you end up with multiple lines.
Update
Prior to @steeldriver's awesome comment, a lookbehind wasn't used and I'd relied on sed '1d'
to delete the first line.
edited Feb 28 at 6:09
answered Feb 27 at 22:18
CrypteyaCrypteya
414118
414118
add a comment |
add a comment |
GNU grep
grep -oP 'http://.+?(?=http://|$)' url_file
add a comment |
GNU grep
grep -oP 'http://.+?(?=http://|$)' url_file
add a comment |
GNU grep
grep -oP 'http://.+?(?=http://|$)' url_file
GNU grep
grep -oP 'http://.+?(?=http://|$)' url_file
answered Feb 28 at 0:59
glenn jackmanglenn jackman
52.7k573114
52.7k573114
add a comment |
add a comment |
You can use this GNU sed
command:
sed 's,http://,n&,g' url_file | tail -n +2
It looks for the pattern http://
and insert a CR before it.
The tail -n +2
skips the first (empty) line inserted by this sed command.
orsed 's,(.)http://,1nhttp://,g'
– Jeff Schaller♦
Feb 27 at 22:19
Thanks @JeffSchaller. I've never been able to do positive lookaheads with sed. That's super handy. You could also dosed 's,(.)(http://),1n2,g' url_file
This way you don't have to re-write the http:// and it's more universal.
– Crypteya
Feb 27 at 22:32
@JeffSchaller, why does your sed match the first URL? There are no characters before the first url to be matched by(.)
– Crypteya
Feb 27 at 22:53
1
It doesn't match the first URL, as there's no character before the first one. That's why I mentioned it.
– Jeff Schaller♦
Feb 27 at 22:54
5
Thatn
is a GNU extension. With GNUsed
, you can also dosed 's|http://|n&|2g'
– Stéphane Chazelas
Feb 27 at 22:55
add a comment |
You can use this GNU sed
command:
sed 's,http://,n&,g' url_file | tail -n +2
It looks for the pattern http://
and insert a CR before it.
The tail -n +2
skips the first (empty) line inserted by this sed command.
orsed 's,(.)http://,1nhttp://,g'
– Jeff Schaller♦
Feb 27 at 22:19
Thanks @JeffSchaller. I've never been able to do positive lookaheads with sed. That's super handy. You could also dosed 's,(.)(http://),1n2,g' url_file
This way you don't have to re-write the http:// and it's more universal.
– Crypteya
Feb 27 at 22:32
@JeffSchaller, why does your sed match the first URL? There are no characters before the first url to be matched by(.)
– Crypteya
Feb 27 at 22:53
1
It doesn't match the first URL, as there's no character before the first one. That's why I mentioned it.
– Jeff Schaller♦
Feb 27 at 22:54
5
Thatn
is a GNU extension. With GNUsed
, you can also dosed 's|http://|n&|2g'
– Stéphane Chazelas
Feb 27 at 22:55
add a comment |
You can use this GNU sed
command:
sed 's,http://,n&,g' url_file | tail -n +2
It looks for the pattern http://
and insert a CR before it.
The tail -n +2
skips the first (empty) line inserted by this sed command.
You can use this GNU sed
command:
sed 's,http://,n&,g' url_file | tail -n +2
It looks for the pattern http://
and insert a CR before it.
The tail -n +2
skips the first (empty) line inserted by this sed command.
edited Feb 28 at 7:35
Kusalananda♦
138k17258426
138k17258426
answered Feb 27 at 22:00
olivoliv
1,911413
1,911413
orsed 's,(.)http://,1nhttp://,g'
– Jeff Schaller♦
Feb 27 at 22:19
Thanks @JeffSchaller. I've never been able to do positive lookaheads with sed. That's super handy. You could also dosed 's,(.)(http://),1n2,g' url_file
This way you don't have to re-write the http:// and it's more universal.
– Crypteya
Feb 27 at 22:32
@JeffSchaller, why does your sed match the first URL? There are no characters before the first url to be matched by(.)
– Crypteya
Feb 27 at 22:53
1
It doesn't match the first URL, as there's no character before the first one. That's why I mentioned it.
– Jeff Schaller♦
Feb 27 at 22:54
5
Thatn
is a GNU extension. With GNUsed
, you can also dosed 's|http://|n&|2g'
– Stéphane Chazelas
Feb 27 at 22:55
add a comment |
orsed 's,(.)http://,1nhttp://,g'
– Jeff Schaller♦
Feb 27 at 22:19
Thanks @JeffSchaller. I've never been able to do positive lookaheads with sed. That's super handy. You could also dosed 's,(.)(http://),1n2,g' url_file
This way you don't have to re-write the http:// and it's more universal.
– Crypteya
Feb 27 at 22:32
@JeffSchaller, why does your sed match the first URL? There are no characters before the first url to be matched by(.)
– Crypteya
Feb 27 at 22:53
1
It doesn't match the first URL, as there's no character before the first one. That's why I mentioned it.
– Jeff Schaller♦
Feb 27 at 22:54
5
Thatn
is a GNU extension. With GNUsed
, you can also dosed 's|http://|n&|2g'
– Stéphane Chazelas
Feb 27 at 22:55
or
sed 's,(.)http://,1nhttp://,g'
– Jeff Schaller♦
Feb 27 at 22:19
or
sed 's,(.)http://,1nhttp://,g'
– Jeff Schaller♦
Feb 27 at 22:19
Thanks @JeffSchaller. I've never been able to do positive lookaheads with sed. That's super handy. You could also do
sed 's,(.)(http://),1n2,g' url_file
This way you don't have to re-write the http:// and it's more universal.– Crypteya
Feb 27 at 22:32
Thanks @JeffSchaller. I've never been able to do positive lookaheads with sed. That's super handy. You could also do
sed 's,(.)(http://),1n2,g' url_file
This way you don't have to re-write the http:// and it's more universal.– Crypteya
Feb 27 at 22:32
@JeffSchaller, why does your sed match the first URL? There are no characters before the first url to be matched by
(.)
– Crypteya
Feb 27 at 22:53
@JeffSchaller, why does your sed match the first URL? There are no characters before the first url to be matched by
(.)
– Crypteya
Feb 27 at 22:53
1
1
It doesn't match the first URL, as there's no character before the first one. That's why I mentioned it.
– Jeff Schaller♦
Feb 27 at 22:54
It doesn't match the first URL, as there's no character before the first one. That's why I mentioned it.
– Jeff Schaller♦
Feb 27 at 22:54
5
5
That
n
is a GNU extension. With GNU sed
, you can also do sed 's|http://|n&|2g'
– Stéphane Chazelas
Feb 27 at 22:55
That
n
is a GNU extension. With GNU sed
, you can also do sed 's|http://|n&|2g'
– Stéphane Chazelas
Feb 27 at 22:55
add a comment |
Here's a way to do it all within POSIX sed
:
$ sed -e '
s|http://|
&|2;P;D
' input.file
This places a newline before the 2nd http://
substring to be found in the current line. Then we perform the action "print upto 1st newline, chop upto 1st newline, rinse & repeat
" till you run out of the pattern space. When only 1 http:// is left then the substitution does nothing, and that's the last print and delete action for the current record.
You can use Perl
arrays to do the job:
perl -F'http://' -lane 'print "http://$_" for @F[1..$#F]' input.file
The first field $F[0]
is empty so is skipped over while printing.
add a comment |
Here's a way to do it all within POSIX sed
:
$ sed -e '
s|http://|
&|2;P;D
' input.file
This places a newline before the 2nd http://
substring to be found in the current line. Then we perform the action "print upto 1st newline, chop upto 1st newline, rinse & repeat
" till you run out of the pattern space. When only 1 http:// is left then the substitution does nothing, and that's the last print and delete action for the current record.
You can use Perl
arrays to do the job:
perl -F'http://' -lane 'print "http://$_" for @F[1..$#F]' input.file
The first field $F[0]
is empty so is skipped over while printing.
add a comment |
Here's a way to do it all within POSIX sed
:
$ sed -e '
s|http://|
&|2;P;D
' input.file
This places a newline before the 2nd http://
substring to be found in the current line. Then we perform the action "print upto 1st newline, chop upto 1st newline, rinse & repeat
" till you run out of the pattern space. When only 1 http:// is left then the substitution does nothing, and that's the last print and delete action for the current record.
You can use Perl
arrays to do the job:
perl -F'http://' -lane 'print "http://$_" for @F[1..$#F]' input.file
The first field $F[0]
is empty so is skipped over while printing.
Here's a way to do it all within POSIX sed
:
$ sed -e '
s|http://|
&|2;P;D
' input.file
This places a newline before the 2nd http://
substring to be found in the current line. Then we perform the action "print upto 1st newline, chop upto 1st newline, rinse & repeat
" till you run out of the pattern space. When only 1 http:// is left then the substitution does nothing, and that's the last print and delete action for the current record.
You can use Perl
arrays to do the job:
perl -F'http://' -lane 'print "http://$_" for @F[1..$#F]' input.file
The first field $F[0]
is empty so is skipped over while printing.
answered Feb 28 at 14:04
Rakesh SharmaRakesh Sharma
582
582
add a comment |
add a comment |
I have done by below 3 methods
python
#!/usr/bin/python
import re
k=open('filename','r')
for i in k:
print re.sub("http","nhttp",i)
perl
perl -pne "s/http/nhttp/g" filename
sed command
sed "s/http/n&/g" filename
output
http://transfer.sh/PIGfk/my-file.002554
http://transfer.sh/Ep9Md/my-file.002555
http://transfer.sh/Ep9Md/my-file.002556
http://transfer.sh/Ep9Md/my-file.002557
add a comment |
I have done by below 3 methods
python
#!/usr/bin/python
import re
k=open('filename','r')
for i in k:
print re.sub("http","nhttp",i)
perl
perl -pne "s/http/nhttp/g" filename
sed command
sed "s/http/n&/g" filename
output
http://transfer.sh/PIGfk/my-file.002554
http://transfer.sh/Ep9Md/my-file.002555
http://transfer.sh/Ep9Md/my-file.002556
http://transfer.sh/Ep9Md/my-file.002557
add a comment |
I have done by below 3 methods
python
#!/usr/bin/python
import re
k=open('filename','r')
for i in k:
print re.sub("http","nhttp",i)
perl
perl -pne "s/http/nhttp/g" filename
sed command
sed "s/http/n&/g" filename
output
http://transfer.sh/PIGfk/my-file.002554
http://transfer.sh/Ep9Md/my-file.002555
http://transfer.sh/Ep9Md/my-file.002556
http://transfer.sh/Ep9Md/my-file.002557
I have done by below 3 methods
python
#!/usr/bin/python
import re
k=open('filename','r')
for i in k:
print re.sub("http","nhttp",i)
perl
perl -pne "s/http/nhttp/g" filename
sed command
sed "s/http/n&/g" filename
output
http://transfer.sh/PIGfk/my-file.002554
http://transfer.sh/Ep9Md/my-file.002555
http://transfer.sh/Ep9Md/my-file.002556
http://transfer.sh/Ep9Md/my-file.002557
answered Feb 28 at 18:19
Praveen Kumar BSPraveen Kumar BS
1,6821311
1,6821311
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f503421%2fsed-or-awk-insert-a-new-line-after-matching-pattern%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
4
Are you sure there's no invisible NUL character in there? Check with
sed -n l < url_file
– Stéphane Chazelas
Feb 27 at 22:50