Sed or awk - Insert a new line after Matching pattern

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP












2















I have a file which contains multiple URLs. But unfortunately, all the URLs are in one line.



cat url_file



http://transfer.sh/PIGfk/my-file.002554http://transfer.sh/Ep9Md/my-file.002555http://transfer.sh/Ep9Md/my-file.002556http://transfer.sh/Ep9Md/my-file.002557


Expected output:



http://transfer.sh/PIGfk/my-file.002554
http://transfer.sh/Ep9Md/my-file.002555
http://transfer.sh/Ep9Md/my-file.002556
http://transfer.sh/Ep9Md/my-file.002557









share|improve this question



















  • 4





    Are you sure there's no invisible NUL character in there? Check with sed -n l < url_file

    – Stéphane Chazelas
    Feb 27 at 22:50















2















I have a file which contains multiple URLs. But unfortunately, all the URLs are in one line.



cat url_file



http://transfer.sh/PIGfk/my-file.002554http://transfer.sh/Ep9Md/my-file.002555http://transfer.sh/Ep9Md/my-file.002556http://transfer.sh/Ep9Md/my-file.002557


Expected output:



http://transfer.sh/PIGfk/my-file.002554
http://transfer.sh/Ep9Md/my-file.002555
http://transfer.sh/Ep9Md/my-file.002556
http://transfer.sh/Ep9Md/my-file.002557









share|improve this question



















  • 4





    Are you sure there's no invisible NUL character in there? Check with sed -n l < url_file

    – Stéphane Chazelas
    Feb 27 at 22:50













2












2








2








I have a file which contains multiple URLs. But unfortunately, all the URLs are in one line.



cat url_file



http://transfer.sh/PIGfk/my-file.002554http://transfer.sh/Ep9Md/my-file.002555http://transfer.sh/Ep9Md/my-file.002556http://transfer.sh/Ep9Md/my-file.002557


Expected output:



http://transfer.sh/PIGfk/my-file.002554
http://transfer.sh/Ep9Md/my-file.002555
http://transfer.sh/Ep9Md/my-file.002556
http://transfer.sh/Ep9Md/my-file.002557









share|improve this question
















I have a file which contains multiple URLs. But unfortunately, all the URLs are in one line.



cat url_file



http://transfer.sh/PIGfk/my-file.002554http://transfer.sh/Ep9Md/my-file.002555http://transfer.sh/Ep9Md/my-file.002556http://transfer.sh/Ep9Md/my-file.002557


Expected output:



http://transfer.sh/PIGfk/my-file.002554
http://transfer.sh/Ep9Md/my-file.002555
http://transfer.sh/Ep9Md/my-file.002556
http://transfer.sh/Ep9Md/my-file.002557






text-processing awk sed






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Feb 28 at 0:36









Isaac

12.2k11954




12.2k11954










asked Feb 27 at 21:31









BhuvaneshBhuvanesh

1236




1236







  • 4





    Are you sure there's no invisible NUL character in there? Check with sed -n l < url_file

    – Stéphane Chazelas
    Feb 27 at 22:50












  • 4





    Are you sure there's no invisible NUL character in there? Check with sed -n l < url_file

    – Stéphane Chazelas
    Feb 27 at 22:50







4




4





Are you sure there's no invisible NUL character in there? Check with sed -n l < url_file

– Stéphane Chazelas
Feb 27 at 22:50





Are you sure there's no invisible NUL character in there? Check with sed -n l < url_file

– Stéphane Chazelas
Feb 27 at 22:50










5 Answers
5






active

oldest

votes


















2














Using perl:



perl -pe 's#(?<=.)(?=http://)#n#g' url_file


Explanation



This uses a positive lookahead to find substrings that begin with http:// and place a newline (n) before them.



It also uses a positive lookbehind to only match when there is a character before the http://. In this way, no newline is insterted before the first url on a line. This will be extra handy if you end up with multiple lines.



Update



Prior to @steeldriver's awesome comment, a lookbehind wasn't used and I'd relied on sed '1d' to delete the first line.






share|improve this answer
































    2














    GNU grep



    grep -oP 'http://.+?(?=http://|$)' url_file





    share|improve this answer






























      2














      You can use this GNU sed command:



      sed 's,http://,n&,g' url_file | tail -n +2


      It looks for the pattern http:// and insert a CR before it.



      The tail -n +2 skips the first (empty) line inserted by this sed command.






      share|improve this answer

























      • or sed 's,(.)http://,1nhttp://,g'

        – Jeff Schaller
        Feb 27 at 22:19











      • Thanks @JeffSchaller. I've never been able to do positive lookaheads with sed. That's super handy. You could also do sed 's,(.)(http://),1n2,g' url_file This way you don't have to re-write the http:// and it's more universal.

        – Crypteya
        Feb 27 at 22:32












      • @JeffSchaller, why does your sed match the first URL? There are no characters before the first url to be matched by (.)

        – Crypteya
        Feb 27 at 22:53






      • 1





        It doesn't match the first URL, as there's no character before the first one. That's why I mentioned it.

        – Jeff Schaller
        Feb 27 at 22:54






      • 5





        That n is a GNU extension. With GNU sed, you can also do sed 's|http://|n&|2g'

        – Stéphane Chazelas
        Feb 27 at 22:55


















      0














      Here's a way to do it all within POSIX sed:



      $ sed -e '
      s|http://|
      &|2;P;D
      ' input.file


      This places a newline before the 2nd http:// substring to be found in the current line. Then we perform the action "print upto 1st newline, chop upto 1st newline, rinse & repeat" till you run out of the pattern space. When only 1 http:// is left then the substitution does nothing, and that's the last print and delete action for the current record.



      You can use Perl arrays to do the job:



      perl -F'http://' -lane 'print "http://$_" for @F[1..$#F]' input.file


      The first field $F[0] is empty so is skipped over while printing.






      share|improve this answer






























        0














        I have done by below 3 methods



        python

        #!/usr/bin/python
        import re
        k=open('filename','r')
        for i in k:
        print re.sub("http","nhttp",i)



        perl

        perl -pne "s/http/nhttp/g" filename



        sed command

        sed "s/http/n&/g" filename


        output



        http://transfer.sh/PIGfk/my-file.002554
        http://transfer.sh/Ep9Md/my-file.002555
        http://transfer.sh/Ep9Md/my-file.002556
        http://transfer.sh/Ep9Md/my-file.002557





        share|improve this answer






















          Your Answer








          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "106"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f503421%2fsed-or-awk-insert-a-new-line-after-matching-pattern%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          5 Answers
          5






          active

          oldest

          votes








          5 Answers
          5






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          2














          Using perl:



          perl -pe 's#(?<=.)(?=http://)#n#g' url_file


          Explanation



          This uses a positive lookahead to find substrings that begin with http:// and place a newline (n) before them.



          It also uses a positive lookbehind to only match when there is a character before the http://. In this way, no newline is insterted before the first url on a line. This will be extra handy if you end up with multiple lines.



          Update



          Prior to @steeldriver's awesome comment, a lookbehind wasn't used and I'd relied on sed '1d' to delete the first line.






          share|improve this answer





























            2














            Using perl:



            perl -pe 's#(?<=.)(?=http://)#n#g' url_file


            Explanation



            This uses a positive lookahead to find substrings that begin with http:// and place a newline (n) before them.



            It also uses a positive lookbehind to only match when there is a character before the http://. In this way, no newline is insterted before the first url on a line. This will be extra handy if you end up with multiple lines.



            Update



            Prior to @steeldriver's awesome comment, a lookbehind wasn't used and I'd relied on sed '1d' to delete the first line.






            share|improve this answer



























              2












              2








              2







              Using perl:



              perl -pe 's#(?<=.)(?=http://)#n#g' url_file


              Explanation



              This uses a positive lookahead to find substrings that begin with http:// and place a newline (n) before them.



              It also uses a positive lookbehind to only match when there is a character before the http://. In this way, no newline is insterted before the first url on a line. This will be extra handy if you end up with multiple lines.



              Update



              Prior to @steeldriver's awesome comment, a lookbehind wasn't used and I'd relied on sed '1d' to delete the first line.






              share|improve this answer















              Using perl:



              perl -pe 's#(?<=.)(?=http://)#n#g' url_file


              Explanation



              This uses a positive lookahead to find substrings that begin with http:// and place a newline (n) before them.



              It also uses a positive lookbehind to only match when there is a character before the http://. In this way, no newline is insterted before the first url on a line. This will be extra handy if you end up with multiple lines.



              Update



              Prior to @steeldriver's awesome comment, a lookbehind wasn't used and I'd relied on sed '1d' to delete the first line.







              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Feb 28 at 6:09

























              answered Feb 27 at 22:18









              CrypteyaCrypteya

              414118




              414118























                  2














                  GNU grep



                  grep -oP 'http://.+?(?=http://|$)' url_file





                  share|improve this answer



























                    2














                    GNU grep



                    grep -oP 'http://.+?(?=http://|$)' url_file





                    share|improve this answer

























                      2












                      2








                      2







                      GNU grep



                      grep -oP 'http://.+?(?=http://|$)' url_file





                      share|improve this answer













                      GNU grep



                      grep -oP 'http://.+?(?=http://|$)' url_file






                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Feb 28 at 0:59









                      glenn jackmanglenn jackman

                      52.7k573114




                      52.7k573114





















                          2














                          You can use this GNU sed command:



                          sed 's,http://,n&,g' url_file | tail -n +2


                          It looks for the pattern http:// and insert a CR before it.



                          The tail -n +2 skips the first (empty) line inserted by this sed command.






                          share|improve this answer

























                          • or sed 's,(.)http://,1nhttp://,g'

                            – Jeff Schaller
                            Feb 27 at 22:19











                          • Thanks @JeffSchaller. I've never been able to do positive lookaheads with sed. That's super handy. You could also do sed 's,(.)(http://),1n2,g' url_file This way you don't have to re-write the http:// and it's more universal.

                            – Crypteya
                            Feb 27 at 22:32












                          • @JeffSchaller, why does your sed match the first URL? There are no characters before the first url to be matched by (.)

                            – Crypteya
                            Feb 27 at 22:53






                          • 1





                            It doesn't match the first URL, as there's no character before the first one. That's why I mentioned it.

                            – Jeff Schaller
                            Feb 27 at 22:54






                          • 5





                            That n is a GNU extension. With GNU sed, you can also do sed 's|http://|n&|2g'

                            – Stéphane Chazelas
                            Feb 27 at 22:55















                          2














                          You can use this GNU sed command:



                          sed 's,http://,n&,g' url_file | tail -n +2


                          It looks for the pattern http:// and insert a CR before it.



                          The tail -n +2 skips the first (empty) line inserted by this sed command.






                          share|improve this answer

























                          • or sed 's,(.)http://,1nhttp://,g'

                            – Jeff Schaller
                            Feb 27 at 22:19











                          • Thanks @JeffSchaller. I've never been able to do positive lookaheads with sed. That's super handy. You could also do sed 's,(.)(http://),1n2,g' url_file This way you don't have to re-write the http:// and it's more universal.

                            – Crypteya
                            Feb 27 at 22:32












                          • @JeffSchaller, why does your sed match the first URL? There are no characters before the first url to be matched by (.)

                            – Crypteya
                            Feb 27 at 22:53






                          • 1





                            It doesn't match the first URL, as there's no character before the first one. That's why I mentioned it.

                            – Jeff Schaller
                            Feb 27 at 22:54






                          • 5





                            That n is a GNU extension. With GNU sed, you can also do sed 's|http://|n&|2g'

                            – Stéphane Chazelas
                            Feb 27 at 22:55













                          2












                          2








                          2







                          You can use this GNU sed command:



                          sed 's,http://,n&,g' url_file | tail -n +2


                          It looks for the pattern http:// and insert a CR before it.



                          The tail -n +2 skips the first (empty) line inserted by this sed command.






                          share|improve this answer















                          You can use this GNU sed command:



                          sed 's,http://,n&,g' url_file | tail -n +2


                          It looks for the pattern http:// and insert a CR before it.



                          The tail -n +2 skips the first (empty) line inserted by this sed command.







                          share|improve this answer














                          share|improve this answer



                          share|improve this answer








                          edited Feb 28 at 7:35









                          Kusalananda

                          138k17258426




                          138k17258426










                          answered Feb 27 at 22:00









                          olivoliv

                          1,911413




                          1,911413












                          • or sed 's,(.)http://,1nhttp://,g'

                            – Jeff Schaller
                            Feb 27 at 22:19











                          • Thanks @JeffSchaller. I've never been able to do positive lookaheads with sed. That's super handy. You could also do sed 's,(.)(http://),1n2,g' url_file This way you don't have to re-write the http:// and it's more universal.

                            – Crypteya
                            Feb 27 at 22:32












                          • @JeffSchaller, why does your sed match the first URL? There are no characters before the first url to be matched by (.)

                            – Crypteya
                            Feb 27 at 22:53






                          • 1





                            It doesn't match the first URL, as there's no character before the first one. That's why I mentioned it.

                            – Jeff Schaller
                            Feb 27 at 22:54






                          • 5





                            That n is a GNU extension. With GNU sed, you can also do sed 's|http://|n&|2g'

                            – Stéphane Chazelas
                            Feb 27 at 22:55

















                          • or sed 's,(.)http://,1nhttp://,g'

                            – Jeff Schaller
                            Feb 27 at 22:19











                          • Thanks @JeffSchaller. I've never been able to do positive lookaheads with sed. That's super handy. You could also do sed 's,(.)(http://),1n2,g' url_file This way you don't have to re-write the http:// and it's more universal.

                            – Crypteya
                            Feb 27 at 22:32












                          • @JeffSchaller, why does your sed match the first URL? There are no characters before the first url to be matched by (.)

                            – Crypteya
                            Feb 27 at 22:53






                          • 1





                            It doesn't match the first URL, as there's no character before the first one. That's why I mentioned it.

                            – Jeff Schaller
                            Feb 27 at 22:54






                          • 5





                            That n is a GNU extension. With GNU sed, you can also do sed 's|http://|n&|2g'

                            – Stéphane Chazelas
                            Feb 27 at 22:55
















                          or sed 's,(.)http://,1nhttp://,g'

                          – Jeff Schaller
                          Feb 27 at 22:19





                          or sed 's,(.)http://,1nhttp://,g'

                          – Jeff Schaller
                          Feb 27 at 22:19













                          Thanks @JeffSchaller. I've never been able to do positive lookaheads with sed. That's super handy. You could also do sed 's,(.)(http://),1n2,g' url_file This way you don't have to re-write the http:// and it's more universal.

                          – Crypteya
                          Feb 27 at 22:32






                          Thanks @JeffSchaller. I've never been able to do positive lookaheads with sed. That's super handy. You could also do sed 's,(.)(http://),1n2,g' url_file This way you don't have to re-write the http:// and it's more universal.

                          – Crypteya
                          Feb 27 at 22:32














                          @JeffSchaller, why does your sed match the first URL? There are no characters before the first url to be matched by (.)

                          – Crypteya
                          Feb 27 at 22:53





                          @JeffSchaller, why does your sed match the first URL? There are no characters before the first url to be matched by (.)

                          – Crypteya
                          Feb 27 at 22:53




                          1




                          1





                          It doesn't match the first URL, as there's no character before the first one. That's why I mentioned it.

                          – Jeff Schaller
                          Feb 27 at 22:54





                          It doesn't match the first URL, as there's no character before the first one. That's why I mentioned it.

                          – Jeff Schaller
                          Feb 27 at 22:54




                          5




                          5





                          That n is a GNU extension. With GNU sed, you can also do sed 's|http://|n&|2g'

                          – Stéphane Chazelas
                          Feb 27 at 22:55





                          That n is a GNU extension. With GNU sed, you can also do sed 's|http://|n&|2g'

                          – Stéphane Chazelas
                          Feb 27 at 22:55











                          0














                          Here's a way to do it all within POSIX sed:



                          $ sed -e '
                          s|http://|
                          &|2;P;D
                          ' input.file


                          This places a newline before the 2nd http:// substring to be found in the current line. Then we perform the action "print upto 1st newline, chop upto 1st newline, rinse & repeat" till you run out of the pattern space. When only 1 http:// is left then the substitution does nothing, and that's the last print and delete action for the current record.



                          You can use Perl arrays to do the job:



                          perl -F'http://' -lane 'print "http://$_" for @F[1..$#F]' input.file


                          The first field $F[0] is empty so is skipped over while printing.






                          share|improve this answer



























                            0














                            Here's a way to do it all within POSIX sed:



                            $ sed -e '
                            s|http://|
                            &|2;P;D
                            ' input.file


                            This places a newline before the 2nd http:// substring to be found in the current line. Then we perform the action "print upto 1st newline, chop upto 1st newline, rinse & repeat" till you run out of the pattern space. When only 1 http:// is left then the substitution does nothing, and that's the last print and delete action for the current record.



                            You can use Perl arrays to do the job:



                            perl -F'http://' -lane 'print "http://$_" for @F[1..$#F]' input.file


                            The first field $F[0] is empty so is skipped over while printing.






                            share|improve this answer

























                              0












                              0








                              0







                              Here's a way to do it all within POSIX sed:



                              $ sed -e '
                              s|http://|
                              &|2;P;D
                              ' input.file


                              This places a newline before the 2nd http:// substring to be found in the current line. Then we perform the action "print upto 1st newline, chop upto 1st newline, rinse & repeat" till you run out of the pattern space. When only 1 http:// is left then the substitution does nothing, and that's the last print and delete action for the current record.



                              You can use Perl arrays to do the job:



                              perl -F'http://' -lane 'print "http://$_" for @F[1..$#F]' input.file


                              The first field $F[0] is empty so is skipped over while printing.






                              share|improve this answer













                              Here's a way to do it all within POSIX sed:



                              $ sed -e '
                              s|http://|
                              &|2;P;D
                              ' input.file


                              This places a newline before the 2nd http:// substring to be found in the current line. Then we perform the action "print upto 1st newline, chop upto 1st newline, rinse & repeat" till you run out of the pattern space. When only 1 http:// is left then the substitution does nothing, and that's the last print and delete action for the current record.



                              You can use Perl arrays to do the job:



                              perl -F'http://' -lane 'print "http://$_" for @F[1..$#F]' input.file


                              The first field $F[0] is empty so is skipped over while printing.







                              share|improve this answer












                              share|improve this answer



                              share|improve this answer










                              answered Feb 28 at 14:04









                              Rakesh SharmaRakesh Sharma

                              582




                              582





















                                  0














                                  I have done by below 3 methods



                                  python

                                  #!/usr/bin/python
                                  import re
                                  k=open('filename','r')
                                  for i in k:
                                  print re.sub("http","nhttp",i)



                                  perl

                                  perl -pne "s/http/nhttp/g" filename



                                  sed command

                                  sed "s/http/n&/g" filename


                                  output



                                  http://transfer.sh/PIGfk/my-file.002554
                                  http://transfer.sh/Ep9Md/my-file.002555
                                  http://transfer.sh/Ep9Md/my-file.002556
                                  http://transfer.sh/Ep9Md/my-file.002557





                                  share|improve this answer



























                                    0














                                    I have done by below 3 methods



                                    python

                                    #!/usr/bin/python
                                    import re
                                    k=open('filename','r')
                                    for i in k:
                                    print re.sub("http","nhttp",i)



                                    perl

                                    perl -pne "s/http/nhttp/g" filename



                                    sed command

                                    sed "s/http/n&/g" filename


                                    output



                                    http://transfer.sh/PIGfk/my-file.002554
                                    http://transfer.sh/Ep9Md/my-file.002555
                                    http://transfer.sh/Ep9Md/my-file.002556
                                    http://transfer.sh/Ep9Md/my-file.002557





                                    share|improve this answer

























                                      0












                                      0








                                      0







                                      I have done by below 3 methods



                                      python

                                      #!/usr/bin/python
                                      import re
                                      k=open('filename','r')
                                      for i in k:
                                      print re.sub("http","nhttp",i)



                                      perl

                                      perl -pne "s/http/nhttp/g" filename



                                      sed command

                                      sed "s/http/n&/g" filename


                                      output



                                      http://transfer.sh/PIGfk/my-file.002554
                                      http://transfer.sh/Ep9Md/my-file.002555
                                      http://transfer.sh/Ep9Md/my-file.002556
                                      http://transfer.sh/Ep9Md/my-file.002557





                                      share|improve this answer













                                      I have done by below 3 methods



                                      python

                                      #!/usr/bin/python
                                      import re
                                      k=open('filename','r')
                                      for i in k:
                                      print re.sub("http","nhttp",i)



                                      perl

                                      perl -pne "s/http/nhttp/g" filename



                                      sed command

                                      sed "s/http/n&/g" filename


                                      output



                                      http://transfer.sh/PIGfk/my-file.002554
                                      http://transfer.sh/Ep9Md/my-file.002555
                                      http://transfer.sh/Ep9Md/my-file.002556
                                      http://transfer.sh/Ep9Md/my-file.002557






                                      share|improve this answer












                                      share|improve this answer



                                      share|improve this answer










                                      answered Feb 28 at 18:19









                                      Praveen Kumar BSPraveen Kumar BS

                                      1,6821311




                                      1,6821311



























                                          draft saved

                                          draft discarded
















































                                          Thanks for contributing an answer to Unix & Linux Stack Exchange!


                                          • Please be sure to answer the question. Provide details and share your research!

                                          But avoid


                                          • Asking for help, clarification, or responding to other answers.

                                          • Making statements based on opinion; back them up with references or personal experience.

                                          To learn more, see our tips on writing great answers.




                                          draft saved


                                          draft discarded














                                          StackExchange.ready(
                                          function ()
                                          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f503421%2fsed-or-awk-insert-a-new-line-after-matching-pattern%23new-answer', 'question_page');

                                          );

                                          Post as a guest















                                          Required, but never shown





















































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown

































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown






                                          Popular posts from this blog

                                          How to check contact read email or not when send email to Individual?

                                          Displaying single band from multi-band raster using QGIS

                                          How many registers does an x86_64 CPU actually have?