$1 not working with sed

Multi tool use
Multi tool use

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
0
down vote

favorite












I have a bunch of files that contain XML tags like:



<h> PIDAT <h> O



I need to delete everything what comes after the first <h> in that line, so I can get this:



<h>



For that I'm using



sed -i -e 's/(^<.*?>).+/$1/' *.conll



But it seems that sed is not recognizing the $1. (As I understand, $1 should delete everything what is not contained in the group). Is there a way I can achieve this? I'd really appreciate if you could point me in the right direction.



PS: I tested those expressions on a regex app and they worked, but it is not working from the command line.







share|improve this question























    up vote
    0
    down vote

    favorite












    I have a bunch of files that contain XML tags like:



    <h> PIDAT <h> O



    I need to delete everything what comes after the first <h> in that line, so I can get this:



    <h>



    For that I'm using



    sed -i -e 's/(^<.*?>).+/$1/' *.conll



    But it seems that sed is not recognizing the $1. (As I understand, $1 should delete everything what is not contained in the group). Is there a way I can achieve this? I'd really appreciate if you could point me in the right direction.



    PS: I tested those expressions on a regex app and they worked, but it is not working from the command line.







    share|improve this question





















      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      I have a bunch of files that contain XML tags like:



      <h> PIDAT <h> O



      I need to delete everything what comes after the first <h> in that line, so I can get this:



      <h>



      For that I'm using



      sed -i -e 's/(^<.*?>).+/$1/' *.conll



      But it seems that sed is not recognizing the $1. (As I understand, $1 should delete everything what is not contained in the group). Is there a way I can achieve this? I'd really appreciate if you could point me in the right direction.



      PS: I tested those expressions on a regex app and they worked, but it is not working from the command line.







      share|improve this question











      I have a bunch of files that contain XML tags like:



      <h> PIDAT <h> O



      I need to delete everything what comes after the first <h> in that line, so I can get this:



      <h>



      For that I'm using



      sed -i -e 's/(^<.*?>).+/$1/' *.conll



      But it seems that sed is not recognizing the $1. (As I understand, $1 should delete everything what is not contained in the group). Is there a way I can achieve this? I'd really appreciate if you could point me in the right direction.



      PS: I tested those expressions on a regex app and they worked, but it is not working from the command line.









      share|improve this question










      share|improve this question




      share|improve this question









      asked Jul 5 at 5:16









      Carolina Cárdenas

      31




      31




















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          3
          down vote



          accepted










          sed backreferences have the form 1, 2, etc. $1 is more Perl-like. Also, if using basic regular expressions (BRE), you need to escape the parentheses (...) forming a group, as well as ? and +. Or you can use extended regular expressions with the -E option.



          Note that sed regexes are greedy, so <.*> will match <h> PIDAT <h> in that line, instead of stopping at the first >. And .*? does not make sense (.* already can match nothing, so making it optional via ? is unnecessary).



          This might work:



          sed -i -Ee 's/^(<[^>]*>).*/1/' *.conll


          [^>] matches everything except >, so <[^>]*> will match <h> but not <h> PIDAT <h>.






          share|improve this answer

















          • 1




            Another possibility is that the OP might be thinking that .*? is a non-greedy match. Your solution, of course, has that covered.
            – John1024
            Jul 5 at 6:01






          • 1




            @John1024 ah, yes, I forgot PCRE uses *? for non-greedy match. That makes sense now.
            – muru
            Jul 5 at 6:02










          • Impressive. Thank you for your explanation! I got an error with your line of code: "1 not defined in the RE", but after a quick Google search, I realized it was because it didn't escape the (), but that was also given in your answer ;) Anyway, thanks again!!
            – Carolina Cárdenas
            Jul 5 at 6:03






          • 1




            @CarolinaCárdenas did you get that error when using the -E option?
            – muru
            Jul 5 at 6:04










          • @muru yes. Exactly.
            – Carolina Cárdenas
            Jul 5 at 6:35










          Your Answer







          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "106"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );








           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f453527%2f1-not-working-with-sed%23new-answer', 'question_page');

          );

          Post as a guest






























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          3
          down vote



          accepted










          sed backreferences have the form 1, 2, etc. $1 is more Perl-like. Also, if using basic regular expressions (BRE), you need to escape the parentheses (...) forming a group, as well as ? and +. Or you can use extended regular expressions with the -E option.



          Note that sed regexes are greedy, so <.*> will match <h> PIDAT <h> in that line, instead of stopping at the first >. And .*? does not make sense (.* already can match nothing, so making it optional via ? is unnecessary).



          This might work:



          sed -i -Ee 's/^(<[^>]*>).*/1/' *.conll


          [^>] matches everything except >, so <[^>]*> will match <h> but not <h> PIDAT <h>.






          share|improve this answer

















          • 1




            Another possibility is that the OP might be thinking that .*? is a non-greedy match. Your solution, of course, has that covered.
            – John1024
            Jul 5 at 6:01






          • 1




            @John1024 ah, yes, I forgot PCRE uses *? for non-greedy match. That makes sense now.
            – muru
            Jul 5 at 6:02










          • Impressive. Thank you for your explanation! I got an error with your line of code: "1 not defined in the RE", but after a quick Google search, I realized it was because it didn't escape the (), but that was also given in your answer ;) Anyway, thanks again!!
            – Carolina Cárdenas
            Jul 5 at 6:03






          • 1




            @CarolinaCárdenas did you get that error when using the -E option?
            – muru
            Jul 5 at 6:04










          • @muru yes. Exactly.
            – Carolina Cárdenas
            Jul 5 at 6:35














          up vote
          3
          down vote



          accepted










          sed backreferences have the form 1, 2, etc. $1 is more Perl-like. Also, if using basic regular expressions (BRE), you need to escape the parentheses (...) forming a group, as well as ? and +. Or you can use extended regular expressions with the -E option.



          Note that sed regexes are greedy, so <.*> will match <h> PIDAT <h> in that line, instead of stopping at the first >. And .*? does not make sense (.* already can match nothing, so making it optional via ? is unnecessary).



          This might work:



          sed -i -Ee 's/^(<[^>]*>).*/1/' *.conll


          [^>] matches everything except >, so <[^>]*> will match <h> but not <h> PIDAT <h>.






          share|improve this answer

















          • 1




            Another possibility is that the OP might be thinking that .*? is a non-greedy match. Your solution, of course, has that covered.
            – John1024
            Jul 5 at 6:01






          • 1




            @John1024 ah, yes, I forgot PCRE uses *? for non-greedy match. That makes sense now.
            – muru
            Jul 5 at 6:02










          • Impressive. Thank you for your explanation! I got an error with your line of code: "1 not defined in the RE", but after a quick Google search, I realized it was because it didn't escape the (), but that was also given in your answer ;) Anyway, thanks again!!
            – Carolina Cárdenas
            Jul 5 at 6:03






          • 1




            @CarolinaCárdenas did you get that error when using the -E option?
            – muru
            Jul 5 at 6:04










          • @muru yes. Exactly.
            – Carolina Cárdenas
            Jul 5 at 6:35












          up vote
          3
          down vote



          accepted







          up vote
          3
          down vote



          accepted






          sed backreferences have the form 1, 2, etc. $1 is more Perl-like. Also, if using basic regular expressions (BRE), you need to escape the parentheses (...) forming a group, as well as ? and +. Or you can use extended regular expressions with the -E option.



          Note that sed regexes are greedy, so <.*> will match <h> PIDAT <h> in that line, instead of stopping at the first >. And .*? does not make sense (.* already can match nothing, so making it optional via ? is unnecessary).



          This might work:



          sed -i -Ee 's/^(<[^>]*>).*/1/' *.conll


          [^>] matches everything except >, so <[^>]*> will match <h> but not <h> PIDAT <h>.






          share|improve this answer













          sed backreferences have the form 1, 2, etc. $1 is more Perl-like. Also, if using basic regular expressions (BRE), you need to escape the parentheses (...) forming a group, as well as ? and +. Or you can use extended regular expressions with the -E option.



          Note that sed regexes are greedy, so <.*> will match <h> PIDAT <h> in that line, instead of stopping at the first >. And .*? does not make sense (.* already can match nothing, so making it optional via ? is unnecessary).



          This might work:



          sed -i -Ee 's/^(<[^>]*>).*/1/' *.conll


          [^>] matches everything except >, so <[^>]*> will match <h> but not <h> PIDAT <h>.







          share|improve this answer













          share|improve this answer



          share|improve this answer











          answered Jul 5 at 5:27









          muru

          33.1k576139




          33.1k576139







          • 1




            Another possibility is that the OP might be thinking that .*? is a non-greedy match. Your solution, of course, has that covered.
            – John1024
            Jul 5 at 6:01






          • 1




            @John1024 ah, yes, I forgot PCRE uses *? for non-greedy match. That makes sense now.
            – muru
            Jul 5 at 6:02










          • Impressive. Thank you for your explanation! I got an error with your line of code: "1 not defined in the RE", but after a quick Google search, I realized it was because it didn't escape the (), but that was also given in your answer ;) Anyway, thanks again!!
            – Carolina Cárdenas
            Jul 5 at 6:03






          • 1




            @CarolinaCárdenas did you get that error when using the -E option?
            – muru
            Jul 5 at 6:04










          • @muru yes. Exactly.
            – Carolina Cárdenas
            Jul 5 at 6:35












          • 1




            Another possibility is that the OP might be thinking that .*? is a non-greedy match. Your solution, of course, has that covered.
            – John1024
            Jul 5 at 6:01






          • 1




            @John1024 ah, yes, I forgot PCRE uses *? for non-greedy match. That makes sense now.
            – muru
            Jul 5 at 6:02










          • Impressive. Thank you for your explanation! I got an error with your line of code: "1 not defined in the RE", but after a quick Google search, I realized it was because it didn't escape the (), but that was also given in your answer ;) Anyway, thanks again!!
            – Carolina Cárdenas
            Jul 5 at 6:03






          • 1




            @CarolinaCárdenas did you get that error when using the -E option?
            – muru
            Jul 5 at 6:04










          • @muru yes. Exactly.
            – Carolina Cárdenas
            Jul 5 at 6:35







          1




          1




          Another possibility is that the OP might be thinking that .*? is a non-greedy match. Your solution, of course, has that covered.
          – John1024
          Jul 5 at 6:01




          Another possibility is that the OP might be thinking that .*? is a non-greedy match. Your solution, of course, has that covered.
          – John1024
          Jul 5 at 6:01




          1




          1




          @John1024 ah, yes, I forgot PCRE uses *? for non-greedy match. That makes sense now.
          – muru
          Jul 5 at 6:02




          @John1024 ah, yes, I forgot PCRE uses *? for non-greedy match. That makes sense now.
          – muru
          Jul 5 at 6:02












          Impressive. Thank you for your explanation! I got an error with your line of code: "1 not defined in the RE", but after a quick Google search, I realized it was because it didn't escape the (), but that was also given in your answer ;) Anyway, thanks again!!
          – Carolina Cárdenas
          Jul 5 at 6:03




          Impressive. Thank you for your explanation! I got an error with your line of code: "1 not defined in the RE", but after a quick Google search, I realized it was because it didn't escape the (), but that was also given in your answer ;) Anyway, thanks again!!
          – Carolina Cárdenas
          Jul 5 at 6:03




          1




          1




          @CarolinaCárdenas did you get that error when using the -E option?
          – muru
          Jul 5 at 6:04




          @CarolinaCárdenas did you get that error when using the -E option?
          – muru
          Jul 5 at 6:04












          @muru yes. Exactly.
          – Carolina Cárdenas
          Jul 5 at 6:35




          @muru yes. Exactly.
          – Carolina Cárdenas
          Jul 5 at 6:35












           

          draft saved


          draft discarded


























           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f453527%2f1-not-working-with-sed%23new-answer', 'question_page');

          );

          Post as a guest













































































          8tOfmCmdzSKsF,YC GH 6 t8YF abC8Z4lv,vNYyv2W0xc,R9dBf64lHLUPdeKg6KMx8z64m,E3DWFZ RKAFdp Mi49cjV U58,e8B
          3t32,Blh tvyoejkTAXwVQ2dKxV 4bkTDIJk1400Sy

          Popular posts from this blog

          How to check contact read email or not when send email to Individual?

          How many registers does an x86_64 CPU actually have?

          Displaying single band from multi-band raster using QGIS