parsing data with sed command

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
0
down vote

favorite












My output like this :



 <li><a href="/intl/id/download/">Bahasa Indonesia</a></li>
<li><a href="/intl/ms/download/">Bahasa Melayu</a></li>
<li><a href="/intl/da/download/">Dansk</a></li>
<li><a href="/intl/de/download/">Deutsch</a></li>
<li><a href="/intl/en/download/">English (US)</a></li>
<li><a href="/intl/es/download/">Español</a></li>
<li><a href="/intl/es-latam/download/">Español (América Latina)</a></li>
<li><a href="/intl/fr/download/">Français</a></li>
<li><a href="/intl/it/download/">Italiano</a></li>
<li><a href="/intl/nl/download/">Nederlands</a></li>
<li><a href="/intl/pl/download/">Polski</a></li>
<li><a href="/intl/pt-br/download/">Português (Brasil)</a></li>
<li><a href="/intl/pt/download/">Português (Portugal)</a></li>
<li><a href="/intl/fi/download/">Suomi</a></li>
<li><a href="/intl/sv/download/">Svenska</a></li>
<li><a href="/intl/vi/download/">Tiếng Việt</a></li>
<li><a href="/intl/tr/download/">Türkçe</a></li>
<li><a href="/intl/ru/download/">Русский</a></li>
<li><a href="/intl/ar/download/">العربية</a></li>
<li><a href="/intl/th/download/">ภาษาไทย</a></li>
<li><a href="/intl/ko/download/">한국어</a></li>
<li><a href="/intl/zh-cn/download/">中文(简体)</a></li>
<li><a href="/intl/zh-tw/download/">中文(繁體)</a></li>
<li><a href="/intl/jp/download/">日本語</a></li>
If your download didn’t start, <a href="https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg">click here</a>.<br>
100 24789 100 24789 0 0 14560 0 0:00:01 0:00:01 --:--:-- 14564
<li><a href="/get-started">Getting started</a></li>
<li><a href="/basic">Basic</a></li>
<li><a href="/premium">Premium</a></li>
<li><a href="/business">Features</a></li>
<li><a href="/business/spaces">Spaces<span class="new">New!</span></a></li>
<li><a href="/business/use-cases">Use cases</a></li>
<li><a href="/business/customer-stories">Customer stories</a></li>
<li><a href="/business/contact">Contact sales</a></li>
<li><a href="http://blog.evernote.com/">Blog</a></li>
<li><a href="/community">Community</a></li>


I wants to extract just https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg with sed command







share|improve this question























    up vote
    0
    down vote

    favorite












    My output like this :



     <li><a href="/intl/id/download/">Bahasa Indonesia</a></li>
    <li><a href="/intl/ms/download/">Bahasa Melayu</a></li>
    <li><a href="/intl/da/download/">Dansk</a></li>
    <li><a href="/intl/de/download/">Deutsch</a></li>
    <li><a href="/intl/en/download/">English (US)</a></li>
    <li><a href="/intl/es/download/">Español</a></li>
    <li><a href="/intl/es-latam/download/">Español (América Latina)</a></li>
    <li><a href="/intl/fr/download/">Français</a></li>
    <li><a href="/intl/it/download/">Italiano</a></li>
    <li><a href="/intl/nl/download/">Nederlands</a></li>
    <li><a href="/intl/pl/download/">Polski</a></li>
    <li><a href="/intl/pt-br/download/">Português (Brasil)</a></li>
    <li><a href="/intl/pt/download/">Português (Portugal)</a></li>
    <li><a href="/intl/fi/download/">Suomi</a></li>
    <li><a href="/intl/sv/download/">Svenska</a></li>
    <li><a href="/intl/vi/download/">Tiếng Việt</a></li>
    <li><a href="/intl/tr/download/">Türkçe</a></li>
    <li><a href="/intl/ru/download/">Русский</a></li>
    <li><a href="/intl/ar/download/">العربية</a></li>
    <li><a href="/intl/th/download/">ภาษาไทย</a></li>
    <li><a href="/intl/ko/download/">한국어</a></li>
    <li><a href="/intl/zh-cn/download/">中文(简体)</a></li>
    <li><a href="/intl/zh-tw/download/">中文(繁體)</a></li>
    <li><a href="/intl/jp/download/">日本語</a></li>
    If your download didn’t start, <a href="https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg">click here</a>.<br>
    100 24789 100 24789 0 0 14560 0 0:00:01 0:00:01 --:--:-- 14564
    <li><a href="/get-started">Getting started</a></li>
    <li><a href="/basic">Basic</a></li>
    <li><a href="/premium">Premium</a></li>
    <li><a href="/business">Features</a></li>
    <li><a href="/business/spaces">Spaces<span class="new">New!</span></a></li>
    <li><a href="/business/use-cases">Use cases</a></li>
    <li><a href="/business/customer-stories">Customer stories</a></li>
    <li><a href="/business/contact">Contact sales</a></li>
    <li><a href="http://blog.evernote.com/">Blog</a></li>
    <li><a href="/community">Community</a></li>


    I wants to extract just https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg with sed command







    share|improve this question





















      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      My output like this :



       <li><a href="/intl/id/download/">Bahasa Indonesia</a></li>
      <li><a href="/intl/ms/download/">Bahasa Melayu</a></li>
      <li><a href="/intl/da/download/">Dansk</a></li>
      <li><a href="/intl/de/download/">Deutsch</a></li>
      <li><a href="/intl/en/download/">English (US)</a></li>
      <li><a href="/intl/es/download/">Español</a></li>
      <li><a href="/intl/es-latam/download/">Español (América Latina)</a></li>
      <li><a href="/intl/fr/download/">Français</a></li>
      <li><a href="/intl/it/download/">Italiano</a></li>
      <li><a href="/intl/nl/download/">Nederlands</a></li>
      <li><a href="/intl/pl/download/">Polski</a></li>
      <li><a href="/intl/pt-br/download/">Português (Brasil)</a></li>
      <li><a href="/intl/pt/download/">Português (Portugal)</a></li>
      <li><a href="/intl/fi/download/">Suomi</a></li>
      <li><a href="/intl/sv/download/">Svenska</a></li>
      <li><a href="/intl/vi/download/">Tiếng Việt</a></li>
      <li><a href="/intl/tr/download/">Türkçe</a></li>
      <li><a href="/intl/ru/download/">Русский</a></li>
      <li><a href="/intl/ar/download/">العربية</a></li>
      <li><a href="/intl/th/download/">ภาษาไทย</a></li>
      <li><a href="/intl/ko/download/">한국어</a></li>
      <li><a href="/intl/zh-cn/download/">中文(简体)</a></li>
      <li><a href="/intl/zh-tw/download/">中文(繁體)</a></li>
      <li><a href="/intl/jp/download/">日本語</a></li>
      If your download didn’t start, <a href="https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg">click here</a>.<br>
      100 24789 100 24789 0 0 14560 0 0:00:01 0:00:01 --:--:-- 14564
      <li><a href="/get-started">Getting started</a></li>
      <li><a href="/basic">Basic</a></li>
      <li><a href="/premium">Premium</a></li>
      <li><a href="/business">Features</a></li>
      <li><a href="/business/spaces">Spaces<span class="new">New!</span></a></li>
      <li><a href="/business/use-cases">Use cases</a></li>
      <li><a href="/business/customer-stories">Customer stories</a></li>
      <li><a href="/business/contact">Contact sales</a></li>
      <li><a href="http://blog.evernote.com/">Blog</a></li>
      <li><a href="/community">Community</a></li>


      I wants to extract just https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg with sed command







      share|improve this question











      My output like this :



       <li><a href="/intl/id/download/">Bahasa Indonesia</a></li>
      <li><a href="/intl/ms/download/">Bahasa Melayu</a></li>
      <li><a href="/intl/da/download/">Dansk</a></li>
      <li><a href="/intl/de/download/">Deutsch</a></li>
      <li><a href="/intl/en/download/">English (US)</a></li>
      <li><a href="/intl/es/download/">Español</a></li>
      <li><a href="/intl/es-latam/download/">Español (América Latina)</a></li>
      <li><a href="/intl/fr/download/">Français</a></li>
      <li><a href="/intl/it/download/">Italiano</a></li>
      <li><a href="/intl/nl/download/">Nederlands</a></li>
      <li><a href="/intl/pl/download/">Polski</a></li>
      <li><a href="/intl/pt-br/download/">Português (Brasil)</a></li>
      <li><a href="/intl/pt/download/">Português (Portugal)</a></li>
      <li><a href="/intl/fi/download/">Suomi</a></li>
      <li><a href="/intl/sv/download/">Svenska</a></li>
      <li><a href="/intl/vi/download/">Tiếng Việt</a></li>
      <li><a href="/intl/tr/download/">Türkçe</a></li>
      <li><a href="/intl/ru/download/">Русский</a></li>
      <li><a href="/intl/ar/download/">العربية</a></li>
      <li><a href="/intl/th/download/">ภาษาไทย</a></li>
      <li><a href="/intl/ko/download/">한국어</a></li>
      <li><a href="/intl/zh-cn/download/">中文(简体)</a></li>
      <li><a href="/intl/zh-tw/download/">中文(繁體)</a></li>
      <li><a href="/intl/jp/download/">日本語</a></li>
      If your download didn’t start, <a href="https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg">click here</a>.<br>
      100 24789 100 24789 0 0 14560 0 0:00:01 0:00:01 --:--:-- 14564
      <li><a href="/get-started">Getting started</a></li>
      <li><a href="/basic">Basic</a></li>
      <li><a href="/premium">Premium</a></li>
      <li><a href="/business">Features</a></li>
      <li><a href="/business/spaces">Spaces<span class="new">New!</span></a></li>
      <li><a href="/business/use-cases">Use cases</a></li>
      <li><a href="/business/customer-stories">Customer stories</a></li>
      <li><a href="/business/contact">Contact sales</a></li>
      <li><a href="http://blog.evernote.com/">Blog</a></li>
      <li><a href="/community">Community</a></li>


      I wants to extract just https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg with sed command









      share|improve this question










      share|improve this question




      share|improve this question









      asked Apr 29 at 17:38









      Mehran

      125




      125




















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          0
          down vote



          accepted










          Only with sed as you wished:



          sed -n '/cdn1/p' "$YOUR_FILE"| sed 's/^.*(https.*dmg).*/1/g'


          Or shorter:



          sed -n 's/^.*(https.*dmg).*/1/p' "$YOUR_FILE"
          sed -n 's/^.*(https.*.[a-z]2,3).*/1/p' "$YOUR_FILE"
          sed -n 's/^.*(https?.*cdn1.*.[a-z]2,3).*/1/p' "$YOUR_FILE"





          share|improve this answer






























            up vote
            4
            down vote













            sed and alike are NOT the right tools to process XML/HTML data.
            Use appropriate XML/HTML parsers, like xmllint or xmlstarlet.



            With xmllint you would do:



            xmllint --html --xpath 'string(//a[text()="click here"]/@href)' input.html


            The output:



            https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg




            • string(//a[text()="click here"]/@href) - the crucial xpath expression to select a tag which text value is click here and get string representation of its href attribute





            share|improve this answer























              Your Answer







              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "106"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              convertImagesToLinks: false,
              noModals: false,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );








               

              draft saved


              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f440766%2fparsing-data-with-sed-command%23new-answer', 'question_page');

              );

              Post as a guest






























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes








              up vote
              0
              down vote



              accepted










              Only with sed as you wished:



              sed -n '/cdn1/p' "$YOUR_FILE"| sed 's/^.*(https.*dmg).*/1/g'


              Or shorter:



              sed -n 's/^.*(https.*dmg).*/1/p' "$YOUR_FILE"
              sed -n 's/^.*(https.*.[a-z]2,3).*/1/p' "$YOUR_FILE"
              sed -n 's/^.*(https?.*cdn1.*.[a-z]2,3).*/1/p' "$YOUR_FILE"





              share|improve this answer



























                up vote
                0
                down vote



                accepted










                Only with sed as you wished:



                sed -n '/cdn1/p' "$YOUR_FILE"| sed 's/^.*(https.*dmg).*/1/g'


                Or shorter:



                sed -n 's/^.*(https.*dmg).*/1/p' "$YOUR_FILE"
                sed -n 's/^.*(https.*.[a-z]2,3).*/1/p' "$YOUR_FILE"
                sed -n 's/^.*(https?.*cdn1.*.[a-z]2,3).*/1/p' "$YOUR_FILE"





                share|improve this answer

























                  up vote
                  0
                  down vote



                  accepted







                  up vote
                  0
                  down vote



                  accepted






                  Only with sed as you wished:



                  sed -n '/cdn1/p' "$YOUR_FILE"| sed 's/^.*(https.*dmg).*/1/g'


                  Or shorter:



                  sed -n 's/^.*(https.*dmg).*/1/p' "$YOUR_FILE"
                  sed -n 's/^.*(https.*.[a-z]2,3).*/1/p' "$YOUR_FILE"
                  sed -n 's/^.*(https?.*cdn1.*.[a-z]2,3).*/1/p' "$YOUR_FILE"





                  share|improve this answer















                  Only with sed as you wished:



                  sed -n '/cdn1/p' "$YOUR_FILE"| sed 's/^.*(https.*dmg).*/1/g'


                  Or shorter:



                  sed -n 's/^.*(https.*dmg).*/1/p' "$YOUR_FILE"
                  sed -n 's/^.*(https.*.[a-z]2,3).*/1/p' "$YOUR_FILE"
                  sed -n 's/^.*(https?.*cdn1.*.[a-z]2,3).*/1/p' "$YOUR_FILE"






                  share|improve this answer















                  share|improve this answer



                  share|improve this answer








                  edited May 5 at 18:46









                  Kusalananda

                  102k13199316




                  102k13199316











                  answered Apr 29 at 17:55









                  chevallier

                  8351116




                  8351116






















                      up vote
                      4
                      down vote













                      sed and alike are NOT the right tools to process XML/HTML data.
                      Use appropriate XML/HTML parsers, like xmllint or xmlstarlet.



                      With xmllint you would do:



                      xmllint --html --xpath 'string(//a[text()="click here"]/@href)' input.html


                      The output:



                      https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg




                      • string(//a[text()="click here"]/@href) - the crucial xpath expression to select a tag which text value is click here and get string representation of its href attribute





                      share|improve this answer



























                        up vote
                        4
                        down vote













                        sed and alike are NOT the right tools to process XML/HTML data.
                        Use appropriate XML/HTML parsers, like xmllint or xmlstarlet.



                        With xmllint you would do:



                        xmllint --html --xpath 'string(//a[text()="click here"]/@href)' input.html


                        The output:



                        https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg




                        • string(//a[text()="click here"]/@href) - the crucial xpath expression to select a tag which text value is click here and get string representation of its href attribute





                        share|improve this answer

























                          up vote
                          4
                          down vote










                          up vote
                          4
                          down vote









                          sed and alike are NOT the right tools to process XML/HTML data.
                          Use appropriate XML/HTML parsers, like xmllint or xmlstarlet.



                          With xmllint you would do:



                          xmllint --html --xpath 'string(//a[text()="click here"]/@href)' input.html


                          The output:



                          https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg




                          • string(//a[text()="click here"]/@href) - the crucial xpath expression to select a tag which text value is click here and get string representation of its href attribute





                          share|improve this answer















                          sed and alike are NOT the right tools to process XML/HTML data.
                          Use appropriate XML/HTML parsers, like xmllint or xmlstarlet.



                          With xmllint you would do:



                          xmllint --html --xpath 'string(//a[text()="click here"]/@href)' input.html


                          The output:



                          https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg




                          • string(//a[text()="click here"]/@href) - the crucial xpath expression to select a tag which text value is click here and get string representation of its href attribute






                          share|improve this answer















                          share|improve this answer



                          share|improve this answer








                          edited Apr 29 at 17:59


























                          answered Apr 29 at 17:54









                          RomanPerekhrest

                          22.4k12144




                          22.4k12144






















                               

                              draft saved


                              draft discarded


























                               


                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function ()
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f440766%2fparsing-data-with-sed-command%23new-answer', 'question_page');

                              );

                              Post as a guest













































































                              Popular posts from this blog

                              Peggy Mitchell

                              Palaiologos

                              The Forum (Inglewood, California)