parsing data with sed command

Clash Royale CLAN TAG#URR8PPP
up vote
0
down vote
favorite
My output like this :
<li><a href="/intl/id/download/">Bahasa Indonesia</a></li>
<li><a href="/intl/ms/download/">Bahasa Melayu</a></li>
<li><a href="/intl/da/download/">Dansk</a></li>
<li><a href="/intl/de/download/">Deutsch</a></li>
<li><a href="/intl/en/download/">English (US)</a></li>
<li><a href="/intl/es/download/">Español</a></li>
<li><a href="/intl/es-latam/download/">Español (América Latina)</a></li>
<li><a href="/intl/fr/download/">Français</a></li>
<li><a href="/intl/it/download/">Italiano</a></li>
<li><a href="/intl/nl/download/">Nederlands</a></li>
<li><a href="/intl/pl/download/">Polski</a></li>
<li><a href="/intl/pt-br/download/">Português (Brasil)</a></li>
<li><a href="/intl/pt/download/">Português (Portugal)</a></li>
<li><a href="/intl/fi/download/">Suomi</a></li>
<li><a href="/intl/sv/download/">Svenska</a></li>
<li><a href="/intl/vi/download/">Tiếng Viá»Ât</a></li>
<li><a href="/intl/tr/download/">Türkçe</a></li>
<li><a href="/intl/ru/download/">àÃÂÃÂÃÂúøù</a></li>
<li><a href="/intl/ar/download/">çÃÂùñèÃÂé</a></li>
<li><a href="/intl/th/download/">à ¸ à ¸²à ¸©à ¸²à ¹Âà ¸Âà ¸¢</a></li>
<li><a href="/intl/ko/download/">ÃÂÂêµÂì´</a></li>
<li><a href="/intl/zh-cn/download/">ä¸ÂæÂÂï¼Âç®Âä½Âï¼Â</a></li>
<li><a href="/intl/zh-tw/download/">ä¸ÂæÂÂï¼Âç¹Âé«Âï¼Â</a></li>
<li><a href="/intl/jp/download/">æÂ¥æÂ¬èªÂ</a></li>
If your download didnâÂÂt start, <a href="https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg">click here</a>.<br>
100 24789 100 24789 0 0 14560 0 0:00:01 0:00:01 --:--:-- 14564
<li><a href="/get-started">Getting started</a></li>
<li><a href="/basic">Basic</a></li>
<li><a href="/premium">Premium</a></li>
<li><a href="/business">Features</a></li>
<li><a href="/business/spaces">Spaces<span class="new">New!</span></a></li>
<li><a href="/business/use-cases">Use cases</a></li>
<li><a href="/business/customer-stories">Customer stories</a></li>
<li><a href="/business/contact">Contact sales</a></li>
<li><a href="http://blog.evernote.com/">Blog</a></li>
<li><a href="/community">Community</a></li>
I wants to extract just https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg with sed command
linux awk sed scripting
add a comment |Â
up vote
0
down vote
favorite
My output like this :
<li><a href="/intl/id/download/">Bahasa Indonesia</a></li>
<li><a href="/intl/ms/download/">Bahasa Melayu</a></li>
<li><a href="/intl/da/download/">Dansk</a></li>
<li><a href="/intl/de/download/">Deutsch</a></li>
<li><a href="/intl/en/download/">English (US)</a></li>
<li><a href="/intl/es/download/">Español</a></li>
<li><a href="/intl/es-latam/download/">Español (América Latina)</a></li>
<li><a href="/intl/fr/download/">Français</a></li>
<li><a href="/intl/it/download/">Italiano</a></li>
<li><a href="/intl/nl/download/">Nederlands</a></li>
<li><a href="/intl/pl/download/">Polski</a></li>
<li><a href="/intl/pt-br/download/">Português (Brasil)</a></li>
<li><a href="/intl/pt/download/">Português (Portugal)</a></li>
<li><a href="/intl/fi/download/">Suomi</a></li>
<li><a href="/intl/sv/download/">Svenska</a></li>
<li><a href="/intl/vi/download/">Tiếng Viá»Ât</a></li>
<li><a href="/intl/tr/download/">Türkçe</a></li>
<li><a href="/intl/ru/download/">àÃÂÃÂÃÂúøù</a></li>
<li><a href="/intl/ar/download/">çÃÂùñèÃÂé</a></li>
<li><a href="/intl/th/download/">à ¸ à ¸²à ¸©à ¸²à ¹Âà ¸Âà ¸¢</a></li>
<li><a href="/intl/ko/download/">ÃÂÂêµÂì´</a></li>
<li><a href="/intl/zh-cn/download/">ä¸ÂæÂÂï¼Âç®Âä½Âï¼Â</a></li>
<li><a href="/intl/zh-tw/download/">ä¸ÂæÂÂï¼Âç¹Âé«Âï¼Â</a></li>
<li><a href="/intl/jp/download/">æÂ¥æÂ¬èªÂ</a></li>
If your download didnâÂÂt start, <a href="https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg">click here</a>.<br>
100 24789 100 24789 0 0 14560 0 0:00:01 0:00:01 --:--:-- 14564
<li><a href="/get-started">Getting started</a></li>
<li><a href="/basic">Basic</a></li>
<li><a href="/premium">Premium</a></li>
<li><a href="/business">Features</a></li>
<li><a href="/business/spaces">Spaces<span class="new">New!</span></a></li>
<li><a href="/business/use-cases">Use cases</a></li>
<li><a href="/business/customer-stories">Customer stories</a></li>
<li><a href="/business/contact">Contact sales</a></li>
<li><a href="http://blog.evernote.com/">Blog</a></li>
<li><a href="/community">Community</a></li>
I wants to extract just https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg with sed command
linux awk sed scripting
add a comment |Â
up vote
0
down vote
favorite
up vote
0
down vote
favorite
My output like this :
<li><a href="/intl/id/download/">Bahasa Indonesia</a></li>
<li><a href="/intl/ms/download/">Bahasa Melayu</a></li>
<li><a href="/intl/da/download/">Dansk</a></li>
<li><a href="/intl/de/download/">Deutsch</a></li>
<li><a href="/intl/en/download/">English (US)</a></li>
<li><a href="/intl/es/download/">Español</a></li>
<li><a href="/intl/es-latam/download/">Español (América Latina)</a></li>
<li><a href="/intl/fr/download/">Français</a></li>
<li><a href="/intl/it/download/">Italiano</a></li>
<li><a href="/intl/nl/download/">Nederlands</a></li>
<li><a href="/intl/pl/download/">Polski</a></li>
<li><a href="/intl/pt-br/download/">Português (Brasil)</a></li>
<li><a href="/intl/pt/download/">Português (Portugal)</a></li>
<li><a href="/intl/fi/download/">Suomi</a></li>
<li><a href="/intl/sv/download/">Svenska</a></li>
<li><a href="/intl/vi/download/">Tiếng Viá»Ât</a></li>
<li><a href="/intl/tr/download/">Türkçe</a></li>
<li><a href="/intl/ru/download/">àÃÂÃÂÃÂúøù</a></li>
<li><a href="/intl/ar/download/">çÃÂùñèÃÂé</a></li>
<li><a href="/intl/th/download/">à ¸ à ¸²à ¸©à ¸²à ¹Âà ¸Âà ¸¢</a></li>
<li><a href="/intl/ko/download/">ÃÂÂêµÂì´</a></li>
<li><a href="/intl/zh-cn/download/">ä¸ÂæÂÂï¼Âç®Âä½Âï¼Â</a></li>
<li><a href="/intl/zh-tw/download/">ä¸ÂæÂÂï¼Âç¹Âé«Âï¼Â</a></li>
<li><a href="/intl/jp/download/">æÂ¥æÂ¬èªÂ</a></li>
If your download didnâÂÂt start, <a href="https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg">click here</a>.<br>
100 24789 100 24789 0 0 14560 0 0:00:01 0:00:01 --:--:-- 14564
<li><a href="/get-started">Getting started</a></li>
<li><a href="/basic">Basic</a></li>
<li><a href="/premium">Premium</a></li>
<li><a href="/business">Features</a></li>
<li><a href="/business/spaces">Spaces<span class="new">New!</span></a></li>
<li><a href="/business/use-cases">Use cases</a></li>
<li><a href="/business/customer-stories">Customer stories</a></li>
<li><a href="/business/contact">Contact sales</a></li>
<li><a href="http://blog.evernote.com/">Blog</a></li>
<li><a href="/community">Community</a></li>
I wants to extract just https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg with sed command
linux awk sed scripting
My output like this :
<li><a href="/intl/id/download/">Bahasa Indonesia</a></li>
<li><a href="/intl/ms/download/">Bahasa Melayu</a></li>
<li><a href="/intl/da/download/">Dansk</a></li>
<li><a href="/intl/de/download/">Deutsch</a></li>
<li><a href="/intl/en/download/">English (US)</a></li>
<li><a href="/intl/es/download/">Español</a></li>
<li><a href="/intl/es-latam/download/">Español (América Latina)</a></li>
<li><a href="/intl/fr/download/">Français</a></li>
<li><a href="/intl/it/download/">Italiano</a></li>
<li><a href="/intl/nl/download/">Nederlands</a></li>
<li><a href="/intl/pl/download/">Polski</a></li>
<li><a href="/intl/pt-br/download/">Português (Brasil)</a></li>
<li><a href="/intl/pt/download/">Português (Portugal)</a></li>
<li><a href="/intl/fi/download/">Suomi</a></li>
<li><a href="/intl/sv/download/">Svenska</a></li>
<li><a href="/intl/vi/download/">Tiếng Viá»Ât</a></li>
<li><a href="/intl/tr/download/">Türkçe</a></li>
<li><a href="/intl/ru/download/">àÃÂÃÂÃÂúøù</a></li>
<li><a href="/intl/ar/download/">çÃÂùñèÃÂé</a></li>
<li><a href="/intl/th/download/">à ¸ à ¸²à ¸©à ¸²à ¹Âà ¸Âà ¸¢</a></li>
<li><a href="/intl/ko/download/">ÃÂÂêµÂì´</a></li>
<li><a href="/intl/zh-cn/download/">ä¸ÂæÂÂï¼Âç®Âä½Âï¼Â</a></li>
<li><a href="/intl/zh-tw/download/">ä¸ÂæÂÂï¼Âç¹Âé«Âï¼Â</a></li>
<li><a href="/intl/jp/download/">æÂ¥æÂ¬èªÂ</a></li>
If your download didnâÂÂt start, <a href="https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg">click here</a>.<br>
100 24789 100 24789 0 0 14560 0 0:00:01 0:00:01 --:--:-- 14564
<li><a href="/get-started">Getting started</a></li>
<li><a href="/basic">Basic</a></li>
<li><a href="/premium">Premium</a></li>
<li><a href="/business">Features</a></li>
<li><a href="/business/spaces">Spaces<span class="new">New!</span></a></li>
<li><a href="/business/use-cases">Use cases</a></li>
<li><a href="/business/customer-stories">Customer stories</a></li>
<li><a href="/business/contact">Contact sales</a></li>
<li><a href="http://blog.evernote.com/">Blog</a></li>
<li><a href="/community">Community</a></li>
I wants to extract just https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg with sed command
linux awk sed scripting
asked Apr 29 at 17:38
Mehran
125
125
add a comment |Â
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
0
down vote
accepted
Only with sed as you wished:
sed -n '/cdn1/p' "$YOUR_FILE"| sed 's/^.*(https.*dmg).*/1/g'
Or shorter:
sed -n 's/^.*(https.*dmg).*/1/p' "$YOUR_FILE"
sed -n 's/^.*(https.*.[a-z]2,3).*/1/p' "$YOUR_FILE"
sed -n 's/^.*(https?.*cdn1.*.[a-z]2,3).*/1/p' "$YOUR_FILE"
add a comment |Â
up vote
4
down vote
sed and alike are NOT the right tools to process XML/HTML data.
Use appropriate XML/HTML parsers, like xmllint or xmlstarlet.
With xmllint you would do:
xmllint --html --xpath 'string(//a[text()="click here"]/@href)' input.html
The output:
https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg
string(//a[text()="click here"]/@href)- the crucial xpath expression to selectatag which text value isclick hereand get string representation of itshrefattribute
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
accepted
Only with sed as you wished:
sed -n '/cdn1/p' "$YOUR_FILE"| sed 's/^.*(https.*dmg).*/1/g'
Or shorter:
sed -n 's/^.*(https.*dmg).*/1/p' "$YOUR_FILE"
sed -n 's/^.*(https.*.[a-z]2,3).*/1/p' "$YOUR_FILE"
sed -n 's/^.*(https?.*cdn1.*.[a-z]2,3).*/1/p' "$YOUR_FILE"
add a comment |Â
up vote
0
down vote
accepted
Only with sed as you wished:
sed -n '/cdn1/p' "$YOUR_FILE"| sed 's/^.*(https.*dmg).*/1/g'
Or shorter:
sed -n 's/^.*(https.*dmg).*/1/p' "$YOUR_FILE"
sed -n 's/^.*(https.*.[a-z]2,3).*/1/p' "$YOUR_FILE"
sed -n 's/^.*(https?.*cdn1.*.[a-z]2,3).*/1/p' "$YOUR_FILE"
add a comment |Â
up vote
0
down vote
accepted
up vote
0
down vote
accepted
Only with sed as you wished:
sed -n '/cdn1/p' "$YOUR_FILE"| sed 's/^.*(https.*dmg).*/1/g'
Or shorter:
sed -n 's/^.*(https.*dmg).*/1/p' "$YOUR_FILE"
sed -n 's/^.*(https.*.[a-z]2,3).*/1/p' "$YOUR_FILE"
sed -n 's/^.*(https?.*cdn1.*.[a-z]2,3).*/1/p' "$YOUR_FILE"
Only with sed as you wished:
sed -n '/cdn1/p' "$YOUR_FILE"| sed 's/^.*(https.*dmg).*/1/g'
Or shorter:
sed -n 's/^.*(https.*dmg).*/1/p' "$YOUR_FILE"
sed -n 's/^.*(https.*.[a-z]2,3).*/1/p' "$YOUR_FILE"
sed -n 's/^.*(https?.*cdn1.*.[a-z]2,3).*/1/p' "$YOUR_FILE"
edited May 5 at 18:46
Kusalananda
102k13199316
102k13199316
answered Apr 29 at 17:55
chevallier
8351116
8351116
add a comment |Â
add a comment |Â
up vote
4
down vote
sed and alike are NOT the right tools to process XML/HTML data.
Use appropriate XML/HTML parsers, like xmllint or xmlstarlet.
With xmllint you would do:
xmllint --html --xpath 'string(//a[text()="click here"]/@href)' input.html
The output:
https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg
string(//a[text()="click here"]/@href)- the crucial xpath expression to selectatag which text value isclick hereand get string representation of itshrefattribute
add a comment |Â
up vote
4
down vote
sed and alike are NOT the right tools to process XML/HTML data.
Use appropriate XML/HTML parsers, like xmllint or xmlstarlet.
With xmllint you would do:
xmllint --html --xpath 'string(//a[text()="click here"]/@href)' input.html
The output:
https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg
string(//a[text()="click here"]/@href)- the crucial xpath expression to selectatag which text value isclick hereand get string representation of itshrefattribute
add a comment |Â
up vote
4
down vote
up vote
4
down vote
sed and alike are NOT the right tools to process XML/HTML data.
Use appropriate XML/HTML parsers, like xmllint or xmlstarlet.
With xmllint you would do:
xmllint --html --xpath 'string(//a[text()="click here"]/@href)' input.html
The output:
https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg
string(//a[text()="click here"]/@href)- the crucial xpath expression to selectatag which text value isclick hereand get string representation of itshrefattribute
sed and alike are NOT the right tools to process XML/HTML data.
Use appropriate XML/HTML parsers, like xmllint or xmlstarlet.
With xmllint you would do:
xmllint --html --xpath 'string(//a[text()="click here"]/@href)' input.html
The output:
https://cdn1.evernote.com/mac-smd/public/Evernote_RELEASE_7.1_456448.dmg
string(//a[text()="click here"]/@href)- the crucial xpath expression to selectatag which text value isclick hereand get string representation of itshrefattribute
edited Apr 29 at 17:59
answered Apr 29 at 17:54
RomanPerekhrest
22.4k12144
22.4k12144
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f440766%2fparsing-data-with-sed-command%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password