What does this sed command do? How to change it?

Clash Royale CLAN TAG#URR8PPP
up vote
0
down vote
favorite
I do not understand what does this command do:
grep '<span id="geodata" class="geo">[-0-9.]*; [-0-9.]*</span>' -R articles/ --only-matching | sed 's@articles//@@' | sed 's@:<span id=.geodata. class=.geo.>@ @' | sed 's@; @ @' | sed 's@</span>@@' | sort -u -b -k1 > geocodes_from_html.txt
Some background: I'm processing wiki articles, I have a folder ("articles") full of them. The processing script was written years ago when the geo information about a place used to like that:
<span id="geodata" class="geo">[-0-9.]*; [-0-9.]*</span>
Now it looks like that:
<abbr class="latitude">[-0-9.]*</abbr><abbr class="longitude">[-0-9.]*</abbr>
What changes do I need to make to make the command work?
sed html
add a comment |
up vote
0
down vote
favorite
I do not understand what does this command do:
grep '<span id="geodata" class="geo">[-0-9.]*; [-0-9.]*</span>' -R articles/ --only-matching | sed 's@articles//@@' | sed 's@:<span id=.geodata. class=.geo.>@ @' | sed 's@; @ @' | sed 's@</span>@@' | sort -u -b -k1 > geocodes_from_html.txt
Some background: I'm processing wiki articles, I have a folder ("articles") full of them. The processing script was written years ago when the geo information about a place used to like that:
<span id="geodata" class="geo">[-0-9.]*; [-0-9.]*</span>
Now it looks like that:
<abbr class="latitude">[-0-9.]*</abbr><abbr class="longitude">[-0-9.]*</abbr>
What changes do I need to make to make the command work?
sed html
3
That code is a lousy way to extract the two coordinates from that html element... Instead of changing it you should use tools designed for this job...
– don_crissti
Sep 12 '17 at 13:27
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I do not understand what does this command do:
grep '<span id="geodata" class="geo">[-0-9.]*; [-0-9.]*</span>' -R articles/ --only-matching | sed 's@articles//@@' | sed 's@:<span id=.geodata. class=.geo.>@ @' | sed 's@; @ @' | sed 's@</span>@@' | sort -u -b -k1 > geocodes_from_html.txt
Some background: I'm processing wiki articles, I have a folder ("articles") full of them. The processing script was written years ago when the geo information about a place used to like that:
<span id="geodata" class="geo">[-0-9.]*; [-0-9.]*</span>
Now it looks like that:
<abbr class="latitude">[-0-9.]*</abbr><abbr class="longitude">[-0-9.]*</abbr>
What changes do I need to make to make the command work?
sed html
I do not understand what does this command do:
grep '<span id="geodata" class="geo">[-0-9.]*; [-0-9.]*</span>' -R articles/ --only-matching | sed 's@articles//@@' | sed 's@:<span id=.geodata. class=.geo.>@ @' | sed 's@; @ @' | sed 's@</span>@@' | sort -u -b -k1 > geocodes_from_html.txt
Some background: I'm processing wiki articles, I have a folder ("articles") full of them. The processing script was written years ago when the geo information about a place used to like that:
<span id="geodata" class="geo">[-0-9.]*; [-0-9.]*</span>
Now it looks like that:
<abbr class="latitude">[-0-9.]*</abbr><abbr class="longitude">[-0-9.]*</abbr>
What changes do I need to make to make the command work?
sed html
sed html
edited Nov 23 at 14:17
Rui F Ribeiro
38.3k1476127
38.3k1476127
asked Sep 12 '17 at 12:57
David
32
32
3
That code is a lousy way to extract the two coordinates from that html element... Instead of changing it you should use tools designed for this job...
– don_crissti
Sep 12 '17 at 13:27
add a comment |
3
That code is a lousy way to extract the two coordinates from that html element... Instead of changing it you should use tools designed for this job...
– don_crissti
Sep 12 '17 at 13:27
3
3
That code is a lousy way to extract the two coordinates from that html element... Instead of changing it you should use tools designed for this job...
– don_crissti
Sep 12 '17 at 13:27
That code is a lousy way to extract the two coordinates from that html element... Instead of changing it you should use tools designed for this job...
– don_crissti
Sep 12 '17 at 13:27
add a comment |
1 Answer
1
active
oldest
votes
up vote
1
down vote
accepted
The provided grep command searches for the string <span [...]</span> in any file, which is in the directory articles. The following sed commands are replacing several strings.
For example sed s@articles/@@ (with one slash / only) can be read as: sed search@this_string@replace_with_this@; the string articles/ will be replaced by nothing. Instead of piping from one sed to the next, you can combine all scripts to one with the same result.
If you do not want to use any other commands to extract the coordinates, you could use:
grep '<abbr class="latitude">[-0-9.]*</abbr><abbr class="longitude">[-0-9.]*</abbr>' -R articles --only-matching | sed 's@articles/@@;s@:<abbr class="latitude">@ @;s@<abbr class="longitude">@ @;s@</abbr>@@g' | sort -u -b -k1 >geocodes_from_html.txt
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
The provided grep command searches for the string <span [...]</span> in any file, which is in the directory articles. The following sed commands are replacing several strings.
For example sed s@articles/@@ (with one slash / only) can be read as: sed search@this_string@replace_with_this@; the string articles/ will be replaced by nothing. Instead of piping from one sed to the next, you can combine all scripts to one with the same result.
If you do not want to use any other commands to extract the coordinates, you could use:
grep '<abbr class="latitude">[-0-9.]*</abbr><abbr class="longitude">[-0-9.]*</abbr>' -R articles --only-matching | sed 's@articles/@@;s@:<abbr class="latitude">@ @;s@<abbr class="longitude">@ @;s@</abbr>@@g' | sort -u -b -k1 >geocodes_from_html.txt
add a comment |
up vote
1
down vote
accepted
The provided grep command searches for the string <span [...]</span> in any file, which is in the directory articles. The following sed commands are replacing several strings.
For example sed s@articles/@@ (with one slash / only) can be read as: sed search@this_string@replace_with_this@; the string articles/ will be replaced by nothing. Instead of piping from one sed to the next, you can combine all scripts to one with the same result.
If you do not want to use any other commands to extract the coordinates, you could use:
grep '<abbr class="latitude">[-0-9.]*</abbr><abbr class="longitude">[-0-9.]*</abbr>' -R articles --only-matching | sed 's@articles/@@;s@:<abbr class="latitude">@ @;s@<abbr class="longitude">@ @;s@</abbr>@@g' | sort -u -b -k1 >geocodes_from_html.txt
add a comment |
up vote
1
down vote
accepted
up vote
1
down vote
accepted
The provided grep command searches for the string <span [...]</span> in any file, which is in the directory articles. The following sed commands are replacing several strings.
For example sed s@articles/@@ (with one slash / only) can be read as: sed search@this_string@replace_with_this@; the string articles/ will be replaced by nothing. Instead of piping from one sed to the next, you can combine all scripts to one with the same result.
If you do not want to use any other commands to extract the coordinates, you could use:
grep '<abbr class="latitude">[-0-9.]*</abbr><abbr class="longitude">[-0-9.]*</abbr>' -R articles --only-matching | sed 's@articles/@@;s@:<abbr class="latitude">@ @;s@<abbr class="longitude">@ @;s@</abbr>@@g' | sort -u -b -k1 >geocodes_from_html.txt
The provided grep command searches for the string <span [...]</span> in any file, which is in the directory articles. The following sed commands are replacing several strings.
For example sed s@articles/@@ (with one slash / only) can be read as: sed search@this_string@replace_with_this@; the string articles/ will be replaced by nothing. Instead of piping from one sed to the next, you can combine all scripts to one with the same result.
If you do not want to use any other commands to extract the coordinates, you could use:
grep '<abbr class="latitude">[-0-9.]*</abbr><abbr class="longitude">[-0-9.]*</abbr>' -R articles --only-matching | sed 's@articles/@@;s@:<abbr class="latitude">@ @;s@<abbr class="longitude">@ @;s@</abbr>@@g' | sort -u -b -k1 >geocodes_from_html.txt
edited Sep 12 '17 at 14:45
Philippos
5,98211547
5,98211547
answered Sep 12 '17 at 14:36
jnL
262
262
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f391800%2fwhat-does-this-sed-command-do-how-to-change-it%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
3
That code is a lousy way to extract the two coordinates from that html element... Instead of changing it you should use tools designed for this job...
– don_crissti
Sep 12 '17 at 13:27