Recursively following link and running grep on any pages
Clash Royale CLAN TAG#URR8PPP
I'm trying to grep through some logs at the url below to look for a specific username. However I'm getting no results, it just runs without stopping.
grepfor="username"
urls=("https://tgstation13.org/parsed-logs/terry/data/logs/2019/01")
while [ $#urls[@] -ne 0 ]
do
content="$(curl -s "$url[0]")"
echo "$content" | grep "$grepfor"
delete=($urls[0])
add=(`echo "$content" | grep -Po '(?<=href=")[^"]*'`)
urls=( "$urls[@]/$delete" "$add[@]" )
done
bash grep curl
add a comment |
I'm trying to grep through some logs at the url below to look for a specific username. However I'm getting no results, it just runs without stopping.
grepfor="username"
urls=("https://tgstation13.org/parsed-logs/terry/data/logs/2019/01")
while [ $#urls[@] -ne 0 ]
do
content="$(curl -s "$url[0]")"
echo "$content" | grep "$grepfor"
delete=($urls[0])
add=(`echo "$content" | grep -Po '(?<=href=")[^"]*'`)
urls=( "$urls[@]/$delete" "$add[@]" )
done
bash grep curl
Don't correct code in the question. It invalidates the answers.
– Kusalananda
Feb 7 at 19:57
Sure thing, thank you
– Shard
Feb 7 at 20:00
I've figured out my next fuckup, the links I'm adding are relative, i need to prepend urls[0] to the start of each. Also the url is missing a / at the end and I'm not following redirects
– Shard
Feb 7 at 20:00
add a comment |
I'm trying to grep through some logs at the url below to look for a specific username. However I'm getting no results, it just runs without stopping.
grepfor="username"
urls=("https://tgstation13.org/parsed-logs/terry/data/logs/2019/01")
while [ $#urls[@] -ne 0 ]
do
content="$(curl -s "$url[0]")"
echo "$content" | grep "$grepfor"
delete=($urls[0])
add=(`echo "$content" | grep -Po '(?<=href=")[^"]*'`)
urls=( "$urls[@]/$delete" "$add[@]" )
done
bash grep curl
I'm trying to grep through some logs at the url below to look for a specific username. However I'm getting no results, it just runs without stopping.
grepfor="username"
urls=("https://tgstation13.org/parsed-logs/terry/data/logs/2019/01")
while [ $#urls[@] -ne 0 ]
do
content="$(curl -s "$url[0]")"
echo "$content" | grep "$grepfor"
delete=($urls[0])
add=(`echo "$content" | grep -Po '(?<=href=")[^"]*'`)
urls=( "$urls[@]/$delete" "$add[@]" )
done
bash grep curl
bash grep curl
edited Feb 7 at 19:57
Kusalananda
133k17254417
133k17254417
asked Feb 7 at 19:40
ShardShard
1137
1137
Don't correct code in the question. It invalidates the answers.
– Kusalananda
Feb 7 at 19:57
Sure thing, thank you
– Shard
Feb 7 at 20:00
I've figured out my next fuckup, the links I'm adding are relative, i need to prepend urls[0] to the start of each. Also the url is missing a / at the end and I'm not following redirects
– Shard
Feb 7 at 20:00
add a comment |
Don't correct code in the question. It invalidates the answers.
– Kusalananda
Feb 7 at 19:57
Sure thing, thank you
– Shard
Feb 7 at 20:00
I've figured out my next fuckup, the links I'm adding are relative, i need to prepend urls[0] to the start of each. Also the url is missing a / at the end and I'm not following redirects
– Shard
Feb 7 at 20:00
Don't correct code in the question. It invalidates the answers.
– Kusalananda
Feb 7 at 19:57
Don't correct code in the question. It invalidates the answers.
– Kusalananda
Feb 7 at 19:57
Sure thing, thank you
– Shard
Feb 7 at 20:00
Sure thing, thank you
– Shard
Feb 7 at 20:00
I've figured out my next fuckup, the links I'm adding are relative, i need to prepend urls[0] to the start of each. Also the url is missing a / at the end and I'm not following redirects
– Shard
Feb 7 at 20:00
I've figured out my next fuckup, the links I'm adding are relative, i need to prepend urls[0] to the start of each. Also the url is missing a / at the end and I'm not following redirects
– Shard
Feb 7 at 20:00
add a comment |
1 Answer
1
active
oldest
votes
Use "$urls[0]"
for the first element of the urls
array, not $urls[0]
.
To delete the first element of urls
and to add the add
array to the end, use
urls=( "$urls[@]:1" "$add[@]" )
Always quote every expansion, even $#urls[@]
.
I haven't looked too closely at your curls and greps, but use
printf '%sn' "$content"
if you want to be sure that you preserve backslashes in the data.
Related:
- When is double-quoting necessary?
- Why is printf better than echo?
Thanks, I'll implement those changes. However even if I change $grepfor to a string which is on the first url I can't get it to grep out that line
– Shard
Feb 7 at 19:51
@Shard Well, thecurl
call previous to that line will look for an url ending in[0]
, so fix that first. Also$url
is unset there (it should be"$urls[0]"
).
– Kusalananda
Feb 7 at 19:52
Ahh thanks that was a big fuck up, I've implemented the changed code into the question. The code is now terminating near immediately. I assume new urls found from the first link aren't correctly being added to $urls
– Shard
Feb 7 at 19:56
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f499347%2frecursively-following-link-and-running-grep-on-any-pages%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Use "$urls[0]"
for the first element of the urls
array, not $urls[0]
.
To delete the first element of urls
and to add the add
array to the end, use
urls=( "$urls[@]:1" "$add[@]" )
Always quote every expansion, even $#urls[@]
.
I haven't looked too closely at your curls and greps, but use
printf '%sn' "$content"
if you want to be sure that you preserve backslashes in the data.
Related:
- When is double-quoting necessary?
- Why is printf better than echo?
Thanks, I'll implement those changes. However even if I change $grepfor to a string which is on the first url I can't get it to grep out that line
– Shard
Feb 7 at 19:51
@Shard Well, thecurl
call previous to that line will look for an url ending in[0]
, so fix that first. Also$url
is unset there (it should be"$urls[0]"
).
– Kusalananda
Feb 7 at 19:52
Ahh thanks that was a big fuck up, I've implemented the changed code into the question. The code is now terminating near immediately. I assume new urls found from the first link aren't correctly being added to $urls
– Shard
Feb 7 at 19:56
add a comment |
Use "$urls[0]"
for the first element of the urls
array, not $urls[0]
.
To delete the first element of urls
and to add the add
array to the end, use
urls=( "$urls[@]:1" "$add[@]" )
Always quote every expansion, even $#urls[@]
.
I haven't looked too closely at your curls and greps, but use
printf '%sn' "$content"
if you want to be sure that you preserve backslashes in the data.
Related:
- When is double-quoting necessary?
- Why is printf better than echo?
Thanks, I'll implement those changes. However even if I change $grepfor to a string which is on the first url I can't get it to grep out that line
– Shard
Feb 7 at 19:51
@Shard Well, thecurl
call previous to that line will look for an url ending in[0]
, so fix that first. Also$url
is unset there (it should be"$urls[0]"
).
– Kusalananda
Feb 7 at 19:52
Ahh thanks that was a big fuck up, I've implemented the changed code into the question. The code is now terminating near immediately. I assume new urls found from the first link aren't correctly being added to $urls
– Shard
Feb 7 at 19:56
add a comment |
Use "$urls[0]"
for the first element of the urls
array, not $urls[0]
.
To delete the first element of urls
and to add the add
array to the end, use
urls=( "$urls[@]:1" "$add[@]" )
Always quote every expansion, even $#urls[@]
.
I haven't looked too closely at your curls and greps, but use
printf '%sn' "$content"
if you want to be sure that you preserve backslashes in the data.
Related:
- When is double-quoting necessary?
- Why is printf better than echo?
Use "$urls[0]"
for the first element of the urls
array, not $urls[0]
.
To delete the first element of urls
and to add the add
array to the end, use
urls=( "$urls[@]:1" "$add[@]" )
Always quote every expansion, even $#urls[@]
.
I haven't looked too closely at your curls and greps, but use
printf '%sn' "$content"
if you want to be sure that you preserve backslashes in the data.
Related:
- When is double-quoting necessary?
- Why is printf better than echo?
answered Feb 7 at 19:47
KusalanandaKusalananda
133k17254417
133k17254417
Thanks, I'll implement those changes. However even if I change $grepfor to a string which is on the first url I can't get it to grep out that line
– Shard
Feb 7 at 19:51
@Shard Well, thecurl
call previous to that line will look for an url ending in[0]
, so fix that first. Also$url
is unset there (it should be"$urls[0]"
).
– Kusalananda
Feb 7 at 19:52
Ahh thanks that was a big fuck up, I've implemented the changed code into the question. The code is now terminating near immediately. I assume new urls found from the first link aren't correctly being added to $urls
– Shard
Feb 7 at 19:56
add a comment |
Thanks, I'll implement those changes. However even if I change $grepfor to a string which is on the first url I can't get it to grep out that line
– Shard
Feb 7 at 19:51
@Shard Well, thecurl
call previous to that line will look for an url ending in[0]
, so fix that first. Also$url
is unset there (it should be"$urls[0]"
).
– Kusalananda
Feb 7 at 19:52
Ahh thanks that was a big fuck up, I've implemented the changed code into the question. The code is now terminating near immediately. I assume new urls found from the first link aren't correctly being added to $urls
– Shard
Feb 7 at 19:56
Thanks, I'll implement those changes. However even if I change $grepfor to a string which is on the first url I can't get it to grep out that line
– Shard
Feb 7 at 19:51
Thanks, I'll implement those changes. However even if I change $grepfor to a string which is on the first url I can't get it to grep out that line
– Shard
Feb 7 at 19:51
@Shard Well, the
curl
call previous to that line will look for an url ending in [0]
, so fix that first. Also $url
is unset there (it should be "$urls[0]"
).– Kusalananda
Feb 7 at 19:52
@Shard Well, the
curl
call previous to that line will look for an url ending in [0]
, so fix that first. Also $url
is unset there (it should be "$urls[0]"
).– Kusalananda
Feb 7 at 19:52
Ahh thanks that was a big fuck up, I've implemented the changed code into the question. The code is now terminating near immediately. I assume new urls found from the first link aren't correctly being added to $urls
– Shard
Feb 7 at 19:56
Ahh thanks that was a big fuck up, I've implemented the changed code into the question. The code is now terminating near immediately. I assume new urls found from the first link aren't correctly being added to $urls
– Shard
Feb 7 at 19:56
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f499347%2frecursively-following-link-and-running-grep-on-any-pages%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Don't correct code in the question. It invalidates the answers.
– Kusalananda
Feb 7 at 19:57
Sure thing, thank you
– Shard
Feb 7 at 20:00
I've figured out my next fuckup, the links I'm adding are relative, i need to prepend urls[0] to the start of each. Also the url is missing a / at the end and I'm not following redirects
– Shard
Feb 7 at 20:00