Make a list of URLs if webpages contain specific text [closed]

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
0
down vote

favorite












I am trying to make a list of URLs of webpages, based on whether the webpages contain the text "XYZ".



The URLs are of the form:



https://www.website.tld/page.php?var1=???&var2=static



??? is a number starting with 1, incremented by 1 each time, until an error page is encountered (say a page containing the text "ERROR”)



I want to dump the URLs of the positive matches into an output file. I read curl can sequentially scan such URLs, and its output can be passed to grep. However, I am unsure how I can retrieve and save the URL, after the grep output.










share|improve this question















closed as unclear what you're asking by G-Man, RalfFriedl, Anthony Geoghegan, Jeff Schaller, JigglyNaga Dec 10 at 10:12


Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.










  • 3




    I feel like you're asking "If I'm at a known location and drive following given directions, how do I know where I am?" If you ran curl, how do you not know the URL?
    – G-Man
    Dec 8 at 3:54










  • I do know the URL, but how do I output it after grepping on the curl output (webpage content)?
    – halfbytecode
    Dec 8 at 4:00






  • 2




    I still don't understand the question.  Given a string, do you know how to output it?  (Hint: echo or printf.)
    – G-Man
    Dec 8 at 4:27










  • Fetch webpages using curl, determine if the webpages (not URLs) contain the required text, output the URLs of positive matches.
    – halfbytecode
    Dec 8 at 4:35










  • Rather than adding information in the comments, you should edit the question to improve its quality (see How to Ask). All relevant details should be in the question itself – not in the comments.
    – Anthony Geoghegan
    Dec 8 at 19:49














up vote
0
down vote

favorite












I am trying to make a list of URLs of webpages, based on whether the webpages contain the text "XYZ".



The URLs are of the form:



https://www.website.tld/page.php?var1=???&var2=static



??? is a number starting with 1, incremented by 1 each time, until an error page is encountered (say a page containing the text "ERROR”)



I want to dump the URLs of the positive matches into an output file. I read curl can sequentially scan such URLs, and its output can be passed to grep. However, I am unsure how I can retrieve and save the URL, after the grep output.










share|improve this question















closed as unclear what you're asking by G-Man, RalfFriedl, Anthony Geoghegan, Jeff Schaller, JigglyNaga Dec 10 at 10:12


Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.










  • 3




    I feel like you're asking "If I'm at a known location and drive following given directions, how do I know where I am?" If you ran curl, how do you not know the URL?
    – G-Man
    Dec 8 at 3:54










  • I do know the URL, but how do I output it after grepping on the curl output (webpage content)?
    – halfbytecode
    Dec 8 at 4:00






  • 2




    I still don't understand the question.  Given a string, do you know how to output it?  (Hint: echo or printf.)
    – G-Man
    Dec 8 at 4:27










  • Fetch webpages using curl, determine if the webpages (not URLs) contain the required text, output the URLs of positive matches.
    – halfbytecode
    Dec 8 at 4:35










  • Rather than adding information in the comments, you should edit the question to improve its quality (see How to Ask). All relevant details should be in the question itself – not in the comments.
    – Anthony Geoghegan
    Dec 8 at 19:49












up vote
0
down vote

favorite









up vote
0
down vote

favorite











I am trying to make a list of URLs of webpages, based on whether the webpages contain the text "XYZ".



The URLs are of the form:



https://www.website.tld/page.php?var1=???&var2=static



??? is a number starting with 1, incremented by 1 each time, until an error page is encountered (say a page containing the text "ERROR”)



I want to dump the URLs of the positive matches into an output file. I read curl can sequentially scan such URLs, and its output can be passed to grep. However, I am unsure how I can retrieve and save the URL, after the grep output.










share|improve this question















I am trying to make a list of URLs of webpages, based on whether the webpages contain the text "XYZ".



The URLs are of the form:



https://www.website.tld/page.php?var1=???&var2=static



??? is a number starting with 1, incremented by 1 each time, until an error page is encountered (say a page containing the text "ERROR”)



I want to dump the URLs of the positive matches into an output file. I read curl can sequentially scan such URLs, and its output can be passed to grep. However, I am unsure how I can retrieve and save the URL, after the grep output.







grep curl






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 12 at 16:27

























asked Dec 8 at 3:23









halfbytecode

42




42




closed as unclear what you're asking by G-Man, RalfFriedl, Anthony Geoghegan, Jeff Schaller, JigglyNaga Dec 10 at 10:12


Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.






closed as unclear what you're asking by G-Man, RalfFriedl, Anthony Geoghegan, Jeff Schaller, JigglyNaga Dec 10 at 10:12


Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.









  • 3




    I feel like you're asking "If I'm at a known location and drive following given directions, how do I know where I am?" If you ran curl, how do you not know the URL?
    – G-Man
    Dec 8 at 3:54










  • I do know the URL, but how do I output it after grepping on the curl output (webpage content)?
    – halfbytecode
    Dec 8 at 4:00






  • 2




    I still don't understand the question.  Given a string, do you know how to output it?  (Hint: echo or printf.)
    – G-Man
    Dec 8 at 4:27










  • Fetch webpages using curl, determine if the webpages (not URLs) contain the required text, output the URLs of positive matches.
    – halfbytecode
    Dec 8 at 4:35










  • Rather than adding information in the comments, you should edit the question to improve its quality (see How to Ask). All relevant details should be in the question itself – not in the comments.
    – Anthony Geoghegan
    Dec 8 at 19:49












  • 3




    I feel like you're asking "If I'm at a known location and drive following given directions, how do I know where I am?" If you ran curl, how do you not know the URL?
    – G-Man
    Dec 8 at 3:54










  • I do know the URL, but how do I output it after grepping on the curl output (webpage content)?
    – halfbytecode
    Dec 8 at 4:00






  • 2




    I still don't understand the question.  Given a string, do you know how to output it?  (Hint: echo or printf.)
    – G-Man
    Dec 8 at 4:27










  • Fetch webpages using curl, determine if the webpages (not URLs) contain the required text, output the URLs of positive matches.
    – halfbytecode
    Dec 8 at 4:35










  • Rather than adding information in the comments, you should edit the question to improve its quality (see How to Ask). All relevant details should be in the question itself – not in the comments.
    – Anthony Geoghegan
    Dec 8 at 19:49







3




3




I feel like you're asking "If I'm at a known location and drive following given directions, how do I know where I am?" If you ran curl, how do you not know the URL?
– G-Man
Dec 8 at 3:54




I feel like you're asking "If I'm at a known location and drive following given directions, how do I know where I am?" If you ran curl, how do you not know the URL?
– G-Man
Dec 8 at 3:54












I do know the URL, but how do I output it after grepping on the curl output (webpage content)?
– halfbytecode
Dec 8 at 4:00




I do know the URL, but how do I output it after grepping on the curl output (webpage content)?
– halfbytecode
Dec 8 at 4:00




2




2




I still don't understand the question.  Given a string, do you know how to output it?  (Hint: echo or printf.)
– G-Man
Dec 8 at 4:27




I still don't understand the question.  Given a string, do you know how to output it?  (Hint: echo or printf.)
– G-Man
Dec 8 at 4:27












Fetch webpages using curl, determine if the webpages (not URLs) contain the required text, output the URLs of positive matches.
– halfbytecode
Dec 8 at 4:35




Fetch webpages using curl, determine if the webpages (not URLs) contain the required text, output the URLs of positive matches.
– halfbytecode
Dec 8 at 4:35












Rather than adding information in the comments, you should edit the question to improve its quality (see How to Ask). All relevant details should be in the question itself – not in the comments.
– Anthony Geoghegan
Dec 8 at 19:49




Rather than adding information in the comments, you should edit the question to improve its quality (see How to Ask). All relevant details should be in the question itself – not in the comments.
– Anthony Geoghegan
Dec 8 at 19:49










2 Answers
2






active

oldest

votes

















up vote
2
down vote













It might be easier to generate the URLs without curl:



for ((i=1; i<1000; i++)); do
url="https://www.website.tld/page.php?var1=$i&var2=static"
if curl -s "$url" | grep -q XYZ; then
echo "$url" >> positive-matches.txt
fi
od





share|improve this answer




















  • Thank you. I started working on a script, and your answer helped me.
    – halfbytecode
    Dec 8 at 11:27






  • 2




    @halfbytecode I've upvoted this answer but I'd suggest that you also accept it (click on the check mark) to show your appreciation as it answers your question and you don't yet have the ability to upvote answers.
    – Anthony Geoghegan
    Dec 8 at 19:43


















up vote
-1
down vote



accepted










I have made a working script. Sharing it in case someone finds it helpful. @nohillside 's answer helped me.



#!/bin/bash

count=1

while true
do
url="https://www.website.tld/page.php?var1=$count&var2=static"

text=`curl -s "$url"`

if echo "$text" | grep -q "ERROR"
then
break
elif echo "$text" | grep -q "XYZ"
then
echo "$url" >> matches.txt
fi

((count++))

done





share|improve this answer



























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    2
    down vote













    It might be easier to generate the URLs without curl:



    for ((i=1; i<1000; i++)); do
    url="https://www.website.tld/page.php?var1=$i&var2=static"
    if curl -s "$url" | grep -q XYZ; then
    echo "$url" >> positive-matches.txt
    fi
    od





    share|improve this answer




















    • Thank you. I started working on a script, and your answer helped me.
      – halfbytecode
      Dec 8 at 11:27






    • 2




      @halfbytecode I've upvoted this answer but I'd suggest that you also accept it (click on the check mark) to show your appreciation as it answers your question and you don't yet have the ability to upvote answers.
      – Anthony Geoghegan
      Dec 8 at 19:43















    up vote
    2
    down vote













    It might be easier to generate the URLs without curl:



    for ((i=1; i<1000; i++)); do
    url="https://www.website.tld/page.php?var1=$i&var2=static"
    if curl -s "$url" | grep -q XYZ; then
    echo "$url" >> positive-matches.txt
    fi
    od





    share|improve this answer




















    • Thank you. I started working on a script, and your answer helped me.
      – halfbytecode
      Dec 8 at 11:27






    • 2




      @halfbytecode I've upvoted this answer but I'd suggest that you also accept it (click on the check mark) to show your appreciation as it answers your question and you don't yet have the ability to upvote answers.
      – Anthony Geoghegan
      Dec 8 at 19:43













    up vote
    2
    down vote










    up vote
    2
    down vote









    It might be easier to generate the URLs without curl:



    for ((i=1; i<1000; i++)); do
    url="https://www.website.tld/page.php?var1=$i&var2=static"
    if curl -s "$url" | grep -q XYZ; then
    echo "$url" >> positive-matches.txt
    fi
    od





    share|improve this answer












    It might be easier to generate the URLs without curl:



    for ((i=1; i<1000; i++)); do
    url="https://www.website.tld/page.php?var1=$i&var2=static"
    if curl -s "$url" | grep -q XYZ; then
    echo "$url" >> positive-matches.txt
    fi
    od






    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Dec 8 at 9:19









    nohillside

    2,087818




    2,087818











    • Thank you. I started working on a script, and your answer helped me.
      – halfbytecode
      Dec 8 at 11:27






    • 2




      @halfbytecode I've upvoted this answer but I'd suggest that you also accept it (click on the check mark) to show your appreciation as it answers your question and you don't yet have the ability to upvote answers.
      – Anthony Geoghegan
      Dec 8 at 19:43

















    • Thank you. I started working on a script, and your answer helped me.
      – halfbytecode
      Dec 8 at 11:27






    • 2




      @halfbytecode I've upvoted this answer but I'd suggest that you also accept it (click on the check mark) to show your appreciation as it answers your question and you don't yet have the ability to upvote answers.
      – Anthony Geoghegan
      Dec 8 at 19:43
















    Thank you. I started working on a script, and your answer helped me.
    – halfbytecode
    Dec 8 at 11:27




    Thank you. I started working on a script, and your answer helped me.
    – halfbytecode
    Dec 8 at 11:27




    2




    2




    @halfbytecode I've upvoted this answer but I'd suggest that you also accept it (click on the check mark) to show your appreciation as it answers your question and you don't yet have the ability to upvote answers.
    – Anthony Geoghegan
    Dec 8 at 19:43





    @halfbytecode I've upvoted this answer but I'd suggest that you also accept it (click on the check mark) to show your appreciation as it answers your question and you don't yet have the ability to upvote answers.
    – Anthony Geoghegan
    Dec 8 at 19:43













    up vote
    -1
    down vote



    accepted










    I have made a working script. Sharing it in case someone finds it helpful. @nohillside 's answer helped me.



    #!/bin/bash

    count=1

    while true
    do
    url="https://www.website.tld/page.php?var1=$count&var2=static"

    text=`curl -s "$url"`

    if echo "$text" | grep -q "ERROR"
    then
    break
    elif echo "$text" | grep -q "XYZ"
    then
    echo "$url" >> matches.txt
    fi

    ((count++))

    done





    share|improve this answer
























      up vote
      -1
      down vote



      accepted










      I have made a working script. Sharing it in case someone finds it helpful. @nohillside 's answer helped me.



      #!/bin/bash

      count=1

      while true
      do
      url="https://www.website.tld/page.php?var1=$count&var2=static"

      text=`curl -s "$url"`

      if echo "$text" | grep -q "ERROR"
      then
      break
      elif echo "$text" | grep -q "XYZ"
      then
      echo "$url" >> matches.txt
      fi

      ((count++))

      done





      share|improve this answer






















        up vote
        -1
        down vote



        accepted







        up vote
        -1
        down vote



        accepted






        I have made a working script. Sharing it in case someone finds it helpful. @nohillside 's answer helped me.



        #!/bin/bash

        count=1

        while true
        do
        url="https://www.website.tld/page.php?var1=$count&var2=static"

        text=`curl -s "$url"`

        if echo "$text" | grep -q "ERROR"
        then
        break
        elif echo "$text" | grep -q "XYZ"
        then
        echo "$url" >> matches.txt
        fi

        ((count++))

        done





        share|improve this answer












        I have made a working script. Sharing it in case someone finds it helpful. @nohillside 's answer helped me.



        #!/bin/bash

        count=1

        while true
        do
        url="https://www.website.tld/page.php?var1=$count&var2=static"

        text=`curl -s "$url"`

        if echo "$text" | grep -q "ERROR"
        then
        break
        elif echo "$text" | grep -q "XYZ"
        then
        echo "$url" >> matches.txt
        fi

        ((count++))

        done






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Dec 8 at 11:40









        halfbytecode

        42




        42












            Popular posts from this blog

            How to check contact read email or not when send email to Individual?

            Displaying single band from multi-band raster using QGIS

            How many registers does an x86_64 CPU actually have?