wget - Mirroring a full website with requisites on different hosts

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
4
down vote

favorite
2












I am trying to make a full static copy of a Wordpress website with wget to be browsed without any network connection (all links and images must be converted).



The different requisites for the pages (images, css, js, ...) are on 3 different Wordpress hosts and are always on the same wp-content/uploads directories.



I tried to limit the recursion on the other domains to wp-content/uploads directories with --domains and --include-directories, but I can't limit wget to fetch only these directories on the $URL1 and $URL2.



Here is the command line (which don't limit to $URL0 and [$URL1|$URL2]/wp-content/uploads ) :



wget --convert-links --recursive -l inf -N -e robots=off -R -nc 
--default-page=index.html -E -D$URL1,$URL2,$URL0 --page-requisites
-B$URL0 -X$URL1,$URL2 --cut-dirs=1 -I*/wp-content/uploads/*, -H -F $URL0


Is there any possibility to limit wget's recursion on the other domains to only some directories?










share|improve this question



















  • 1




    Do I understand correctly that you want only directories below wp-content/uploads? If so, is the -np (no parent) flag what you're looking for?
    – Kevin
    Dec 13 '11 at 19:18














up vote
4
down vote

favorite
2












I am trying to make a full static copy of a Wordpress website with wget to be browsed without any network connection (all links and images must be converted).



The different requisites for the pages (images, css, js, ...) are on 3 different Wordpress hosts and are always on the same wp-content/uploads directories.



I tried to limit the recursion on the other domains to wp-content/uploads directories with --domains and --include-directories, but I can't limit wget to fetch only these directories on the $URL1 and $URL2.



Here is the command line (which don't limit to $URL0 and [$URL1|$URL2]/wp-content/uploads ) :



wget --convert-links --recursive -l inf -N -e robots=off -R -nc 
--default-page=index.html -E -D$URL1,$URL2,$URL0 --page-requisites
-B$URL0 -X$URL1,$URL2 --cut-dirs=1 -I*/wp-content/uploads/*, -H -F $URL0


Is there any possibility to limit wget's recursion on the other domains to only some directories?










share|improve this question



















  • 1




    Do I understand correctly that you want only directories below wp-content/uploads? If so, is the -np (no parent) flag what you're looking for?
    – Kevin
    Dec 13 '11 at 19:18












up vote
4
down vote

favorite
2









up vote
4
down vote

favorite
2






2





I am trying to make a full static copy of a Wordpress website with wget to be browsed without any network connection (all links and images must be converted).



The different requisites for the pages (images, css, js, ...) are on 3 different Wordpress hosts and are always on the same wp-content/uploads directories.



I tried to limit the recursion on the other domains to wp-content/uploads directories with --domains and --include-directories, but I can't limit wget to fetch only these directories on the $URL1 and $URL2.



Here is the command line (which don't limit to $URL0 and [$URL1|$URL2]/wp-content/uploads ) :



wget --convert-links --recursive -l inf -N -e robots=off -R -nc 
--default-page=index.html -E -D$URL1,$URL2,$URL0 --page-requisites
-B$URL0 -X$URL1,$URL2 --cut-dirs=1 -I*/wp-content/uploads/*, -H -F $URL0


Is there any possibility to limit wget's recursion on the other domains to only some directories?










share|improve this question















I am trying to make a full static copy of a Wordpress website with wget to be browsed without any network connection (all links and images must be converted).



The different requisites for the pages (images, css, js, ...) are on 3 different Wordpress hosts and are always on the same wp-content/uploads directories.



I tried to limit the recursion on the other domains to wp-content/uploads directories with --domains and --include-directories, but I can't limit wget to fetch only these directories on the $URL1 and $URL2.



Here is the command line (which don't limit to $URL0 and [$URL1|$URL2]/wp-content/uploads ) :



wget --convert-links --recursive -l inf -N -e robots=off -R -nc 
--default-page=index.html -E -D$URL1,$URL2,$URL0 --page-requisites
-B$URL0 -X$URL1,$URL2 --cut-dirs=1 -I*/wp-content/uploads/*, -H -F $URL0


Is there any possibility to limit wget's recursion on the other domains to only some directories?







regular-expression wget hosts domain






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Sep 8 at 0:52









Jeff Schaller

33.1k849111




33.1k849111










asked Oct 19 '11 at 22:34









user11689

212




212







  • 1




    Do I understand correctly that you want only directories below wp-content/uploads? If so, is the -np (no parent) flag what you're looking for?
    – Kevin
    Dec 13 '11 at 19:18












  • 1




    Do I understand correctly that you want only directories below wp-content/uploads? If so, is the -np (no parent) flag what you're looking for?
    – Kevin
    Dec 13 '11 at 19:18







1




1




Do I understand correctly that you want only directories below wp-content/uploads? If so, is the -np (no parent) flag what you're looking for?
– Kevin
Dec 13 '11 at 19:18




Do I understand correctly that you want only directories below wp-content/uploads? If so, is the -np (no parent) flag what you're looking for?
– Kevin
Dec 13 '11 at 19:18










2 Answers
2






active

oldest

votes

















up vote
1
down vote













wget --mirror --convert-links yourdomain.com





share|improve this answer






















  • This seems like it does the opposite of what he asked; the man page says --mirror "sets infinite recursion depth"
    – Michael Mrozek♦
    Nov 7 '11 at 12:29






  • 3




    Also, could you tell us a bit about what the command actually does. Simply stating a command is not enough.
    – n0pe
    Nov 8 '11 at 3:05

















up vote
0
down vote













Think you might be looking for the include_directories switch?



From the manual:




‘include_directories = list’
‘-I’ option accepts a comma-separated list of directories included in the retrieval. Any other directories will simply be ignored. The directories are absolute paths.
So, if you wish to download from ‘http://host/people/bozo/’ following only links to bozo's colleagues in the /people directory and the bogus scripts in /cgi-bin, you can specify:




 wget -I /people,/cgi-bin http://host/people/bozo/





share|improve this answer




















    Your Answer







    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "106"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f22961%2fwget-mirroring-a-full-website-with-requisites-on-different-hosts%23new-answer', 'question_page');

    );

    Post as a guest






























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    1
    down vote













    wget --mirror --convert-links yourdomain.com





    share|improve this answer






















    • This seems like it does the opposite of what he asked; the man page says --mirror "sets infinite recursion depth"
      – Michael Mrozek♦
      Nov 7 '11 at 12:29






    • 3




      Also, could you tell us a bit about what the command actually does. Simply stating a command is not enough.
      – n0pe
      Nov 8 '11 at 3:05














    up vote
    1
    down vote













    wget --mirror --convert-links yourdomain.com





    share|improve this answer






















    • This seems like it does the opposite of what he asked; the man page says --mirror "sets infinite recursion depth"
      – Michael Mrozek♦
      Nov 7 '11 at 12:29






    • 3




      Also, could you tell us a bit about what the command actually does. Simply stating a command is not enough.
      – n0pe
      Nov 8 '11 at 3:05












    up vote
    1
    down vote










    up vote
    1
    down vote









    wget --mirror --convert-links yourdomain.com





    share|improve this answer














    wget --mirror --convert-links yourdomain.com






    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Nov 7 '11 at 12:29









    Michael Mrozek♦

    58.8k27184207




    58.8k27184207










    answered Nov 7 '11 at 9:38









    Peter

    111




    111











    • This seems like it does the opposite of what he asked; the man page says --mirror "sets infinite recursion depth"
      – Michael Mrozek♦
      Nov 7 '11 at 12:29






    • 3




      Also, could you tell us a bit about what the command actually does. Simply stating a command is not enough.
      – n0pe
      Nov 8 '11 at 3:05
















    • This seems like it does the opposite of what he asked; the man page says --mirror "sets infinite recursion depth"
      – Michael Mrozek♦
      Nov 7 '11 at 12:29






    • 3




      Also, could you tell us a bit about what the command actually does. Simply stating a command is not enough.
      – n0pe
      Nov 8 '11 at 3:05















    This seems like it does the opposite of what he asked; the man page says --mirror "sets infinite recursion depth"
    – Michael Mrozek♦
    Nov 7 '11 at 12:29




    This seems like it does the opposite of what he asked; the man page says --mirror "sets infinite recursion depth"
    – Michael Mrozek♦
    Nov 7 '11 at 12:29




    3




    3




    Also, could you tell us a bit about what the command actually does. Simply stating a command is not enough.
    – n0pe
    Nov 8 '11 at 3:05




    Also, could you tell us a bit about what the command actually does. Simply stating a command is not enough.
    – n0pe
    Nov 8 '11 at 3:05












    up vote
    0
    down vote













    Think you might be looking for the include_directories switch?



    From the manual:




    ‘include_directories = list’
    ‘-I’ option accepts a comma-separated list of directories included in the retrieval. Any other directories will simply be ignored. The directories are absolute paths.
    So, if you wish to download from ‘http://host/people/bozo/’ following only links to bozo's colleagues in the /people directory and the bogus scripts in /cgi-bin, you can specify:




     wget -I /people,/cgi-bin http://host/people/bozo/





    share|improve this answer
























      up vote
      0
      down vote













      Think you might be looking for the include_directories switch?



      From the manual:




      ‘include_directories = list’
      ‘-I’ option accepts a comma-separated list of directories included in the retrieval. Any other directories will simply be ignored. The directories are absolute paths.
      So, if you wish to download from ‘http://host/people/bozo/’ following only links to bozo's colleagues in the /people directory and the bogus scripts in /cgi-bin, you can specify:




       wget -I /people,/cgi-bin http://host/people/bozo/





      share|improve this answer






















        up vote
        0
        down vote










        up vote
        0
        down vote









        Think you might be looking for the include_directories switch?



        From the manual:




        ‘include_directories = list’
        ‘-I’ option accepts a comma-separated list of directories included in the retrieval. Any other directories will simply be ignored. The directories are absolute paths.
        So, if you wish to download from ‘http://host/people/bozo/’ following only links to bozo's colleagues in the /people directory and the bogus scripts in /cgi-bin, you can specify:




         wget -I /people,/cgi-bin http://host/people/bozo/





        share|improve this answer












        Think you might be looking for the include_directories switch?



        From the manual:




        ‘include_directories = list’
        ‘-I’ option accepts a comma-separated list of directories included in the retrieval. Any other directories will simply be ignored. The directories are absolute paths.
        So, if you wish to download from ‘http://host/people/bozo/’ following only links to bozo's colleagues in the /people directory and the bogus scripts in /cgi-bin, you can specify:




         wget -I /people,/cgi-bin http://host/people/bozo/






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jul 13 '12 at 12:28









        James

        1012




        1012



























             

            draft saved


            draft discarded















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f22961%2fwget-mirroring-a-full-website-with-requisites-on-different-hosts%23new-answer', 'question_page');

            );

            Post as a guest













































































            Popular posts from this blog

            How to check contact read email or not when send email to Individual?

            Displaying single band from multi-band raster using QGIS

            How many registers does an x86_64 CPU actually have?