wget - Mirroring a full website with requisites on different hosts
Clash Royale CLAN TAG#URR8PPP
up vote
4
down vote
favorite
I am trying to make a full static copy of a Wordpress website with wget
to be browsed without any network connection (all links and images must be converted).
The different requisites for the pages (images, css, js, ...) are on 3 different Wordpress hosts and are always on the same wp-content/uploads
directories.
I tried to limit the recursion on the other domains to wp-content/uploads
directories with --domains
and --include-directories
, but I can't limit wget
to fetch only these directories on the $URL1
and $URL2
.
Here is the command line (which don't limit to $URL0
and [$URL1|$URL2]/wp-content/uploads
) :
wget --convert-links --recursive -l inf -N -e robots=off -R -nc
--default-page=index.html -E -D$URL1,$URL2,$URL0 --page-requisites
-B$URL0 -X$URL1,$URL2 --cut-dirs=1 -I*/wp-content/uploads/*, -H -F $URL0
Is there any possibility to limit wget
's recursion on the other domains to only some directories?
regular-expression wget hosts domain
add a comment |Â
up vote
4
down vote
favorite
I am trying to make a full static copy of a Wordpress website with wget
to be browsed without any network connection (all links and images must be converted).
The different requisites for the pages (images, css, js, ...) are on 3 different Wordpress hosts and are always on the same wp-content/uploads
directories.
I tried to limit the recursion on the other domains to wp-content/uploads
directories with --domains
and --include-directories
, but I can't limit wget
to fetch only these directories on the $URL1
and $URL2
.
Here is the command line (which don't limit to $URL0
and [$URL1|$URL2]/wp-content/uploads
) :
wget --convert-links --recursive -l inf -N -e robots=off -R -nc
--default-page=index.html -E -D$URL1,$URL2,$URL0 --page-requisites
-B$URL0 -X$URL1,$URL2 --cut-dirs=1 -I*/wp-content/uploads/*, -H -F $URL0
Is there any possibility to limit wget
's recursion on the other domains to only some directories?
regular-expression wget hosts domain
1
Do I understand correctly that you want only directories belowwp-content/uploads
? If so, is the-np
(no parent) flag what you're looking for?
â Kevin
Dec 13 '11 at 19:18
add a comment |Â
up vote
4
down vote
favorite
up vote
4
down vote
favorite
I am trying to make a full static copy of a Wordpress website with wget
to be browsed without any network connection (all links and images must be converted).
The different requisites for the pages (images, css, js, ...) are on 3 different Wordpress hosts and are always on the same wp-content/uploads
directories.
I tried to limit the recursion on the other domains to wp-content/uploads
directories with --domains
and --include-directories
, but I can't limit wget
to fetch only these directories on the $URL1
and $URL2
.
Here is the command line (which don't limit to $URL0
and [$URL1|$URL2]/wp-content/uploads
) :
wget --convert-links --recursive -l inf -N -e robots=off -R -nc
--default-page=index.html -E -D$URL1,$URL2,$URL0 --page-requisites
-B$URL0 -X$URL1,$URL2 --cut-dirs=1 -I*/wp-content/uploads/*, -H -F $URL0
Is there any possibility to limit wget
's recursion on the other domains to only some directories?
regular-expression wget hosts domain
I am trying to make a full static copy of a Wordpress website with wget
to be browsed without any network connection (all links and images must be converted).
The different requisites for the pages (images, css, js, ...) are on 3 different Wordpress hosts and are always on the same wp-content/uploads
directories.
I tried to limit the recursion on the other domains to wp-content/uploads
directories with --domains
and --include-directories
, but I can't limit wget
to fetch only these directories on the $URL1
and $URL2
.
Here is the command line (which don't limit to $URL0
and [$URL1|$URL2]/wp-content/uploads
) :
wget --convert-links --recursive -l inf -N -e robots=off -R -nc
--default-page=index.html -E -D$URL1,$URL2,$URL0 --page-requisites
-B$URL0 -X$URL1,$URL2 --cut-dirs=1 -I*/wp-content/uploads/*, -H -F $URL0
Is there any possibility to limit wget
's recursion on the other domains to only some directories?
regular-expression wget hosts domain
regular-expression wget hosts domain
edited Sep 8 at 0:52
Jeff Schaller
33.1k849111
33.1k849111
asked Oct 19 '11 at 22:34
user11689
212
212
1
Do I understand correctly that you want only directories belowwp-content/uploads
? If so, is the-np
(no parent) flag what you're looking for?
â Kevin
Dec 13 '11 at 19:18
add a comment |Â
1
Do I understand correctly that you want only directories belowwp-content/uploads
? If so, is the-np
(no parent) flag what you're looking for?
â Kevin
Dec 13 '11 at 19:18
1
1
Do I understand correctly that you want only directories below
wp-content/uploads
? If so, is the -np
(no parent) flag what you're looking for?â Kevin
Dec 13 '11 at 19:18
Do I understand correctly that you want only directories below
wp-content/uploads
? If so, is the -np
(no parent) flag what you're looking for?â Kevin
Dec 13 '11 at 19:18
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
1
down vote
wget --mirror --convert-links yourdomain.com
This seems like it does the opposite of what he asked; the man page says--mirror
"sets infinite recursion depth"
â Michael Mrozekâ¦
Nov 7 '11 at 12:29
3
Also, could you tell us a bit about what the command actually does. Simply stating a command is not enough.
â n0pe
Nov 8 '11 at 3:05
add a comment |Â
up vote
0
down vote
Think you might be looking for the include_directories
switch?
From the manual:
âÂÂinclude_directories = listâÂÂ
âÂÂ-Iâ option accepts a comma-separated list of directories included in the retrieval. Any other directories will simply be ignored. The directories are absolute paths.
So, if you wish to download from âÂÂhttp://host/people/bozo/â following only links to bozo's colleagues in the /people directory and the bogus scripts in /cgi-bin, you can specify:
wget -I /people,/cgi-bin http://host/people/bozo/
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
wget --mirror --convert-links yourdomain.com
This seems like it does the opposite of what he asked; the man page says--mirror
"sets infinite recursion depth"
â Michael Mrozekâ¦
Nov 7 '11 at 12:29
3
Also, could you tell us a bit about what the command actually does. Simply stating a command is not enough.
â n0pe
Nov 8 '11 at 3:05
add a comment |Â
up vote
1
down vote
wget --mirror --convert-links yourdomain.com
This seems like it does the opposite of what he asked; the man page says--mirror
"sets infinite recursion depth"
â Michael Mrozekâ¦
Nov 7 '11 at 12:29
3
Also, could you tell us a bit about what the command actually does. Simply stating a command is not enough.
â n0pe
Nov 8 '11 at 3:05
add a comment |Â
up vote
1
down vote
up vote
1
down vote
wget --mirror --convert-links yourdomain.com
wget --mirror --convert-links yourdomain.com
edited Nov 7 '11 at 12:29
Michael Mrozekâ¦
58.8k27184207
58.8k27184207
answered Nov 7 '11 at 9:38
Peter
111
111
This seems like it does the opposite of what he asked; the man page says--mirror
"sets infinite recursion depth"
â Michael Mrozekâ¦
Nov 7 '11 at 12:29
3
Also, could you tell us a bit about what the command actually does. Simply stating a command is not enough.
â n0pe
Nov 8 '11 at 3:05
add a comment |Â
This seems like it does the opposite of what he asked; the man page says--mirror
"sets infinite recursion depth"
â Michael Mrozekâ¦
Nov 7 '11 at 12:29
3
Also, could you tell us a bit about what the command actually does. Simply stating a command is not enough.
â n0pe
Nov 8 '11 at 3:05
This seems like it does the opposite of what he asked; the man page says
--mirror
"sets infinite recursion depth"â Michael Mrozekâ¦
Nov 7 '11 at 12:29
This seems like it does the opposite of what he asked; the man page says
--mirror
"sets infinite recursion depth"â Michael Mrozekâ¦
Nov 7 '11 at 12:29
3
3
Also, could you tell us a bit about what the command actually does. Simply stating a command is not enough.
â n0pe
Nov 8 '11 at 3:05
Also, could you tell us a bit about what the command actually does. Simply stating a command is not enough.
â n0pe
Nov 8 '11 at 3:05
add a comment |Â
up vote
0
down vote
Think you might be looking for the include_directories
switch?
From the manual:
âÂÂinclude_directories = listâÂÂ
âÂÂ-Iâ option accepts a comma-separated list of directories included in the retrieval. Any other directories will simply be ignored. The directories are absolute paths.
So, if you wish to download from âÂÂhttp://host/people/bozo/â following only links to bozo's colleagues in the /people directory and the bogus scripts in /cgi-bin, you can specify:
wget -I /people,/cgi-bin http://host/people/bozo/
add a comment |Â
up vote
0
down vote
Think you might be looking for the include_directories
switch?
From the manual:
âÂÂinclude_directories = listâÂÂ
âÂÂ-Iâ option accepts a comma-separated list of directories included in the retrieval. Any other directories will simply be ignored. The directories are absolute paths.
So, if you wish to download from âÂÂhttp://host/people/bozo/â following only links to bozo's colleagues in the /people directory and the bogus scripts in /cgi-bin, you can specify:
wget -I /people,/cgi-bin http://host/people/bozo/
add a comment |Â
up vote
0
down vote
up vote
0
down vote
Think you might be looking for the include_directories
switch?
From the manual:
âÂÂinclude_directories = listâÂÂ
âÂÂ-Iâ option accepts a comma-separated list of directories included in the retrieval. Any other directories will simply be ignored. The directories are absolute paths.
So, if you wish to download from âÂÂhttp://host/people/bozo/â following only links to bozo's colleagues in the /people directory and the bogus scripts in /cgi-bin, you can specify:
wget -I /people,/cgi-bin http://host/people/bozo/
Think you might be looking for the include_directories
switch?
From the manual:
âÂÂinclude_directories = listâÂÂ
âÂÂ-Iâ option accepts a comma-separated list of directories included in the retrieval. Any other directories will simply be ignored. The directories are absolute paths.
So, if you wish to download from âÂÂhttp://host/people/bozo/â following only links to bozo's colleagues in the /people directory and the bogus scripts in /cgi-bin, you can specify:
wget -I /people,/cgi-bin http://host/people/bozo/
answered Jul 13 '12 at 12:28
James
1012
1012
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f22961%2fwget-mirroring-a-full-website-with-requisites-on-different-hosts%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
1
Do I understand correctly that you want only directories below
wp-content/uploads
? If so, is the-np
(no parent) flag what you're looking for?â Kevin
Dec 13 '11 at 19:18