How to use wget (with the mk option) to mirror the site and its external links?

I know wget -mkp http://example.com to mirror the site and all its embedded files.

But I need to backup the site where all the images are stored in a separate domain. How to download these images using wget and update src tags accordingly?

Thank!

+3
source share
2 answers

A slightly modified version of @PatrickHorn's answer:

First cdto the top directory containing the downloaded files.

"wget ​​first to find pages recursively, albeit from only one domain"

wget --recursive --timestamping -l inf --no-remove-listing --page-requisites http://site.com

"second wget that spans nodes but does not recursively retrieve pages"

find site.com -name '*.htm*' -exec wget --no-clobber --span-hosts --timestamping --page-requisites http://{} \;

, , , - .htm(l) , , . .

+1

wget -r -H , (, ) . , , , wget , :

wget -H -N -kp http://<site>/<document>

.

, , wget , ; wget, , :

wget -mkp http://example.com
find example.com/ -name '*.html*' -exec wget -nc -HNkp http://{} \;

-nc - wget, , , , . ; , , ( ) . , , . , , - k, URL-, , URL-. , URL- .

, , " http://example.com/", "sed" script, .

, , , , example.com, -D , , . , google.com gstatic.com.

, , , .

"-r -l 1 -H" , -A , css:

0

All Articles