Christopher Juckins

SysAdmin Tips, Tricks and other Software Tools

User Tools

Site Tools


mirror_websites_with_wget

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

mirror_websites_with_wget [2013/09/12 21:32]
juckins created
mirror_websites_with_wget [2014/05/20 21:14] (current)
juckins
Line 1: Line 1:
 http://​www.techrepublic.com/​blog/​opensource/​mirroring-web-sites-with-wget/​883 http://​www.techrepublic.com/​blog/​opensource/​mirroring-web-sites-with-wget/​883
  
 +--
  
 +The quickest and easiest way to mirror a remote Web site is to use wget. Wget is similar to cURL (and I'll be the first to admit that I prefer cURL over wget), but wget has some really slick and useful features that aren't found in cURL, such as a means to download an entire Web site for local viewing:
 +
 +$ wget -rkp -l6 -np -nH -N http://​example.com/​
 +
 +This command does a number of things. The -rkp option tells wget to download recursively,​ to convert downloaded links in HTML pages to point to local files, and to obtain all images and other files to properly render the page.
 +
 +The -l6 option tells wget to recurse to a maximum of six nested levels, while -np tells it not to recurse to the parent directory. The -nH option tells wget not to create host directories;​ this means that the files will be downloaded to the current directory rather than a directory named after the hostname of the site being mirrored.
 +
 +Finally, -N tells wget to use time-stamping,​ which is its way of trying to prevent downloading the same unchanged file more than once. Unfortunately,​ with dynamic sites being the norm, this may not work very well, but it's worth adding, regardless.
mirror_websites_with_wget.txt ยท Last modified: 2014/05/20 21:14 by juckins