How to Download a Whole Website Using Wget
I remember that back in the days that i started to learn more about the different linux/unix tools, i was fascinated by the great wget http downloader. Believe me, you will end up using this one quite a lot, even just for means of running a cron to update a website or for some more interesting stuff, like downloading a whole website.
You will have certainly been in the position where you just wanted to download a website because of its great information wealth. In this short post, i will tell you how to do that easily without having to use tools like Teleport Pro in Windows, but just using the awesome wget. Notice that while you can download a whole website that has static pages, downloading websites made in dynamic languages like php is not feasible. This happens because php is by nature a language that depends on user interaction. Therefore, wget cannot just go about getting php pages, it can be done but not this way.
However, most websites that people want to download are those that have lots of static content like videos, images and stuff. These websites can be downloaded by wget. The command to download a whole website is pretty easy:
wget -mk www.website.com
This creates a mirror(-m) for the website that we specify and convert its links to local links(-k) to make our offline browsing easy. Sometimes, this implementation may have a problem because some websites check the user agent field of our connection to specify whether we use wget to download the whole website, thus not allowing us to do so. The -m switch usese “wget” as a user agent specification. Therefore, if we want to specify another user agent, like Mozilla, we should use wget to download the whole website as follows:
wget -r -k -U Mozilla www.website.com
We specify -r here to download a website recursively (that is what -m actually does also). Moreover, using the -U switch, we specify the user agent as being “Mozilla” instead of “wget”.
Bonus Tip – Continue Download In Case of Interruption
For some reason, wget might be interrupted while you are downloading a website. This can be very frustrating, especially if you are downloading big chunks of data like videos. The good thing is that you won’t have to redownload the same files again using wget. Just by using the switch -c, you can just as simply continue the download right at the same spot where it stopped and not lose any progress. Doing that is easy, if you sometime get an interruption on your downloading, just use the same command you used at the first place only including the -c switch after wget, like:
wget -c -r -k -U Mozilla www.website.com
























Now i came to know how people make a duplicate site!!!Thanks for sharing!!!
loved this article!
Hello! this article was very interesting and funny for me. But it was difficult to find it with ask. Maybe you should improve it with seo plugins for wordpress like headspace.
Thank you for this useful topic. I added to favorites.
:~” I am really thankful to this topic because it really gives useful information ~;:
This was really SO very helpful for me, thanks for making this information available to everyone for free.
Thanks good article and useful. Have a good article like this forever.
Hello Spyro,
Thank you for all these wonderful article about programming. I am a former programmer who hasn’t been programmed for a while, and now trying to learn web development. I had reading your posts for hours, and it is really helpful.
Thank you very much. I really appreciate it.
@ken : You are welcome, thank you for the nice words
Hi Can I download the server side pages of the website. Any tool available for download the server side pages.
@bharat : you cannot download anything that is running on the server, same as happens with php. If you could, it would be a great security threat. What runs on the server, stays on the server.