Wget command in Linux/Unix
Wget is the non-interactive network downloader which is used to download files from the server even when the user has not logged on to the system and it can work in the background without hindering the current process.
- GNU wget is a free utility for non-interactive download of files from the Web. It supports HTTP, HTTPS, and FTP protocols, as well as retrieval through HTTP proxies.
- wget is non-interactive, meaning that it can work in the background, while the user is not logged on. This allows you to start a retrieval and disconnect from the system, letting wget finish the work. By contrast, most of the Web browsers require constant user’s presence, which can be a great hindrance when transferring a lot of data.
- wget can follow links in HTML and XHTML pages and create local versions of remote web sites, fully recreating the directory structure of the original site. This is sometimes referred to as recursive downloading. While doing that, wget respects the Robot Exclusion Standard (/robots.txt). wget can be instructed to convert the links in downloaded HTML files to the local files for offline viewing.
- wget has been designed for robustness over slow or unstable network connections; if a download fails due to a network problem, it will keep retrying until the whole file has been retrieved. If the server supports resuming, it will instruct the server to continue the download from where it left off.
wget [option] [URL]
1. To simply download a webpage:
2. To download the file in background
wget -b http://www.example.com/samplepage.php
3. To overwrite the log while of the wget command
wget http://www.example.com/filename.txt -o /path/filename.txt
4. To resume a partially downloaded file
wget -c http://example.com/samplefile.tar.gz
5. To try a given number of times
wget --tries=10 http://example.com/samplefile.tar.gz
1. -v / –version : This is used to display the version of the wget available on your system.
2. -h / –help : This is used to print a help message displaying all the possible options of the line command that is available with the wget command line options
$wget -h [URL]
3. -o logfile : This option is used to direct all the messages generated by the system to the logfile specified by the option and when the process is completed all the messages thus generated are available in the log file. If no log file has been specified then the output messages are redirected to the default log file i.e. wget -log
$wget -o logfile [URL]
4. -b / –background : This option is used to send a process to the background as soon as the process has started so that other processes can be carried out. If no output file is specified via the -o option, output is redirected to wget-log by default.
$wget -b [URL]
5. -a : This option is used to append the output messages to the current output log file without overwriting the file as in -o option the output log file is overwritten but by using this option the log of the previous command is saved and the current log is written after that of the previous ones.
$wget -a logfile [URL]
6. -i : This option is used to read URLs from file. If -i is specified as file, URLs are read from the standard input.If this function is used, no URLs need be present on the command line. If there are URLs both on the command line and in an input file, those on the command lines will be the first ones to be retrieved. The file need not be an HTML document if the URLs are just listed sequentially.
$wget -i inputfile $wget -i inputfile [URL]
7. -t number / –tries=number : This option is used to set number of retries to a specified number of times. Specify 0 or inf for infinite retrying. The default is to retry 20 times, with the exception of fatal errors like connection refused or link not found, which are not retried once the error has occurred.
$wget -t number [URL]
8. -c : This option is used to resume a partially downloaded file if the resume capability of the file is yes otherwise the downloading of the file cannot be resume if the resume capability of the given file is no or not specified.
$wget -c [URL]
9. -w : This option is used to set the system to wait the specified number of seconds between the retrievals. Use of this option is recommended, as it lightens the server load by making the requests less frequent. Instead of in seconds, the time can be specified in minutes using the m suffix, in hours using h suffix, or in days using d suffix. Specifying a large value for this option is useful if the network or the destination host is down, so that wget can wait long enough to reasonably expect the network error to be fixed before the retry.
$wget -w number in seconds [URL]
10. -r : this option is used to turn on the recursive retrieving of the link specified in case of fatal errors also. This option is a recursive call to the given link in the command line.
$wget -r [URL]
This article is contributed by Mohak Agrawal. If you like GeeksforGeeks and would like to contribute, you can also write an article using write.geeksforgeeks.org or mail your article to firstname.lastname@example.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.