How to Recursively Fetch Files from a Website using ‘wget’
Command: `wget -r --no-parent target.com/dir
`
--2023-04-07 15:30:00-- http://target.com/dir/
Resolving target.com (target.com)... 192.168.1.1
Connecting to target.com (target.com)|192.168.1.1|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4096 (4.0K) [text/html]
Saving to: ‘target.com/dir/index.html’
...
FINISHED --2023-04-07 15:30:01--
Total wall clock time: 1s
Downloaded: 10 files, 248K in 0.1s (1.73 MB/s)
Description:
`wget
` is a command-line tool used for downloading files from the internet. The -r
option tells `wget
` to recursively download all files from the specified directory and its subdirectories. The `--no-parent
` option tells `wget`
not to download files from directories above the specified directory.
In the example command above, `wget -r --no-parent target.com/dir
` will fetch all the files within the directory `dir
` on the website `target.com
`. The output shows that `wget
` connects to the website and starts downloading the files, including index.html
and other files in the subdirectories. The total number of downloaded files and the time it took to download them are also displayed.
Using `wget
` with the `-r`
and `--no-parent
` options can be a powerful tool for fetching large amounts of data from a website, especially when used in combination with other options such as `--limit-rate
` to limit the download speed or `-nc
` to avoid re-downloading files that have already been downloaded.