产生需求:linux环境下下载ftp网页下指定文件夹的所有文件
使用wget 命令可以完成
1 | wget -r -np -nH --cut-dirs=8 -R index.html* \ |
Solution:wget -r -np -nH --cut-dirs=3 -R index.html http://hostname/aaa/bbb/ccc/ddd/
Explanation:
- It will download all files and subfolders in ddd directory
-r
: recursively 指明是对文件夹的操作-np
: not going to upper directories, like _ccc/… _不保留上层文件夹-nH
: not saving files to hostname folder 不保留ftp服务器名字的文件夹--cut-dirs=3
: but saving it to ddd by omitting first 3 folders aaa, bbb, _ccc _指定下载目的文件夹的层次-R index.html
: excluding index.html files 排除_index.html_文件-a pdf
only download PDF files 只下载pdf文件-e robots=off
if there is a robots.txt file disallowing the downloading of files in the directory, this won’t work. In that case you need to add 不遵守robots协议