いつか使うかもしれないのでメモ。
$ wget \ --recursive \ --no-clobber \ --page-requisites \ --html-extension \ --convert-links \ --restrict-file-names=windows \ --domains website.org \ --no-parent www.website.org/tutorials/html/
オプション | 説明 |
--recursive | Download the entire Web site. |
--no-clobber | Don't overwrite any existing files (used in case the download is interrupted and resumed). |
--page-requisites | Get all the elements that compose the page (images, CSS and so on). |
--html-extension | Save files with the .html extension. |
--convert-links | Convert links so that they work locally, off-line. |
--restrict-file-names=windows | Modify filenames so that they will work in Windows as well. |
--domains website.org | Don't follow links outside website.org. |
--no-parent | Don't follow links outside the directory tutorials/html/. |
参考:
http://www.linuxjournal.com/content/downloading-entire-web-site-wget
● wget, web, 全て, ダウンロード, download, all, entire, スクレイピング
0 件のコメント:
コメントを投稿