2009年4月3日星期五

wget的一个小设置来下载受限网址的文件

翻译自:http://www.askapache.com/tools/wget-header-trick.html


I am often logged in to my servers via SSH, and I need to download a file like a WordPress plugin. I’ve noticed many sites now employ a means of blocking robots like wget from accessing their files. Most of the time they use .htaccess to do this. So a permanent workaround has wget mimick a normal browser.
我经常用ssh登录我的服务器,我想下载一个WordPress插件,我发现很多站点都在用一种阻止用户来用wget来下载它们文件的设置。通常它们使用.htaccess来实现,所以需要一个永久的设置来使wget正常浏览。

Using alias

Add this to your .bash_profile or other shell startup script, or just type it at the prompt. Now just run wget from the command line as usual, i.e. wget -dnv http://www.askapache.com/sitemap.xml.

用 alias 命令

把下面的代码放到你的根sh文件或者其他的可以启动的shell脚本中,或者直接输入这些命令。现在从终端向平常一样运行wget -dnv http://www.askapache.com/sitemap.xml 就可以了。

alias wget='wget --referer="http://www.google.com" --user-agent="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6" --header="Accept:
text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5" --header="Accept-Language: en-us,en;q=0.5" --header="Accept-Encoding: gzip,deflate"
--header="Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7" --header="Keep-Alive: 300"'

Using custom .wgetrc
通过自定义的.wgetrc文件

或者你可以在你的 家目录$HOME的/.wgetrc 写入如下的代码,你就可以像平时一样输入wget -dnv http://www.askapache.com/sitemap.xml

###
### Sample Wget initialization file .wgetrc by http://www.askapache.com
###
##
## Local settings (for a user to set in his $HOME/.wgetrc). It is
## *highly* undesirable to put these settings in the global file, since
## they are potentially dangerous to "normal" users.
##
## Even when setting up your own ~/.wgetrc, you should know what you
## are doing before doing so.
##

header = Accept-Language: en-us,en;q=0.5
header = Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
header = Accept-Encoding: gzip,deflate
header = Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
header = Keep-Alive: 300
user_agent = Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6
referer = http://www.google.com

没有评论:

发表评论