问题描述
我有一项任务是从网站下载 Gbs 的数据.数据为 .gz 文件形式,每个文件大小为 45mb.
I have a task to download Gbs of data from a website. The data is in form of .gz files, each file being 45mb in size.
获取文件的简单方法是使用wget -r -np -A files url".这将以递归格式下载数据并镜像网站.下载速度非常高,4mb/sec.
The easy way to get the files is use "wget -r -np -A files url". This will donwload data in a recursive format and mirrors the website. The donwload rate is very high 4mb/sec.
但是,为了玩玩,我也使用 python 来构建我的 urlparser.
But, just to play around I was also using python to build my urlparser.
通过 Python 的 urlretrieve 下载非常慢,可能是 wget 的 4 倍.下载速率为 500kb/sec.我使用 HTMLParser 来解析 href 标签.
Downloading via Python's urlretrieve is damm slow, possible 4 times as slow as wget. The download rate is 500kb/sec. I use HTMLParser for parsing the href tags.
我不确定为什么会发生这种情况.有没有这方面的设置.
I am not sure why is this happening. Are there any settings for this.
谢谢
推荐答案
可能是您的单位数学错误.
Probably a unit math error on your part.
只是注意到 500KB/s(千字节)等于 4Mb/s(兆比特).
这篇关于python的wget Vs urlretrieve的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!