对于工作,我试图从加拿大环境部网页上获取批量数据,实际上,该网页具有其自己的说明:ftp://ftp.tor.ec.gc.ca/Pub/Get_More_Data_Plus_de_donnees/Readme.txt运行代码时,总会出现错误10054;这通常是错误的。现有连接被远程主机强行关闭。作为一个相当新手的程序员,我想知道该网站是否不喜欢我的程序(我确实在省政府网站的早期阶段就对该程序进行了测试,并且似乎可以检索到信息),或者我的代码中是否存在特定错误,阻止我正确连接。欢迎任何建议如何进行。谢谢

这是我的代码;最后一个try/except块是我在获得IOError消息后尝试重试连接:

import math
import datetime
import sys
import os
import urllib

# out_folder is relative to local directory
# station id is arbitrary; figure this out from the Web site
#   by inspecting the URL of the stations Web page
[station, start_year, end_year, out_folder] = sys.argv[1:5]

print "retrieving data for station "+station+" for years "+start_year+" to "+end_year+" and saving in folder ./"+out_folder+"\n"

# generate filenames and download them
for year in range(int(start_year), int(end_year)+1):
    for month in range(1, 2):

        url = "http://climate.weather.gc.ca/climateData/bulkdata_e.html?format=csv&stationID="+str(station)+"&Year="+str(year)+"&Month="+str(month+1)+"&Day=1&timeframe=2&submit=Download+Data"
        filename = 'stn_'+str(station)+'_'+str(year)+'.csv'
        print 'stn_'+str(station)+'_'+str(year)+'.csv'
        try:
            print "Trying to retrieve data; please hold"
            urllib.urlretrieve(url, out_folder+'\\'+filename)
        except IOError:
            os.mkdir(out_folder)
            print "folder "+out_folder+" does not exist yet, creating it ...\n"
            try:
                print "Trying to retrieve data; please hold"
                urllib.urlretrieve(url, out_folder+'\\'+filename)
            except IOError:
                print "Trying to retrieve data; please hold"
                urllib.urlretrieve(url, out_folder+'\\'+filename)

exit()

此外,如果它有助于回溯的最后一行,则:
File "C:\Python27\lib\socket.py", line 476 in readline

data = self._sock.recu(self._rbufsize)

IOError: [Errno socket error] [Errno 10054] An existing connection was forcibly...

最佳答案

抱歉,不必要的询问;问题与我自己的网络阻止连接有关,我切换到Wifi,程序正常运行。

关于python - 使用Webscraping环境Canada时发生IOError,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/37418395/

10-10 18:55