我的函数是用线程执行的:
def getdata(self, page, ...):
tries = 10
for n in range(tries):
try:
...
datarALL = []
url = 'http://website/...'.format(...)
responsedata = requests.get(url, data=data, headers=self.hed, verify=False)
responsedata.raise_for_status()
if responsedata.status_code == 200: # 200 for successful call
...
if ...
break
except (ChunkedEncodingError, requests.exceptions.HTTPError) as e:
print ("page #{0} run #{1} failed. Returned status code {2}. Reason: {3}. Msg: {4}. Retry.".format(page, n, responsedata.status_code, responsedata.reason, sys.exc_info()[0]))
if n == tries - 1:
print ("page {0} could not be imported. Max retried reached.".format(page))
os._exit(1) #One thread max retried - close all threads and
return datarALL
如下:
with ThreadPoolExecutor(max_workers=num_of_workers) as executor:
futh = [(executor.submit(self.getdata, page,...)) for page in pages]
for data in as_completed(futh):
datarALL.extend(data.result())
print ("Finished generateing data.")
return datarALL
有时我会遇到意外异常,例如:
ConnectionResetError: [Errno 104] Connection reset by peer
关闭程序。我想更改代码,以便无论发生哪种异常,线程都会一直重试,直到遇到if n == tries - 1:
。我不希望我的线程由于随机异常而关闭。我读了request exceptions info page,但看不到如何在不手动列出所有例外的情况下捕获所有例外。有通用的方法可以做到这一点吗?
基本上我想要这样的东西:
except (ALL EXCEPTIONS from Requests) as e:
print ("page #{0} run #{1} failed. Returned status code {2}. Reason: {3}. Msg: {4}. Retry.".format(page, n, responsedata.status_code, responsedata.reason, sys.exc_info()[0]))
if n == tries - 1:
print ("page {0} could not be imported. Max retried reached.".format(page))
os._exit(1) #One thread max retried - close all threads and
return datarALL
我怎样才能做到这一点?
编辑:
使用
except Exception as e:
print ("page #{0} run #{1} failed. Returned status code {2}. Reason: {3}. Msg: {4}. Retry.".format(page, n, responsedata.status_code, responsedata.reason, sys.exc_info()[0]))
if n == tries - 1:
print ("page {0} could not be imported. Max retried reached.".format(page))
os._exit(1) #One thread max retried - close all threads and
return datarALL
不赶上。它给了我这个:
Traceback (most recent call last):
File "/home/ubuntu/.local/lib/python3.5/site-packages/urllib3/response.py", line 331, in _error_catcher
yield
File "/home/ubuntu/.local/lib/python3.5/site-packages/urllib3/response.py", line 640, in read_chunked
chunk = self._handle_chunk(amt)
File "/home/ubuntu/.local/lib/python3.5/site-packages/urllib3/response.py", line 595, in _handle_chunk
returned_chunk = self._fp._safe_read(self.chunk_left)
File "/usr/lib/python3.5/http/client.py", line 607, in _safe_read
chunk = self.fp.read(min(amt, MAXAMOUNT))
File "/usr/lib/python3.5/socket.py", line 575, in readinto
return self._sock.recv_into(b)
ConnectionResetError: [Errno 104] Connection reset by peer
循环不会重试。运行终止...
编辑2:
except requests.exceptions.RequestException as e:
print ("page #{0} run #{1} failed. Returned status code {2}. Reason: {3}. Msg: {4}. Retry.".format(page, n, responsedata.status_code, responsedata.reason, sys.exc_info()[0]))
if n == tries - 1:
print ("page {0} could not be imported. Max retried reached.".format(page))
os._exit(1) #One thread max retried - close all threads and
return datarALL
也不能捕获上面列出的
ConnectionResetError: [Errno 104] Connection reset by peer
。 最佳答案
捕获所有异常通常被认为是不好的做法,因为它可能会隐藏一些问题。
也就是说,Python异常受益于继承,捕获基本异常将捕获从该基本异常继承的每个异常。
有关详细信息,请参见the Python standard exception hierarchy。
您可以看到根本异常是BaseException
,但是永远不要捕获此异常,因为它将捕获Ctrl+C
中断和生成器退出。
如果要捕获所有异常类型,则可以捕获Exception
。
您可能还想仅捕获requests
中的异常。在这种情况下,根据the doc,它可以通过捕获requests
模块的基本异常来完成:RequestException
如果要同时捕获requests
异常和ConnectionResetError
(这是Python标准异常),则必须在except
子句中同时指定:
except (requests.exceptions.RequestException,
ConnectionResetError) as err:
# some code
或者,如果您想减少特定性并捕获所有可能的连接错误,则可以使用
ConnectionError
代替ConnectionResetError
。 (请参见exceptions hierarchy)最后,您可能想对每种异常类型做出不同的反应。在这种情况下,您可以执行以下操作:
try:
# something
except ConnectionError as err:
# manage connection errors
except requests.exceptions.RequestException as err:
# manage requests errors