我正在尝试编写一个函数,它将获取一个url并返回该url的内容。还有一个附加参数(use tor),当设置为True
时,它将使用socksipy通过socks 5代理服务器路由请求(在本例中为tor)。
我可以为所有连接设置全局代理,但我无法解决两个问题:
如何将此设置移到函数中,以便可以在useTor
变量上确定它我无法访问函数中的socks
,也不知道如何访问。
我假设如果我不设置代理,那么下次请求时它将直接执行。SocksiPy文档似乎没有给出任何关于如何重置代理的指示。
有人能建议吗?我(初学者)的代码如下:
import gzip
import socks
import socket
def create_connection(address, timeout=None, source_address=None):
sock = socks.socksocket()
sock.connect(address)
return sock
# next line works just fine if I want to set the proxy globally
# socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 9050)
socket.socket = socks.socksocket
socket.create_connection = create_connection
import urllib2
import sys
def getURL(url, useTor=False):
if useTor:
print "Using tor..."
# Throws- AttributeError: 'module' object has no attribute 'setproxy'
socks.setproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 9050)
else:
print "Not using tor..."
# Not sure how to cancel the proxy, assuming it persists
opener = urllib2.build_opener()
usock = opener.open(url)
url = usock.geturl()
encoding = usock.info().get("Content-Encoding")
if encoding in ('gzip', 'x-gzip', 'deflate'):
content = usock.read()
if encoding == 'deflate':
data = StringIO.StringIO(zlib.decompress(content))
else:
data = gzip.GzipFile('', 'rb', 9, StringIO.StringIO(content))
result = data.read()
else:
result = usock.read()
usock.close()
return result
# Connect to the same site both with and without using Tor
print getURL('https://check.torproject.org', False)
print getURL('https://check.torproject.org', True)
最佳答案
例子
只需调用不带参数的socksocket.set_proxy
,这将有效地删除以前设置的任何代理设置。
import socks
sck = socks.socksocket ()
# use TOR
sck.setproxy (socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 9050)
# reset to normal use
sck.setproxy ()
详情
通过查看
socks.py
的源代码,并深入研究socksocket.setproxy
的内容,我们很快意识到,为了丢弃以前的任何代理属性,我们只需调用函数而不需要额外的参数(除了self
)。class socksocket(socket.socket):
... # additional functionality ignored
def setproxy(self,proxytype=None,addr=None,port=None,rdns=True,username=None,password=None):
"""setproxy(proxytype, addr[, port[, rdns[, username[, password]]]])
Sets the proxy to be used.
proxytype - The type of the proxy to be used. Three types
are supported: PROXY_TYPE_SOCKS4 (including socks4a),
PROXY_TYPE_SOCKS5 and PROXY_TYPE_HTTP
addr - The address of the server (IP or DNS).
port - The port of the server. Defaults to 1080 for SOCKS
servers and 8080 for HTTP proxy servers.
rdns - Should DNS queries be preformed on the remote side
(rather than the local side). The default is True.
Note: This has no effect with SOCKS4 servers.
username - Username to authenticate with to the server.
The default is no authentication.
password - Password to authenticate with to the server.
Only relevant when username is also provided.
"""
self.__proxy = (proxytype,addr,port,rdns,username,password)
... # additional functionality ignored
注意:当一个新连接即将协商时,实现将使用
self.__proxy
的内容,除非潜在的必需元素是None
(在这种情况下,该设置被简单忽略)。