问题描述
我正在寻找在很短的时间内进行大量反向DNS查找。我目前使用socket.gethostbyaddr和concurrent.futures线程池实现异步查找,但是我仍然看不到所需的性能。例如,脚本花了大约22分钟才能完成2500个IP地址。
我想知道是否有更快的方法来做这些,而不需要使用adns-蟒蛇。我发现这个提供了一些额外的背景。
代码片段:
ips = [...]
with concurrent.futures.ThreadPoolExecutor(max_workers = 16)as池:
列表(pool.map(get_hostname_from_ip,ips))
def get_hostname_from_ip(ip):
try:
return socket.gethostbyaddr(ip)[0]
除了:
return
我认为部分问题是, IP地址不能解析和超时。我试过:
socket.setdefaulttimeout(2.0)
但似乎没有效果。
我发现了我的主要问题IP无法解决,因此插座不服从设置超时,30秒后失败。请参阅。
$由于缺乏对IPv6的支持(没有补丁),因此无法执行ad $ python ,因此无法执行。
之后搜索周围我发现这一点:并在我的代码中实现了类似的版本(他的代码也使用可选的线程池并实现了一个超时)。
最后我用了一个并发的dnspython。用于异步反向DNS查找的期货线程池(请参阅和)。超过1秒钟,这个切换运行时间在2500个IP地址上从22分钟到16秒左右。大的差异可能归因于全球口译员锁定a>在套接字和30秒钟超时。
代码片段:
import concurrent.futures
from dns import resolver,reversename
dns_resolver = resolver.Resolver()
dns_resolver.timeout = 1
dns_resolver.lifetime = 1
ips = [...]
results = []
with concurrent.futures.ThreadPoolExecutor(max_workers = 16)as pool:
results = list(pool.map(get_hostname_from_ip, ips))
def get_hostname_from_ip(ip):
try:
reverse_name = reversename.from_address(ip)
return dns_resolver.query(reverse_name,PTR) [0] .to_text()[: - 1]
除了:
返回
I am looking to do a large number of reverse DNS lookups in a small amount of time. I currently have implemented an asynchronous lookup using socket.gethostbyaddr and concurrent.futures thread pool, but am still not seeing the desired performance. For example, the script took about 22 minutes to complete on 2500 IP addresses.
I was wondering if there is any quicker way to do this without resorting to something like adns-python. I found this http://blog.schmichael.com/2007/09/18/a-lesson-on-python-dns-and-threads/ which provided some additional background.
Code Snippet:
ips = [...]
with concurrent.futures.ThreadPoolExecutor(max_workers = 16) as pool:
list(pool.map(get_hostname_from_ip, ips))
def get_hostname_from_ip(ip):
try:
return socket.gethostbyaddr(ip)[0]
except:
return ""
I think part of the issue is that many of the IP addresses are not resolving and timing out. I tried:
socket.setdefaulttimeout(2.0)
but it seems to have no effect.
I discovered my main issue was IPs failing to resolve and thus sockets not obeying their set timeouts and failing after 30 seconds. See Python 2.6 urlib2 timeout issue.
adns-python was a no-go because of its lack of support for IPv6 (without patches).
After searching around I found this: Reverse DNS Lookups with dnspython and implemented a similar version in my code (his code also uses an optional thread pool and implements a timeout).
In the end I used dnspython with a concurrent.futures thread pool for asynchronous reverse DNS lookups (see Python: Reverse DNS Lookup in a shared hosting and Dnspython: Setting query timeout/lifetime). With a timeout of 1 second this cut runtime from about 22 minutes to about 16 seconds on 2500 IP addresses. The large difference can probably be attributed to the Global Interpreter Lock on sockets and the 30 second timeouts.
Code Snippet:
import concurrent.futures
from dns import resolver, reversename
dns_resolver = resolver.Resolver()
dns_resolver.timeout = 1
dns_resolver.lifetime = 1
ips = [...]
results = []
with concurrent.futures.ThreadPoolExecutor(max_workers = 16) as pool:
results = list(pool.map(get_hostname_from_ip, ips))
def get_hostname_from_ip(ip):
try:
reverse_name = reversename.from_address(ip)
return dns_resolver.query(reverse_name, "PTR")[0].to_text()[:-1]
except:
return ""
这篇关于Python异步反向DNS查找的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!