问题描述
我想在python中使用线程来下载许多网页,并通过以下代码在一个网站中使用队列.
I wanted to use threading in python to download lot of webpages and went through the following code which uses queues in one of the website.
它放置了一个无限的while循环.每个线程是否连续运行,直到所有线程完成才结束?我想念什么吗?
it puts a infinite while loop. Does each of thread run continuously with out ending till all of them are complete? Am I missing something.
#!/usr/bin/env python
import Queue
import threading
import urllib2
import time
hosts = ["http://yahoo.com", "http://google.com", "http://amazon.com",
"http://ibm.com", "http://apple.com"]
queue = Queue.Queue()
class ThreadUrl(threading.Thread):
"""Threaded Url Grab"""
def __init__(self, queue):
threading.Thread.__init__(self)
self.queue = queue
def run(self):
while True:
#grabs host from queue
host = self.queue.get()
#grabs urls of hosts and prints first 1024 bytes of page
url = urllib2.urlopen(host)
print url.read(1024)
#signals to queue job is done
self.queue.task_done()
start = time.time()
def main():
#spawn a pool of threads, and pass them queue instance
for i in range(5):
t = ThreadUrl(queue)
t.setDaemon(True)
t.start()
#populate queue with data
for host in hosts:
queue.put(host)
#wait on the queue until everything has been processed
queue.join()
main()
print "Elapsed Time: %s" % (time.time() - start)
推荐答案
将线程设置为daemon
线程会使线程在完成主线程后退出.但是,是的,您是正确的,因为只要queue
中有某些内容,线程将连续运行,否则它将阻塞.
Setting the thread's to be daemon
threads causes them to exit when the main is done. But, yes you are correct in that your threads will run continuously for as long as there is something in the queue
else it will block.
文档解释了此详细信息队列文档
The documentation explains this detail Queue docs
python Threading文档也解释了daemon
部分.
The python Threading documentation explains the daemon
part as well.
没有活动的非守护线程时,整个Python程序都会退出.
因此,当清空队列并在解释器退出时恢复queue.join
时,线程将死亡.
So, when the queue is emptied and the queue.join
resumes when the interpreter exits the threads will then die.
对Queue
这篇关于使用队列在python中进行线程化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!