线程池,为什么要使用线程池:
1. 线程中可以获取某一个线程的状态或者某一个任务的状态,以及返回值
2. 当一个线程完成的时候我们主线程能立即知道
3. futures可以让多线程和多进程编码接口一致

获取状态或关闭任务





import time

def get_html(times):
    time.sleep(times)
    print("get page {} success".format(times))
    return times

executor = ThreadPoolExecutor(max_workers=2)# 最大执行线程数
#通过submit函数提交执行的函数到线程池中, submit 是立即返回,不会造成主线程阻塞
task1 = executor.submit(get_html, (3))
task2 = executor.submit(get_html, (2))
print(task1.done()) # 判断任务是否完成
print(task2.cancel()) #取消任务,只能在任务没有开始的时候进行cancel

获取成功的task返回

方法一:

import time
from concurrent.futures import ThreadPoolExecutor, as_completed

def get_html(times):
    time.sleep(times)
    print("get page {} success".format(times))
    return times

executor = ThreadPoolExecutor(max_workers=2)
urls = [3, 2, 4]
all_task = [executor.submit(get_html, (url)) for url in urls]
for future in as_completed(all_task): # 返回已经成功的任务的返回值,不会按照线程执行的顺序返回
    data = future.result()
    print("get {} page".format(data))

方法二:

import time
from concurrent.futures import ThreadPoolExecutor, as_completed

def get_html(times):
    time.sleep(times)
    print("get page {} success".format(times))
    return times

executor = ThreadPoolExecutor(max_workers=2)
urls = [3, 2, 4]
#通过executor的map获取已经完成的task的值,这个会按照执行线程的顺序执行,会按照执行线程顺序返回
for data in executor.map(get_html, urls):
    print("get {} page".format(data))

wait,wait第几个任务执行完了以后才执行主线程,不然的话主线程会一直阻塞

from concurrent.futures import ThreadPoolExecutor, wait
import time

def get_html(times):
    time.sleep(times)
    print("get page {} success".format(times))
    return times

executor = ThreadPoolExecutor(max_workers=2)

urls = [3, 2, 4]
all_task = [executor.submit(get_html, (url)) for url in urls]
wait(all_task, return_when=FIRST_COMPLETED)
print("main")
04-19 11:22