python - 如何使计算循环易于拆分和恢复？

我想在0..99中找到给定计算问题的最佳参数i, j, k，我需要运行：

for i in range(100):
    for j in range(100):
        for k in range(100):
            dothejob(i, j, k)    # 1 second per computation

这总共需要10 ^ 6秒，即11.5天。

我通过将工作分配到4个进程中开始使用（以使用我的4核CPU计算机的100％计算能力）：

for i in range(100):
    if i % 4 != 0:      #  replace != 0 by 1, 2, or 3 for the parallel scripts #2, #3, #4
        continue
    for j in range(100):
        for k in range(100):
            dothejob(i, j, k)

        with open('done.log', 'a+') as f:    # log what has been done
            f.write("%i %i\n" % (i, j))

但是我有这种方法的问题：

我必须运行python script.py，然后打开script.py，将第2行替换为if i % 4 != 1，然后运行python script.py，然后打开script.py，将第2行替换为if i % 4 != 2，然后运行python script.py，然后打开script.py，将第2行替换为if i % 4 != 3，然后运行python script.py。
假设循环中断（需要重新启动计算机，崩溃或发生其他任何情况等）。至少我们知道在done.log中已经完成的所有（i，j）（因此我们不需要再次从0开始），但是没有简单的方法来恢复工作。（好的，我们可以打开done.log，进行解析，丢弃在重新启动循环时已经完成的（i，j），我开始这样做-但我有一种以肮脏的方式重新发明已经存在的感觉的感觉）

我正在为此寻求更好的解决方案（但是map/reduce可能对于这个小任务来说是过大的，并且在Python的几行中不容易使用）。

问题：如何在Python中轻松地将计算for i in range(100): for j in range(100): for k in range(100): dothejob(i, j, k)拆分为多个进程，并使其易于恢复（例如，重新启动后）？

最佳答案

只需使用过程池来映射产品，例如：

import itertools as it
from multiprocessing import Pool
the_args = it.product(range(100), range(100), range(100))
pool = Pool(4)

def jobWrapper(args): #we need this to unpack the (i, j, k) tuple
    return dothejob(*args)

res = pool.map(jobWrapper, the_args)

如果要恢复它，请从日志中知道las (i, j, k)，只需跳过先前从the_args计算的所有内容：

the_args = it.product(range(100), range(100), range(100))
#skip previously computed
while True:
    if next(the_args) == (i, j, k):
        break
...

作为(i, j, k)具有las计算值的元组。