问题描述
所以我写了一个小脚本来从网站下载图片。它通过一个7个alpha charactor值,其中第一个char总是一个数字。问题是如果我想停止脚本并重新启动,我必须从头开始。
So I've wrote a small script to download pictures from a website. It goes through a 7 alpha charactor value, where the first char is always a number. The problem is if I want to stop the script and start it up again I have to start all over.
我可以使用最后一个值来种子itertools.product所以我不必再经过他们了。
Can I seed itertools.product somehow with the last value I got so I don't have to go through them all again.
感谢任何输入。
这里是代码的一部分:
numbers = '0123456789'
alnum = numbers + 'abcdefghijklmnopqrstuvwxyz'
len7 = itertools.product(numbers, alnum, alnum, alnum, alnum, alnum, alnum) # length 7
for p in itertools.chain(len7):
currentid = ''.join(p)
#semi static vars
url = 'http://mysite.com/images/'
url += currentid
#Need to get the real url cause the redirect
print "Trying " + url
req = urllib2.Request(url)
res = openaurl(req)
if res == "continue": continue
finalurl = res.geturl()
#ok we have the full url now time to if it is real
try: file = urllib2.urlopen(finalurl)
except urllib2.HTTPError, e:
print e.code
im = cStringIO.StringIO(file.read())
img = Image.open(im)
writeimage(img)
推荐答案
这里是一个基于pypy的库代码的解决方案(感谢agf在评论中的建议)。
here's a solution based on pypy's library code (thanks to agf's suggestion in the comments).
状态可通过 .state
属性,可以通过 .goto(state)
重置,其中 state
是序列中的索引(从0开始)。最后有一个演示(你需要向下滚动,我很害怕)。
the state is available via the .state
attribute and can be reset via .goto(state)
where state
is an index into the sequence (starting at 0). there's a demo at the end (you need to scroll down, i'm afraid).
这比丢弃值更快。
> cat prod.py
class product(object):
def __init__(self, *args, **kw):
if len(kw) > 1:
raise TypeError("product() takes at most 1 argument (%d given)" %
len(kw))
self.repeat = kw.get('repeat', 1)
self.gears = [x for x in args] * self.repeat
self.num_gears = len(self.gears)
self.reset()
def reset(self):
# initialization of indicies to loop over
self.indicies = [(0, len(self.gears[x]))
for x in range(0, self.num_gears)]
self.cont = True
self.state = 0
def goto(self, n):
self.reset()
self.state = n
x = self.num_gears
while n > 0 and x > 0:
x -= 1
n, m = divmod(n, len(self.gears[x]))
self.indicies[x] = (m, self.indicies[x][1])
if n > 0:
self.reset()
raise ValueError("state exceeded")
def roll_gears(self):
# Starting from the end of the gear indicies work to the front
# incrementing the gear until the limit is reached. When the limit
# is reached carry operation to the next gear
self.state += 1
should_carry = True
for n in range(0, self.num_gears):
nth_gear = self.num_gears - n - 1
if should_carry:
count, lim = self.indicies[nth_gear]
count += 1
if count == lim and nth_gear == 0:
self.cont = False
if count == lim:
should_carry = True
count = 0
else:
should_carry = False
self.indicies[nth_gear] = (count, lim)
else:
break
def __iter__(self):
return self
def next(self):
if not self.cont:
raise StopIteration
l = []
for x in range(0, self.num_gears):
index, limit = self.indicies[x]
l.append(self.gears[x][index])
self.roll_gears()
return tuple(l)
p = product('abc', '12')
print list(p)
p.reset()
print list(p)
p.goto(2)
print list(p)
p.goto(4)
print list(p)
> python prod.py
[('a', '1'), ('a', '2'), ('b', '1'), ('b', '2'), ('c', '1'), ('c', '2')]
[('a', '1'), ('a', '2'), ('b', '1'), ('b', '2'), ('c', '1'), ('c', '2')]
[('b', '1'), ('b', '2'), ('c', '1'), ('c', '2')]
[('c', '1'), ('c', '2')]
你应该测试更多 - 我可能会犯了一个愚蠢的错误 - 但是想法很简单,所以你应该可以解决它:o)你可以自由地使用我的更改;不知道原始的pypy许可证是什么。
you should test it more - i may have made a dumb mistake - but the idea is quite simple, so you should be able to fix it :o) you're free to use my changes; no idea what the original pypy licence is.
还状态
不是真正的完整状态 - t包含原始参数 - 它只是序列中的索引。也许这可以称之为索引,但代码中已经有标记...
also state
isn't really the full state - it doesn't include the original arguments - it's just an index into the sequence. maybe it would have been better to call it index, but there are already indici[sic]es in the code...
更新
这是一个更简单的版本,是一样的想法,但可以通过转换一系列数字。所以你只需要 imap
它超过 count(n)
来获得序列偏移量 n
。
here's a simpler version that is the same idea but works by transforming a sequence of numbers. so you just imap
it over count(n)
to get the sequence offset by n
.
> cat prod2.py
from itertools import count, imap
def make_product(*values):
def fold((n, l), v):
(n, m) = divmod(n, len(v))
return (n, l + [v[m]])
def product(n):
(n, l) = reduce(fold, values, (n, []))
if n > 0: raise StopIteration
return tuple(l)
return product
print list(imap(make_product(['a','b','c'], [1,2,3]), count()))
print list(imap(make_product(['a','b','c'], [1,2,3]), count(3)))
def product_from(n, *values):
return imap(make_product(*values), count(n))
print list(product_from(4, ['a','b','c'], [1,2,3]))
> python prod2.py
[('a', 1), ('b', 1), ('c', 1), ('a', 2), ('b', 2), ('c', 2), ('a', 3), ('b', 3), ('c', 3)]
[('a', 2), ('b', 2), ('c', 2), ('a', 3), ('b', 3), ('c', 3)]
[('b', 2), ('c', 2), ('a', 3), ('b', 3), ('c', 3)]
(这里的缺点是,如果你想停止并重新启动,你需要跟踪自己有多少你使用)
(the downside here is that if you want to stop and restart you need to have kept track yourself of how many you have used)
这篇关于使用itertools.product并想要种子值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!