问题描述
所以,我正在观看 Raymond Hettinger 的演讲 将代码转换为漂亮的惯用 Python 和他提出了我从未意识到的这种 iter
形式.他的例子如下:
代替:
块 = []为真:块 = f.read(32)如果块 == '':休息blocks.append(块)
使用:
块 = []read_block = 部分(f.read,32)对于 iter(read_block, '') 中的块:blocks.append(块)
检查iter
文档后>,我找到了一个类似的例子:
with open('mydata.txt') as fp:对于迭代器中的行(fp.readline, ''):process_line(线)
这对我来说看起来非常有用,但我想知道你们中的 Pythonistas 是否知道任何不涉及 I/O-read 循环的构造示例?也许在标准库中?
我能想到一些非常人为的例子,如下所示:
>>>定义 f():... f.count += 1...返回 f.count...>>>f.count = 0>>>列表(迭代器(f,20))[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]>>>但很明显,这并不比内置的可迭代对象更有用.此外,当您将状态分配给函数时,对我来说似乎是代码味道.那时,我可能应该使用一个类,但如果我要编写一个类,我不妨为我想要完成的任何事情实现迭代器协议.
作为一个规则,我见过的两个 arg iter 的主要用途涉及转换类似于 C API 的函数(隐式状态,没有迭代的概念)到迭代器.类文件对象是一个常见示例,但它出现在其他对 C API 包装不佳的库中.您期望的模式会出现在 API 中,例如 FindFirstFile
/FindNextFile
,其中打开一个资源,每次调用都会推进内部状态并返回一个新值或标记变量(如 C 中的 NULL
).将它包装在一个实现迭代器协议的类中通常是最好的,但如果你必须自己做,虽然 API 是内置的 C 级,包装最终会减慢使用速度,其中两个 arg iter,在 C 中实现为嗯,可以避免额外字节码执行的开销.
其他示例涉及在循环本身期间更改的可变对象,例如,在字节数组中的行上以相反的顺序循环,仅在处理完成后删除该行:
>>>从 functools 导入部分>>>ba = bytearray(b'aaaa\n'*5)>>>for i in iter(partial(ba.rfind, b'\n'), -1):...打印(一)... ba[i:] = b''...24191494另一种情况是以渐进方式使用切片时,例如,一种有效(如果公认丑陋)的方式将可迭代分组为 n
个项目的组,同时允许最终组小于n
项,如果输入可迭代项的长度不是 n
项的偶数倍(我实际使用过这个,但我通常使用 itertools.takewhile(bool
而不是两个 arg iter
):
# from future_builtins import map # 仅适用于 Python 2从 itertools 导入星图,islice,重复def grouper(n, iterable):'''返回一个生成器,从可迭代中产生 n 个大小的元组对于不能被 n 整除的可迭代对象,最终组的大小将过小.'''# 继续对 n 个项目进行切片并转换为组,直到我们遇到一个空切片return iter(map(tuple, starmap(islice, repeat((iter(iterable), n)))).__next__, ()) # 在 Py2 上使用 .next 代替 .__next__
另一个用途:将多个pickled对象写入单个文件,后跟一个标记值(例如None
),因此在unpickling时,您可以使用此习语,而无需以某种方式记住数字被腌制的项目,或者需要一遍又一遍地调用 load
直到 EOFError
:
with open('picklefile', 'rb') as f:对于迭代器中的 obj(pickle.Unpickler(f).load, None):... 处理一个对象 ...
So, I was watching Raymond Hettinger's talk Transforming Code into Beautiful, Idiomatic Python and he brings up this form of iter
which I was never aware of. His example is the following:
Instead of:
blocks = []
while True:
block = f.read(32)
if block == '':
break
blocks.append(block)
Use:
blocks = []
read_block = partial(f.read, 32)
for block in iter(read_block, ''):
blocks.append(block)
After checking the documentation of iter
, I found a similar example:
with open('mydata.txt') as fp:
for line in iter(fp.readline, ''):
process_line(line)
This looks pretty useful to me, but I was wondering if of you Pythonistas know of any examples of this construct that doesn't involve I/O-read loops? Perhaps in the Standard Library?
I can think of very contrived examples, like the following:
>>> def f():
... f.count += 1
... return f.count
...
>>> f.count = 0
>>> list(iter(f,20))
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
>>>
But obviously this is not any more useful that the built-in iterables. Also, it seems like code smell to me when you are assigning state to a function. At that point, I'd likely should be working with a class, but if I'm going to write a class, I might as well implement the iterator protocol for whatever I want to accomplish.
As a rule, the main uses I've seen for two arg iter involve converting functions that are similar to C APIs (implicit state, no concept of iteration) to iterators. File-like objects are a common example, but it shows up in other libraries that poorly wrap C APIs. The pattern you'd expect would be one seen in APIs like FindFirstFile
/FindNextFile
, where a resource is opened, and each call advances internal state and returns a new value or a marker variable (like NULL
in C). Wrapping it in a class implementing the iterator protocol is usually best, but if you have to do it yourself, while the API is a C level built-in, the wrapping can end up slowing usage, where two arg iter, implemented in C as well, can avoid the expense of additional byte code execution.
Other examples involve mutable objects that are changed during the loop itself, for example, looping in reverse order over lines in a bytearray, removing the line only once processing is complete:
>>> from functools import partial
>>> ba = bytearray(b'aaaa\n'*5)
>>> for i in iter(partial(ba.rfind, b'\n'), -1):
... print(i)
... ba[i:] = b''
...
24
19
14
9
4
Another case is when using slicing in a progressive manner, for example, an efficient (if admittedly ugly) way to group an iterable into groups of n
items while allowing the final group to be less than n
items if the input iterable isn't an even multiple of n
items in length (this one I've actually used, though I usually use itertools.takewhile(bool
instead of two arg iter
):
# from future_builtins import map # Python 2 only
from itertools import starmap, islice, repeat
def grouper(n, iterable):
'''Returns a generator yielding n sized tuples from iterable
For iterables not evenly divisible by n, the final group will be undersized.
'''
# Keep islicing n items and converting to groups until we hit an empty slice
return iter(map(tuple, starmap(islice, repeat((iter(iterable), n)))).__next__, ()) # Use .next instead of .__next__ on Py2
Another use: Writing multiple pickled objects to a single file, followed by a sentinel value (None
for example), so when unpickling, you can use this idiom instead of needing to somehow remember the number of items pickled, or needing to call load
over and over until EOFError
:
with open('picklefile', 'rb') as f:
for obj in iter(pickle.Unpickler(f).load, None):
... process an object ...
这篇关于iter(callable, sentinel) 的用途是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!