我有一个带时间戳的文本文件。

例:

16-07-2015 18:08:20
16-07-2015 18:08:22
16-07-2015 18:08:30
16-07-2015 18:08:40
17-07-2015 10:04:01
17-07-2015 10:14:31
17-07-2015 10:14:59
17-07-2015 12:24:11
....


现在,我需要每小时的最小值和最大值,如下面的示例所示。

例:

16-07-2015 18:08:20 - 16-07-2015 18:08:40
17-07-2015 10:04:01 - 17-07-2015 10:14:59
17-07-2015 12:24:11 - ....


我该如何实现?

最佳答案

如果您有一个datetime对象的可迭代对象,则可以按天和小时对它们进行分组,然后使用itertools.groupby()查找其中的第一个和最后一个:

from itertools import groupby

def min_max_per_hour(iterable):
    for dayhour, grouped in groupby(iterable, lambda dt: (dt.date(), dt.hour)):
        minimum = next(grouped)  # first object is the minimum for this hour
        maximum = minimum  # starting value
        for dt in grouped:
            maximum = dt   # last assignment is the maximum within this hour
        yield (minimum, maximum)


这依赖于按顺序包含datetime对象的可迭代对象。

要产生可迭代的输入,请在生成器表达式或其他生成器中解析文本文件;无需一次将所有内容都保留在内存中:

from datetime import datetime

with open(input_filename) as inf:
    # generator expression
    datetimes = (datetime.strptime(line.strip(), '%d-%m-%Y %H:%M:%S') for line in inf)
    for mindt, maxdt in min_max_per_hour(datetimes):
        print mindt, maxdt


演示:

>>> from datetime import datetime
>>> from itertools import groupby
>>> def min_max_per_hour(iterable):
...     for dayhour, grouped in groupby(iterable, lambda dt: (dt.date(), dt.hour)):
...         minimum = next(grouped)  # first object is the minimum for this hour
...         maximum = minimum  # starting value
...         for dt in grouped:
...             maximum = dt   # last assignment is the maximum within this hour
...         yield (minimum, maximum)
...
>>> textfile = '''\
... 16-07-2015 18:08:20
... 16-07-2015 18:08:22
... 16-07-2015 18:08:30
... 16-07-2015 18:08:40
... 17-07-2015 10:04:01
... 17-07-2015 10:14:31
... 17-07-2015 10:14:59
... 17-07-2015 12:24:11
... '''.splitlines()
>>> datetimes = (datetime.strptime(line.strip(), '%d-%m-%Y %H:%M:%S') for line in textfile)
>>> for mindt, maxdt in min_max_per_hour(datetimes):
...     print mindt, maxdt
...
2015-07-16 18:08:20 2015-07-16 18:08:40
2015-07-17 10:04:01 2015-07-17 10:14:59
2015-07-17 12:24:11 2015-07-17 12:24:11

关于python - 每小时时段的最小和最大时间戳,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/34261931/

10-12 07:36