问题描述
在尝试对熊猫中的时间增量求和时,它似乎适用于切片,但不适用于整个列.
While trying to sum across timedeltas in pandas, it seems to work for a slice but not the whole column.
>> d.ix[0:100, 'VOID-DAYS'].sum()
Timedelta('2113 days 00:00:00')
>> d['VOID-DAYS'].sum()
ValueError: overflow in timedelta operation
推荐答案
如果VOID-DAYS
表示整数天,则将Timedeltas转换为整数:
If VOID-DAYS
represents an integer number of days, convert the Timedeltas into integers:
df['VOID-DAYS'] = df['VOID-DAYS'].dt.days
import numpy as np
import pandas as pd
df = pd.DataFrame({'VOID-DAYS': pd.to_timedelta(np.ones((106752,)), unit='D')})
try:
print(df['VOID-DAYS'].sum())
except ValueError as err:
print(err)
# overflow in timedelta operation
df['VOID-DAYS'] = df['VOID-DAYS'].dt.days
print(df['VOID-DAYS'].sum())
# 106752
如果Timedelta包含秒或更小的单位,请使用
If the Timedeltas include seconds or smaller units, then use
df['VOID-DAYS'] = df['VOID-DAYS'].dt.total_seconds()
将值转换为浮点数.
Pandas Timedeltas(系列和TimedeltaIndexes)将所有timedelta存储为与NumPy的timedelta64[ns]
dtype兼容的整数.此dtype使用8字节的整数存储时间增量(以纳秒为单位).
Pandas Timedeltas (Series and TimedeltaIndexes) store all timedeltas as ints compatible with NumPy's timedelta64[ns]
dtype. This dtype uses 8-byte ints to store the timedelta in nanoseconds.
以这种格式表示的最大天数是
The largest number of days representable in this format is
In [73]: int(float(np.iinfo(np.int64).max) / (10**9 * 3600 * 24))
Out[73]: 106751
这是为什么
In [74]: pd.Series(pd.to_timedelta(np.ones((106752,)), unit='D')).sum()
ValueError: overflow in timedelta operation
引发ValueError
,但是
In [75]: pd.Series(pd.to_timedelta(np.ones((106751,)), unit='D')).sum()
Out[75]: Timedelta('106751 days 00:00:00')
没有.
这篇关于Python Pandas中的TimeDeltas总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!