本文介绍了在 pandas df中找到timedelta对象的均值和标准差的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想通过dataframe按库计算timedeltameanstandard deviation,如下所示,两列.当我运行代码(也显示在下面)时,出现以下错误:

I would like to calculate the mean and standard deviation of a timedelta by bank from a dataframe with two columns shown below. When I run the code (also shown below) I get the below error:

pandas.core.base.DataError: No numeric types to aggregate

我的数据框:

   bank                          diff
   Bank of Japan                 0 days 00:00:57.416000
   Reserve Bank of Australia     0 days 00:00:21.452000
   Reserve Bank of New Zealand  55 days 12:39:32.269000
   U.S. Federal Reserve          8 days 13:27:11.387000

我的代码:

means = dropped.groupby('bank').mean()
std = dropped.groupby('bank').std()

推荐答案

您需要将timedelta转换为某个数字值,例如int64values最准确,因为转换为nstimedelta的数字​​表示形式:

You need to convert timedelta to some numeric value, e.g. int64 by values what is most accurate, because convert to ns is what is the numeric representation of timedelta:

dropped['new'] = dropped['diff'].values.astype(np.int64)

means = dropped.groupby('bank').mean()
means['new'] = pd.to_timedelta(means['new'])

std = dropped.groupby('bank').std()
std['new'] = pd.to_timedelta(std['new'])

另一种解决方案是通过seconds noreferrer> total_seconds ,但这不太准确:

Another solution is to convert values to seconds by total_seconds, but that is less accurate:

dropped['new'] = dropped['diff'].dt.total_seconds()

means = dropped.groupby('bank').mean()

这篇关于在 pandas df中找到timedelta对象的均值和标准差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-19 17:41