本文介绍了因果重采样:最后一个X< time_unit>之和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我有以下值:

                                   money_spent
time
2014-10-06 17:59:40.016000-04:00      1.832128
2014-10-06 17:59:41.771000-04:00      2.671048
2014-10-06 17:59:43.001000-04:00      2.019434
2014-10-06 17:59:44.792000-04:00      1.294051
2014-10-06 17:59:48.741000-04:00      0.867856

我希望衡量每2秒花费的钱.更具体地说,对于输出中的每个时间戳,我都需要查看最近2秒钟内花费的钱.

I am hoping to measure much money is spent every 2 seconds. More specifically, for every timestamp in the output, I need to see the money spent within the last 2 seconds.

当我这样做时:

df.resample('2S', how='last')

我得到:

                                money_spent
time
2014-10-06 17:59:40-04:00          2.671048
2014-10-06 17:59:42-04:00          2.019434
2014-10-06 17:59:44-04:00          1.294051
2014-10-06 17:59:46-04:00               NaN
2014-10-06 17:59:48-04:00          0.867856

这不是我期望的.首先,请注意,重新采样的df中的第一个条目是 2.671048 ,但这是在17:59:40时间,即使根据原始数据框架,没有花任何钱.那怎么可能?

which is not what I would expect. To start with, note that the first entry in the resampled df is 2.671048, but that is at time 17:59:40, even though, according to the original dataframe, no money was spent yet. How is that possible?

推荐答案

尝试使用how=np.sum:

df.resample('2S', how=np.sum, closed='left', label='right')

修改:

对于closedlabel:

表示:从左封闭的间隔开始,并用间隔右端的日期标记. (共2秒,例如[1、1.2、1.5、1.9、2)).然后从 docs :

It means: from the left-closed interval, and labeled with the date from the right end of the interval. (of 2 seconds e.g. [1, 1.2, 1.5, 1.9, 2) ) .And from the docs:

label:{'right','left'}哪个桶边标签用来标记存储桶

label : {‘right’, ‘left’} Which bin edge label to label bucket with

这篇关于因果重采样:最后一个X< time_unit>之和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-14 13:28