本文介绍了Pandas 用最后一个已知值填充空尾随值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有几个具有不同结束时间段的列.

I have several columns with different ending time periods.

需要用最后一个已知值填充空数据.

Need to fill the empty data with the last known value.

有没有一种 Pandas 方法可以在不循环结束日期的情况下做到这一点?

is there a Pandas way to do this without looping bases on the ending dates?

我需要过去 4 个月的 gain_sum_y 等于 -57129.0.

I need the gain_sum_y to equal -57129.0 for the last 4 months.

        gain_sum_x  gain_sum_y
month                             
2014-09-30      -97747    -41355.0
2014-10-31     -112928    -47394.0
2014-11-30     -131638    -57129.0
2014-12-31     -161370         0.0
2015-01-31     -168832         0.0
2015-02-28     -151930         0.0
2015-03-31     -162077         0.0

谢谢.

推荐答案

我认为你需要 replaceffill (fillnamethod='ffill') 如果要替换所有 0 值乘以最后一个非 0 值:

I think you need replace with ffill (fillna with method='ffill') if want replace all 0 values by last non 0 values:

df = df.replace(0, np.nan).ffill()
print (df)
        month  gain_sum_x  gain_sum_y
0  2014-09-30      -97747    -41355.0
1  2014-10-31     -112928    -47394.0
2  2014-11-30     -131638    -57129.0
3  2014-12-31     -161370    -57129.0
4  2015-01-31     -168832    -57129.0
5  2015-02-28     -151930    -57129.0
6  2015-03-31     -162077    -57129.0

如果你想指定替换列(谢谢约翰·高尔特):

If you want specify column for replace (thank you John Galt):

df.replace({'gain_sum_y': {0: np.nan}}).ffill()

具有多个0的样本:

print (df)
            gain_sum_x  gain_sum_y
month                             
2014-09-30      -97747    -41355.0
2014-10-31           0         0.0
2014-11-30           0    -57129.0
2014-12-31     -161370         0.0
2015-01-31     -168832         0.0
2015-02-28           0         0.0
2015-03-31     -162077         0.0

df1 = df.replace(0,np.nan).ffill()
print (df1)
            gain_sum_x  gain_sum_y
month                             
2014-09-30    -97747.0    -41355.0
2014-10-31    -97747.0    -41355.0
2014-11-30    -97747.0    -57129.0
2014-12-31   -161370.0    -57129.0
2015-01-31   -168832.0    -57129.0
2015-02-28   -168832.0    -57129.0
2015-03-31   -162077.0    -57129.0

但是如果只需要替换最后一个 0 需要 last_valid_index 用于将最后一个 0 替换为 NaN:

But if need replace only last 0 need last_valid_index for replace last 0 to NaN:

df2 = df.replace(0,np.nan).apply(lambda x: x.loc[:x.last_valid_index()].fillna(0)).ffill()
print (df2)
            gain_sum_x  gain_sum_y
2014-09-30    -97747.0    -41355.0
2014-10-31         0.0         0.0
2014-11-30         0.0    -57129.0
2014-12-31   -161370.0    -57129.0
2015-01-31   -168832.0    -57129.0
2015-02-28         0.0    -57129.0
2015-03-31   -162077.0    -57129.0

这篇关于Pandas 用最后一个已知值填充空尾随值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-16 09:57