问题描述
我正在根据statsmodels执行ADF测试.该值系列可能缺少遗忘之处.实际上,如果NaN的分数大于c,我将放弃分析.但是,如果该系列解决了所有问题,则adfuller无法处理丢失的数据.由于这是具有最小帧大小的训练数据,因此我想这样做:
I am performing an ADF-test from statsmodels. The value series can have missing obversations. In fact, I am dropping the analysis if the fraction of NaNs is larger than c. However, if the series makes it through the I get the problem, that the adfuller cannot deal with missing data. Since this is training data with a minimum framesize, I would like to do:
1)如果x(t = 0)= NaN,则找到下一个非NaN值(t> 0)2)否则,如果x(t)= NaN,则x(t)= x(t-1)
1) if x(t=0) = NaN, then find the next non-NaN value (t>0)2) otherwise if x(t) = NaN, then x(t) = x(t-1)
因此,我在这里损害了我的第一个价值,但要确保输入数据始终具有相同的维数.另外,如果使用dropna的limit选项,我可以用0填充第一个值.
So I am compromising here my first value, but making sure the input data has always the same dimension. Alternatively, I could fill if the first value is missing with 0 making use of the limit option from dropna.
从文档中,我对100%的其他选项不清楚:方法:{'backfill','bfill','pad','ffill',None},默认为None
From the documentation the different option are not 100% clear to me:method : {‘backfill’, ‘bfill’, ‘pad’, ‘ffill’, None}, default None
填充/填充:这是否意味着我保留了之前的值?回填/填充:这是否表示我将来会从有效值中获取该值?
pad / ffill: does that mean I carry over the previous value?backfill / bfill: does that mean I the value is taken from a valid one in the future?
df.dropna(method = 'bfill', limit 1, inplace = True)
df.dropna(method = 'ffill', inplace = True)
那会不会有限制?该文档使用限制= 1",但预先确定了要填充的值.
Would that work with limit? The documentation uses 'limit = 1' but has predetermined a value to be filled.
推荐答案
要预先填充所有(除了可能要填充的)第一个观察值以外的所有观察值,可以将两个调用链接到 fillna
,第一个带有method='ffill'
,第二个带有method='fill'
:
To front-fill all observations except for (possibly) the first ones, which should be backfilled, you can chain two calls to fillna
, the first with method='ffill'
and the second with method='fill'
:
df = pd.DataFrame({'a': [None, None, 1, None, 2, None]})
>>> df.fillna(method='ffill').fillna(method='bfill')
a
0 1.0
1 1.0
2 1.0
3 1.0
4 2.0
5 2.0
这篇关于数据帧中的NaN:当时间序列的首次观察为NaN时,先填充第一个可用的,否则继续进行上一个/先前的观察的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!