我有以下数据帧:

                S
2011-01-26      1
2011-01-27      0
2011-01-28      0
2011-01-29      0
2011-01-30      0
2011-01-31      0
2011-02-01      0
2011-02-02      0
2011-02-03      0
2011-02-04      0
2011-02-05      0
2011-02-06      0
2011-02-07      0
2011-02-08      0
2011-02-09      0

我正在尝试从df生成以下数据帧:
                S  S1 S2 S3
2011-01-26      1  0  0  0
2011-01-27      0  1  0  0
2011-01-28      0  1  0  0
2011-01-29      0  0  1  0
2011-01-30      0  0  1  0
2011-01-31      0  0  1  0
2011-02-01      0  0  1  0
2011-02-02      0  0  0  1
2011-02-03      0  0  0  1
2011-02-04      0  0  0  1
2011-02-05      0  0  0  1
2011-02-06      0  0  0  1
2011-02-07      0  0  0  1
2011-02-08      0  0  0  1
2011-02-09      0  0  0  1

可以看到,每列中df的数量向下增加了2的倍数。在Pandas中是否有一个函数,比如1可以指定向下填充x行?
更新
事实上,我有一个更复杂的任务。
如果这是我的fillna
                S
2011-01-26      1
2011-01-27      0
2011-01-28      0
2011-01-29      0
2011-01-30      0
2011-01-31      0
2011-02-01      0
2011-02-02      0
2011-02-03      0
2011-02-04      0
2011-02-05      0
2011-02-06      0
2011-02-07      0
2011-02-08      0
2011-02-09      0
...         (all zeros)
                    S
2011-04-26      1
2011-04-27      0
2011-04-28      0
2011-04-29      0
2011-04-30      0
2011-04-31      0
2011-05-01      0
2011-05-02      0
2011-05-03      0
2011-05-04      0
2011-05-05      0
2011-05-06      0
2011-05-07      0
2011-05-08      0
2011-05-09      0

我需要这个:
                S  S1 S2 S3
2011-01-26      1  0  0  0
2011-01-27      0  1  0  0
2011-01-28      0  1  0  0
2011-01-29      0  0  1  0
2011-01-30      0  0  1  0
2011-01-31      0  0  1  0
2011-02-01      0  0  1  0
2011-02-02      0  0  0  1
2011-02-03      0  0  0  1
2011-02-04      0  0  0  1
2011-02-05      0  0  0  1
2011-02-06      0  0  0  1
2011-02-07      0  0  0  1
2011-02-08      0  0  0  1
2011-02-09      0  0  0  1
all zeros every where
                    S  S1 S2 S3
2011-04-26      1  0  0  0
2011-04-27      0  1  0  0
2011-04-28      0  1  0  0
2011-04-29      0  0  1  0
2011-04-30      0  0  1  0
2011-04-31      0  0  1  0
2011-05-01      0  0  1  0
2011-05-02      0  0  0  1
2011-05-03      0  0  0  1
2011-05-04      0  0  0  1
2011-05-05      0  0  0  1
2011-05-06      0  0  0  1
2011-05-07      0  0  0  1
2011-05-08      0  0  0  1
2011-05-09      0  0  0  1

最佳答案

据我所知,没有现成的功能可以做到这一点。但是我们可以用下面的技巧来做类似的事情。

import pandas as pd
import numpy as np

# your data
# ========================================
df = pd.DataFrame(0, index=pd.date_range('2015-01-01', periods=100, freq='D'), columns=['col'])
df.iloc[[0, 71], 0] = 1

grouped = df.groupby(df.col.cumsum())

grouped.get_group(1)

Out[275]:
            col
2015-01-01    1
2015-01-02    0
2015-01-03    0
2015-01-04    0
2015-01-05    0
2015-01-06    0
2015-01-07    0
2015-01-08    0
...         ...
2015-03-05    0
2015-03-06    0
2015-03-07    0
2015-03-08    0
2015-03-09    0
2015-03-10    0
2015-03-11    0
2015-03-12    0

[71 rows x 1 columns]

grouped.get_group(2)

Out[276]:
            col
2015-03-13    1
2015-03-14    0
2015-03-15    0
2015-03-16    0
2015-03-17    0
2015-03-18    0
2015-03-19    0
2015-03-20    0
...         ...
2015-04-03    0
2015-04-04    0
2015-04-05    0
2015-04-06    0
2015-04-07    0
2015-04-08    0
2015-04-09    0
2015-04-10    0

[29 rows x 1 columns]

# processing
# ==================================

def func(group):
    group['temp'] = 0
    group.temp.iloc[2 ** np.arange(int(np.log2(len(group))) + 1) - 1] = 1
    group['new_col'] = group.temp.cumsum()
    return pd.get_dummies(group.new_col)


grouped.apply(func)

Out[281]:
            1  2  3  4  5   6   7
2015-01-01  1  0  0  0  0   0   0
2015-01-02  0  1  0  0  0   0   0
2015-01-03  0  1  0  0  0   0   0
2015-01-04  0  0  1  0  0   0   0
2015-01-05  0  0  1  0  0   0   0
2015-01-06  0  0  1  0  0   0   0
2015-01-07  0  0  1  0  0   0   0
2015-01-08  0  0  0  1  0   0   0
...        .. .. .. .. ..  ..  ..
2015-04-03  0  0  0  0  1 NaN NaN
2015-04-04  0  0  0  0  1 NaN NaN
2015-04-05  0  0  0  0  1 NaN NaN
2015-04-06  0  0  0  0  1 NaN NaN
2015-04-07  0  0  0  0  1 NaN NaN
2015-04-08  0  0  0  0  1 NaN NaN
2015-04-09  0  0  0  0  1 NaN NaN
2015-04-10  0  0  0  0  1 NaN NaN

关于python - Python: Pandas 在DataFrame中生成向下填充变量,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/31305769/

10-12 19:14