This question was migrated from Data Science Stack Exchange because it can be answered on Stack Overflow. Migrated去年。Learn more。
例如:我有,
df = pd.DataFrame({0: [420, np.nan, 455, np.nan, np.nan, np.nan]})
df
0
0 420.0
1 NaN
2 455.0
3 NaN
4 NaN
5 NaN
然后使用:
df[0].isnull().astype(int)
0 0
1 1
2 0
3 1
4 1
5 1
Name: 0, dtype: int64
我明白了
df[0].fillna(method='ffill') - df[0].isnull().astype(int)
0 420.0
1 419.0
2 455.0
3 454.0
4 454.0
5 454.0
Name: 0, dtype: float64
我想得到0,1,0,1,2,3,最后:
df[0]=420419455;454453452
最佳答案
groupby
,cumcount
df[0].ffill() - df.groupby(df[0].notna().cumsum()).cumcount()
0 420.0
1 419.0
2 455.0
3 454.0
4 453.0
5 452.0
dtype: float64
详情
定义组
df[0].notna().cumsum()
0 1
1 1
2 2
3 2
4 2
5 2
Name: 0, dtype: int64
与
groupby
一起在cumcount
中使用df.groupby(df[0].notna().cumsum()).cumcount()
0 0
1 1
2 0
3 1
4 2
5 3
dtype: int64
10-07 23:28