我有以下代码:
import pandas as pd
df = pd.DataFrame(
{'Index' : ['1', '2', '5','7', '8', '9', '10'],
'Vals' : [1, 2, 3, 4, np.nan, np.nan, 5]})
这给了我:
Index Vals
0 1 1.0
1 2 2.0
2 5 3.0
3 7 4.0
4 8 NaN
5 9 NaN
6 10 5.0
但我想要的是这样的东西:
Index Vals
0 1 1.000000
1 2 2.000000
2 3 NaN
3 4 NaN
4 5 3.000000
5 6 NaN
6 7 4.000000
7 8 NaN
8 9 NaN
9 10 5.000000
我试图通过创建一个具有连续索引的新数据帧来实现这一点然后我想分配我已经拥有的值,但是如何分配到目前为止,我唯一拥有的是:
clean_data = pd.DataFrame({'Index' : range(1,11)})
这给了我:
Index
0 1
1 2
2 3
3 4
4 5
5 6
6 7
7 8
8 9
9 10
最佳答案
因此,对于您的示例,它看起来像:
import pandas as pd
import numpy as np
df = pd.DataFrame(
{'Index' : ['1', '2', '5','7', '8', '9', '10'],
'Vals' : [1, 2, 3, 4, np.nan, np.nan, 5]})
df['Index'] = df['Index'].astype(int)
clean_data = pd.DataFrame({'Index' : range(1,11)})
result = clean_data.merge(df,on="Index",how='outer')
结果是:
Index Vals
0 1 1.0
1 2 2.0
2 3 NaN
3 4 NaN
4 5 3.0
5 6 NaN
6 7 4.0
7 8 NaN
8 9 NaN
9 10 5.0