我有以下输入:

Year    Brand   Model   Value
2018    A           a   1,00
2018    A           b   2,00
2018    B           a   3,00
2017    A           b   4,00
2016    C           b   5,00


我想补充一下缺少的组合:


每年我必须有A,B和C
对于每个品牌,我必须有a和b


预期的输出是这样的:

Year    Brand   Model   Value
2018    A          a    1
2018    A          b    2
2018    B          a    3,00
2018    B          b
2018    C          a
2018    C          b
2017    A          a
2017    A          b    4
2017    B          a
2017    B          b
2017    C          a
2017    C          b
2016    A          a
2016    A          b
2016    B          a
2016    B          b
2016    C          a
2016    C          b    5


我怎样才能做到这一点 ?

最佳答案

使用reindex创建的MultiIndex中的MultiIndex.from_product

mux = pd.MultiIndex.from_product([df['Year'].unique(),
                                  df['Brand'].unique(),
                                  df['Model'].unique()], names=['Year','Brand','Model'])
df = df.set_index(['Year','Brand','Model']).reindex(mux).reset_index()
print (df)
    Year Brand Model Value
0   2018     A     a  1,00
1   2018     A     b  2,00
2   2018     B     a  3,00
3   2018     B     b   NaN
4   2018     C     a   NaN
5   2018     C     b   NaN
6   2017     A     a   NaN
7   2017     A     b  4,00
8   2017     B     a   NaN
9   2017     B     b   NaN
10  2017     C     a   NaN
11  2017     C     b   NaN
12  2016     A     a   NaN
13  2016     A     b   NaN
14  2016     B     a   NaN
15  2016     B     b   NaN
16  2016     C     a   NaN
17  2016     C     b  5,00

关于python - Pandas 添加缺少的行,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/52605334/

10-09 06:41