所以我有两个数据框

在第一个数据框中,它显示了在该范围的不同日期售出的4辆汽车以及售出的汽车数量

在第二个数据框中,有些汽车已经维修过,因此被称为

df1:

     Cars     Range(Days)    Sold
  0   A          1-3            5
  1              4-7            23
  2              8-15           2
  3   B          4-7            4
  4              8-15           1
  5   C          1-3            5
  6   D          1-3            2
  7   E          1-3            9




df2:

     Car     Repair_Calls
0    A            2
1    C            45
2    D            32
4    E            1


我试过了

df1 ['Repair_Calls'] = df2 ['Repair_Calls']

我得到了什么

     Car     Range(Days)    Sold     Repair_Calls
  0   A          1-3            5           2
  1              4-7            23          45
  2              8-15           2           32
  3   B          4-7            4           1
  4              8-15           1
  5   C          1-3            5
  6   D          1-3            2
  7   E          1-3            9


预期产量

     Car     Range(Days)    Sold     Repair_Calls
  0   A          1-3            5           2
  1              4-7            23
  2              8-15           2
  3   B          4-7            4           0
  4              8-15           1
  5   C          1-3            5           45
  6   D          1-3            2           32
  7   E          1-3            9            1

最佳答案

map创建的Seriesset_indexdf2一起使用:

df1['Repair_Calls'] = df1['Cars'].map(df2.set_index('Car')['Repair_Calls'])


mergeleft加入:

df1 = df1.merge(df2, left_on='Cars',right_on='Car', how='left').drop('Car', axis=1)

print (df1)
  Cars Range(Days)  Sold  Repair_Calls
0    A         1-3     5           2.0
1  NaN         4-7    23           NaN
2  NaN        8-15     2           NaN
3    B         4-7     4           NaN
4  NaN        8-15     1           NaN
5    C         1-3     5          45.0
6    D         1-3     2          32.0
7    E         1-3     9           1.0


但是,如果还需要添加缺少的值,请通过唯一的非reindex值添加NaN

s = df2.set_index('Car')['Repair_Calls'].reindex(df1['Cars'].dropna().unique(), fill_value=0)
df1['Repair_Calls'] = df1['Cars'].map(s)
print (df1)
  Cars Range(Days)  Sold  Repair_Calls
0    A         1-3     5           2.0
1  NaN         4-7    23           NaN
2  NaN        8-15     2           NaN
3    B         4-7     4           0.0
4  NaN        8-15     1           NaN
5    C         1-3     5          45.0
6    D         1-3     2          32.0
7    E         1-3     9           1.0

关于python - 如何在python中合并两个不规则数据帧,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/50678599/

10-13 08:45
查看更多