我有第一个数据框

df1:

A         B       C    D
Car               0
Bike              0
Train             0
Plane             0
Other_1  Plane    2
Other_2  Plane    3
Other 3  Plane    4

另一个是:
df2:

A         B
Car       4 %
Bike      5 %
Train     6 %
Plane     7 %

所以我想得到这个组合:
df1:

A         B       C    D
Car               0    4 %
Bike              0    5 %
Train             0    6 %
Plane             0    7 %
Other_1  Plane    2    2
Other_2  Plane    3    3
Other 3  Plane    4    4

哪种方法最好?

最佳答案

如果df和df2索引相同,则可以使用:

df['D'] = df2['B'].combine_first(df['C'])

输出:
         A      B  C    D
0      Car    NaN  0  4 %
1     Bike    NaN  0  5 %
2    Train    NaN  0  6 %
3    Plane    NaN  0  7 %
4  Other_1  Plane  2    2
5  Other_2  Plane  3    3
6  Other_3  Plane  4    4

如果索引不完全相同,则可以在A列上使用merge
df_out = df.merge(df2, on ='A', how='left', suffixes=('','y'))
df_out.assign(D = df_out.By.fillna(df_out.C)).drop('By', axis=1)

或使用@piRSquared improved one-liner
df.drop('D',1).merge(df2.rename(columns={'B':'D'}), how='left',on ='A')

输出:
         A      B  C    D
0      Car    NaN  0  4 %
1     Bike    NaN  0  5 %
2    Train    NaN  0  6 %
3    Plane    NaN  0  7 %
4  Other_1  Plane  2    2
5  Other_2  Plane  3    3
6  Other_3  Plane  4    4

关于python - 根据其他数据框添加特定的列值,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/56462488/

10-14 19:18