所以我有2个数据集,df1
具有所有水果的颜色,而df2
没有。如何根据水果名称根据df2
的颜色数据映射d1
的颜色值?
df1 df2
Name Color Name Color
Apple Red Orange Na
Orange Orange Coconut Na
Pear Pear Pear Na
Pear Pear Strawberries Na
Papaya Papaya Banana Na
Watermelon Watermelon Papaya Na
" " " "
最佳答案
我认为您可以使用map
,但首先需要Series.drop_duplicates
:
df2['Color'] = df2['Name'].map(df1.set_index('Name')['Color'].drop_duplicates())
print (df2)
Name Color
0 Orange Orange
1 Coconut NaN
2 Pear Pear
3 Strawberries NaN
4 Banana NaN
5 Papaya Papaya
使用
merge
和DataFrame.drop_duplicates
和DataFrame.drop
的另一种解决方案:df2 = pd.merge(df2.drop('Color', axis=1),df1.drop_duplicates(), how='left')
print (df2)
Name Color
0 Orange Orange
1 Coconut NaN
2 Pear Pear
3 Strawberries NaN
4 Banana NaN
5 Papaya Papaya
关于python - 映射其他数据集中的数据。 python Pandas ,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/39566992/