问题描述
我在pandas df中有一张桌子.
i have a table in pandas df.
id prod1 prod2 count
1 10 30 100
2 10 20 200
3 20 10 200
4 30 10 100
5 30 40 300
我在df2中还有另一个表
also i am having another table in df2
product price master_product
1000 1 10
5000 2 10
2000 2 20
9000 5 20
8000 1 20
30 3 0
4000 4 50
检查prod1和prod2是否属于master_product中的值,
Check if prod1 and prod2 belongs to the values in master_product,
如果是我想用master_product中最便宜的产品替换我的第一个df中的prod1和prod2.
if yesi want to replace prod1 and prod2 in my first df with the cheapest product in my master_product.
如果prod1和prod2值与master_product中的值不匹配,保持原样.
if the prod1 and prod2 values donot match with the values in master_product,leave the values as it is.
我正在寻找决赛桌.
id prod1 prod2 count
1 1000 4000 100
2 1000 8000 200
3 8000 1000 200
4 30 1000 100 #since 30 is not in master_product,leave as it is
5 30 40 300
我正在尝试使用.map函数来实现这一目标但我只能做到这一点.
i was trying to use .map function to achieve this but i could only reach to this.
df['prod1'] = df['prod1'].map(df2.set_index('master_product')['product'])
df['prod2'] = df['prod2'].map(df2.set_index('master_product')['product'])
但是它将尝试用df2的master_product中的匹配值替换prod1和prod2中的每个值.
but it will try to replace every values in prod1 and prod2 with matching values in master_product from df2.
有什么想法可以实现这一目标吗?
Any ideas how to achieve this?
推荐答案
通过 groupby
与 idxmin
-以最小的price
获取所有索引:
You can first modify df1
for minimal price
by master_product
by groupby
with idxmin
- get all indices with minimal price
:
df1 = df1.loc[df1.groupby('master_product')['price'].idxmin()]
print (df1)
product price master_product
5 30 3 0
0 1000 1 10
4 8000 1 20
6 4000 4 50
创建dict
进行映射:
d = df1.set_index('master_product')['product'].to_dict()
print (d)
{0: 30, 10: 1000, 20: 8000, 50: 4000}
最后一个 map
和如果缺少值,请通过 combine_first
:
Last map
and if value is missing add it by combine_first
:
df.prod1 = df.prod1.map(d).combine_first(df.prod1)
df.prod2 = df.prod2.map(d).combine_first(df.prod2)
print (df)
id prod1 prod2 count
0 1 1000.0 30.0 100
1 2 1000.0 8000.0 200
2 3 8000.0 1000.0 200
3 4 30.0 1000.0 100
4 5 30.0 40.0 300
这篇关于在有条件的情况下对两个数据框进行复杂的Map操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!