我有2个熊猫数据框df1df2

Name   No
A      1
A      2
B      5

Player Gender
A      F
B      M
C      F


我想使用sexdf1列中的相应值在gender数据框中创建一个新列df2。用于查找的列是Name中的df1Player中的df2

非常感谢任何帮助

最佳答案

map列中使用df2 by set_index,其中Playermap中:

df1['sex'] = df1.Name.map(df2.set_index('Player')['Gender'])
print (df1)
  Name  No sex
0    A   1   F
1    A   2   F
2    B   5   M


这与dict的相同:

d = df2.set_index('Player')['Gender'].to_dict()
print (d)
{'A': 'F', 'B': 'M', 'C': 'F'}
df1['sex'] = df1.Name.map(d)
print (df1)
  Name  No sex
0    A   1   F
1    A   2   F
2    B   5   M


要么:

print (pd.merge(df1,df2, left_on='Name', right_on='Player')
         .rename(columns={'Gender':'sex'})
         .drop('Player', axis=1))

  Name  No sex
0    A   1   F
1    A   2   F
2    B   5   M


首先是更快:

In [46]: %timeit (pd.merge(df1,df2, left_on='Name', right_on='Player').rename(columns={'Gender':'sex'}).drop('Player', axis=1))
The slowest run took 4.53 times longer than the fastest. This could mean that an intermediate result is being cached.
100 loops, best of 3: 2.53 ms per loop

In [47]: %timeit df1.Name.map(df2.set_index('Player')['Gender'])
The slowest run took 4.78 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 882 µs per loop

关于python - 在 Pandas 中具有不同列名称的查找值,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/37894807/

10-13 07:35