我有数据框A和数据框B,我想将B连接到A上,但只连接B上的某个列。像这样:

dataA = ['a', 'c', 'd', 'e']
A = pd.DataFrame(dataA, columns=['testA'])

dataB = [['a', 1, 'asdf'],
        ['b', 2, 'asdf'],
        ['c', 3, 'asdf'],
        ['d', 4, 'asdf'],
        ['e', 5, 'asdf']]
B = pd.DataFrame(data1, columns=['testB', 'num', 'asdf'])

Out[1]: A
    testA
0   a
1   c
2   d
3   e

Out[2]: B
    testB   num     asdf
0   a       1       asdf
1   b       2       asdf
2   c       3       asdf
3   d       4       asdf
4   e       5       asdf


我当前的代码是:

Out[3]: A.join(B.set_index('testB'), on='testA')
    testA   num     asdf
0   a       1       asdf
1   c       3       asdf
2   d       4       asdf
3   e       5       asdf


我想要的输出仅是按如下所示连接'num'列,而忽略'asdf'列,如果还有更多,则忽略所有其他列:

Out[4]: A

    testA   num
0   a       1
1   c       3
2   d       4
3   e       5

最佳答案

一种方法是使用merge

new_df= A.merge(B, how='left', left_on='testA', right_on='testB')[['testA', 'num']]


结果:

  testA  num
0     a    1
1     c    3
2     d    4
3     e    5

关于python - Pandas 只加入某一列,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/56997754/

10-12 23:14