我有数据框A和数据框B,我想将B连接到A上,但只连接B上的某个列。像这样:
dataA = ['a', 'c', 'd', 'e']
A = pd.DataFrame(dataA, columns=['testA'])
dataB = [['a', 1, 'asdf'],
['b', 2, 'asdf'],
['c', 3, 'asdf'],
['d', 4, 'asdf'],
['e', 5, 'asdf']]
B = pd.DataFrame(data1, columns=['testB', 'num', 'asdf'])
Out[1]: A
testA
0 a
1 c
2 d
3 e
Out[2]: B
testB num asdf
0 a 1 asdf
1 b 2 asdf
2 c 3 asdf
3 d 4 asdf
4 e 5 asdf
我当前的代码是:
Out[3]: A.join(B.set_index('testB'), on='testA')
testA num asdf
0 a 1 asdf
1 c 3 asdf
2 d 4 asdf
3 e 5 asdf
我想要的输出仅是按如下所示连接'num'列,而忽略'asdf'列,如果还有更多,则忽略所有其他列:
Out[4]: A
testA num
0 a 1
1 c 3
2 d 4
3 e 5
最佳答案
一种方法是使用merge
:
new_df= A.merge(B, how='left', left_on='testA', right_on='testB')[['testA', 'num']]
结果:
testA num
0 a 1
1 c 3
2 d 4
3 e 5
关于python - Pandas 只加入某一列,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/56997754/