问题描述
我想合并两个数据帧,df2可能有更多列,并且始终为1行.我希望df2行中的数据覆盖df中的匹配行.注意: ser
和 no
列一起使一行唯一.
I would like to merge two dataframes, df2 might have more columns and will always be 1 row. I would like the data from the df2 row to overwrite the matching row in df. NOTE: ser
and no
columns together make a row unique.
import pandas as pd
df = pd.DataFrame({'ser': {0: 0, 1: 0, 2: 0, 3: 1, 4: 1, 5: 1, 6: 2, 7: 2, 8: 2}, 'no': {0: 0, 1: 1, 2: 2, 3: 0, 4: 1, 5: 2, 6: 0, 7: 1, 8: 2}, 'c': {0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1, 8: 1}})
df2 = pd.DataFrame({'ser': {0: 1}, 'no': {0: 2}, 'c': {0: 88}, 'd': {0: 90}})
>>> df
ser no c
0 0 0 1
1 0 1 1
2 0 2 1
3 1 0 1
4 1 1 1
5 1 2 1
6 2 0 1
7 2 1 1
8 2 2 1
>>> df2
ser no c d
0 1 2 88 90
我尝试过合并,但这会创建两个c列(c_x和c_y):
I have tried merge left but this creates two c columns (c_x and c_y):
>>> pd.merge(df,df2,how='left',on=['ser','no'])
ser no c_x c_y d
0 0 0 1 NaN NaN
1 0 1 1 NaN NaN
2 0 2 1 NaN NaN
3 1 0 1 NaN NaN
4 1 1 1 NaN NaN
5 1 2 1 88.0 90.0
6 2 0 1 NaN NaN
7 2 1 1 NaN NaN
8 2 2 1 NaN NaN
期望的输出:
ser no c d
0 0 0 1 NaN
1 0 1 1 NaN
2 0 2 1 NaN
3 1 0 1 NaN
4 1 1 1 NaN
5 1 2 88 90
6 2 0 1 NaN
7 2 1 1 NaN
8 2 2 1 NaN
推荐答案
Frankenstein答案
df[['ser', 'no']].merge(df2, 'left').set_axis(df.index).fillna(df)
ser no c d
0 0 0 1.0 NaN
1 0 1 1.0 NaN
2 0 2 1.0 NaN
3 1 0 1.0 NaN
4 1 1 1.0 NaN
5 1 2 88.0 90.0
6 2 0 1.0 NaN
7 2 1 1.0 NaN
8 2 2 1.0 NaN
说明
-
我要合并在
['ser','no']
列上,并且不想在merge
调用中指定.另外,我也不想像'c_x'
和'c_y'
这样愚蠢的重复列名,所以我只切了我想要共同的列然后合并
I'm going to merge on the columns
['ser', 'no']
and don't want to specify in themerge
call. Also, I don't want goofy duplicate column names like'c_x'
and'c_y'
so I slice only columns that I want in common then merge
df[['ser', 'no']].merge(df2, 'left')
合并时,我只希望左侧数据框中的行.但是, merge
通常会产生与原始数据帧完全不同的许多行,因此会产生新的 index
.但是,注意是假设正确的数据帧( df2
)关于 ['ser','no']
,然后'left'
merge
应该产生与左侧数据帧相同的行数( df 代码>).但是它不一定具有相同的
index
.事实证明,在此示例中确实如此.但是我不想冒险.所以我用 set_axis
When I merge, I want only rows from the left dataframe. However, merge
usually produces a number of rows vastly different from the original dataframes and therefore produces a new index
. However, NOTE this is assuming the right dataframe (df2
) has NO DUPLICATES with respect ['ser', 'no']
then a 'left'
merge
should produce the same exact number of rows as the left dataframe (df
). But it won't have the same index
necessarily. It turns out that in this example it does. But I don't want to take chances. So I use set_axis
set_axis(df.index)
最后,由于生成的数据框具有与 df
相同的 index
和 columns
.我可以使用以下方式填充缺失的部分:
Finally, since the resulting dataframe has the same index
and columns
as df
. I can fill in the missing bits with:
fillna(df)
这篇关于 pandas 左合并,将数据保留在重复数据列的右数据框中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!