本文介绍了从另一个数据框填充一个数据框的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试根据条件从另一个数据框填充一个数据框的列.假设我的第一个数据帧是df1,第二个数据帧是df2.
I'm trying to fill a column of a dataframe from another dataframe based on conditions. Let's say my first dataframe is df1 and the second is named df2.
# df1 is described as bellow :
+------+------+
| Col1 | Col2 |
+------+------+
| A | 1 |
| B | 2 |
| C | 3 |
| A | 1 |
+------+------+
还有
# df2 is described as bellow :
+------+------+
| Col1 | Col2 |
+------+------+
| A | NaN |
| B | NaN |
| D | NaN |
+------+------+
Col1的每个不同值都有她的ID号(在Col2中),所以我要在df2.Col2中填充NaN值,其中df2.Col1 == df1.Col1.这样我的第二个数据帧看起来就像:
Each distinct value of Col1 has her an id number (In Col2), so what I want is to fill the NaN values in df2.Col2 where df2.Col1==df1.Col1 .So that my second dataframe will look like :
# df2 :
+------+------+
| Col1 | Col2 |
+------+------+
| A | 1 |
| B | 2 |
| D | NaN |
+------+------+
我正在使用Python 2.7
I'm using Python 2.7
推荐答案
使用 drop_duplicates
与 set_index
和 combine_first
:
df = df2.set_index('Col1').combine_first(df1.drop_duplicates().set_index('Col1')).reset_index()
如果只需要在 id
列中检查重复项:
If need check dupes only in id
column:
df = df2.set_index('Col1').combine_first(df1.drop_duplicates().set_index('Col1')).reset_index()
这篇关于从另一个数据框填充一个数据框的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!