根据另一个数据框python

根据另一个数据框python

本文介绍了根据另一个数据框python pandas替换列值-更好的方法吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

注意:为简单起见,我使用一个玩具示例,因为在堆栈溢出中很难复制/粘贴数据帧(请让我知道是否有简单的方法来实现此目的.)

Note:for simplicity's sake, i'm using a toy example, because copy/pasting dataframes is difficult in stack overflow (please let me know if there's an easy way to do this).

是否有一种方法可以将一个数据帧中的值合并到另一个数据帧中而无需获取_X,_Y列?我希望将一列中的值替换为另一列中的所有零值.

Is there a way to merge the values from one dataframe onto another without getting the _X, _Y columns? I'd like the values on one column to replace all zero values of another column.

df1:

Name   Nonprofit    Business    Education

X      1             1           0
Y      0             1           0   <- Y and Z have zero values for Nonprofit and Educ
Z      0             0           0
Y      0             1           0

df2:

Name   Nonprofit    Education
Y       1            1     <- this df has the correct values.
Z       1            1



pd.merge(df1, df2, on='Name', how='outer')

Name   Nonprofit_X    Business    Education_X     Nonprofit_Y     Education_Y
Y       1                1          1                1               1
Y      1                 1          1                1               1
X      1                 1          0               nan             nan
Z      1                 1          1                1               1

在以前的文章中,我尝试了Combine_First和dropna(),但是这些都做不到.

In a previous post, I tried combine_First and dropna(), but these don't do the job.

我想用df2中的值替换df1中的零.此外,我希望根据df2更改具有相同名称的所有行.

I want to replace zeros in df1 with the values in df2.Furthermore, I want all rows with the same Names to be changed according to df2.

Name    Nonprofit     Business    Education
Y        1             1           1
Y        1             1           1
X        1             1           0
Z        1             0           1

(需要澄清:业务"列中name = Z的值应为0.)

(need to clarify: The value in 'Business' column where name = Z should 0.)

我现有的解决方案执行以下操作:我基于df2中存在的名称进行子集设置,然后将这些值替换为正确的值.但是,我希望这样做的方法不那么客气.

My existing solution does the following:I subset based on the names that exist in df2, and then replace those values with the correct value. However, I'd like a less hacky way to do this.

pubunis_df = df2
sdf = df1

regex = str_to_regex(', '.join(pubunis_df.ORGS))

pubunis = searchnamesre(sdf, 'ORGS', regex)

sdf.ix[pubunis.index, ['Education', 'Public']] = 1
searchnamesre(sdf, 'ORGS', regex)

推荐答案

使用 isin 来过滤df并从rhs df中分配所需的行值:

Use the boolean mask from isin to filter the df and assign the desired row values from the rhs df:

In [27]:

df.loc[df.Name.isin(df1.Name), ['Nonprofit', 'Education']] = df1[['Nonprofit', 'Education']]
df
Out[27]:
  Name  Nonprofit  Business  Education
0    X          1         1          0
1    Y          1         1          1
2    Z          1         0          1
3    Y          1         1          1

[4 rows x 4 columns]

这篇关于根据另一个数据框python pandas替换列值-更好的方法吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-13 18:06