本文介绍了如何用Pandas DataFrame中的共享列值替换某些行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有以下熊猫DataFrame:

Let's say I have the following pandas DataFrame:

import pandas as pd

data = [['Alex',10],['Bob',12],['Clarke',13], ['Bob', '#'], ['Bob', '#'], ['Bob', '#']]

df = pd.DataFrame(data,columns=['Name','Age'], dtype=float)
print(df)
     Name Age
0    Alex  10
1     Bob  12
2  Clarke  13
3     Bob   #
4     Bob   #
5     Bob   #

因此,Bob的数据帧中有奇数行,即第3、4和5行.这些值始终是#,而不是12.行1显示Bob应该是12,而不是#.

So, there are odd rows in the DataFrame for Bob, namely rows 3, 4, and 5. These values are consistently #, not 12. Row 1 shows that Bob should be 12, not #.

在此示例中,使用replace()修复此问题很简单:

In this example, it's straightforward to fix this with replace():

df = df.replace("#", 12)
print(df)
     Name Age
0    Alex  10
1     Bob  12
2  Clarke  13
3     Bob   12
4     Bob   12
5     Bob   12

但是,这不适用于较大的数据框,例如

However, this wouldn't work for larger dataframes, e.g.

     Name Age
0    Alex  10
1     Bob  12
2  Clarke  13
3     Bob   #
4     Bob   #
5     Bob   #
6  Clarke   #

第6行应为6 Clarke 13.

如何基于Name用其他行中给出的正确整数将Age中的Age中的任何行替换为#?如果#存在,请检查具有相同名称"值的其他行,并替换#.

How does one replace any row in Age with # with the correct integer as given in other rows, based on Name? If # exists, check other rows with the same Name value and replace #.

推荐答案

尝试一下,

d= df[df['Age']!='#'].set_index('Name')['Age']
df['Age']=df['Name'].replace(d)

O/P:

     Name Age
0    Alex  10
1     Bob  12
2  Clarke  13
3     Bob  12
4     Bob  12
5     Bob  12
6  Clarke  13

这篇关于如何用Pandas DataFrame中的共享列值替换某些行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-11 15:26