问题描述
假设我有这张桌子
Type | Killed | Survived
Dog 5 2
Dog 3 4
Cat 1 7
Dog nan 3
cow nan 2
[Type] = Dog
缺少Killed
上的值之一.
我想在[Killed]
中为[Type] = Dog
推算平均值.
I want to impute the mean in [Killed]
for [Type] = Dog
.
我的代码如下:
- 搜索均值
df[df['Type'] == 'Dog'].mean().round()
这将给我平均值(约2.25)
This will give me the mean (around 2.25)
- 求平均值(这是问题开始的地方)
df.loc[(df['Type'] == 'Dog') & (df['Killed'])].fillna(2.25, inplace = True)
代码可以运行,但是该值不是估算值,NaN值仍然存在.
The code runs, but the value is not impute, the NaN value is still there.
我的问题是,如何根据[Type] = Dog
来估算[Killed]
中的均值.
My Question is, how do I impute the mean in [Killed]
based on [Type] = Dog
.
推荐答案
为我工作:
df.ix[df['Type'] == 'Dog', 'Killed'] = df.ix[df['Type'] == 'Dog', 'Killed'].fillna(2.25)
print (df)
Type Killed Survived
0 Dog 5.00 2
1 Dog 3.00 4
2 Cat 1.00 7
3 Dog 2.25 3
4 cow NaN 2
如果需要 fillna
Series
-因为2列Killed
和Survived
:
If need fillna
by Series
- because 2 columns Killed
and Survived
:
m = df[df['Type'] == 'Dog'].mean().round()
print (m)
Killed 4.0
Survived 3.0
dtype: float64
df.ix[df['Type'] == 'Dog'] = df.ix[df['Type'] == 'Dog'].fillna(m)
print (df)
Type Killed Survived
0 Dog 5.0 2
1 Dog 3.0 4
2 Cat 1.0 7
3 Dog 4.0 3
4 cow NaN 2
如果仅在Killed
列中需要fillna:
If need fillna only in column Killed
:
#if dont need rounding, omit it
m = round(df.ix[df['Type'] == 'Dog', 'Killed'].mean())
print (m)
4
df.ix[df['Type'] == 'Dog', 'Killed'] = df.ix[df['Type'] == 'Dog', 'Killed'].fillna(m)
print (df)
Type Killed Survived
0 Dog 5.0 2
1 Dog 3.0 8
2 Cat 1.0 7
3 Dog 4.0 3
4 cow NaN 2
您可以重复使用以下代码:
You can reuse code like:
filtered = df.ix[df['Type'] == 'Dog', 'Killed']
print (filtered)
0 5.0
1 3.0
3 NaN
Name: Killed, dtype: float64
df.ix[df['Type'] == 'Dog', 'Killed'] = filtered.fillna(filtered.mean())
print (df)
Type Killed Survived
0 Dog 5.0 2
1 Dog 3.0 8
2 Cat 1.0 7
3 Dog 4.0 3
4 cow NaN 2
这篇关于基于特定列属性的 pandas fillna()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!