问题描述
我有以下数据框,想填写缺失的值.
I have the following dataframe and would like to fill in missing values.
mukey hzdept_r hzdepb_r sandtotal_r silttotal_r
425897 0 61
425897 61 152 5.3 44.7
425911 0 30 30.1 54.9
425911 30 74 17.7 49.8
425911 74 84
我希望每个缺失值都是对应于该mukey的值的平均值.在这种情况下,例如第一行缺失值将是对应于mukey == 425897的sandtotal_r和silttotal_r的平均值.熊猫fillna似乎并不能解决问题.有帮助吗?
I want each missing value to be the average of values corresponding to that mukey. In this case, e.g. the first row missing values will be the average of sandtotal_r and silttotal_r corresponding to mukey==425897. pandas fillna doesn't seem to do the trick. Any help?
推荐答案
使用下面我刚刚学到的几个问题....
Using what I just learned a couple questions below....
仅供参考,对于没有任何"sandtotal_r"或"silttotal_r"的任何"Mukey",此解决方案仍将保留NaN.
FYI, this solution will still leave NaN's for any 'Mukey's that don't have any 'sandtotal_r's or 'silttotal_r's.
import pandas as pd
df = pd.read_clipboard()
df1 = df.set_index('mukey')
df1.fillna(df.groupby('mukey').mean(),inplace=True)
df1.reset_index()
mukey hzdept_r hzdepb_r sandtotal_r silttotal_r
0 425897 0 61 5.3 44.70
1 425897 61 152 5.3 44.70
2 425911 0 30 30.1 54.90
3 425911 30 74 17.7 49.80
4 425911 74 84 23.9 52.35
这篇关于在 pandas 数据框中填写缺少的行值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!