在 pandas 数据框中填写缺少的行值

本文介绍了在 pandas 数据框中填写缺少的行值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有以下数据框，想填写缺失的值.

I have the following dataframe and would like to fill in missing values.

mukey   hzdept_r    hzdepb_r    sandtotal_r silttotal_r
425897      0         61        
425897      61        152          5.3         44.7
425911      0         30           30.1        54.9
425911      30        74           17.7        49.8
425911      74        84

我希望每个缺失值都是对应于该mukey的值的平均值.在这种情况下，例如第一行缺失值将是对应于mukey == 425897的sandtotal_r和silttotal_r的平均值.熊猫fillna似乎并不能解决问题.有帮助吗?

I want each missing value to be the average of values corresponding to that mukey. In this case, e.g. the first row missing values will be the average of sandtotal_r and silttotal_r corresponding to mukey==425897. pandas fillna doesn't seem to do the trick. Any help?

推荐答案

使用下面我刚刚学到的几个问题....

Using what I just learned a couple questions below....

仅供参考，对于没有任何"sandtotal_r"或"silttotal_r"的任何"Mukey"，此解决方案仍将保留NaN.

FYI, this solution will still leave NaN's for any 'Mukey's that don't have any 'sandtotal_r's or 'silttotal_r's.

import pandas as pd

df = pd.read_clipboard()

df1 = df.set_index('mukey')

df1.fillna(df.groupby('mukey').mean(),inplace=True)

df1.reset_index()

    mukey  hzdept_r  hzdepb_r  sandtotal_r  silttotal_r
0  425897         0        61          5.3        44.70
1  425897        61       152          5.3        44.70
2  425911         0        30         30.1        54.90
3  425911        30        74         17.7        49.80
4  425911        74        84         23.9        52.35

这篇关于在 pandas 数据框中填写缺少的行值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！