本文介绍了Python数据框:使用Groupby在一列上计算R ^ 2和RMSE的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下Python数据框:

I have the following Python dataframe:

Type    Actual  Predicted
A       4       3
A       10      18
A       13      11
B       3       10
B       4       2
B       8       33
C       20      17
C       40      33
C       87      80
C       32      30

我有计算R ^ 2和RMSE的代码,但我不知道如何通过不同的类型"来计算它.

I have the code to calculate R^2 and RMSE but I don't know how to calculate it by distinct "Type".

就目前而言,我的方法是将较大的表分为仅由A,B,C值组成的三个较小的表,然后根据每个较小的表计算R ^ 2和RMSE,然后将它们重新附加在一起.

For now, my methodology is breaking the larger table into three smaller tables consisting of only A, B, C values and then calculating R^2 and RMSE off each smaller table...then appending them back together.

但是上述方法效率低下,我相信应该有一个更简单的方法吗?

But the above method is inefficient and I believe there should be an easier way?

以下是我希望对结果进行分组时产生的格式:

Below is the format I want the results to produce when things are grouped:

Type    R^2     RMSE    
A       value   value   
B       value   value   
C       value   value   

推荐答案

这是groupby方法:

import numpy as np
import pandas as pd
from sklearn.metrics import r2_score, mean_squared_error

def r2_rmse( g ):
    r2 = r2_score( g['Actual'], g['Predicted'] )
    rmse = np.sqrt( mean_squared_error( g['Actual'], g['Predicted'] ) )
    return pd.Series( dict(  r2 = r2, rmse = rmse ) )

your_df.groupby( 'Type' ).apply( r2_rmse ).reset_index()

这篇关于Python数据框:使用Groupby在一列上计算R ^ 2和RMSE的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-27 16:52