在Lasso和RobustScalar之后如何逆变换回归预测

在Lasso和RobustScalar之后如何逆变换回归预测

本文介绍了在Lasso和RobustScalar之后如何逆变换回归预测?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在使用RobustScalar和Lasso之后,我试图弄清楚如何对数据进行缩放(大概使用inverse_transform)进行预测.以下数据仅是示例.我的实际数据更大,更复杂,但是我希望使用RobustScaler(因为我的数据有异常值)和Lasso(因为我的数据具有许多无用的功能).

I'm trying to figure out how to unscale my data (presumably using inverse_transform) for predictions after using RobustScalar and Lasso. The data below is just an example. My actual data is much larger and complicated, but I'm looking to use RobustScaler (as my data has outliers) and Lasso (as my data has dozens of useless features).

基本上,如果我尝试使用此模型进行任何预测,则希望以无比例的方式进行该预测.当我尝试使用示例数据点执行此操作时,出现一个错误,似乎是我想要取消缩放与训练子集大小相同的数据(也称为两个观察值).我收到以下错误:ValueError:形状为(1,1)的不可广播的输出操作数与广播形状(1,2)不匹配

Basically, if I try to use this model to predict anything, I want that prediction in unscaled terms. When I try to do this with the example data point, I get an error that seems to want me to unscale data that is the same size as the training subset (aka two observations). I get the following error: ValueError: non-broadcastable output operand with shape (1,1) doesn't match the broadcast shape (1,2)

如何仅对一个预测进行缩放?这可能吗?

How can I unscale just one prediction? Is this possible?

import pandas as pd
from sklearn.linear_model import Lasso
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import RobustScaler

data = [[100, 1, 50],[500 , 3, 25],[1000 , 10, 100]]
df = pd.DataFrame(data,columns=['Cost','People', 'Supplies'])

X = df[['People', 'Supplies']]
y = df[['Cost']]

#Split
X_train,X_test,y_train,y_test = train_test_split(X,y)

#Scale data
transformer = RobustScaler().fit(X_train)
transformer.transform(X_train)

X_rtrain = RobustScaler().fit_transform(X_train)
y_rtrain = RobustScaler().fit_transform(y_train)
X_rtest = RobustScaler().fit_transform(X_test)
y_rtest = RobustScaler().fit_transform(y_test)

#Fit Train Model
lasso = Lasso()
lasso_alg = lasso.fit(X_rtrain,y_rtrain)

train_score =lasso_alg.score(X_rtrain,y_rtrain)
test_score = lasso_alg.score(X_rtest,y_rtest)

print ("training score:", train_score)
print ("test score:", test_score)

#Predict example
example = [[10,100]]
transformer.inverse_transform(lasso_alg.predict(example).reshape(-1, 1))

推荐答案

您不能对X和y使用相同的 transformer 对象.在您的代码段中,您的 transformer 用于X,即2D,因此在转换预测结果(即1D)时会出现错误.(实际上,您很幸运得到一个错误;如果您的X是1D,那么您会胡说八道.)

You cannot use the same tranformer object for both X and y. In your snippet, your transformer is for X, which is 2D, thus you get an error when transforming the result of your prediction, which is 1D. (Actually you are lucky to get an error; if your X was 1D, you would get nonsense).

类似的事情应该起作用:

Something like this should work:

transformer_x = RobustScaler().fit(X_train)
transformer_y = RobustScaler().fit(y_train)
X_rtrain = transformer_x.transform(X_train)
y_rtrain = transformer_y.transform(y_train)
X_rtest = transformer_x.transform(X_test)
y_rtest = transformer_y.transform(y_test)

#Fit Train Model
lasso = Lasso()
lasso_alg = lasso.fit(X_rtrain,y_rtrain)

train_score =lasso_alg.score(X_rtrain,y_rtrain)
test_score = lasso_alg.score(X_rtest,y_rtest)

print ("training score:", train_score)
print ("test score:", test_score)

example = [[10,100]]
transformer_y.inverse_transform(lasso.predict(example).reshape(-1, 1))

这篇关于在Lasso和RobustScalar之后如何逆变换回归预测?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 04:28