线性回归::归一化(Vs)标准化

本文介绍了线性回归::归一化(Vs)标准化的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用线性回归来预测数据.但是，当我对变量进行归一化(Vs)标准化时，我得到的是完全相反的结果.

I am using Linear regression to predict data. But, I am getting totally contrasting results when I Normalize (Vs) Standardize variables.

归一化= x -xmin/xmax – xmin零分数标准化= x-xmean/xstd

Normalization = x -xmin/ xmax – xmin Zero Score Standardization = x - xmean/ xstd

a) Also, when to Normalize (Vs) Standardize ?
b) How Normalization affects Linear Regression?
c) Is it okay if I don't normalize all the attributes/lables in the linear regression?

谢谢，桑托什

推荐答案

请注意，结果不一定会如此不同.您可能只需要为这两个选项使用不同的超参数，即可得出相似的结果.

Note that the results might not necessarily be so different. You might simply need different hyperparameters for the two options to give similar results.

理想的情况是测试最适合您的问题的方法.如果由于某种原因您负担不起，那么大多数算法可能会从标准化中受益，而不是从标准化中受益.

The ideal thing is to test what works best for your problem. If you can't afford this for some reason, most algorithms will probably benefit from standardization more so than from normalization.

有关<此处的信息，请参阅.优先于其他:

See here for some examples of when one should be preferred over the other:

但是，这并不意味着最小-最大"缩放根本没有用！一种流行的应用是图像处理，其中必须将像素强度归一化以适合特定范围(即RGB颜色范围为0到255).而且，典型的神经网络算法需要0-1尺度的数据.

However, this doesn’t mean that Min-Max scaling is not useful at all! A popular application is image processing, where pixel intensities have to be normalized to fit within a certain range (i.e., 0 to 255 for the RGB color range). Also, typical neural network algorithm require data that on a 0-1 scale.

标准化相对于标准化的一个缺点是，它丢失了数据中的某些信息，尤其是有关离群值的信息.

One disadvantage of normalization over standardization is that it loses some information in the data, especially about outliers.

在链接页面上也有这张图片:

Also on the linked page, there is this picture:

如您所见，缩放可将所有数据紧密地聚集在一起，这可能不是您想要的.可能会导致诸如梯度下降之类的算法花费更长的时间才能收敛到与标准化数据集相同的解决方案，甚至可能使其无法实现.

As you can see, scaling clusters all the data very close together, which may not be what you want. It might cause algorithms such as gradient descent to take longer to converge to the same solution they would on a standardized data set, or it might even make it impossible.

规范化变量"实际上没有任何意义.正确的术语是规范化/缩放要素".如果要归一化或缩放一个功能，则其余功能应做同样的事情.

"Normalizing variables" doesn't really make sense. The correct terminology is "normalizing / scaling the features". If you're going to normalize or scale one feature, you should do the same for the rest.

这篇关于线性回归::归一化(Vs)标准化的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！