为什么standardscaler和normalizer需要不同的数据输入?

本文介绍了为什么standardscaler和normalizer需要不同的数据输入?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试以下代码，发现 sklearn 中的 StandardScaler(或 MinMaxScaler) 和 Normalizer 处理数据的方式非常不同.这个问题使管道建设更加困难.我想知道这种设计差异是否是故意的.

I was trying the following code and found that StandardScaler(or MinMaxScaler) and Normalizer from sklearn handle data very differently. This issue makes the pipeline construction more difficult. I was wondering if this design discrepancy is intentional or not.

from sklearn.preprocessing import StandardScaler, Normalizer, MinMaxScaler

对于Normalizer，数据是水平"读取的.

For Normalizer, the data is read "horizontally".

Normalizer(norm = 'max').fit_transform([[ 1., 1.,  2., 10],
                                        [ 2.,  0.,  0., 100],
                                        [ 0.,  -1., -1., 1000]])
#array([[ 0.1  ,  0.1  ,  0.2  ,  1.   ],
#       [ 0.02 ,  0.   ,  0.   ,  1.   ],
#       [ 0.   , -0.001, -0.001,  1.   ]])

对于StandardScaler 和MinMaxScaler，数据是垂直"读取的.

For StandardScaler and MinMaxScaler, the data is read "vertically".

StandardScaler().fit_transform([[ 1., 1.,  2., 10],
                                [ 2.,  0.,  0., 100],
                                [ 0.,  -1., -1., 1000]])
#array([[ 0.        ,  1.22474487,  1.33630621, -0.80538727],
#       [ 1.22474487,  0.        , -0.26726124, -0.60404045],
#       [-1.22474487, -1.22474487, -1.06904497,  1.40942772]])

MinMaxScaler().fit_transform([[ 1., 1.,  2., 10],
                              [ 2.,  0.,  0., 100],
                              [ 0.,  -1., -1., 1000]])
#array([[0.5       , 1.        , 1.        , 0.        ],
#       [1.        , 0.5       , 0.33333333, 0.09090909],
#       [0.        , 0.        , 0.        , 1.        ]])

么standardscaler和normalizer需要不同的数

问题描述

推荐答案