这是我的代码。我正在尝试对数据集进行归一化,但是我可以看到输出未在0到1之间缩放。我是否在这里缺少内容?
由于此代码适用于虹膜数据集。归一化不是总是返回0到1之间的缩放值吗?

# Normalize the data attributes for the boston dataset.
from sklearn.datasets import load_boston
from sklearn import preprocessing
# load the iris dataset
dataset = load_boston()
print(iris.data.shape)
# separate the data from the target attributes
X = dataset.data
y = dataset.target
# normalize the data attributes
normalized_X = preprocessing.normalize(X)



normalized_X[:5]


输出:

array([[1.26388341e-05, 3.59966795e-02, 4.61957387e-03, 0.00000000e+00,
        1.07590075e-03, 1.31487871e-02, 1.30387972e-01, 8.17924550e-03,
        1.99981553e-03, 5.91945396e-01, 3.05971776e-02, 7.93726783e-01,
        9.95908132e-03],
       [5.78529889e-05, 0.00000000e+00, 1.49769546e-02, 0.00000000e+00,
        9.93520754e-04, 1.36021253e-02, 1.67140272e-01, 1.05222110e-02,
        4.23676228e-03, 5.12648235e-01, 3.77071843e-02, 8.40785474e-01,
        1.93620036e-02],
       [5.85729947e-05, 0.00000000e+00, 1.51744622e-02, 0.00000000e+00,
        1.00662274e-03, 1.54212886e-02, 1.31139977e-01, 1.06609718e-02,
        4.29263427e-03, 5.19408747e-01, 3.82044450e-02, 8.43137761e-01,
        8.64965806e-03],
       [7.10489715e-05, 0.00000000e+00, 4.78488594e-03, 0.00000000e+00,
        1.00526503e-03, 1.53599229e-02, 1.00526503e-01, 1.33059337e-02,
        6.58470542e-03, 4.87268201e-01, 4.10446638e-02, 8.66174100e-01,
        6.45301131e-03],
       [1.50596596e-04, 0.00000000e+00, 4.75453408e-03, 0.00000000e+00,
        9.98888353e-04, 1.55874565e-02, 1.18209058e-01, 1.32215305e-02,
        6.54293681e-03, 4.84177324e-01, 4.07843061e-02, 8.65630540e-01,
        1.16246177e-02]])

最佳答案

为什么说值不在0到1之间?

规范化并不意味着min=0max=1 ...,这意味着将对每个非零向量进行缩放,以使其范数(默认为L2范数)为1。

换句话说,对于每个向量,每个坐标的平方和为1。

例如,考虑您的最后一个向量,我们可以看到

In [1]: x = [1.50596596e-04, 0.00000000e+00, 4.75453408e-03, 0.00000000e+00,
   ...:         9.98888353e-04, 1.55874565e-02, 1.18209058e-01, 1.32215305e-02,
   ...:         6.54293681e-03, 4.84177324e-01, 4.07843061e-02, 8.65630540e-01,
   ...:         1.16246177e-02]

In [2]: sum(c**2 for c in x)
Out[2]: 0.9999999993530653

In [3]:

10-07 21:52