本文介绍了Python ValueError:具有形状(124,1)的不可广播输出操作数与广播形状(124,13)不匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 sklearn.preprocessing 中的 MinMaxScaler 规范化训练和测试数据集.但是,该软件包似乎不接受我的测试数据集.

I would like to normalize a training and test data set using MinMaxScaler in sklearn.preprocessing. However, the package does not appear to be accepting my test data set.

import pandas as pd
import numpy as np

# Read in data.
df_wine = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data',
                      header=None)
df_wine.columns = ['Class label', 'Alcohol', 'Malic acid', 'Ash',
                   'Alcalinity of ash', 'Magnesium', 'Total phenols',
                   'Flavanoids', 'Nonflavanoid phenols', 'Proanthocyanins',
                   'Color intensity', 'Hue', 'OD280/OD315 of diluted wines',
                   'Proline']

# Split into train/test data.
from sklearn.model_selection import train_test_split
X = df_wine.iloc[:, 1:].values
y = df_wine.iloc[:, 0].values
X_train, y_train, X_test, y_test = train_test_split(X, y, test_size=0.3,
                                                    random_state = 0)

# Normalize features using min-max scaling.
from sklearn.preprocessing import MinMaxScaler
mms = MinMaxScaler()
X_train_norm = mms.fit_transform(X_train)
X_test_norm = mms.transform(X_test)

执行此操作时,我收到一个 DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19.如果您的数据具有单个特征,则使用 X.reshape(-1, 1) 或 X.reshape(1, -1) 如果它包含单个样本,则使用 X.reshape(-1, -1) 重塑您的数据. 以及 ValueError:操作数无法与形状 (124,) (13,) (124,) 一起广播.

When executing this, I get a DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample. along with a ValueError: operands could not be broadcast together with shapes (124,) (13,) (124,).

重新调整数据仍然会产生错误.

Reshaping the data still yields an error.

X_test_norm = mms.transform(X_test.reshape(-1, 1))

此整形产生错误 ValueError: non-broadcastable output operation with shape (124,1) 与广播形状 (124,13) 不匹配.

有关如何修复此错误的任何输入都会有所帮助.

Any input on how to get fix this error would be helpful.

推荐答案

训练/测试数据的分区必须按照与 train_test_split() 函数,以便根据该顺序解压它们.

The partitioning of train/test data must be specified in the same order as the input array to the train_test_split() function for it to unpack them corresponding to that order.

显然,当顺序指定为 X_train, y_train, X_test, y_test 时,y_train (len(y_train)=54code>) 和 X_test (len(X_test)=124) 交换导致 ValueError.

Clearly, when the order was specified as X_train, y_train, X_test, y_test, the resulting shapes of y_train (len(y_train)=54) and X_test (len(X_test)=124) got swapped resulting in the ValueError.

相反,您必须:

# Split into train/test data.
#                   _________________________________
#                   |       |                        \
#                   |       |                         \
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)
# |          |                                      /
# |__________|_____________________________________/
# (or)
# y_train, y_test, X_train, X_test = train_test_split(y, X, test_size=0.3, random_state=0)

# Normalize features using min-max scaling.
from sklearn.preprocessing import MinMaxScaler
mms = MinMaxScaler()
X_train_norm = mms.fit_transform(X_train)
X_test_norm = mms.transform(X_test)

产生:

X_train_norm[0]
array([ 0.72043011,  0.20378151,  0.53763441,  0.30927835,  0.33695652,
        0.54316547,  0.73700306,  0.25      ,  0.40189873,  0.24068768,
        0.48717949,  1.        ,  0.5854251 ])

X_test_norm[0]
array([ 0.72849462,  0.16386555,  0.47849462,  0.29896907,  0.52173913,
        0.53956835,  0.74311927,  0.13461538,  0.37974684,  0.4364852 ,
        0.32478632,  0.70695971,  0.60566802])

这篇关于Python ValueError:具有形状(124,1)的不可广播输出操作数与广播形状(124,13)不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-14 18:41