我想将数据框的列缩放为0到1之间的值。为此,我使用了MinMaxScaler,它可以正常工作,但会向我发送混合消息。我正在做:

x = df['Activity'].values #returns a numpy array
min_max_scaler = preprocessing.MinMaxScaler()
x_scaled = min_max_scaler.fit_transform(x)
df['Activity'] = pd.Series(x_scaled)

此代码的消息编号警告是警告:
DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
好吧,所以拥有一维数组的明显对象很快就会成为现实,所以让我们尝试按照建议的方式重塑它:
x = df['Activity'].values.reshape(-1, 1)

现在,该代码甚至无法运行:抛出Exception: Data must be 1-dimensional。所以我很困惑。将很快弃用1d,但数据也必须是1d ??如何安全地做到这一点?这里有什么问题?

根据@sascha 的要求进行编辑
x看起来像这样:
array([ 0.00568953,  0.00634314,  0.00718003, ...,  0.01976002,
        0.00575024,  0.00183782])

并在重塑后:
array([[ 0.00568953],
       [ 0.00634314],
       [ 0.00718003],
       ...,
       [ 0.01976002],
       [ 0.00575024],
       [ 0.00183782]])

整个警告:
/usr/local/lib/python3.5/dist-packages/sklearn/preprocessing/data.py:321: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
  warnings.warn(DEPRECATION_MSG_1D, DeprecationWarning)
/usr/local/lib/python3.5/dist-packages/sklearn/preprocessing/data.py:356: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
  warnings.warn(DEPRECATION_MSG_1D, DeprecationWarning)

我重塑时的错误:
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-132-df180aae2d1a> in <module>()
      2 min_max_scaler = preprocessing.MinMaxScaler()
      3 x_scaled = min_max_scaler.fit_transform(x)
----> 4 telecom['Activity'] = pd.Series(x_scaled)

/usr/local/lib/python3.5/dist-packages/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    225             else:
    226                 data = _sanitize_array(data, index, dtype, copy,
--> 227                                        raise_cast_failure=True)
    228
    229                 data = SingleBlockManager(data, index, fastpath=True)

/usr/local/lib/python3.5/dist-packages/pandas/core/series.py in _sanitize_array(data, index, dtype, copy, raise_cast_failure)
   2918     elif subarr.ndim > 1:
   2919         if isinstance(data, np.ndarray):
-> 2920             raise Exception('Data must be 1-dimensional')
   2921         else:
   2922             subarr = _asarray_tuplesafe(data, dtype=dtype)

Exception: Data must be 1-dimensional

最佳答案

您可以简单地删除pd.Series:

import pandas as pd
from sklearn import preprocessing
df = pd.DataFrame({'Activity': [ 0.00568953,  0.00634314,  0.00718003,
                                0.01976002, 0.00575024,  0.00183782]})
x = df['Activity'].values.reshape(-1, 1) #returns a numpy array
min_max_scaler = preprocessing.MinMaxScaler()
x_scaled = min_max_scaler.fit_transform(x)
df['Activity'] = x_scaled

或者您可以显式获取x_scaled的第一列:
df['Activity'] = pd.Series(x_scaled[:, 0])

关于Python/sklearn-preprocessing.MinMaxScaler 1d弃用,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/40684734/

10-12 22:46