问题描述
我正在尝试将一组表示为稀疏密密矩阵X的数据点聚类.
I am attempting to cluster a set of data points that are represented as a sparse scipy matrix, X. That is,
>>> print type(X)
<class 'scipy.sparse.csr.csr_matrix'>
>>> print X.shape
(57, 1038)
>>> print X[0]
(0, 223) 0.471313296962
(0, 420) 0.621222153695
(0, 1030) 0.442688836467
(0, 124) 0.442688836467
但是,当我将此矩阵输入sklearn.mixture.GMM模型时,它会引发以下ValueError:
When I feed this matrix into an sklearn.mixture.GMM model, however, it raises the following ValueError:
File "/Library/Python/2.7/site-packages/sklearn/mixture/gmm.py", line 423, in fit
X = np.asarray(X, dtype=np.float)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/core/numeric.py", line 235, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.
但是,我已经能够使sklearn.cluster.KMeans模型在相同的稀疏矩阵X上完美地工作.
However, I have been able to make the sklearn.cluster.KMeans model work perfectly on the same sparse matrix X.
其他一些希望有用的信息:scipy版本= 0.11.0,sklearn版本= 0.14.1
Some other hopefully useful info:scipy version = 0.11.0, sklearn version = 0.14.1
关于出什么问题的任何想法?预先感谢!
Any ideas on what is going wrong? Thanks in advance!
推荐答案
GMM不支持稀疏矩阵输入,而KMeans
则支持.如果估算器支持稀疏矩阵,则始终在相关方法的文档字符串中明确指出.
GMMs don't support sparse matrix input, while KMeans
does. If an estimator supports sparse matrices, this is always explicitly stated in the docstring for the relevant method.
这篇关于sklearn GMM引发"ValueError:设置具有序列的数组元素".在稀疏矩阵上的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!