问题描述
在Python中sklearn KMeans(参见文档 ),我想知道将init
形状(n,n_features)的ndarray
传递给init
参数,当n<n_clusters
In Python sklearn KMeans (see documentation), I was wondering what happens internally when passing an ndarray
of shape (n, n_features) to the init
parameter, When n<n_clusters
- 它是否放弃给定的质心并仅启动kmeans ++初始化(这是
init
参数的默认选择)? ( PDF纸kmeans ++ )() - 它是否考虑给定的质心并使用kmeans ++来相应地填充其余的质心?
- 它是否考虑给定的质心并使用随机值填充其余的质心?
- Does it drop the given centroids and just starts a kmeans++ initialization which is the default choice for the
init
parameter ? (PDF paper kmeans++) (How does Kmeans++ work) - Does it consider the given centroids and fill accordingly the remaining centroids using kmeans++ ?
- Does it consider the given centroids and fill the remaining centroids using random values ?
在这种情况下,我没想到此方法不会返回任何警告.这就是为什么我需要知道它是如何管理的.
I didn't expect that this method returns no warning in this case. That's why I need to know how it manages this.
推荐答案
如果您给它一个不匹配的init
,它将调整群集的数量,如您在源.这没有记录,我认为这是一个错误.我会建议修复它.
If you give it a mismatching init
it will adjust the number of clusters, as you can see from the source. This is not documented and I would consider it a bug.I'll propose to fix it.
这篇关于sklearn.cluster.KMeans如何处理缺少质心(可用质心小于n_clusters)的init ndarray参数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!