问题描述
我是机器学习的新手,我想知道kmeans和kmeans2在scipy中的区别.根据文档,他们两个都使用"k-means"算法,但是如何选择它们呢?
I am new to machine learning and wondering the difference between kmeans and kmeans2 in scipy. According to the doc both of them are using the 'k-means' algorithm, but how to choose them?
推荐答案
根据文档,kmeans2似乎是标准的k均值算法,一直运行到收敛到局部最优值为止-并允许您更改种子初始化.
Based on the documentation, it seems kmeans2 is the standard k-means algorithm and runs until converging to a local optimum - and allows you to change the seed initialization.
kmeans函数将根据缺乏更改而提前终止,因此它甚至可能无法达到局部最优值.此外,其目的是生成用于将特征向量映射到的码本.码本本身不一定是从停止点生成的,而是将使用失真"最低的迭代来生成码本.此方法还将多次运行kmeans.该文档包含更多细节.
The kmeans function will terminate early based on a lack of change, so it may not even reach a local optimum. Further, the goal of it is to generate a codebook to map feature vectors to. The codebook itself is not necessarily generated from the stoping point, but will use the iteration that had the lowest "distortion" to generate the codebook. This method will also run kmeans multiple times. The documentation goes into more specifics.
如果您只想将k-means作为算法运行,请选择kmeans2.如果您只想要一本密码本,请选择kmeans.
If you just want to run k-means as an algorithm, pick kmeans2. If you just want a codebook, pick kmeans.
这篇关于kmeans和kmeans2在scipy之间有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!