问题描述
尝试将相似性传播用于简单的群集任务:
Trying to use affinity propagation for a simple clustering task:
from sklearn.cluster import AffinityPropagation
c = [[0], [0], [0], [0], [0], [0], [0], [0]]
af = AffinityPropagation (affinity = 'euclidean').fit (c)
print (af.labels_)
我得到这个奇怪的结果:
[0 1 0 1 2 1 1 0]
I get this strange result:[0 1 0 1 2 1 1 0]
我希望所有样本都位于同一群集中,例如在这种情况下:
I would expect to have all samples in the same cluster, like in this case:
c = [[0], [0], [0]]
af = AffinityPropagation (affinity = 'euclidean').fit (c)
print (af.labels_)
的确将所有样本放在同一簇中:
[0 0 0]
which indeed puts all samples in the same cluster:[0 0 0]
我还缺少什么?
谢谢
推荐答案
我相信这是因为您的问题本质上是不适定的(将很多相同的点传递给算法试图找到不同点之间的相似性)。 AffinityPropagation在后台进行矩阵数学运算,而您的相似度矩阵(全为零)简直是退化的。为了不出错,实现,以防止算法在遇到两个相同点时退出。
I believe this is because your problem is essentially ill-posed (you pass lots of the same point to an algorithm which is trying to find similarity between different points). AffinityPropagation is doing matrix math under the hood, and your similarity matrix (which is all zeros) is nastily degenerate. In order to not error out, the implementation adds a small random matrix to the similarity matrix, preventing the algorithm from quitting when it encounters two of the same point.
这篇关于亲和力传播(sklearn)-奇怪的行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!