问题描述
如何在具有2个聚类的R中测量层次聚类(单链接)的准确性?这是我的代码:
How can I measure accuracy in Hierarchical Clustering (Single link) in R with 2 Clusters ?Here is my code:
> dcdata = read.csv("kkk.txt")
> target = dcdata[,3]
> dcdata = dcdata [,1:2]
> d = dist(dcdata)
> hc_single = hclust(d,method="single")
> plot(hc_single)
> clusters =cutree(hc_single, k=2)
> print(clusters)
谢谢!
推荐答案
精度不是最准确的术语,但是我想您想看看分层聚类是否为您提供了与标签重合的聚类或组.例如,我使用虹膜数据集,并使用setosa与其他对象作为目标:
Accuracy is not the most accurate term, but I guess you want to see whether the hierarchical clustering gives you clusters or groups that coincide with your labels. For example, I use the iris dataset, and use setosa vs others as target:
data = iris
target = ifelse(data$Species=="setosa","setosa","others")
table(target)
others setosa
100 50
data = data[,1:4]
d = dist(data)
hc_single = hclust(d,method="single")
plot(hc_single)
好像它们是两个主要集群.现在,我们尝试查看目标的分布方式:
Seems like they are two major clusters. Now we try to see how the target are distributed:
library(dendextend)
dend <- as.dendrogram(hc_single)
COLS = c("turquoise","orange")
names(COLS) = unique(target)
dend <- color_labels(dend, col = COLS[target[labels(dend)]])
plot(dend)
现在像您所做的一样,我们得到了簇,
Now like what you did, we get the clusters,
clusters =cutree(hc_single, k=2)
table(clusters,target)
target
clusters others setosa
1 0 50
2 100 0
您获得了近乎完美的分离.群集1中的所有数据点均为setosa,而群集2中的所有数据点均不是setosa.因此,您可以将其视为100%的准确性,但我会谨慎使用该术语.
You get an almost perfect separation. All the data points in cluster 1 are setosa and all in cluster 2 are not setosa. So you can think of it as like 100% accuracy but I would be careful about using the term.
您可以大致计算出这样的巧合:
You can roughly calculate the coincidence like this:
Majority_class = tapply(factor(target),clusters,function(i)names(sort(table(i)))[2])
这将告诉您每个群集,这是多数类.从那里我们可以看到这与实际标签有多大的一致性.
This tells you for each cluster, which is the majority class. And from there we see how much this agrees with the actual labels.
mean(Majority_class[clusters] == target)
这篇关于R中的层次聚类(单链接)中的测量精度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!