本文介绍了HDBSCAN Python选择群集数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以在python的HDBSCAN算法中选择簇数?还是唯一的方法就是使用输入参数(例如alpha,min_cluster_size)?

Is is possible to select the number of clusters in the HDBSCAN algorithm in python? Or the only way is to play around with the input parameters such as alpha, min_cluster_size?

谢谢

更新:这是使用fcluster和hdbscan的代码

UPDATE:here is the code to use fcluster and hdbscan

import hdbscan
from scipy.cluster.hierarchy import fcluster

clusterer = hdbscan.HDBSCAN()
clusterer.fit(X)
Z = clusterer.single_linkage_tree_.to_numpy()
labels = fcluster(Z, 2, criterion='maxclust')

推荐答案

如果您明确需要获取固定数量的集群,则最接近管理的事情是使用集群层次结构并在层次结构中进行平截在为您提供所需数量的群集的级别上.确实需要使用HDBSCAN暴露的树对象之一并使您的手变脏,但这是可以做到的.

If you explicitly need to get a fixed number of clusters then the closest thing to managing that would be to use the cluster hierarchy and perform a flat cut through the hierarchy at the level that gives you the desired number of clusters. That does involve working with one of the tree objects that HDBSCAN exposes and getting your hands a little dirty, but it can be done.

这篇关于HDBSCAN Python选择群集数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-01 20:40