问题描述
准确地说,我有两个问题。首先,我想知道是否有一种简便的方法来适应马尔可夫聚类算法,以便我可以提前指定到底要多少个聚类。如果没有,您会推荐哪种类似算法?
I have two questions to be precise. Firstly, I would like to know if there is an easy way to adapt the Markov Clustering Algorithm so that I can specify in advance, how many clusters I would like to have at the end. If not, which similiar algorithm would you recommend?
其次,应该如何处理马尔可夫世界中重叠的簇?
And secondly how should be dealt with overlapping clusters in the Markov world?
推荐答案
1)。没有简单的方法来适应MCL算法(注意:它的名字是马尔可夫聚类算法,不带 ing。很多人像在进行马尔可夫聚类中那样说出来,以输出指定数量的聚类) 。我认为,在99.99%的时间内,这是一个非常理想的功能。如果要执行您想要的操作,则将在不同的粒度级别上生成4或5个聚类(例如将MCL膨胀参数设置为1.4、2.0、3.0、4.0和6.0,但是可能需要做更多一些工作,并且根据群集大小的分布选择),然后将它们统一在分层群集中(程序 clm close可以做到这一点)。之后,可以遍历该树并尝试找到所需大小的最佳聚类。显然,这需要大量的努力。我过去做过类似但不完全相同的事情。
1). There is no easy way to adapt the MCL algorithm (note: its name is 'Markov cluster algorithm' without the 'ing'. Many people verbalise it as in 'doing Markov clustering', which is fine) to output a specified number of clusters. This is in my opinion, for 99.99% of the time a highly desirable feature. If I were to do what you want, I would generate 4 or 5 clusterings at different levels of granularity (say setting the MCL inflation parameter to 1.4, 2.0, 3.0, 4.0 and 6.0, but it could be worthwhile to do a few more and pick based on the distribution of cluster sizes), then unify them in a hierarchical clustering (the program 'clm close' can do that). After that one could traverse the tree and try to find an optimal clustering of the desired size. This obviously requires significant effort. I have done something similar but not quite the same in the past.
2)。 MCL产生的重叠聚类非常罕见,并且总是输入图中对称的结果。大多数人使用的标准MCL实现(来自)将消除重叠。我认为这不是问题。免责声明:我编写了MCL。
2). Overlapping clusterings produced by MCL are extremely rare, and always a result of symmetry in the input graph. The standard MCL implementation that most people use (from http://micans.org/mcl/) will remove overlap. This in my opinion is not a concern. Disclaimer: I authored MCL.
这篇关于马尔可夫聚类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!