我正在使用 kmodes python 库。可以解释一下这些参数的含义吗?

关联:
https://github.com/nicodv/kmodes#huang97

km = kmodes.KModes(n_clusters=4, init='Huang', n_init=5, verbose=1)

我知道 n_clusters 是将数据分组的簇数,但其他参数是什么?

最佳答案

source code :

Parameters
    -----------
    n_clusters : int, optional, default: 8
        The number of clusters to form as well as the number of
        centroids to generate.
    max_iter : int, default: 300
        Maximum number of iterations of the k-modes algorithm for a
        single run.
    cat_dissim : func, default: matching_dissim
        Dissimilarity function used by the algorithm for categorical variables.
        Defaults to the matching dissimilarity function.
    init : {'Huang', 'Cao', 'random' or an ndarray}, default: 'Cao'
        Method for initialization:
        'Huang': Method in Huang [1997, 1998]
        'Cao': Method in Cao et al. [2009]
        'random': choose 'n_clusters' observations (rows) at random from
        data for the initial centroids.
        If an ndarray is passed, it should be of shape (n_clusters, n_features)
        and gives the initial centroids.
    n_init : int, default: 10
        Number of time the k-modes algorithm will be run with different
        centroid seeds. The final results will be the best output of
        n_init consecutive runs in terms of cost.
    verbose : int, optional
        Verbosity mode.

所以 init 只是用于初始化的方法,而 n_init 是算法将运行的次数,从这些独立运行中选择最佳输出。
verbose 只是规定有多少输出被传递到标准输出(即告诉你算法处于什么阶段等)。

10-08 06:40