svm scikit中的class weight = none和auto之间有什么区别

本文介绍了svm scikit中的class weight = none和auto之间有什么区别的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在scikit中学习svm分类器，class_weight = None和class_weight = Auto之间有什么区别.

In scikit learn svm classifier what is the difference between class_weight = None and class_weight = Auto.

在文档中以

class sklearn.svm.SVC(C=1.0, kernel='rbf', degree=3, gamma=0.0, coef0=0.0, shrinking=True, probability=False, tol=0.001, cache_size=200, class_weight=None, verbose=False, max_iter=-1, random_state=None)

但是使用自动模式的好处是什么.我不明白它的实现.

But what is the advantage of using auto mode. I couldnt understand its implementation.

推荐答案

这发生在 class_weight.py文件:

elif class_weight == 'auto':
    # Find the weight of each class as present in y.
    le = LabelEncoder()
    y_ind = le.fit_transform(y)
    if not all(np.in1d(classes, le.classes_)):
        raise ValueError("classes should have valid labels that are in y")

    # inversely proportional to the number of samples in the class
    recip_freq = 1. / bincount(y_ind)
    weight = recip_freq[le.transform(classes)] / np.mean(recip_freq)

这意味着您拥有的每个类(在classes中)的权重等于1除以该类在数据中出现的次数(y)，因此出现频率更高的类将获得重量更轻.然后将其除以所有逆类频率的平均值.

This means that each class you have (in classes) gets a weight equal to 1 divided by the number of times that class appears in your data (y), so classes that appear more often will get lower weights. This is then further divided by the mean of all the inverse class frequencies.

优点是您不必再担心自己设置类权重:这对于大多数应用程序来说已经很不错了.

The advantage is that you no longer have to worry about setting the class weights yourself: this should already be good for most applications.

如果您在源代码中的上方查看，对于None，weight用一个填充，因此每个类的权重都相等.

If you look above in the source code, for None, weight is filled with ones, so each class gets equal weight.

这篇关于svm scikit中的class weight = none和auto之间有什么区别的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！