本文介绍了Python + Scikit学习:如何针对加性平滑参数alpha绘制训练得分和验证得分的曲线的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用k倍交叉验证来计算加性平滑"参数alpha的最佳值.另外,我想针对α值绘制训练精度和验证精度的曲线.我为此写了一个代码:

I am using k-fold cross validation to compute the optimal value of Additive Smoothing parameter alpha. Also, I want to plot the curves of training accuracy and validation accuracy against the values of alpha. I wrote a code for that:

alphas = list(np.arange(0.0001, 1.5000, 0.0001))

#empty lists that stores cv scores and training_scores
cv_scores = []
training_scores = []

#perform k fold cross validation
for alpha in alphas:
    naive_bayes = MultinomialNB(alpha=alpha)
    scores = cross_val_score(naive_bayes, x_train_counts, y_train, cv=20, scoring='accuracy')
    scores_training = naive_bayes.fit(x_train_counts, y_train).score(x_train_counts, y_train)

    cv_scores.append(scores.mean())
    training_scores.append(scores_training)

#plot cross-validated score, training score vs alpha 

plt.plot(alphas, cv_scores, 'r')
plt.plot(alphas, training_scores, 'b')
plt.xlabel('alpha')
plt.ylabel('score')

这是实现此目标的正确方法吗?

Is this the correct way to implement this?

推荐答案

取决于您是否要调整其他模型的超参数,使用称为网格搜索.使用此功能,您可以以更简单的方式调整额外的超级参数,并且可以获得训练分数.请参阅下面的实现.

Depending on whether you want to tweak other model hyper parameters it may be easier to use what is called a grid search. Using this, you can tweak extra hyper parameters in a simpler way and training scores are available for you. See my below implementation.

parameters = {'alpha':[0.0001, 1.5000, 0.0001]}
classifier = GridSearchCV(MultinomialNB(), parameters, cv=20)
clf.fit(x_train, y_train)

print('Mean train set score: {}'.format(clf.cv_results_['mean_train_score']))

这篇关于Python + Scikit学习:如何针对加性平滑参数alpha绘制训练得分和验证得分的曲线的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-24 18:46