本文介绍了朴素贝叶斯分类器动态训练的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以(以及如何)动态训练 sklearn MultinomialNB 分类器?每次我在其中输入电子邮件时,我都想训练(更新)我的垃圾邮件分类器.

Is it possible (and how if it is) to dynamically train sklearn MultinomialNB Classifier?I would like to train(update) my spam classifier every time I feed an email in it.

我想要这个(不起作用):

I want this (does not work):

x_train, x_test, y_train, y_test = tts(features, labels, test_size=0.2)
clf = MultinomialNB()
for i in range(len(x_train)):
    clf.fit([x_train[i]], [y_train[i]])
preds = clf.predict(x_test)

有类似的结果(工作正常):

to have similar result as this (works OK):

x_train, x_test, y_train, y_test = tts(features, labels, test_size=0.2)
clf = MultinomialNB()
clf.fit(x_train, y_train)
preds = clf.predict(x_test)

推荐答案

Scikit-learn 支持多种算法的增量学习,包括 MultinomialNB.检查文档这里

Scikit-learn supports incremental learning for multiple algorithms, including MultinomialNB. Check the docs here

您需要使用方法 partial_fit() 而不是 fit(),因此您的示例代码如下所示:

You'll need to use the method partial_fit() instead of fit(), so your example code would look like:

x_train, x_test, y_train, y_test = tts(features, labels, test_size=0.2)
clf = MultinomialNB()
for i in range(len(x_train)):
    if i == 0:
        clf.partial_fit([x_train[i]], [y_train[I]], classes=numpy.unique(y_train))
    else:
        clf.partial_fit([x_train[i]], [y_train[I]])
preds = clf.predict(x_test)

classes 参数添加到 partial_fit,如@BobWazowski 所建议的

added the classes argument to partial_fit, as suggested by @BobWazowski

这篇关于朴素贝叶斯分类器动态训练的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

05-23 02:50