问题描述
是否可以(以及如何)动态训练 sklearn MultinomialNB 分类器?每次我在其中输入电子邮件时,我都想训练(更新)我的垃圾邮件分类器.
Is it possible (and how if it is) to dynamically train sklearn MultinomialNB Classifier?I would like to train(update) my spam classifier every time I feed an email in it.
我想要这个(不起作用):
I want this (does not work):
x_train, x_test, y_train, y_test = tts(features, labels, test_size=0.2)
clf = MultinomialNB()
for i in range(len(x_train)):
clf.fit([x_train[i]], [y_train[i]])
preds = clf.predict(x_test)
有类似的结果(工作正常):
to have similar result as this (works OK):
x_train, x_test, y_train, y_test = tts(features, labels, test_size=0.2)
clf = MultinomialNB()
clf.fit(x_train, y_train)
preds = clf.predict(x_test)
推荐答案
Scikit-learn 支持多种算法的增量学习,包括 MultinomialNB.检查文档这里
Scikit-learn supports incremental learning for multiple algorithms, including MultinomialNB. Check the docs here
您需要使用方法 partial_fit()
而不是 fit()
,因此您的示例代码如下所示:
You'll need to use the method partial_fit()
instead of fit()
, so your example code would look like:
x_train, x_test, y_train, y_test = tts(features, labels, test_size=0.2)
clf = MultinomialNB()
for i in range(len(x_train)):
if i == 0:
clf.partial_fit([x_train[i]], [y_train[I]], classes=numpy.unique(y_train))
else:
clf.partial_fit([x_train[i]], [y_train[I]])
preds = clf.predict(x_test)
将 classes
参数添加到 partial_fit
,如@BobWazowski 所建议的
added the classes
argument to partial_fit
, as suggested by @BobWazowski
这篇关于朴素贝叶斯分类器动态训练的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!