本文介绍了如何使用python pickle库(或任何其他高效的库)保存scikit-learn的多个分类器模型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

通常,我们可以使用pickle保存一个分类器模型.有没有一种方法可以将多个分类器模型保存在一个泡菜中?如果是,我们如何保存模型并在以后检索它?

In general, we could use pickle to save ONE classifier model. Is there a way to save MULTIPLE classifier models in one pickle? If yes, how could we save the model and retrieve it later?

例如,(最小的工作示例)

For instance, (the minimum working example)

from sklearn import model_selection
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from numpy.random import rand, randint

models = []
models.append(('LogisticReg', LogisticRegression(random_state=123)))
models.append(('DecisionTree', DecisionTreeClassifier(random_state=123)))
# evaluate each model in turn
results_all = []
names = []
dict_method_score = {}
scoring = 'f1'

X = rand(8, 4)
Y = randint(2, size=8)

print("Method: Average (Standard Deviation)\n")
for name, model in models:
    kfold = model_selection.KFold(n_splits=2, random_state=999)
    cv_results = model_selection.cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
    results_all.append(cv_results)
    names.append(name)
    dict_method_score[name] = (cv_results.mean(), cv_results.std())
    print("{:s}: {:.3f} ({:.3f})".format(name, cv_results.mean(), cv_results.std()))

目的:使用相同的设置更改某些超参数(在交叉验证中为n_splits),然后再检索模型.

Purpose: Change some hyperparameters (say n_splits in cross validation) using the same setup and retrieve the model later.

推荐答案

您可以将多个对象保存到相同的泡菜中:

You can save multiple objects into the same pickle:

with open("models.pckl", "wb") as f:
    for model in models:
         pickle.dump(model, f)

然后您可以一次将模型重新加载到内存中

You can then load back your models into memory one at a time:

models = []
with open("models.pckl", "rb") as f:
    while True:
        try:
            models.append(pickle.load(f))
        except EOFError:
            break

这篇关于如何使用python pickle库(或任何其他高效的库)保存scikit-learn的多个分类器模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-13 19:06