问题描述
我无法加载使用 sklearn.externals.joblib.dump
或 pickle.dump
保存的自定义转换器的实例,因为自定义转换器的原始定义当前 python 会话中缺少转换器.
I am not able to load an instance of a custom transformer saved using either sklearn.externals.joblib.dump
or pickle.dump
because the original definition of the custom transformer is missing from the current python session.
假设在一个 python 会话中,我定义、创建并保存了一个自定义转换器,它也可以在同一个会话中加载:
Suppose in one python session, I define, create and save a custom transformer, it can also be loaded in the same session:
from sklearn.base import TransformerMixin
from sklearn.base import BaseEstimator
from sklearn.externals import joblib
class CustomTransformer(BaseEstimator, TransformerMixin):
def __init__(self):
pass
def fit(self, X, y=None):
return self
def transform(self, X, y=None):
return X
custom_transformer = CustomTransformer()
joblib.dump(custom_transformer, 'custom_transformer.pkl')
loaded_custom_transformer = joblib.load('custom_transformer.pkl')
打开一个新的python会话并从'custom_transformer.pkl'加载
Opening up a new python session and loading from 'custom_transformer.pkl'
from sklearn.externals import joblib
joblib.load('custom_transformer.pkl')
引发以下异常:
AttributeError: module '__main__' has no attribute 'CustomTransformer'
如果将 joblib
替换为 pickle
,也会观察到同样的情况.在一个会话中保存自定义转换器
The same thing is observed if joblib
is replaced with pickle
. Saving the custom transformer in one session with
with open('custom_transformer_pickle.pkl', 'wb') as f:
pickle.dump(custom_transformer, f, -1)
并将其加载到另一个中:
and loading it in another:
with open('custom_transformer_pickle.pkl', 'rb') as f:
loaded_custom_transformer_pickle = pickle.load(f)
引发相同的异常.
在上面,如果将CustomTransformer
替换为例如sklearn.preprocessing.StandardScaler
,则发现保存的实例可以在新的python 中加载会议.
In the above, if CustomTransformer
is replaced with, say, sklearn.preprocessing.StandardScaler
, then it is found that the saved instance can be loaded in a new python session.
是否可以保存自定义转换器并稍后将其加载到其他地方?
Is it possible to be able to save a custom transformer and load it later somewhere else?
推荐答案
sklearn.preprocessing.StandardScaler
之所以有效,是因为类定义在 sklearn 包安装中可用,joblib
加载泡菜时会查找.
sklearn.preprocessing.StandardScaler
works because the class definition is available in the sklearn package installation, which joblib
will look up when you load the pickle.
您必须通过重新定义或导入来使您的 CustomTransformer
类在新会话中可用.
You'll have to make your CustomTransformer
class available in the new session, either by re-defining or importing it.
这篇关于如何在sklearn中保存自定义转换器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!