问题描述
上下文:
我正在使用scikit库中的Passive Aggressor,并且困惑是使用热启动还是部分拟合.
I am using Passive Aggressor from scikit library and confused whether to use warm start or partial fit.
迄今为止的努力:
- 引用了该线程讨论:
https://github.com/scikit-learn/scikit-learn/第/1585期
- 彻底了解了 _fit 和 _partial_fit 的scikit代码.
- Gone through the scikit code for _fit and _partial_fit.
我的观察:
-
_fit
依次调用_partial_fit
.
设置 warm_start
时, _fit
用 _partial_fit 调用 self.coef _
When warm_start
is set, _fit
calls _partial_fit with self.coef_
在没有 coef_init
参数和 self的情况下调用
,它将继续使用 _partial_fit
时.设置了coef _ self.coef _
When _partial_fit
is called without coef_init
parameter and self.coef_
is set, it continues to use self.coef_
问题:
我觉得两者最终都提供了相同的功能.那么,它们之间的基本区别是什么?在哪些情况下使用了它们中的任何一个?
I feel both are ultimately providing the same functionalities.Then, what is the basic difference between them? In which contexts, either of them are used?
我缺少明显的东西吗?任何帮助表示赞赏!
Am I missing something evident? Any help is appreciated!
推荐答案
我不知道被动攻击者,但至少在使用 SGDRegressor , partial_fit
仅适合1个时期,而 fit
将适合多个时期(直到损失收敛或 max_iter
到达了).因此,在将新数据拟合到模型中时, partial_fit
将仅向新数据迈进一步,而使用 fit
和 warm_start
就像您将旧数据和新数据组合在一起并拟合模型一次直到收敛一样.
I don't know about the Passive Aggressor, but at least when using the SGDRegressor, partial_fit
will only fit for 1 epoch, whereas fit
will fit for multiple epochs (until the loss converges or max_iter
is reached). Therefore, when fitting new data to your model, partial_fit
will only correct the model one step towards the new data, but with fit
and warm_start
it will act as if you would combine your old data and your new data together and fit the model once until convergence.
示例:
from sklearn.linear_model import SGDRegressor
import numpy as np
np.random.seed(0)
X = np.linspace(-1, 1, num=50).reshape(-1, 1)
Y = (X * 1.5 + 2).reshape(50,)
modelFit = SGDRegressor(learning_rate="adaptive", eta0=0.01, random_state=0, verbose=1,
shuffle=True, max_iter=2000, tol=1e-3, warm_start=True)
modelPartialFit = SGDRegressor(learning_rate="adaptive", eta0=0.01, random_state=0, verbose=1,
shuffle=True, max_iter=2000, tol=1e-3, warm_start=False)
# first fit some data
modelFit.fit(X, Y)
modelPartialFit.fit(X, Y)
# for both: Convergence after 50 epochs, Norm: 1.46, NNZs: 1, Bias: 2.000027, T: 2500, Avg. loss: 0.000237
print(modelFit.coef_, modelPartialFit.coef_) # for both: [1.46303288]
# now fit new data (zeros)
newX = X
newY = 0 * Y
# fits only for 1 epoch, Norm: 1.23, NNZs: 1, Bias: 1.208630, T: 50, Avg. loss: 1.595492:
modelPartialFit.partial_fit(newX, newY)
# Convergence after 49 epochs, Norm: 0.04, NNZs: 1, Bias: 0.000077, T: 2450, Avg. loss: 0.000313:
modelFit.fit(newX, newY)
print(modelFit.coef_, modelPartialFit.coef_) # [0.04245779] vs. [1.22919864]
newX = np.reshape([2], (-1, 1))
print(modelFit.predict(newX), modelPartialFit.predict(newX)) # [0.08499296] vs. [3.66702685]
这篇关于部分贴合和热启动有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!