问题描述
是否可以在Xgboost中训练具有多个连续输出(多重回归)的模型?训练这种模型的目的是什么?
Is it possible to train a model in Xgboost that have multiple continuous outputs (multi regression)?What would be the objective to train such a model?
预先感谢您的任何建议
推荐答案
我的建议是使用 sklearn.multioutput.MultiOutputRegressor 作为xgb.XGBRegressor
的包装. MultiOutputRegressor
为每个目标训练一个回归器,并且仅要求该回归器实现fit
和predict
,而xgboost恰好支持该实现.
My suggestion is to use sklearn.multioutput.MultiOutputRegressor as a wrapper of xgb.XGBRegressor
. MultiOutputRegressor
trains one regressor per target and only requires that the regressor implements fit
and predict
, which xgboost happens to support.
# get some noised linear data
X = np.random.random((1000, 10))
a = np.random.random((10, 3))
y = np.dot(X, a) + np.random.normal(0, 1e-3, (1000, 3))
# fitting
multioutputregressor = MultiOutputRegressor(xgb.XGBRegressor(objective='reg:linear')).fit(X, y)
# predicting
print np.mean((multioutputregressor.predict(X) - y)**2, axis=0) # 0.004, 0.003, 0.005
这可能是使用xgboost回归多维目标的最简单方法,因为您无需更改代码的任何其他部分(如果您最初使用的是sklearn
API).
This is probably the easiest way to regress multi-dimension targets using xgboost as you would not need to change any other part of your code (if you were using the sklearn
API originally).
但是,此方法没有利用目标之间的任何可能关系.但是您可以尝试设计自定义目标功能来实现这一目标.
However this method does not leverage any possible relation between targets. But you can try to design a customized objective function to achieve that.
这篇关于xgboost中的多输出回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!