问题描述
我知道xgboost需要第一个渐变和第二个渐变,但其他人都使用过"mae"作为obj函数吗?
I know xgboost need first gradient and second gradient, but anybody else has used "mae" as obj function?
推荐答案
首先要理论一点,对不起!您要求MAE的研究生和粗麻布,但是,MAE不是,因此尝试计算一阶和二阶导数变得很棘手.在下面,我们可以看到x=0
处的扭曲",它阻止了MAE的连续微分.
A little bit of theory first, sorry! You asked for the grad and hessian for MAE, however, the MAE is not continuously twice differentiable so trying to calculate the first and second derivatives becomes tricky. Below we can see the "kink" at x=0
which prevents the MAE from being continuously differentiable.
此外,二阶导数在表现良好的所有点均为零.在XGBoost中,二阶导数用作叶子权重的分母,当其为零时,会产生严重的数学错误.
Moreover, the second derivative is zero at all the points where it is well behaved. In XGBoost, the second derivative is used as a denominator in the leaf weights, and when zero, creates serious math-errors.
鉴于这些复杂性,我们最好的选择是尝试使用其他行为良好的函数来近似MAE.让我们看一下.
Given these complexities, our best bet is to try to approximate the MAE using some other, nicely behaved function. Let's take a look.
我们在上面可以看到有一些函数可以逼近绝对值.显然,对于非常小的值,平方误差(MSE)是MAE的一个很好的近似值.但是,我认为这不足以满足您的用例.
We can see above that there are several functions that approximate the absolute value. Clearly, for very small values, the Squared Error (MSE) is a fairly good approximation of the MAE. However, I assume that this is not sufficient for your use case.
Huber 损失是一个有据可查的损失函数.但是,它不是平滑的,因此我们不能保证平滑的导数.我们可以使用Psuedo-Huber函数对其进行近似.可以在python XGBoost中如下实现,
Huber Loss is a well documented loss function. However, it is not smooth so we cannot guarantee smooth derivatives. We can approximate it using the Psuedo-Huber function. It can be implemented in python XGBoost as follows,
import xgboost as xgb
dtrain = xgb.DMatrix(x_train, label=y_train)
dtest = xgb.DMatrix(x_test, label=y_test)
param = {'max_depth': 5}
num_round = 10
def huber_approx_obj(preds, dtrain):
d = preds - dtrain.get_labels() #remove .get_labels() for sklearn
h = 1 #h is delta in the graphic
scale = 1 + (d / h) ** 2
scale_sqrt = np.sqrt(scale)
grad = d / scale_sqrt
hess = 1 / scale / scale_sqrt
return grad, hess
bst = xgb.train(param, dtrain, num_round, obj=huber_approx_obj)
可以通过替换obj=huber_approx_obj
使用其他功能.
Other function can be used by replacing the obj=huber_approx_obj
.
公平损失并没有得到充分的记录,但似乎运行得很好.公平损失函数为:
Fair Loss is not well documented at all but it seems to work rather well. The fair loss function is:
可以这样实现,
def fair_obj(preds, dtrain):
"""y = c * abs(x) - c**2 * np.log(abs(x)/c + 1)"""
x = preds - dtrain.get_labels()
c = 1
den = abs(x) + c
grad = c*x / den
hess = c*c / den ** 2
return grad, hess
此代码是从第二位获得并改编的解决方案在Kaggle Allstate挑战赛中.
This code is taken and adapted from the second place solution in the Kaggle Allstate Challenge.
Log-Cosh 损失功能.
Log-Cosh Loss function.
def log_cosh_obj(preds, dtrain):
x = preds - dtrain.get_labels()
grad = np.tanh(x)
hess = 1 / np.cosh(x)**2
return grad, hess
最后,您可以使用上述函数作为模板来创建自己的自定义损失函数.
Finally, you can create your own custom loss functions using the above functions as templates.
这篇关于Xgboost-如何使用"mae"作为目标函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!