我正在LightGBM中使用LGBMClassifer构建二进制分类器模型,如下所示:

 # LightGBM model
        clf = LGBMClassifier(
            nthread=4,
            n_estimators=10000,
            learning_rate=0.005,
            num_leaves= 45,
            colsample_bytree= 0.8,
            subsample= 0.4,
            subsample_freq=1,
            max_depth= 20,
            reg_alpha= 0.5,
            reg_lambda=0.5,
            min_split_gain=0.04,
            min_child_weight=.05
            random_state=0,
            silent=-1,
            verbose=-1)


接下来,将我的模型拟合训练数据

     clf.fit(train_x, train_y, eval_set=[(train_x, train_y), (valid_x, valid_y)],
                eval_metric= 'auc', verbose= 100, early_stopping_rounds= 200)

    fold_importance_df = pd.DataFrame()
    fold_importance_df["feature"] = feats
    fold_importance_df["importance"] = clf.feature_importances_


输出:

feature                      importance
feature13                     1108
feature21                     1104
feature11                     774



到这里一切都很好,现在我正在研究基于此模型的特征重要性度量。所以,我正在使用feature_importance_()函数来获取它(但是默认情况下,它基于split赋予了我功能重要性)
虽然split使我了解了拆分中使用了多少个特征,但是我认为gain可以使我更好地了解特征的重要性。
LightGBM增强器类https://lightgbm.readthedocs.io/en/latest/Python-API.html?highlight=importance的Python API提到:

 feature_importance(importance_type='split', iteration=-1)


 Parameters:importance_type (string, optional (default="split")) –
 If “split”, result contains numbers
 of times the feature is used in a model. If “gain”, result contains
 total gains of splits which use the feature.
 Returns:   result – Array with feature importances.
 Return type:   numpy array`



而针对LightGBM LGBMClassifier()的Sklearn API没有提及任何Sklearn API LGBM,它对此功能仅具有以下参数:

feature_importances_
array of shape = [n_features] – The feature importances (the higher, the more important the feature).



我的问题是如何从sklearn版本即基于LGBMClassifier()gain获得功能的重要性?

最佳答案

feature_importance()是原始LGBM中Booster对象的一种方法。

sklearn API通过API Docs中给出的属性booster_将底层Booster暴露在训练数据上。

因此,您可以首先访问该增强对象,然后以与原始LGBM相同的方式调用feature_importance()

clf.booster_.feature_importance(importance_type='gain')

关于machine-learning - 如何在sklearn中的LightGBM分类器的feature_importances_中将“增益”设置为特征重要性度量:: LGBMClassifier(),我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/51118772/

10-12 23:23