本文介绍了功能重要性-套袋,scikit学习的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于一个项目,我正在使用scikit-learn的回归算法(随机森林,额外树,Adaboost和Bagging)比较许多决策树.为了比较和解释它们,我使用了功能重要性,尽管对于装袋决策树来说这似乎不可用.

For a project I am comparing a number of decision trees, using the regression algorithms (Random Forest, Extra Trees, Adaboost and Bagging) of scikit-learn.To compare and interpret them I use the feature importance , though for the bagging decision tree this does not look to be available.

我的问题:有人知道如何获取Bagging的功能重要性列表吗?

My question: Does anybody know how to get the feature importances list for Bagging?

问候,柯妮

推荐答案

您是否在谈论BaggingClassifier?它可以与许多基本估计器一起使用,因此没有实现功能上的重要性.有用于计算功能重要性的与模型无关的方法(请参见例如 https://github. com/scikit-learn/scikit-learn/issues/8898 ),scikit-learn不使用它们.

Are you talking about BaggingClassifier? It can be used with many base estimators, so there is no feature importances implemented. There are model-independent methods for computing feature importances (see e.g. https://github.com/scikit-learn/scikit-learn/issues/8898), scikit-learn doesn't use them.

如果将决策树作为基本估计量,则您可以自己计算要素重要性:在bagging.estimators_中所有树中,平均值tree.feature_importances_:

In case of decision trees as base estimators you can compute feature importances yourselves: it'd be just an average of tree.feature_importances_ among all trees in bagging.estimators_:

import numpy as np
from sklearn.ensemble import BaggingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)
clf = BaggingClassifier(DecisionTreeClassifier())
clf.fit(X, y)

feature_importances = np.mean([
    tree.feature_importances_ for tree in clf.estimators_
], axis=0)

RandomForestClassifer在内部进行相同的计算.

RandomForestClassifer does the same computation internally.

这篇关于功能重要性-套袋,scikit学习的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-11 13:29