本文介绍了scikits 机器学习中的缺失值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
scikit-learn 中是否可能存在缺失值?他们应该如何代表?我找不到任何关于此的文档.
Is it possible to have missing values in scikit-learn ? How should they be represented? I couldn't find any documentation about that.
推荐答案
以上答案已经过时;scikit-learn 的最新版本有一个类 Imputer
做简单的,每个特征的缺失值插补.您可以向它提供包含 NaN 的数组,以将其替换为相应特征的均值、中值或众数.
The above answer is outdated; the latest release of scikit-learn has a class Imputer
that does simple, per-feature missing value imputation. You can feed it arrays containing NaNs to have those replaced by the mean, median or mode of the corresponding feature.
这篇关于scikits 机器学习中的缺失值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!