TypeError：sklearn.feature_extraction.FeatureHasher中需要float

本文介绍了TypeError：sklearn.feature_extraction.FeatureHasher中需要float的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我使用sklearn版本0.16.1。看起来FeatureHasher不支持字符串（就像DictVectorizer一样）。
例如：

 值= [
 {'city'：'Dubai'，'temperature' ：}}，
 {'city'：'London'，'temperature'：12.}，
 {'city'：'San Fransisco'，'temperature'：18.} 
）
 
 print（Starting FeatureHasher ...）
 hasher = FeatureHasher（n_features = 2）
 X = hasher.transform（values）.toarray（）
 print X

但收到以下错误：

  _hashing.transform（raw_X，self.n_features，self.dtype）
文件_hashing.pyx，行46，位于sklearn.feature_extraction._hashing.transform （sklearn \feature_extraction\_hashing.c：1762）
 TypeError：需要一个浮点数

我无法使用因为我的da taset非常大，功能高基数，所以我得到一个MemoryError。
有什么建议？

更新（2016年10月）：

NirIzr评论说，现在支持，因为sklearn开发团队在

FeatureHasher应该正确处理从0.18版本开始的字符串字典值。

解决方案
这是一个已知的sklearn问题：
FeatureHasher目前不支持其字典输入格式的字符串值

I'm using sklearn version 0.16.1. It seems that FeatureHasher doesn't support strings (as DictVectorizer does). For example:
values = [ {'city': 'Dubai', 'temperature': 33.}, {'city': 'London', 'temperature': 12.}, {'city': 'San Fransisco', 'temperature': 18.} ] print("Starting FeatureHasher ...") hasher = FeatureHasher(n_features=2) X = hasher.transform(values).toarray() print X
But the following error is received:
_hashing.transform(raw_X, self.n_features, self.dtype) File "_hashing.pyx", line 46, in sklearn.feature_extraction._hashing.transform (sklearn\feature_extraction\_hashing.c:1762) TypeError: a float is required
I can't use DictVectorizer since my dataset is very big and the features are with high cardinality so I get a MemoryError. Any suggestions?
Update (October 2016):
As NirIzr commented, this is now supported, as sklearn dev team addressed this issue in https://github.com/scikit-learn/scikit-learn/pull/6173
FeatureHasher should properly handle string dictionary values as of version 0.18.
解决方案
It is a known sklearn issue: FeatureHasher does not currently support string values for its dict input format
https://github.com/scikit-learn/scikit-learn/issues/4878

这篇关于TypeError：sklearn.feature_extraction.FeatureHasher中需要float的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！