问题描述
我在使用 RandomForest 拟合函数时遇到问题
I have trouble using RandomForest fit function
这是我的训练集
P1 Tp1 IrrPOA Gz Drz2
0 0.0 7.7 0.0 -1.4 -0.3
1 0.0 7.7 0.0 -1.4 -0.3
2 ... ... ... ... ...
3 49.4 7.5 0.0 -1.4 -0.3
4 47.4 7.5 0.0 -1.4 -0.3
... (10k rows)
由于所有其他变量,我想使用 sklearn.ensemble RandomForest 来预测 P1
I want to predict P1 thanks to all the other variables using sklearn.ensemble RandomForest
colsRes = ['P1']
X_train = train.drop(colsRes, axis = 1)
Y_train = pd.DataFrame(train[colsRes])
rf = RandomForestClassifier(n_estimators=100)
rf.fit(X_train, Y_train)
这是我得到的错误:
ValueError: Unknown label type: array([[ 0. ],
[ 0. ],
[ 0. ],
...,
[ 49.4],
[ 47.4],
我没有发现有关此标签错误的任何信息,我使用的是 Python 3.5.任何建议都会有很大帮助!
I did not find anything about this label error, I use Python 3.5.Any advice would be a great help !
推荐答案
当你将标签 (y) 数据传递给 rf.fit(X,y)
时,它期望 y 是一维列表.对 Panda 框架进行切片总是会产生一个 2D 列表.因此,在您的用例中引发了冲突.您需要将pandas DataFrame 提供的2D 列表按照fit 函数的预期转换为1D 列表.
When you are passing label (y) data to rf.fit(X,y)
, it expects y to be 1D list. Slicing the Panda frame always result in a 2D list. So, conflict raised in your use-case. You need to convert the 2D list provided by pandas DataFrame to a 1D list as expected by fit function.
先尝试使用一维列表:
Y_train = list(train.P1.values)
如果这不能解决问题,您可以尝试使用MultinomialNB错误:未知标签类型"中提到的解决方案:
If this does not solve the problem, you can try with solution mentioned in MultinomialNB error: "Unknown Label Type":
Y_train = np.asarray(train['P1'], dtype="|S6")
所以你的代码变成了,
colsRes = ['P1']
X_train = train.drop(colsRes, axis = 1)
Y_train = np.asarray(train['P1'], dtype="|S6")
rf = RandomForestClassifier(n_estimators=100)
rf.fit(X_train, Y_train)
这篇关于Python RandomForest - 未知标签错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!