中出现不支持连续错误

中出现不支持连续错误

本文介绍了在 RandomForestRegressor 中出现不支持连续错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只是想做一个简单的 RandomForestRegressor 示例.但是在测试准确性时我得到了这个错误

/Users/noppanit/anaconda/lib/python2.7/site-packages/sklearn/metrics/classification.pyc

在accuracy_score(y_true, y_pred, normalize, sample_weight)177178 # 计算每种可能表示的准确度--> 179 y_type, y_true, y_pred = _check_targets(y_true, y_pred)180 如果 y_type.startswith('multilabel'):第 181 章

/Users/noppanit/anaconda/lib/python2.7/site-packages/sklearn/metrics/classification.pyc

在_check_targets(y_true, y_pred)90 if (y_type 不在 ["binary", "multiclass", "multilabel-indicator",91多标签序列"]):---> 92 raise ValueError("{0} is not supported".format(y_type))9394 if y_type in ["binary", "multiclass"]:

ValueError: 不支持连续

这是数据的样本.我无法显示真实数据.

target, func_1, func_2, func_2, ... func_200浮动,浮动,浮动,浮动,...浮动

这是我的代码.

将pandas导入为pd将 numpy 导入为 npfrom sklearn.preprocessing import Imputer从 sklearn.ensemble 导入 RandomForestClassifier、RandomForestRegressor、ExtraTreesRegressor、GradientBoostingRegressor从 sklearn.cross_validation 导入 train_test_split从 sklearn.metrics 导入accuracy_score从 sklearn 导入树train = pd.read_csv('data.txt', sep='	')标签 = train.targettrain.drop('target', axis=1, inplace=True)猫 = ['猫']train_cat = pd.get_dummies(train[cat])train.drop(train[cat],axis=1,inplace=True)train = np.hstack((train, train_cat))imp = Imputer(missing_values='NaN', strategy='mean', axis=0)imp.fit(火车)train = imp.transform(train)x_train, x_test, y_train, y_test = train_test_split(train, labels.values, test_size = 0.2)clf = RandomForestRegressor(n_estimators=10)clf.fit(x_train, y_train)y_pred = clf.predict(x_test)precision_score(y_test, y_pred) # 这是我得到错误的地方.
解决方案

这是因为 accuracy_score 仅用于分类任务.对于回归,您应该使用不同的东西,例如:

clf.score(X_test, y_test)

其中 X_test 是样本,y_test 是对应的真实值.它将在内部计算预测.

I'm just trying to do a simple RandomForestRegressor example. But while testing the accuracy I get this error

This is the sample of the data. I can't show the real data.

target, func_1, func_2, func_2, ... func_200
float, float, float, float, ... float

Here's my code.

import pandas as pd
import numpy as np
from sklearn.preprocessing import Imputer
from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor, ExtraTreesRegressor, GradientBoostingRegressor
from sklearn.cross_validation import train_test_split
from sklearn.metrics import accuracy_score
from sklearn import tree

train = pd.read_csv('data.txt', sep='	')

labels = train.target
train.drop('target', axis=1, inplace=True)
cat = ['cat']
train_cat = pd.get_dummies(train[cat])

train.drop(train[cat], axis=1, inplace=True)
train = np.hstack((train, train_cat))

imp = Imputer(missing_values='NaN', strategy='mean', axis=0)
imp.fit(train)
train = imp.transform(train)

x_train, x_test, y_train, y_test = train_test_split(train, labels.values, test_size = 0.2)

clf = RandomForestRegressor(n_estimators=10)

clf.fit(x_train, y_train)
y_pred = clf.predict(x_test)
accuracy_score(y_test, y_pred) # This is where I get the error.
解决方案

It's because accuracy_score is for classification tasks only.For regression you should use something different, for example:

clf.score(X_test, y_test)

Where X_test is samples, y_test is corresponding ground truth values. It will compute predictions inside.

这篇关于在 RandomForestRegressor 中出现不支持连续错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-13 19:15