Scikit中的学习Knn分类器错误

Scikit中的学习Knn分类器错误

本文介绍了“不支持Multiclass-multioutput" Scikit中的学习Knn分类器错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个变量X和Y.

X的结构(即np.array):

The structure of X (i.e an np.array):

[[26777 24918 26821 ...    -1    -1    -1]
[26777 26831 26832 ...    -1    -1    -1]
[26777 24918 26821 ...    -1    -1    -1]
...
[26811 26832 26813 ...    -1    -1    -1]
[26830 26831 26832 ...    -1    -1    -1]
[26830 26831 26832 ...    -1    -1    -1]]

Y的结构:

[[1252, 26777, 26831], [1252, 26777, 26831], [1252, 26777, 26831], [1252, 26777, 26831], [1252, 26777, 26831], [1252, 26777, 26831], [25197, 26777, 26781], [25197, 26777, 26781], [25197, 26777, 26781], [26764, 25803, 26781], [26764, 25803, 26781], [25197, 26777, 26781], [25197, 26777, 26781], [1252, 26777, 16172], [1252, 26777, 16172]]

Y中的数组,例如[1252、26777、26831]是三个独立的功能.

The array in Y , example [1252, 26777, 26831] are three separate features.

我正在使用scikit学习模块中的Knn分类器

I am using Knn classifier from scikit learn module

classifier = KNeighborsClassifier(n_neighbors=3)
classifier.fit(X,Y)
predictions = classifier.predict(X)
print(accuracy_score(Y,predictions))

但是我收到一条错误消息:

But I get an error saying :

我猜不支持'Y'的结构,为了使程序执行我需要进行哪些更改?

I guess the structure of 'Y' is not supported , what changes do I make in order for the program to execute?

输入:

  Deluxe Single room with sea view

预期输出:

c_class = Deluxe
c_occ = single
c_view = sea

推荐答案

如错误中所述,KNN不支持多输出回归/分类.

As mentioned in the error, KNN does not support multi-output regression/classification.

对于您的问题,您需要 .

For your problem, you need MultiOutputClassifier().

from sklearn.multioutput import MultiOutputClassifier

knn = KNeighborsClassifier(n_neighbors=3)
classifier = MultiOutputClassifier(knn, n_jobs=-1)
classifier.fit(X,Y)

工作示例:

>>> from sklearn.feature_extraction.text import TfidfVectorizer
>>> corpus = [
...     'This is the first document.',
...     'This document is the second document.',
...     'And this is the third one.',
...     'Is this the first document?',
... ]
>>> vectorizer = TfidfVectorizer()
>>> X = vectorizer.fit_transform(corpus)

>>> Y = [[124323,1234132,1234],[124323,4132,14],[1,4132,1234],[1,4132,14]]

>>> from sklearn.multioutput import MultiOutputClassifier
>>> from sklearn.neighbors import KNeighborsClassifier
>>> knn = KNeighborsClassifier(n_neighbors=3)
>>> classifier = MultiOutputClassifier(knn, n_jobs=-1)
>>> classifier.fit(X,Y)
>>> predictions = classifier.predict(X)

array([[124323,   4132,     14],
       [124323,   4132,     14],
       [     1,   4132,   1234],
       [124323,   4132,     14]])

>>> classifier.score(X,np.array(Y))
0.5

>>> test_data = ['I want to test this']
>>> classifier.predict(vectorizer.transform(test_data))
array([[124323,   4132,     14]])

这篇关于“不支持Multiclass-multioutput" Scikit中的学习Knn分类器错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-13 19:00