此处重新完整显示:https://repl.it/@JacksonEnnis/KNNPercentage
我正在尝试使用sci-kit中的KNN工具进行一些预测。
我有两个函数,recurse()和predict()。 recurse()旨在遍历每个可能的功能组合,而predict应该做实际的
def predict(self, data, answers):
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split as tts
import numpy as np
if len(data) > 1:
print("length before transposition {}".format(len(data)))
#n_data = np.transpose(data)
#print("length after transposition {}".format(len(n_data)))
knn = KNeighborsClassifier(n_neighbors=1)
xTrain, xTest, yTrain, yTest = tts(data, answers)
print("xTrain data: {}".format(len(xTrain)))
knn.fit(xTrain, yTrain)
print(knn.score(xTest, yTest))
def recurse(self, data):
self.predict(data, self.y)
if len(data) > 0:
self.recurse(self.rLeft(data))
if len(data) > 1:
self.recurse(self.rMid(data))
if len(data) > 2:
self.recurse(self.rRight(data))
但是,当我运行该程序时,它指出火车/测试线有问题。我检查了每个功能中的样本以及答案,发现它们的长度相同,所以我不确定为什么会这样。
Traceback (most recent call last):
File "main.py", line 12, in <module>
best = Config(apple)
File "/home/runner/Config.py", line 13, in __init__
self.predict(self.features, self.y)
File "/home/runner/Config.py", line 45, in predict
xTrain, xTest, yTrain, yTest = tts(data, answers)
File "/home/runner/.local/lib/python3.6/site-packages/sklearn/model_selection/_split.py", line 2096, in train_test_split
arrays = indexable(*arrays)
File "/home/runner/.local/lib/python3.6/site-packages/sklearn/utils/validation.py", line 230, in indexable
check_consistent_length(*result)
File "/home/runner/.local/lib/python3.6/site-packages/sklearn/utils/validation.py", line 205, in check_consistent_length
" samples: %r" % [int(l) for l in lengths])
ValueError: Found input variables with inconsistent numbers of samples: [20, 499]
最佳答案
您的轴已反转。格式是每个数组的array.shape[0]
大小必须相同。我建议您查看scikit docs以获得更多示例。
tts(np.array(data).T, answers)
关于python - KNN:找到的输入变量样本数量不一致:[20,499],我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/56811515/