问题描述
我使用 WEKA 进行文本分类,我已经训练了数据集,并应用了 StringToWOrdVector 和 NumericToNominal 过滤器,并拥有测试数据集并对其应用了相同的过滤器.当我尝试将我的模型应用于测试数据时,它给了我以下错误训练集和测试集不兼容找了个解决办法,报错是因为两组之间的属性数量不同,而且总是不同,因为两组中的文本不同
I use WEKA for Text classification , I have trained data set , and I apply StringToWOrdVector and NumericToNominal filters , and have test data set and applied the same filters on it .When I try to apply my model on test data ,it gave me the following errorTrain and test set are not compatibleI searched for a solution , the error occurred because number of attributes different between two sets, and it always be different because texts in two sets are different
我该如何解决这个错误?
How I can solve this error please ?
推荐答案
你能做的最好的事情就是将你的训练集和测试集合并到一个文件中,然后一次性应用过滤器,然后再次拆分它们并将 @attribute
值从组合文件复制到训练和测试文件中.这样,两个文件的属性将保持一致.
The best thing you can do is combine your training and test set into one file and then apply the filter to it all in one go, then split them up again and copy the @attribute
values from the combined file into both the training and test files. This way the attributes will be consistent across both files.
这篇关于训练数据和测试数据具有不同数量的属性,导致错误“训练和测试集不兼容".的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!