本文介绍了fastText的精度和召回率?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我实现了用于文本分类的fastText,链接 https: //github.com/facebookresearch/fastText/blob/master/tutorials/supervised-learning.md 我想知道precision @ 1或P @ 5是什么意思?我进行了二进制分类,但是测试了不同的数字,我不理解结果:

haos-mbp:fastText hao$ ./fasttext test trainmodel.bin train.valid 2
N   312
P@2 0.5
R@2 1
Number of examples: 312
haos-mbp:fastText hao$ ./fasttext test trainmodel.bin train.valid 1
N   312
P@1 0.712
R@1 0.712
Number of examples: 312
haos-mbp:fastText hao$ ./fasttext test trainmodel.bin train.valid 3
N   312
P@3 0.333
R@3 1
Number of examples: 312
解决方案

精度是相关结果数与程序检索到的结果总数之比.假设一个文档搜索引擎,检索了100个文档,其中90个与查询相关,则精度为90/100(0.9).由于我们已经用100个结果计算了精​​度,所以它是P @ 100.

和Recall是算法检索的相关结果与所有相关结果总数的比率.在上面的相同示例中,如果相关文档的总数为110,则召回率为90/110.

简而言之,召回有助于从获取相关结果的角度评估信息检索程序的完整性.精度有助于评估结果的准确性.

也请在fasttext中对此进行二进制分类检查, https://github.com/facebookresearch /fastText/issues/93

I implement the fastText for text classification, link https://github.com/facebookresearch/fastText/blob/master/tutorials/supervised-learning.mdI was wondering what's the precision@1, or P@5 means? I did a binary classification, but I tested different number, I don't understand results:

haos-mbp:fastText hao$ ./fasttext test trainmodel.bin train.valid 2
N   312
P@2 0.5
R@2 1
Number of examples: 312
haos-mbp:fastText hao$ ./fasttext test trainmodel.bin train.valid 1
N   312
P@1 0.712
R@1 0.712
Number of examples: 312
haos-mbp:fastText hao$ ./fasttext test trainmodel.bin train.valid 3
N   312
P@3 0.333
R@3 1
Number of examples: 312
解决方案

Precision is the ratio of number of relevant results and total number of results retrieved by the program. Assume a document search engine, retrieved 100 docs out of which 90 are relevant to the query, then the precision is 90 / 100 (0.9). Since we have calculated the precision with 100 results, this is P@100.

And Recall is the ratio of relevant results retrieved by the algorithm and total number of the all relevant results. With the same example above, if the total number of relevant documents is 110, then the recall is 90 / 110.

In a nutshell, recall helps to evaluate an information retrieval program on how complete it is, in terms of fetching relevant results; and precision helps to evaluate on how accurate the results are.

Please check this for binary classification in fasttext also, https://github.com/facebookresearch/fastText/issues/93

这篇关于fastText的精度和召回率?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-24 16:13