问题描述
如何使用 Whoosh 获取文档的相似性度量?
How do I get a similarity measure of a document using Whoosh?
我想创建一个相关"功能,对与某个文档具有高度相似性的其他先前编入索引的文档进行排名.
I want to create a "Related" feature that ranks other previously indexed documents that have a high similarity to a document.
我是否将文档作为长查询字符串输入?我是否将文档添加到索引并从那里以某种方式提取相似性查询结果?
Do I input the document as a long query string? Do I add the document to the index and extract a similarity query result somehow from there?
谢谢
推荐答案
Whoosh 搜索器类有一个名为 'more_like()'.
The Whoosh searcher class has a method called 'more_like()'.
它允许您将文档与其他索引文档进行比较和索引,并返回与给定文档相似的文档列表.
It allows you to compare and indexed document to other indexed documents and returns a list of documents similar to the given document.
还有 类 whoosh.searching.Hit 可以给出排名和分数.
And the class whoosh.searching.Hit can give a rank and a score.
这篇关于使用 Whoosh Python 搜索库进行文档比较/相似性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!