问题描述
我已经阅读了Lucene 4.0的文档,现在该库存储了一些统计信息,以便计算不同的评分模型,其中之一是bm25.除了获取文档之外,还有什么方法可以获取其长度吗?
as I've read the documentation of the lucene 4.0, now this library stores some statistics as in order to compute different scoring models, one of them bm25. Is there a way, besides fetching a document, to fetch its length too?
推荐答案
您可以将FieldInvertState中所需的内容存储到范数"中,并且也不必是8位浮点数.
You can store whatever you want from FieldInvertState into the 'norm', and it doesn't have to be a 8 bit float either.
默认值是长度的有损存储,如果您想要实际的确切长度,则可能选择每个文档使用短(16位)或其他格式.
The default is a lossy storage of the length, if you want the actual exact length, maybe you choose to use a short (16bits) per document or something else instead.
请参阅Sametime.computeNorm
See Similarity.computeNorm
这篇关于Lucene 4.0中的文档长度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!