问题描述
solr/lucene中是否有内置功能可以过滤结果(如果它们低于特定分数阈值)?假设我提供的得分阈值为.2,那么所有得分小于.2的文档都将从我的结果中删除.我的直觉是,可以通过更新/自定义solr或lucene来实现.
Is there a built-in functionalities in solr/lucene to filter the results if they fall below a certain score threshold? Let's say if I provide a score threshold of .2, then all documents with score less than .2 will be removed from my results. My intuition is that this is possible by updating/customizing solr or lucene.
您能指出正确的方向吗?
Could you point me to right direction on how to do this?
提前谢谢!
推荐答案
您可以编写自己的收集器,该收集器将忽略收集计分器放置在阈值以下的那些文档.下面是使用Lucene.Net 2.9.1.2和C#的一个简单示例.如果要保留计算出的分数,则需要修改示例.
You could write your own Collector that would ignore collecting those documents that the scorer places below your threshold. Below is a simple example of this using Lucene.Net 2.9.1.2 and C#. You'll need to modify the example if you want to keep the calculated score.
using System;
using System.Collections.Generic;
using Lucene.Net.Index;
using Lucene.Net.Search;
public class ScoreLimitingCollector : Collector {
private readonly Single _lowerInclusiveScore;
private readonly List<Int32> _docIds = new List<Int32>();
private Scorer _scorer;
private Int32 _docBase;
public IEnumerable<Int32> DocumentIds {
get { return _docIds; }
}
public ScoreLimitingCollector(Single lowerInclusiveScore) {
_lowerInclusiveScore = lowerInclusiveScore;
}
public override void SetScorer(Scorer scorer) {
_scorer = scorer;
}
public override void Collect(Int32 doc) {
var score = _scorer.Score();
if (_lowerInclusiveScore <= score)
_docIds.Add(_docBase + doc);
}
public override void SetNextReader(IndexReader reader, Int32 docBase) {
_docBase = docBase;
}
public override bool AcceptsDocsOutOfOrder() {
return true;
}
}
这篇关于在Solr/Lucene中删除低于特定分数阈值的结果?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!