如果我从Lucene Java Doc Page正确理解,则将CustomScoreQuery
实例设置为strict应该在方法FunctionQuery
中将FieldSource
的valSrcScore
值传递给CustomScoreProvider
的public float customScore(int doc, float subQueryScore, float valSrcScore)
而不进行修改(如规范化),如FloatSourceField
。
因此,我认为我可以准确地获得浮点值,该值存储在文档的valSrcScore
中。
但是,当索引数据量变大时,情况似乎并非如此。在这里,我有一个简单的例子来说明我的意思:
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.*;
import org.apache.lucene.index.*;
import org.apache.lucene.queries.*;
import org.apache.lucene.queries.function.FunctionQuery;
import org.apache.lucene.queries.function.valuesource.FloatFieldSource;
import org.apache.lucene.search.*;
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.util.Version;
import java.io.IOException;
public class CustomScoreTest {
public static void main(String[] args) throws IOException {
RAMDirectory index = new RAMDirectory();
IndexWriterConfig config = new IndexWriterConfig(Version.LATEST, new StandardAnalyzer());
IndexWriter writer = new IndexWriter(index, config);
// prepare dummy text
String text = "";
for (int i = 0; i < 1000; i++) text += "abc ";
// add dummy docs
for (int i = 0; i <25000; i++) {
Document doc = new Document();
doc.add(new FloatField("number", i * 100f, Field.Store.YES));
doc.add(new TextField("text", text, Field.Store.YES));
writer.addDocument(doc);
}
writer.close();
IndexReader reader = IndexReader.open(index);
IndexSearcher searcher = new IndexSearcher(reader);
Query q1 = new TermQuery(new Term("text", "abc"));
CustomScoreQuery q2 = new CustomScoreQuery(q1, new FunctionQuery(new FloatFieldSource("number"))) {
protected CustomScoreProvider getCustomScoreProvider(AtomicReaderContext ctx) throws IOException {
return new CustomScoreProvider(ctx) {
public float customScore(int doc, float subQueryScore, float valSrcScore) throws IOException {
float diff = Math.abs(valSrcScore - searcher.doc(doc).getField("number").numericValue().floatValue());
if (diff > 0) throw new IllegalStateException("diff: " + diff);
return super.customScore(doc, subQueryScore, valSrcScore);
}
};
}
};
// In strict custom scoring, the part does not participate in weight normalization.
// This may be useful when one wants full control over how scores are modified, and
// does not care about normalising by the part
q2.setStrict(true);
// Exception in thread "main" java.lang.IllegalStateException: diff: 1490700.0
searcher.search(q2, 10);
}
}
如该示例中所述,抛出异常是因为与存储在文档“数字”字段中的实际值相差很大。
但是,当我将索引虚拟文档的数量减少到2500个时,它可以按预期工作,并且我得到的值与“ number”字段中的值的差为0。
我在这里做错了什么?
最佳答案
您正在运行哪个版本的lucene?一种可能是随着索引大小的增长,AtomicReaderContext
应该替换为LeafReaderContext
。只是一个假设
关于java - Lucene CustomScoreQuery不会传递来自FunctionQuery的FieldSource的值,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/28263382/