Lucene有效负载评分

Lucene有效负载评分

本文介绍了Lucene有效负载评分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

限时删除!!

我想弄清楚有效负载评分在Lucene中是如何工作的.由于我不了解PayloadFunction的适用范围,因此我认为我并不真正了解它的工作原理.尝试使用谷歌搜索,但除了提供建议的来源外,找不到其他东西.好吧,如果有人可以在这里解释它会很好,否则它的源代码是:)

I want to figure out how payload scoring works in lucene. Since I don't understand where PayloadFunction fits in, I think I don't really understand how it works. Tried googling for it, but couldn't find much apart from advice to go through source. Well, it would be nice if someone can explain it here, else source code it is :)

推荐答案

它分为三个部分.首先,您应该在分析过程中生成有效载荷.这可以使用PayloadAttribute完成.您只需要将此属性添加到分析期间所需的术语即可.

There are three parts of it. First of all you should generate payloads during analysis. This could be done using PayloadAttribute. You just need to add this attribute to terms you want during analysis.

class MyFilter extends TokenFilter {

  private PayloadAttribute attr;

  public MyFilter() {
    attr = addAttribute(PayloadAttribute.class);
  }

  public final boolean incrementToken() throws IOException {
    if (input.incrementToken()) {
      Payload p = new Payload(PayloadHelper.encodeFloat(42));
      attr.setPayload(p);
    } else {
      attr.setPayload(null);
    }
}

然后在搜索过程中,应使用特殊的查询类PayloadTermQuery.此类的行为与SpanTermQuery相同,但会跟踪索引中的有效负载.使用自定义的Similarity实现,您可以对文档中每个有效负载的出现进行评分.

Then during searching you should use special query class PayloadTermQuery. This class behaves as SpanTermQuery but do track of payloads in index. Using custom Similarity implementation you could score each payload occurrence in document.

public class MySimilarity extends DefaultSimilarity {

  public float scorePayload(int docID, String fieldName,
                            int start, int end, byte[] payload,
                            int offset, int length) {
    if (payload != null) {
      return PayloadHelper.decodeFloat(payload, offset);
    } else {
      return 1.0f;
    }
  }
}

最后,使用PayloadFunction,您可以在文档上聚合有效载荷分数,以生成最终文档分数.

Finally, using PayloadFunction you could aggregate payload scores over document to produce final document score.

这篇关于Lucene有效负载评分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

1403页,肝出来的..

09-06 15:51