本文介绍了结果定位,而不是突出显示的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试获取位置而不是突出显示的文本,这是Elasticsearch查询的结果.

I try to get positions instead of highlighted text as the result of elasticsearch query.

创建索引:

PUT /test/
{
  "mappings": {
    "article": {
      "properties": {
        "text": {
          "type": "text",
          "analyzer": "english"
        },
        "author": {
          "type": "text"
        }
      }
    }
  }
}

放置文档:

PUT /test/article/1
{
  "author": "Just Me",
  "text": "This is just a simple test to demonstrate the audience the purpose of the question!"
}

搜索文档

GET /test/article/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match_phrase": {
            "text": {
              "query": "simple test",
              "_name": "must"
            }
          }
        }
      ],
      "should": [
        {
          "match_phrase": {
            "text": {
              "query": "need help",
              "_name": "first",
              "slop": 2
            }
          }
        },
        {
          "match_phrase": {
            "text": {
              "query": "purpose question",
              "_name": "second",
              "slop": 3
            }
          }
        },
        {
          "match_phrase": {
            "text": {
              "query": "don't know anything",
              "_name": "third"
            }
          }
        }
      ],
      "minimum_should_match": 1
    }
  },
  "highlight": {
    "fields": {
      "text": {}
    }
  }
}

当我运行此搜索时,我得到的结果如下:

When i run this search, i get the result like so:

This is just a simple test to <em>demonstrate</em> the audience the purpose of the <em>question</em>!

我对将结果标记为em标签不感兴趣,但我想像这样获取结果的所有位置:

I'm not interested in getting the results surrounded with em tags, but i want to get all the positions of the results like so:

"hits": [
   { "start_offset": 30, "end_offset": 40 },
   { "start_offset": 74, "end_offset": 81 }
]

希望你有我的主意!

推荐答案

要在文本中具有单词的偏移位置,应在 termvector -此处的文档.如文档中所写,您必须在索引时间启用此参数:

To have the offset position of a word in a text you should add to your index mapping a termvector - doc here . As written in the doc, you have to enable this param at index time:

"term_vector": "with_positions_offsets_payloads"

对于特定查询,请按照链接的文档页面

For the specific query, please follow the linked doc page

这篇关于结果定位,而不是突出显示的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-19 01:58