将folliwng book编入索引:

curl -X PUT localhost:9200/books/book/1 -d '{
    "title": "All Quiet on the Western Front",
    "author": "Erich Maria Remarque",
    "year": 1929,
}'

我正在尝试使用official docs的代码来实现短语建议器。

所以我尝试了;
curl -XPOST 'localhost:9200/books/_search' -d '{
  "suggest" : {
    "text" : "al quet",
    "simple_phrase" : {
      "phrase" : {
        "analyzer" : "body",
        "field" : "bigram",
        "size" : 1,
        "real_word_error_likelihood" : 0.95,
        "max_errors" : 0.5,
        "gram_size" : 2,
        "direct_generator" : [ {
          "field" : "title",
          "suggest_mode" : "always",
          "min_word_length" : 1
        } ],
        "highlight": {
          "pre_tag": "<em>",
          "post_tag": "</em>"
        }
      }
    }
  }
}'

我希望这可以从al quet纠正为all quiet

但是我收到以下错误:
  "error" : {
    "root_cause" : [ {
      "type" : "illegal_argument_exception",
      "reason" : "Analyzer [body] doesn't exists"

如果我将"analyzer" : "body"更改为"analyzer" : "title",我会收到相同的错误,但使用title:
  "error" : {
    "root_cause" : [ {
      "type" : "illegal_argument_exception",
      "reason" : "Analyzer [title] doesn't exists"

如果我将"analyzer" : "body"更改为"analyzer" : "default",则在该行中不会显示错误,但在下一行中会显示错误。 "field" : "bigram",
  "error" : {
     "root_cause" : [ {
       "type" : "illegal_argument_exception",
       "reason" : "No mapping found for field [bigram]"

进行此工作的唯一方法是添加:"analyzer" : "default","field" : "title",:
curl -XPOST 'localhost:9200/books/_search?pretty=true' -d '{
  "suggest" : {
    "text" : "al quet",
    "simple_phrase" : {
      "phrase" : {
        "analyzer" : "default",
        "field" : "title",
        "size" : 1,
        "real_word_error_likelihood" : 0.95,
        "max_errors" : 0.5,
        "gram_size" : 2,
        "direct_generator" : [ {
          "field" : "title",
          "suggest_mode" : "always",
          "min_word_length" : 1
        } ],
        "highlight": {
          "pre_tag": "<em>",
          "post_tag": "</em>"
        }
      }
    }
  }
}'

有了这个我得到这个输出:
 "suggest" : {
    "simple_phrase" : [ {
      "text" : "al quet",
      "offset" : 0,
      "length" : 7,
      "options" : [ {
        "text" : "al quiet",
        "highlighted" : "al <em>quiet</em>",
        "score" : 0.09049256
      } ]
    } ]
  }

如您所见,它正在纠正quiet而不是al,而我的所有其他尝试都一样,它只能纠正一个单词。

如何在示例中输入al quet并返回all quiet的情况下,做一个成功的短语建议器?

最佳答案

您遇到第一个错误,因为索引中没有名为body的analyzer,也没有标题

第二个错误是由于缺少域bigram,索引中只有三个域,即标题,作者和年份。

在当前设置下,为了使suggester正常工作,您需要为max_errors赋予较高的值(value)。根据文档,max_errors是



所以这应该给您想要的输出。

{
  "suggest": {
    "text": "al quet",
    "simple_phrase": {
      "phrase": {
        "analyzer": "default",
        "field": "title",
        "size": 1,
        "real_word_error_likelihood": 0.95,
        "max_errors": 0.9,  <--- increase this value
        "gram_size": 2,
        "direct_generator": [
          {
            "field": "title",
            "suggest_mode": "always",
            "min_word_length": 1
          }
        ],
        "highlight": {
          "pre_tag": "<em>",
          "post_tag": "</em>"
        }
      }
    }
  },
  "size": 0
}

您可能想要对短语使用shingles,而对collate使用仅获取索引中的结果。我已经为this question提供了详细的答案,这可能会有所帮助。

09-05 08:32