我将简化我的问题。假设我有一个索引,其中包含我用Kibana创建的3个文档:

PUT /test/vendors/1
{
  "type": "doctor",
  "name": "Phil",
  "works_in": [
      {
        "place": "Chicago"
      },
      {
        "place": "New York"
      }
    ]
}

PUT /test/vendors/2
{
  "type": "lawyer",
  "name": "John",
  "works_in": [
      {
        "place": "Chicago"
      },
      {
        "place": "New Jersey"
      }
    ]
}

PUT /test/vendors/3
{
  "type": "doctor",
  "name": "Jill",
  "works_in": [
      {
        "place": "Chicago"
      }
    ]
}

现在,我正在运行搜索:
GET /test/_search
{
  "query": {
    "multi_match" : {
      "query":    "doctor in chicago",
      "fields": [ "type", "place" ]
    }
  }
}

我得到了很好的回应:
{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "test",
        "_type": "vendors",
        "_id": "1",
        "_score": 0.2876821,
        "_source": {
          "type": "doctor",
          "name": "Phil",
          "works_in": [
            {
              "place": "Chicago"
            },
            {
              "place": "New York"
            }
          ]
        }
      },
      {
        "_index": "test",
        "_type": "vendors",
        "_id": "3",
        "_score": 0.2876821,
        "_source": {
          "type": "doctor",
          "name": "Jill",
          "works_in": [
            {
              "place": "Chicago"
            }
          ]
        }
      }
    ]
  }
}

现在事情开始变得有问题了...

doctor更改为doctors
GET /test/_search
{
  "query": {
    "multi_match" : {
      "query":    "doctors in chicago",
      "fields": [ "type", "place" ]
    }
  }
}

找不到零结果,即doctors。 flex 不知道复数还是单数。

将查询更改为New York
GET /test/_search
{
  "query": {
    "multi_match" : {
      "query":    "doctor in new york",
      "fields": [ "type", "place" ]
    }
  }
}

但是响应结果集除了doctor中的Chicago之外,还为我提供了doctor中的New York。字段与OR匹配...

另一个有趣的问题是,如果有人使用docsphysicianshealth professionals但表示doctor会发生什么。有没有规定我可以教Elasticsearch将这些信息汇集到“医生”中?

单独使用elasticsearch有没有解决的方法?在哪里我不必在我自己的应用程序中分析字符串的含义,然后该字符串将构造一个复杂的精确Elasticsearch查询以匹配它?

我会感激任何朝着正确方向的指针

最佳答案

我假设typeplace字段是Text类型的Standard Analyzers

要管理单数/复数,您要查找的内容称为Snowball Token Filter,您需要将其添加到映射中。

您提到的另一个要求例如physicians也应等同于doctor,您需要使用Synonym Token Filter

以下是您的映射方式。请注意,我刚刚将分析器添加到type中。您可以对到其他字段的映射进行类似的更改。

制图

PUT <your_index_name>
{
   "settings":{
      "analysis":{
         "analyzer":{
            "my_analyzer":{
               "tokenizer":"standard",
               "filter":[
                  "lowercase",
                  "my_snow",
                  "my_synonym"
               ]
            }
         },
         "filter":{
            "my_snow":{
               "type":"snowball",
               "language":"English"
            },
            "my_synonym":{
               "type":"synonym",
               "synonyms":[
                  "docs, physicians, health professionals, doctor"
               ]
            }
         }
      }
   },
   "mappings":{
      "mydocs":{
         "properties":{
            "type":{
               "type":"text",
               "analyzer":"my_analyzer"
            },
            "place":{
               "type":"text",
               "analyzer":"my_analyzer"
            }
         }
      }
   }
}

请注意,我是如何在映射本身中添加同义词的,而不是建议您在文本文件中添加同义词,如下所示
{
   "type":"synonym",
   "synonyms_path" : "analysis/synonym.txt"
}

根据我分享的链接,它提到上面配置了一个同义词过滤器,其路径为analysis / synonym.txt(相对于配置位置)。

希望能帮助到你!

关于elasticsearch - Elastic/Kibana:在查询搜索中支持复数,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/55071467/

10-12 21:27