ElasticSearch Analyzer文本字段

本文介绍了ElasticSearch Analyzer文本字段的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是我在elasticSearch上的字段：

Here is my field on elasticSearch :

"keywordName": {
        "type": "text",
        "analyzer": "custom_stop"
      }

这是我的分析器：

$ b

Here is my analyzer :

"custom_stop": {
      "type":      "custom",
      "tokenizer": "standard",
      "filter": [
        "my_stop",
        "my_snow",
        "asciifolding"
      ]
    }

这里是我的过滤器：

           "my_stop": {
              "type":       "stop",
              "stopwords":  "_french_"
          },
           "my_snow" : {
                "type" : "snowball",
                "language" : "French"
            }

这是我的文件我的索引（在我唯一的字段中：keywordName）：

Here are my documents my index (in my only field : keywordName) :

canne a peche，canne，canne a peche telescopique ，iphone 8，iphone 8号，iphone 8盖，iphone 8充电器，iphone 8新

"canne a peche", "canne", "canne a peche telescopique", "iphone 8", "iphone 8 case", "iphone 8 cover", "iphone 8 charger", "iphone 8 new"

当我搜索canne时，它给了我文件，这是我想要的：

When I search for "canne", it gives me the "canne" document, which is what I want :

GET ads/_search
{
   "query": {
    "match": {
      "keywordName": {
        "query": "canne",
        "operator":  "and"
      }
    }
  },
  "size": 1
}

当我搜索canneàpêche时，它给了我canne a peche，这也可以。对于戛纳àPêche - >canne a peche - > OK。

When I search for "canne à pêche", it gives me "canne a peche", which is OK, too. Same for "Cannes à Pêche" -> "canne a peche" -> OK.

这是棘手的一部分：当我搜索iphone 8时，它给出我iphone 8盖而不是iphone 8。如果我更改大小，我设置5（因为它返回包含iphone 8的5个结果）。我看到iphone 8是得分的第四个结果。第一个是iphone 8盖，然后iphone 8，然后iphone 8新，最后iphone 8...

Here is the tricky part : when I search for "iphone 8", it gives me "iphone 8 cover" instead of "iphone 8". If I change the size, I set 5 (as it returns the 5 results containing "iphone 8"). I see that "iphone 8" is the 4th result in term of score. The first is "iphone 8 cover" then "iphone 8 case" then "iphone 8 new" and finally "iphone 8" ...

这是查询：

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 5,
    "max_score": 1.4009607,
    "hits": [
      {
        "_index": "ads",
        "_type": "keyword",
        "_id": "iphone 8 cover",
        "_score": 1.4009607,
        "_source": {
          "keywordName": "iphone 8 cover"
        }
      },
      {
        "_index": "ads",
        "_type": "keyword",
        "_id": "iphone 8 case",
        "_score": 1.4009607,
        "_source": {
          "keywordName": "iphone 8 case"
        }
      },
      {
        "_index": "ads",
        "_type": "keyword",
        "_id": "iphone 8 new",
        "_score": 0.70293105,
        "_source": {
          "keywordName": "iphone 8 new"
        }
      },
      {
        "_index": "ads",
        "_type": "keyword",
        "_id": "iphone 8",
        "_score": 0.5804671,
        "_source": {
          "keywordName": "iphone 8"
        }
      },
      {
        "_index": "ads",
        "_type": "keyword",
        "_id": "iphone 8 charge",
        "_score": 0.46705723,
        "_source": {
          "keywordName": "iphone 8 charge"
        }
      }
    ]
  }
}

我如何保持关于canne a peche（口音，大写字母，复数词）的灵活性，但也告诉他如果有完全匹配（ iphone 8=iphone 8），给我确切的keywordName？

How can I keep the flexibility concerning the keyword "canne a peche" (accents, capital letters, plural terms) but also tell him that if there is an exact match ("iphone 8" = "iphone 8"), give me the exact keywordName ?

iPhone

ElasticSearch Analyzer文本字段

问题描述

推荐答案