望在Elasticsearch中搜索词的一部分的功能不返回任何内

望在Elasticsearch中搜索词的一部分的功能不返回任何内

本文介绍了希望在Elasticsearch中搜索词的一部分的功能不返回任何内容.仅适用于完整单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试了两种不同的创建索引的方法,如果我搜索单词的一部分,两者都将返回任何内容.基本上,如果我搜索第一个字母或单词中间的字母,我想获取所有文档.

I tried two different approaches for creating index and both are returning anything if I search for part o the word. Basically, if I search for first letters or letters in the middle of the word I want get all the documents.

以这种方式创建索引的第一个尝试(其他stackoverflow问题有点老了):

FIRST TENTATIVE BY CREATING INDEX THAT WAY (other stackoverflow question a bit old):

POST correntistas/correntista
{
  "index": {
    "index": "correntistas",
    "type": "correntista",
    "analysis": {
      "index_analyzer": {
        "my_index_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "mynGram"
          ]
        }
      },
      "search_analyzer": {
        "my_search_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "standard",
            "lowercase",
            "mynGram"
          ]
        }
      },
      "filter": {
        "mynGram": {
          "type": "nGram",
          "min_gram": 2,
          "max_gram": 50
        }
      }
    }
  }
}

通过创建索引的方式进行第二次尝试(其他最近出现的stackoverflow问题)

SECOND TENTATIVE BY CREATING INDEX THAT WAY (other recent stackoverflow question)

PUT /correntistas
{
    "settings": {
        "analysis": {
            "filter": {
                "autocomplete_filter": {
                    "type": "edge_ngram",
                    "min_gram": 1,
                    "max_gram": 20
                }
            },
            "analyzer": {
                "autocomplete_search": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter": [
                        "lowercase"
                    ]
                },
                "autocomplete_index": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter": [
                        "lowercase",
                        "autocomplete_filter"
                    ]
                }
            }
        }
    },
    "mappings": {
        "properties": {
            "nome": {
                "type": "text",
                "analyzer": "autocomplete_index",
                "search_analyzer": "autocomplete_search"
            }
        }
    }
}

第二个尝试失败了

{
  "error": {
    "root_cause": [
      {
        "type": "mapper_parsing_exception",
        "reason": "Root mapping definition has unsupported parameters:  [nome : {search_analyzer=autocomplete_search, analyzer=autocomplete_index, type=text}]"
      }
    ],
    "type": "mapper_parsing_exception",
    "reason": "Failed to parse mapping [properties]: Root mapping definition has unsupported parameters:  [nome : {search_analyzer=autocomplete_search, analyzer=autocomplete_index, type=text}]",
    "caused_by": {
      "type": "mapper_parsing_exception",
      "reason": "Root mapping definition has unsupported parameters:  [nome : {search_analyzer=autocomplete_search, analyzer=autocomplete_index, type=text}]"
    }
  },
  "status": 400
}

尽管我创建索引的第一种方法是无一例外创建索引,但是当我键入部分属性"nome"时,该索引不起作用.

Although the first way I created the index the index was created without exception, it doesn't work when I type part of the properties "nome".

我以此方式添加了一个文档

I added one document this way

POST /correntistas/correntista/1
    {
        "conta": "1234",
        "sobrenome": "Carvalho1",
        "nome": "Demetrio1"
    }

现在,我想通过键入第一个字母(例如De)或键入中间单词的一部分(例如met)来检索上述文档.但是我正在搜索的以下两种方式都没有检索文档

Now I want to retrieve the above document either by typing first letters (eg. De) or typing part of the word from middle (eg met). But none of the two ways bellow I am searching is retrieving the document

简单的查询方式:

GET correntistas/correntista/_search
{
    "query": {
        "match": {
            "nome": {
                "query": "De" #### "met" should I also work from my perspective
            }
        }
    }
}

更详尽的查询方法也失败

More elaborated way to query also failling

GET correntistas/correntista/_search
{
    "query": {
        "match": {
            "nome": {
                "query": "De",  #### "met" should I also work from my perspective
                "operator": "OR",
                "prefix_length": 0,
                "max_expansions": 50,
                "fuzzy_transpositions": true,
                "lenient": false,
                "zero_terms_query": "NONE",
                "auto_generate_synonyms_phrase_query": true,
                "boost": 1
            }
        }
    }
}

我认为不相关,但是这里是版本(我使用此版本,因为它旨在用于使用spring-data的生产中,并且在Spring-data中添加Elasticsearch较新的版本存在一些延迟")

I don't think is relevant but here are the verions (I am using this version because it is intended to work in production with spring-data and there is some "delay" on adding Elasticsearch newer versions in Spring-data)

elasticsearch and kibana 6.8.4

PS .:请不要建议我不要使用正则表达式,也不要使用通配符(*).

PS.: please don't suggest me to use regular expression neither wilcards (*).

***已编辑

以下所有步骤均在控制台-Kibana/Dev工具中完成

All steps below were done in Console - Kibana/Dev Tools

第1步:

POST /correntistas/correntista
{
  "settings": {
    "index.max_ngram_diff" :10,
    "analysis": {
      "filter": {
        "autocomplete_filter": {
          "type": "ngram",
          "min_gram": 2,
          "max_gram": 8
        }
      },
      "analyzer": {
        "autocomplete": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "autocomplete_filter"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "autocomplete",
        "search_analyzer": "standard"
      }
    }
  }
}

右侧面板上的结果

#! Deprecation: the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template
{
  "_index" : "correntistas",
  "_type" : "correntista",
  "_id" : "alrO-3EBU5lMnLQrXlwB",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

第2步:

POST /correntistas/correntista/1
{
    "title" : "Demetrio1"
}

右侧面板上的结果

{
  "_index" : "correntistas",
  "_type" : "correntista",
  "_id" : "1",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

第3步:

GET correntistas/_search
{
    "query" :{
        "match" :{
            "title" :"met"
        }
    }
}

右侧面板上的结果

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}

在相关的情况下:

在获取网址上添加了文档类型

Added document type on get url

GET correntistas/correntista/_search
{
    "query" :{
        "match" :{
            "title" :"met"
        }
    }
}

也什么也没带来:

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}

搜索整个标题文本

GET correntistas/_search
{
    "query" :{
        "match" :{
            "title" :"Demetrio1"
        }
    }
}

带来文档:

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "correntistas",
        "_type" : "correntista",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
          "title" : "Demetrio1"
        }
      }
    ]
  }
}

看着感兴趣的索引,看不到分析器:

Looking at the index it is interested not see the analyser:

GET /correntistas/_settings

右侧面板上的结果

{
  "correntistas" : {
    "settings" : {
      "index" : {
        "creation_date" : "1589067537651",
        "number_of_shards" : "5",
        "number_of_replicas" : "1",
        "uuid" : "jm8Kof16TAW7843YkaqWYQ",
        "version" : {
          "created" : "6080499"
        },
        "provided_name" : "correntistas"
      }
    }
  }
}

我如何运行Elasticsearch和Kibana

How I run Elasticsearch and Kibana

docker network create eknetwork

docker run -d --name elasticsearch --net eknetwork -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch:6.8.4

docker run -d --name kibana --net eknetwork -p 5601:5601 kibana:6.8.4

推荐答案

在我的,要求是某种前缀搜索,即对于文本Demetrio1仅搜索所需的de demet,该方法在我创建时起作用 edge-ngram令牌生成器来解决这,但在这个问题中,要求是提供中缀搜索,我们将为此使用 ngram标记器.

In my this SO answer, the requirement was kinda prefixed search, ie for text Demetrio1 only searching for de demet required, which worked as I created edge-ngram tokenizer to address this, but in this question, requirement is to provide the infix search for which we will use the ngram tokenizer in our custom analyzer.

以下是分步示例

索引定义

{
  "settings": {
    "index.max_ngram_diff" :10,
    "analysis": {
      "filter": {
        "autocomplete_filter": {
          "type": "ngram",  --> note this
          "min_gram": 2,
          "max_gram": 8
        }
      },
      "analyzer": {
        "autocomplete": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "autocomplete_filter"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "autocomplete",
        "search_analyzer": "standard"
      }
    }
  }
}

索引样本文档

{
    "title" : "Demetrio1"
}

搜索查询

{
    "query" :{
        "match" :{
            "title" :"met"
        }
    }
}

搜索结果带有示例文档:)

 "hits": [
            {
                "_index": "ngram",
                "_type": "_doc",
                "_id": "1",
                "_score": 0.47766083,
                "_source": {
                    "title": "Demetrio1"
                }
            }
        ]

这篇关于希望在Elasticsearch中搜索词的一部分的功能不返回任何内容.仅适用于完整单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-06 16:07