问题描述
我尝试了两种不同的创建索引的方法,如果我搜索单词的一部分,两者都将返回任何内容.基本上,如果我搜索第一个字母或单词中间的字母,我想获取所有文档.
I tried two different approaches for creating index and both are returning anything if I search for part o the word. Basically, if I search for first letters or letters in the middle of the word I want get all the documents.
以这种方式创建索引的第一个尝试(其他stackoverflow问题有点老了):
FIRST TENTATIVE BY CREATING INDEX THAT WAY (other stackoverflow question a bit old):
POST correntistas/correntista
{
"index": {
"index": "correntistas",
"type": "correntista",
"analysis": {
"index_analyzer": {
"my_index_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"mynGram"
]
}
},
"search_analyzer": {
"my_search_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"standard",
"lowercase",
"mynGram"
]
}
},
"filter": {
"mynGram": {
"type": "nGram",
"min_gram": 2,
"max_gram": 50
}
}
}
}
}
通过创建索引的方式进行第二次尝试(其他最近出现的stackoverflow问题)
SECOND TENTATIVE BY CREATING INDEX THAT WAY (other recent stackoverflow question)
PUT /correntistas
{
"settings": {
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 20
}
},
"analyzer": {
"autocomplete_search": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase"
]
},
"autocomplete_index": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
}
}
},
"mappings": {
"properties": {
"nome": {
"type": "text",
"analyzer": "autocomplete_index",
"search_analyzer": "autocomplete_search"
}
}
}
}
第二个尝试失败了
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "Root mapping definition has unsupported parameters: [nome : {search_analyzer=autocomplete_search, analyzer=autocomplete_index, type=text}]"
}
],
"type": "mapper_parsing_exception",
"reason": "Failed to parse mapping [properties]: Root mapping definition has unsupported parameters: [nome : {search_analyzer=autocomplete_search, analyzer=autocomplete_index, type=text}]",
"caused_by": {
"type": "mapper_parsing_exception",
"reason": "Root mapping definition has unsupported parameters: [nome : {search_analyzer=autocomplete_search, analyzer=autocomplete_index, type=text}]"
}
},
"status": 400
}
尽管我创建索引的第一种方法是无一例外创建索引,但是当我键入部分属性"nome"时,该索引不起作用.
Although the first way I created the index the index was created without exception, it doesn't work when I type part of the properties "nome".
我以此方式添加了一个文档
I added one document this way
POST /correntistas/correntista/1
{
"conta": "1234",
"sobrenome": "Carvalho1",
"nome": "Demetrio1"
}
现在,我想通过键入第一个字母(例如De)或键入中间单词的一部分(例如met)来检索上述文档.但是我正在搜索的以下两种方式都没有检索文档
Now I want to retrieve the above document either by typing first letters (eg. De) or typing part of the word from middle (eg met). But none of the two ways bellow I am searching is retrieving the document
简单的查询方式:
GET correntistas/correntista/_search
{
"query": {
"match": {
"nome": {
"query": "De" #### "met" should I also work from my perspective
}
}
}
}
更详尽的查询方法也失败
More elaborated way to query also failling
GET correntistas/correntista/_search
{
"query": {
"match": {
"nome": {
"query": "De", #### "met" should I also work from my perspective
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"boost": 1
}
}
}
}
我认为不相关,但是这里是版本(我使用此版本,因为它旨在用于使用spring-data的生产中,并且在Spring-data中添加Elasticsearch较新的版本存在一些延迟")
I don't think is relevant but here are the verions (I am using this version because it is intended to work in production with spring-data and there is some "delay" on adding Elasticsearch newer versions in Spring-data)
elasticsearch and kibana 6.8.4
PS .:请不要建议我不要使用正则表达式,也不要使用通配符(*).
PS.: please don't suggest me to use regular expression neither wilcards (*).
***已编辑
以下所有步骤均在控制台-Kibana/Dev工具中完成
All steps below were done in Console - Kibana/Dev Tools
第1步:
POST /correntistas/correntista
{
"settings": {
"index.max_ngram_diff" :10,
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "ngram",
"min_gram": 2,
"max_gram": 8
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
}
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "standard"
}
}
}
}
右侧面板上的结果
#! Deprecation: the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template
{
"_index" : "correntistas",
"_type" : "correntista",
"_id" : "alrO-3EBU5lMnLQrXlwB",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}
第2步:
POST /correntistas/correntista/1
{
"title" : "Demetrio1"
}
右侧面板上的结果
{
"_index" : "correntistas",
"_type" : "correntista",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}
第3步:
GET correntistas/_search
{
"query" :{
"match" :{
"title" :"met"
}
}
}
右侧面板上的结果
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
}
在相关的情况下:
在获取网址上添加了文档类型
Added document type on get url
GET correntistas/correntista/_search
{
"query" :{
"match" :{
"title" :"met"
}
}
}
也什么也没带来:
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
}
搜索整个标题文本
GET correntistas/_search
{
"query" :{
"match" :{
"title" :"Demetrio1"
}
}
}
带来文档:
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "correntistas",
"_type" : "correntista",
"_id" : "1",
"_score" : 0.2876821,
"_source" : {
"title" : "Demetrio1"
}
}
]
}
}
看着感兴趣的索引,看不到分析器:
Looking at the index it is interested not see the analyser:
GET /correntistas/_settings
右侧面板上的结果
{
"correntistas" : {
"settings" : {
"index" : {
"creation_date" : "1589067537651",
"number_of_shards" : "5",
"number_of_replicas" : "1",
"uuid" : "jm8Kof16TAW7843YkaqWYQ",
"version" : {
"created" : "6080499"
},
"provided_name" : "correntistas"
}
}
}
}
我如何运行Elasticsearch和Kibana
How I run Elasticsearch and Kibana
docker network create eknetwork
docker run -d --name elasticsearch --net eknetwork -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch:6.8.4
docker run -d --name kibana --net eknetwork -p 5601:5601 kibana:6.8.4
推荐答案
在我的,要求是某种前缀搜索,即对于文本Demetrio1
仅搜索所需的de
demet
,该方法在我创建时起作用 edge-ngram令牌生成器来解决这,但在这个问题中,要求是提供中缀搜索,我们将为此使用 ngram标记器.
In my this SO answer, the requirement was kinda prefixed search, ie for text Demetrio1
only searching for de
demet
required, which worked as I created edge-ngram tokenizer to address this, but in this question, requirement is to provide the infix search for which we will use the ngram tokenizer in our custom analyzer.
以下是分步示例
索引定义
{
"settings": {
"index.max_ngram_diff" :10,
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "ngram", --> note this
"min_gram": 2,
"max_gram": 8
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
}
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "standard"
}
}
}
}
索引样本文档
{
"title" : "Demetrio1"
}
搜索查询
{
"query" :{
"match" :{
"title" :"met"
}
}
}
搜索结果带有示例文档:)
"hits": [
{
"_index": "ngram",
"_type": "_doc",
"_id": "1",
"_score": 0.47766083,
"_source": {
"title": "Demetrio1"
}
}
]
这篇关于希望在Elasticsearch中搜索词的一部分的功能不返回任何内容.仅适用于完整单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!