问题描述
我有一个类似以下设置和映射的索引;
I have an index like following settings and mapping;
{
"settings":{
"index":{
"analysis":{
"analyzer":{
"analyzer_keyword":{
"tokenizer":"keyword",
"filter":"lowercase"
}
}
}
}
},
"mappings":{
"product":{
"properties":{
"name":{
"analyzer":"analyzer_keyword",
"type":"string",
"index": "not_analyzed"
}
}
}
}
}
我正在努力实现对 name
字段的通配符搜索.我的示例数据是这样的;
I am struggling with making an implementation for wildcard search on name
field. My example data like this;
[
{"name": "SVF-123"},
{"name": "SVF-234"}
]
当我执行以下查询时;
http://localhost:9200/my_index/product/_search -d '
{
"query": {
"filtered" : {
"query" : {
"query_string" : {
"query": "*SVF-1*"
}
}
}
}
}'
它返回 SVF-123
,SVF-234
.我认为,它仍然标记数据.它必须只返回 SVF-123
.
It returns SVF-123
,SVF-234
. I think, it still tokenizes data. It must return only SVF-123
.
你能帮忙吗?
提前致谢
推荐答案
我的解决方案冒险
正如您在我的问题中看到的那样,我已经开始了我的案件.每当我更改了一部分设置时,一部分开始工作,但另一部分停止工作.让我给出我的解决方案历史:
I have started my case as you can see in my question. Whenever, I have changed a part of my settings, one part started to work, but another part stop working. Let me give my solution history:
1.) 我已默认为我的数据编制索引.这意味着,我的数据默认为 analyzed
.这会在我这边造成问题.例如;
1.) I have indexed my data as default. This means, my data is analyzed
as default. This will cause problem on my side. For example;
当用户开始搜索 SVF-1 等关键字时,系统会运行以下查询:
When user started to search a keyword like SVF-1, system run this query:
{
"query": {
"filtered" : {
"query" : {
"query_string" : {
"analyze_wildcard": true,
"query": "*SVF-1*"
}
}
}
}
}
和结果;
SVF-123
SVF-234
这是正常的,因为我的文档的name
字段是analyzed
的.这会将查询拆分为标记 SVF
和 1
,并且 SVF
匹配我的文档,尽管 1
不匹配.我已经跳过了这条路.我已经为我的字段创建了一个映射,使它们 not_analyzed
This is normal, because name
field of my documents are analyzed
. This splits query into tokens SVF
and 1
, and SVF
matches my documents, although 1
does not match. I have skipped this way. I have create a mapping for my fields make them not_analyzed
{
"mappings":{
"product":{
"properties":{
"name":{
"type":"string",
"index": "not_analyzed"
},
"site":{
"type":"string",
"index": "not_analyzed"
}
}
}
}
}
但我的问题仍然存在.
2.) 经过大量研究后,我想尝试另一种方式.决定使用通配符查询.我的查询是;
2.) I wanted to try another way after lots of research. Decided to use wildcard query.My query is;
{
"query": {
"wildcard" : {
"name" : {
"value" : *SVF-1*"
}
}
},
"filter":{
"term": {"site":"pro_en_GB"}
}
}
}
这个查询有效,但这里有一个问题.我的字段不再被分析,我正在进行通配符查询.区分大小写是这里的问题.如果我像 svf-1 一样搜索,它不会返回任何内容.因为,用户可以输入小写版本的查询.
This query worked, but one problem here. My fields are not_analyzed anymore, and I am making wildcard query. Case sensitivity is problem here. If I search like svf-1, it returns nothing. Since, user can input lowercase version of query.
3.) 我已将文档结构更改为;
3.) I have changed my document structure to;
{
"mappings":{
"product":{
"properties":{
"name":{
"type":"string",
"index": "not_analyzed"
},
"nameLowerCase":{
"type":"string",
"index": "not_analyzed"
}
"site":{
"type":"string",
"index": "not_analyzed"
}
}
}
}
}
我为 name
添加了一个名为 nameLowerCase
的字段.当我为我的文档编制索引时,我将我的文档设置为:
I have adde one more field for name
called nameLowerCase
. When I am indexing my document, I am setting my document like;
{
name: "SVF-123",
nameLowerCase: "svf-123",
site: "pro_en_GB"
}
在这里,我将查询关键字转换为小写,并对新的 nameLowerCase
索引进行搜索操作.并显示 name
字段.
Here, I am converting query keyword to lowercase and make search operation on new nameLowerCase
index. And displaying name
field.
我的查询的最终版本是;
Final version of my query is;
{
"query": {
"wildcard" : {
"nameLowerCase" : {
"value" : "*svf-1*"
}
}
},
"filter":{
"term": {"site":"pro_en_GB"}
}
}
}
现在可以了.还有一种方法可以使用 multi_field.我的查询包含破折号(-),并且遇到了一些问题.
Now it works. There is also one way to solve this problem by using multi_field. My query contains dash(-), and faced some problems.
非常感谢@Alex Brasetvik 的详细解释和努力
Lots of thanks to @Alex Brasetvik for his detailed explanation and effort
这篇关于在 not_analyzed 字段上进行 Elasticsearch 通配符搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!