问题描述
所以我想要实现的是与每个索引的自定义可搜索字段的部分匹配。
我生成一个 match_phrase_prefix
与要搜索的值,如果它是多个单词,我每个单词生成另一个单词(我可以使用前缀
,但是它已经被检测到,或者没有记录的设置)。
So what I'm trying to achieve is partial matching with customized searchable fields per index. I generate a match_phrase_prefix
with the value to search, and if it is more than one word, I generate another one per word.(I could use prefix
, but it bugged, or has undocumented settings).
在这种情况下,我正在寻找belden cable
;查询如下所示:
In this case, I'm trying to look up for "belden cable"
; the query looks like this:
{
"query":{
"bool":{
"should":
[
{
"indices":{
"indices":["addresss"],
"query":{
"bool":{
"should":
[
{"match_phrase_prefix":{"name":"BELDEN CABLE"}}
{"match_phrase_prefix":{"name":"BELDEN"}},
{"match_phrase_prefix":{"name":"CABLE"}}
]
}
},
"no_match_query":"none"
}
},
{
"indices":{
"indices":["customers"],
"query":{
"bool":{
"should":[
{"match_phrase_prefix":{"_all":"BELDEN CABLE"}},
{"match_phrase_prefix":{"_all":"CABLE"}},
{"match_phrase_prefix":{"_all":"BELDEN"}}
]
}
},
"no_match_query":"none"
}
}
]
}
}
我的目标搜索是首先获得belden cable
的结果,搜索belden
或cable
。
My target search is to get the results that have "belden cable"
first, then the searches for just "belden"
or "cable"
.
这返回(例如)4个具有belden cable
的结果,然后只有cable
,然后更多结果belden cable
。
This returns(by example) 4 results that have "belden cable"
, then a result that has only "cable"
, then more results of "belden cable"
.
如何提升具有完整搜索价值的结果?(belden cable)
我已经尝试分离了两个单词和单词之间的索引查询,但它给出了最差的相关性结果。
I've tried separating the indices query of both words and separated words, but it gives worst relevance results.
另外我试过在 match_phrase_prefix
中为belden cable
使用boost语句,而不会改变结果..
Also I've tried using a boost statement inside the match_phrase_prefix
for "belden cable"
without change in the results..
推荐答案
您实际需要的是分析输入数据的不同方法。请参阅以下作为您的最终解决方案的起点的内容(因为您需要考虑查询和数据分析的全部要求)。使用ES进行搜索不仅涉及查询,而且还涉及您如何结构和准备数据。
What you actually need is a different way of analyzing the input data. See below something that should be a starting point to your final solution (because you need to consider the full set of requirements for your queries and data analysis). Searching with ES is not only about queries, but also about how you structure and prepare the data.
想法是,您想要您的数据进行分析,以便 belden cable
保持原样。使用name的映射:{type:string}
正在使用标准
分析器意味着索引中的条款列表是 belden
和 cable
。您实际需要的是[ belden cable
, belden
, cable
]。所以,我想到建议带状元件
令牌过滤器。
The idea is that you want your data to be analyzed so that belden cable
stays as is. With a mapping of "name": {"type": "string"}
the standard
analyzer is being used which means that the list of terms in your index is belden
and cable
. What you actually need is [belden cable
, belden
, cable
]. So, I thought on suggesting the shingles
token filter.
DELETE /addresss
PUT /addresss
{
"settings": {
"analysis": {
"analyzer": {
"analyzer_shingle": {
"tokenizer": "standard",
"filter": [
"standard",
"lowercase",
"shingle"
]
}
}
}
},
"mappings": {
"test": {
"properties": {
"name": {
"type": "string",
"analyzer": "analyzer_shingle"
}
}
}
}
}
DELETE /customers
PUT /customers
{
"settings": {
"analysis": {
"analyzer": {
"analyzer_shingle": {
"tokenizer": "standard",
"filter": [
"standard",
"lowercase",
"shingle"
]
}
}
}
},
"mappings": {
"test": {
"_all": {
"analyzer": "analyzer_shingle"
}
}
}
}
POST /addresss/test/_bulk
{"index":{}}
{"name": "belden cable"}
{"index":{}}
{"name": "belden cable yyy"}
{"index":{}}
{"name": "belden cable xxx"}
{"index":{}}
{"name": "belden bla"}
{"index":{}}
{"name": "cable bla"}
POST /customers/test/_bulk
{"index":{}}
{"field1": "belden", "field2": "cable"}
{"index":{}}
{"field1": "belden cable yyy"}
{"index":{}}
{"field2": "belden cable xxx"}
{"index":{}}
{"field2": "belden bla"}
{"index":{}}
{"field2": "cable bla"}
GET /addresss,customers/test/_search
{
"query": {
"bool": {
"should": [
{
"indices": {
"indices": [
"addresss"
],
"query": {
"bool": {
"should": [
{
"match_phrase_prefix": {
"name": "BELDEN CABLE"
}
},
{
"match_phrase_prefix": {
"name": "BELDEN"
}
},
{
"match_phrase_prefix": {
"name": "CABLE"
}
}
]
}
},
"no_match_query": "none"
}
},
{
"indices": {
"indices": [
"customers"
],
"query": {
"bool": {
"should": [
{
"match_phrase_prefix": {
"_all": "BELDEN CABLE"
}
},
{
"match_phrase_prefix": {
"_all": "CABLE"
}
},
{
"match_phrase_prefix": {
"_all": "BELDEN"
}
}
]
}
},
"no_match_query": "none"
}
}
]
}
}
}
这篇关于如何在布尔查询中提升索引查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!