问题描述
我很努力,我觉得应该工作,但也许我在做一些愚蠢的事情。此搜索: {
查询:
{
bool:
{
must:[
{match:{Element.sourceSystem.name:Source1 Source2}}
]
}
}
返回Source1和Source2的数据。添加术语搜索,如下所示,我希望返回第一个搜索的子集,只返回Source1s。没有任何东西返回,当运行第一个查询或它自己的。
{
查询:
{
bool:
{
必须:[
{match:{Element.sourceSystem.name:Source1 Source2}}
{terms:{Element.sourceSystem.name:[Source1]}}
]
}
}
}
我意识到这是很难的,没有看到文件,但足以说Element.sourceSystem.name存在,是可用的第一个搜索工作正常 - 所有输入感激地收到。
有些东西在 match
查询比条款
查询
首先,a绕过分析仪:
假设您使用的是的弹性搜索,由标准的标记器和一些令牌过滤器组成。标准标记器将在空格,标点符号和一些其他特殊字符上进行标记(将文本分割为术语)。详细信息可以在Elasticsearch文档中找到,所以现在让我们说每个单词将是一个术语。
分析器的第二个非常重要的部分是小写滤镜它会将术语转换为小写。这意味着,以后,搜索 Source1
和 source1
应该产生相同的结果。
所以一个简单的例子:
以下是使用术语查询的线索。它不分析您提供的文本。通常应该用于关键字
类型的字段。关键字字段也未分析(有关更多信息,请阅读 - 其实很重要)。那么这是什么意思?
- 如果我从上面拿我的例子,我的索引将包含
this 是,我的,输入,文本,在,英语
。 - c>英语将匹配,因为它将被分析到
english
-
英文
将永远不会匹配,因为我的索引中没有条款English
。这是区分大小写的。
我非常乐观,如果你使用 source1
在你的术语查询中,它会匹配一些东西。但是,我非常怀疑您的查询是您的用例的出路。在查询文本字段时尝试使用正常匹配查询,(通常 - 不总是适用)仅在关键字字段中使用术语查询。
I'm struggling with this, which I feel should work but maybe I'm doing something stupid. This search:
{
"query":
{
"bool":
{
"must":[
{"match":{"Element.sourceSystem.name":"Source1 Source2"}}
]
}
}
returns data for both Source1 and Source2. Adding a terms search, as underneath, I would expect to return a subset of the first search with just the Source1s returned. Nothing is returned, when run with the first query or on it's own.
{
"query":
{
"bool":
{
"must":[
{"match":{"Element.sourceSystem.name":"Source1 Source2"}},
{"terms":{"Element.sourceSystem.name":["Source1"]}}
]
}
}
}
I realise this is hard without seeing the documents, but suffice it to say that "Element.sourceSystem.name" exists and is available as the first search works fine - all input gratefully received.
There are some things that are handled differently in match
queries than in terms
queries.
First of all, a detour to analyzers:
Assuming you are using the standard analyzer of elasticsearch, which consists of a standard tokenizer and some token filters. The standard tokenizer will tokenize (split your text into terms) on spaces, punctuation marks and some other special characters. Details can be found in the Elasticsearch Documentation, so for now let's just say 'each word will be a term'.
The second, very important part of the analyzer is the lowercase filter. It will transform terms into lowercase. This means, later on, searching for Source1
and source1
should yield the same results.
So a short example:
All of this happens when you index a document into a text
field for example. I assume the Element.sourceSystem.name
is one of this type, since your normal match query seems to work.
Now, when you issue a match query with "Source1 Source2"
, the analysis will also happen and transform it into tokens source1
and source2
. Internally it will then create 2 term queries in a boolean OR. So either source1
or source2
must match to be a result of your query.
Here's now the clue with the terms query. It does not analyze the text you provide. It's usually supposed to be used on fields of type keyword
. Keyword fields are also not analyzed (for further information, please read the documentation of mapping types - it is actually pretty important). So what does this mean?
- If I take my example from above, my index would contain
"this", "is", "my", "input", "text", "in", "english"
. - A match query with
English
will match, because it will be analyzed toenglish
- A term/s query with
English
will never match, because there is no termEnglish
in my index. It is case sensitive.
I am very positive, if you would use source1
in your terms query, it would match something. However, I highly doubt that your query is the way to go for your use case. Try using normal match queries when querying text fields and (in general - not always applicable) only use terms queries on keyword fields.
这篇关于ElasticSearch多个确切搜索字段不会返回结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!