问题描述
我的ElasticSearch v1.2.1中有一些文档,例如:
I have a few documents in my ElasticSearch v1.2.1 like:
{
"tempSkipAfterSave": "false",
"variation": null,
"images": null,
"name": "Dolce & Gabbana Short Sleeve Coat",
"sku": "MD01575254-40-WHITE",
"user_id": "123foo",
"creation_date": null,
"changed": 1
}
其中sku
可以是如下形式的变体:MD01575254-40-BlUE
,MD01575254-38-WHITE
where sku
can be a variation such as : MD01575254-40-BlUE
, MD01575254-38-WHITE
我可以使用弹性搜索查询来解决此问题:
I can get my elastic search query to work with this:
{
"size": 1000,
"from": 0,
"filter": {
"and": [
{
"regexp": {
"sku": "md01575254.*"
}
},
{
"term": {
"user_id": "123foo"
}
},
{
"missing": {
"field": "project_id"
}
}
]
},
"query": {
"match_all": {}
}
}
我得到了sku的所有变体:MD01575254*
I got all the variations back of sku: MD01575254*
但是,破折号'-'确实把我搞砸了
However, the dash '-' is really screwing me up
当我将正则表达式更改为:
when I change the regexp to:
"regexp": {
"sku": "md01575254-40.*"
}
我无法获得任何结果.我也尝试过
I can't get any results back. I've also tried
- "sku":"md01575254-40.*"
- "sku":"md01575254 \ -40.*"
- "sku":"md01575254-40-.*"
- ...
似乎无法使其正常工作?我在这里没错吗?
Just can't seem to make it work ? What am I don't wrong here?
推荐答案
问题:
这是因为默认分析器通常在-
处标记化,因此您的字段最有可能像这样保存:
This is because the default analyzer usually tokenizes at -
, so your field is most likey saved like:
-
MD01575254
-
40
-
BlUE
MD01575254
40
BlUE
解决方案:
您可以将映射更新为具有sku.raw
字段,该字段在建立索引时不会被分析.这将要求您删除并重新编制索引.
You can update your mapping to have a sku.raw
field that would not be analyzed when indexed. This will require you to delete and re-index.
{
"<type>" : {
"properties" : {
...,
"sku" : {
"type": "string",
"fields" : {
"raw" : {"type" : "string", "index" : "not_analyzed"}
}
}
}
}
}
然后,您可以查询未分析的新字段:
Then you can query this new field which is not analyzed:
{
"query" : {
"regexp" : {
"sku.raw": "md01575254-40.*"
}
}
}
HTTP端点:
删除当前映射和数据的API是:
The API to delete your current mapping and data is:
DELETE http://localhost:9200/<index>/<type>
使用原始SKU添加新映射的API是:
The API to add your new mapping, with the raw SKU is:
PUT http://localhost:9200/<index>/<type>/_mapping
链接:
- multiple fields in mapping
- analyzers
这篇关于ElasticSearch RegExp过滤器正则破折号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!