本文介绍了弹性搜索聚合分离单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只是在浏览器插件(奇迹)中运行一个聚合,如下图所示,只有一个文档与查询匹配,但是由空格分隔,但没有意义,我想要对不同的文档进行聚合。场景应该只有一组,数字1和键:Drow Ranger。
在弹性搜索中这样做的真正方法是什么?

I simply run an aggregations in browser plugin(marvel) as you see in picture below there is only one doc match the query but aggregrated separated by spaces but it doesn't make sense I want aggregrate for different doc.. ın this scenario there should be only one group with count 1 and key:"Drow Ranger".What is the true way of do this in elasticsearch..

推荐答案

可能是因为您的 heroname 字段分析,因此Drow Ranger获得标记化,索引为drow和ranger。

It's probably because your heroname field is analyzed and thus "Drow Ranger" gets tokenized and indexed as "drow" and "ranger".

解决这个问题的一个方法是将您的 heroname 字段转换为具有分析部分的多字段您通过通配符查询搜索的)和另一个 not_analyzed 部分(您可以聚合的部分)。

One way to get around this is to transform your heroname field to a multi-field with an analyzed part (the one you search on with the wildcard query) and another not_analyzed part (the one you can aggregate on).

您应该创建这样的索引,并为 heroname指定适当的映射字段

You should create your index like this and specify the proper mapping for your heroname field

curl -XPUT localhost:9200/dota2 -d '{
    "mappings": {
        "agust": {
            "properties": {
                "heroname": {
                    "type": "string",
                    "fields": {
                        "raw: {
                            "type": "string",
                            "index": "not_analyzed"
                        }
                    }
                },
                ... your other fields go here
            }
        }
    }
} 

然后,您可以在 heroname.raw 字段而不是 heroname 字段上运行聚合。

Then you can run your aggregation on the heroname.raw field instead of the heroname field.

更新

如果您只想尝试 heroname 字段,您可以修改该字段,而不是重新创建整个索引。如果您运行以下命令,则只需将新的 heroname.raw 子字段添加到您现有的 heroname 字段中。请注意,您仍然必须通过

If you just want to try on the heroname field, you can just modify that field and not recreate the whole index. If you run the following command, it will simply add the new heroname.raw sub-field to your existing heroname field. Note that you still have to reindex your data though

curl -XPUT localhost:9200/dota2/_mapping/agust -d '{
    "properties": {
        "heroname": {
            "type": "string",
            "fields": {
                "raw: {
                    "type": "string",
                    "index": "not_analyzed"
                }
            }
        }
    }
} 

然后,您可以在 heroname >通配符查询,但您的汇总将如下所示:

Then you can keep using heroname in your wildcard query, but your aggregation will look like this:

{
    "aggs": {
        "asd": {
            "terms": {
                "field": "heroname.raw",    <--- use the raw field here
                "size": 0
            }
        }
    }
}

这篇关于弹性搜索聚合分离单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-16 01:59