我想在Elasticsearch数据中搜索每个组的最大和。例如:
数据是:
id | gId | cost
----|-----|------
1 | 1 | 20
2 | 1 | 15
3 | 2 | 30
4 | 1 | 30 *
5 | 2 | 40 *
6 | 1 | 20
7 | 2 | 30
8 | 3 | 45 *
9 | 1 | 10
我使用 sum_bucket 对每个组的最大值进行求和。这是我的查询:
{
"aggs": {
"T1":{
"terms": {
"field": "gId",
"size":3
},
"aggs":{
"MAX_COST":{
"max": {
"field": "cost"
}
}
}
},
"T2":{
"sum_bucket": {
"buckets_path": "T1>MAX_COST"
}
}
},
"size": 0
}
查询响应是
"T1": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [ |
{ |
"key": 1, |
"doc_count": 5, |
"MAX": { |
"value": 30 |
} |
}, |
{ | How can ignore this part to return
"key": 2, | from elasticsearch query response
"doc_count": 3, |
"MAX": { |
"value": 40 |
} |
}, |
{ |
"key": 3, |
"doc_count": 1, |
"MAX": { |
"value": 45 |
} |
} |
]
},
"T2": {
"value": 115
}
T2.value是所需的结果。但是我想在查询结果T1.buckets中忽略网络性能问题,因为我的数据非常大。通过将T1.terms.size设置为特定数字,T2.value结果中仅结果效果的最高数。如何写出我所查询的结果而忽略T1.buckets或对每组最大和的总和更好的查询求和?
最佳答案
您可以使用 filter_path
仅返回响应的一部分
var searchResponse = client.Search<Document>(s => s
.FilterPath(new[] { "T2.value" }) // paths to include in response
.Aggregations(a => a
// ... rest of aggs here
)
);
请记住,结合使用NEST的
filter_path
有时可能会导致内部序列化程序无法反序列化响应,因为该结构是意外的。在这种情况下,您可以使用高级客户端上公开的低级客户端来处理响应var searchDescriptor = new SearchDescriptor<Document>()
.Aggregations(a => a
// ... rest of aggs here
);
var searchResponse = client.LowLevel.Search<StringResponse>(
"index",
"type",
PostData.Serializable(searchDescriptor),
new SearchRequestParameters
{
QueryString = new Dictionary<string, object>
{
["filter_path"] = "T2.value"
}
});
// do something with JSON string response
var json = searchResponse.Body;
关于elasticsearch - Elasticsearch中每个组的最大总和,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/55793748/