我的类型具有一个字段,该字段是ISO 8601格式的时间数组。我想获取所有在某一天有时间的列表,然后在它们出现在该特定日期的最早时间之前对其进行排序。问题是我的查询是根据所有天的最早时间进行排序的。
您可以在下面重现该问题。
curl -XPUT 'localhost:9200/listings?pretty'
curl -XPOST 'localhost:9200/listings/listing/_bulk?pretty' -d '
{"index": { } }
{ "name": "second on 6th (3rd on the 5th)", "times": ["2018-12-05T12:00:00","2018-12-06T11:00:00"] }
{"index": { } }
{ "name": "third on 6th (1st on the 5th)", "times": ["2018-12-05T10:00:00","2018-12-06T12:00:00"] }
{"index": { } }
{ "name": "first on the 6th (2nd on the 5th)", "times": ["2018-12-05T11:00:00","2018-12-06T10:00:00"] }
'
# because ES takes time to add them to index
sleep 2
echo "Query listings on the 6th!"
curl -XPOST 'localhost:9200/listings/_search?pretty' -d '
{
"sort": {
"times": {
"order": "asc",
"nested_filter": {
"range": {
"times": {
"gte": "2018-12-06T00:00:00",
"lte": "2018-12-06T23:59:59"
}
}
}
}
},
"query": {
"bool": {
"filter": {
"range": {
"times": {
"gte": "2018-12-06T00:00:00",
"lte": "2018-12-06T23:59:59"
}
}
}
}
}
}'
curl -XDELETE 'localhost:9200/listings?pretty'
将上面的脚本添加到.sh文件中并运行它有助于重现该问题。您会看到订单是根据5号而不是6号进行的。 Elasticsearch将时间转换为
epoch_millis
编号以进行排序,您可以在hits对象的sort字段中看到纪元编号,例如1544007600000。进行asc排序时,in将采用数组中最小的编号(顺序不重要)并基于那个。不知何故,我需要在查询的当天(即6日)最早的时间订购该产品。
当前正在使用Elasticsearch 2.4,但是即使有人可以向我展示在当前版本中是如何完成的,也很棒。
如果有帮助的话,这是他们关于nested queries和scripting的文档。
最佳答案
我认为这里的问题是嵌套排序是针对嵌套对象的,而不是针对数组的。
如果将文档转换为使用一组嵌套对象而不是简单的日期数组的文档,则可以构造一个有效的嵌套过滤排序。
以下是Elasticsearch 6.0-从6.1开始,它们对语法进行了一些更改,但我不确定在2.x中可以使用多少语法:
映射:
PUT nested-listings
{
"mappings": {
"listing": {
"properties": {
"name": {
"type": "keyword"
},
"openTimes": {
"type": "nested",
"properties": {
"date": {
"type": "date"
}
}
}
}
}
}
}
数据:
POST nested-listings/listing/_bulk
{"index": { } }
{ "name": "second on 6th (3rd on the 5th)", "openTimes": [ { "date": "2018-12-05T12:00:00" }, { "date": "2018-12-06T11:00:00" }] }
{"index": { } }
{ "name": "third on 6th (1st on the 5th)", "openTimes": [ {"date": "2018-12-05T10:00:00"}, { "date": "2018-12-06T12:00:00" }] }
{"index": { } }
{ "name": "first on the 6th (2nd on the 5th)", "openTimes": [ {"date": "2018-12-05T11:00:00" }, { "date": "2018-12-06T10:00:00" }] }
因此,我们有一个“openTimes”嵌套对象,而不是“nextNexpectionOpenTimes”,并且每个 list 都包含一个openTimes数组。
现在搜索:
POST nested-listings/_search
{
"sort": {
"openTimes.date": {
"order": "asc",
"nested_path": "openTimes",
"nested_filter": {
"range": {
"openTimes.date": {
"gte": "2018-12-06T00:00:00",
"lte": "2018-12-06T23:59:59"
}
}
}
}
},
"query": {
"nested": {
"path": "openTimes",
"query": {
"bool": {
"filter": {
"range": {
"openTimes.date": {
"gte": "2018-12-06T00:00:00",
"lte": "2018-12-06T23:59:59"
}
}
}
}
}
}
}
}
这里的主要区别是查询稍有不同,因为您需要使用“嵌套”查询对嵌套对象进行过滤。
这给出了以下结果:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 3,
"max_score": null,
"hits": [
{
"_index": "nested-listings",
"_type": "listing",
"_id": "vHH6e2cB28sphqox2Dcm",
"_score": null,
"_source": {
"name": "first on the 6th (2nd on the 5th)"
},
"sort": [
1544090400000
]
},
{
"_index": "nested-listings",
"_type": "listing",
"_id": "unH6e2cB28sphqox2Dcm",
"_score": null,
"_source": {
"name": "second on 6th (3rd on the 5th)"
},
"sort": [
1544094000000
]
},
{
"_index": "nested-listings",
"_type": "listing",
"_id": "u3H6e2cB28sphqox2Dcm",
"_score": null,
"_source": {
"name": "third on 6th (1st on the 5th)"
},
"sort": [
1544097600000
]
}
]
}
}
我认为您实际上不能从ES中的数组中选择一个值,因此对于排序,您总是要对所有结果进行排序。对于纯数组,您可以做的最好的事情就是选择如何处理该数组以进行排序(使用最低,最高,均值等)。