本文介绍了MongoDB-从带有时间戳的文档中获取最新的非空字段值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我查询的MongoDB集合中,每个文档代表特定时间某个地块的一些数据.每次收到包裹更新时,某些字段可能会被更新(非空值),而另一些字段则未被更新(空值).

In the MongoDB collection I'm querying, each document represents some data for a parcel at a specific time. Every time I receive an update for a parcel, some fields may be updated (non-null value) and some others are not (null values).

为说明起见,请考虑以下示例.我们收到了一个包裹的3个数据集:

To illustrate, consider this example. We received 3 data sets for a parcel:

/* 1 */
{
    "parcelNum" : "CC123456789FR",
    "datetime" : ISODate("2018-09-05T10:48:38.584Z"),
    "field1" : "value1_1",
    "field2" : "value2_1"
}

/* 2 */
{
    "parcelNum" : "CC123456789FR",
    "datetime" : ISODate("2018-09-05T10:48:40.566Z"),
    "field1" : "value1_2",
    "field2" : null
}

/* 3 */
{
    "parcelNum" : "CC123456789FR",
    "datetime" : ISODate("2018-09-05T10:48:42.777Z"),
    "field1" : null,
    "field2" : "value2_2"
}

如何考虑到它们所属的文档的时间戳,为所有字段提取最新的非空值?

使用上一个示例,这是我尝试得到的:

Using the previous example, this is what I try to get:

{
    "parcelNum" : "CC123456789FR",
    "field1" : "value1_2",
    "field2" : "value2_2"
}

我尝试了这种查询,但是我找不到如何混合多个文档中的字段值:

I tried that kind of query but I can't find how to mix field values from multiple documents:

db.testDB.aggregate([
    {$sort: { datetime: -1 }},
    {$group: { _id: "$parcelNum", 
        field1: {$first: "$field1" },
        field2: {$first: "$field2" }
    }}
])

给我:

{
    "_id" : "CC123456789FR",
    "field1" : null,
    "field2" : "value2_2"
}

这是错误的,因为它仅使用最新文档中的值,而不混合所有文档.

which is wrong because it only uses values from the most recent document and doesn't mix all the documents.

我尝试了Rishi在另一个主题中建议的另一种方法.他建议不要将修订版本的子文档推送到数组中,并在父文档中维护最新的修订版本,而不是为每个修订版本创建新文档.

I tried another approach suggested by Rishi in another topic. Instead of creating a new document for each revision, he suggested pushing revision sub-documents onto an array and maintaining the latest revision at the parent document.

类似这样的东西:

{
    parcelNum: CC123456789FR,
    lastUpdated: ISODate("2018-09-05T10:48:42.777Z")
    field1: "value1_2",
    field2: "value2_2",
    revisions: [
        {
            datetime: ISODate("2018-09-05T10:48:38.584Z"),
            field1: "value1_1",
            field2: "value2_1"
        },
        {
            datetime: ISODate("2018-09-05T10:48:40.566Z"),
            field1: "value1_2",
            field2: null
        },
        {
            datetime: ISODate("2018-09-05T10:48:42.777Z"),
            field1: null,
            field2: "value2_2"
        }
    ]
}

但是,维护最新修订版并不是那么容易,因为没有按时间顺序接收更新,那么我可以收到一个具有较旧"datetime"字段值的新"文档,然后除非出现以下情况,否则我不可以更新这些字段它们为空.然后,如果需要,我将必须记录所有字段的最后更新时间戳!

However, mainting the latest revision is not that easy because updates are not received in a chronological order then I can receive an "new" document which has an older "datetime" field value and then I must not update the fields except if they are null. Then, I would have to record the last update timestamp for all fields if I want to do so!

推荐答案

您可以尝试以下操作:

db.getCollection('test').aggregate([
    //Sort
    {$sort: { datetime: -1 }},
    //Add fields to an array
    {$group: {
        "_id": null,
        "field1": { $push: "$field1" },
        "field2": { $push: "$field2" },
    }},

    //Filter and do not include null values
    {$project: {
        "field1notNull" : {
              $filter: {
               input: "$field1",
               as: "f",
               cond: { $ne: [ "$$f", null ] }
            }
          },
        "field2notNull" : {
              $filter: {
               input: "$field2",
               as: "f",
               cond: { $ne: [ "$$f", null ] }
            }
          }
        }
    },
    //Get the first values of each
    {$project: {
        "_id": null,
        "field1": {$arrayElemAt: ["$field1notNull", 0]},
        "field2": {$arrayElemAt: ["$field2notNull", 0]} 
    }}
])

这篇关于MongoDB-从带有时间戳的文档中获取最新的非空字段值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-31 12:36