json - 还有其他方法可以针对JSON中的多个嵌套字段优化此Elasticsearch查询

我是Elasticserach的新手。以下是需要在其上运行 flex 查询的示例数据。我正在尝试获取account_type为“信用卡”且source_name为“SOMEVALUE”的那些文档

{
"took" : 0,
"timed_out" : false,
"_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
},
"hits" : {
    "total" : {
    "value" : 1,
    "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
    {
        "_index" : "bureau_data",
        "_type" : "_doc",
        "_id" : "bda57e01-c564-4cdc-bb8d-79bd2db9d2f8",
        "_score" : 1.0,
        "_source" : {
        "userid" : "bda57e01-c564-4cdc-bb8d-79bd2db9d2f8",
        "raw_derived" : {
            "gender" : "MALE",
            "firstname" : "trsqlsz",
            "middlename" : "rgj",
            "lastname" : "ggksb",
            "mobilephone" : "2125954664",
            "dob" : "1988-06-28 00:00:00",
            "applications" : [
            {
                "applicationid" : "c7fb0147-22fd-4a5e-8851-98241de6aa50",
                "createdat" : "2019-06-07 19:28:54",
                "updatedat" : "2019-06-07 19:28:55",
                "source" : "4",
                "source_name" : "EXPERIAN",
                "applicationcreditreportid" : "b67f9180-9bb6-485c-9cfc-e7ccf9a70a69",
                "accounts" : [
                {
                    "applicationcreditreportaccountid" : "c5de28c4-cac9-4390-852a-96f143cb0b62",
                    "currentbalance" : 418288,
                    "institutionid" : "021d58b4-aba5-42c9-8d39-304a78d34aea",
                    "accounttypeid" : "5",
                    "institution_name" : "HDFC BANK",
                    "account_type_name" : "Personal Loan"
                }
                ]
            }
            ]
        }
        }
    }

我已经尝试了以下查询及其正常工作。我需要我们是否有任何优化的方法来查询多个嵌套字段

GET /my_index/_search
{
"query": {
    "bool": {
    "must": [
        {
        "nested": {
            "path": "raw_derived.applications.accounts",
            "query": {
            "bool": {
                "must": [
                {"match": {
                    "raw_derived.applications.accounts.account_type_name": "Credit Card"
                }}
                ]
            }
            }
        }
        },
        {
        "nested": {
            "path": "raw_derived.applications",
            "query": {
            "bool": {
                "must": [
                {"match": {
                    "raw_derived.applications.source_name": "CIBIL"
                }}
                ]
            }
            }
        }
        }
    ]
    }
}

}

如果我要查询多个嵌套字段，它将变得很长。请建议使用任何其他方式查询嵌套字段或多个AND

最佳答案

那么，您的优化应该始终从数据模型/映射开始，因为这主要是性能问题的原因，而而不是是您的查询。

话虽如此，您可以通过展平数据来避免嵌套查询。统一的数据模型将导致每个应用程序和帐户元素一个文档。

由于Elasticsearch是非关系数据存储，因此对“冗余”数据进行索引完全可以。这是，而不是，这是一种懒惰的方法，而是处理这些类型的数据结构的常用方法。

样本文档1:

{
    "_index" : "bureau_data",
    "_type" : "_doc",
    "_id" : "bda57e01-c564-4cdc-bb8d-79bd2db9d2f8",
    "_score" : 1.0,
    "_source" : {
      "userid" : "bda57e01-c564-4cdc-bb8d-79bd2db9d2f8",
      "gender" : "MALE",
      "firstname" : "trsqlsz",
      "middlename" : "rgj",
      "lastname" : "ggksb",
      "mobilephone" : "2125954664",
      "dob" : "1988-06-28 00:00:00",
      "applicationid" : "c7fb0147-22fd-4a5e-8851-98241de6aa50",
      "createdat" : "2019-06-07 19:28:54",
      "updatedat" : "2019-06-07 19:28:55",
      "source" : "4",
      "source_name" : "EXPERIAN",
      "applicationcreditreportid" : "b67f9180-9bb6-485c-9cfc-e7ccf9a70a69",
      "applicationcreditreportaccountid" : "c5de28c4-cac9-4390-852a-96f143cb0b62",
      "currentbalance" : 418288,
      "institutionid" : "021d58b4-aba5-42c9-8d39-304a78d34aea",
      "accounttypeid" : "5",
      "institution_name" : "HDFC BANK",
      "account_type_name" : "Personal Loan"
    }
}

如果同一用户创建另一个帐户，则您将发送完全相同(“冗余”)的数据，但其他帐户元素/数据除外，如下所示:

    {
    "_index" : "bureau_data",
    "_type" : "_doc",
    "_id" : "another, from es generated id",
    "_score" : 1.0,
    "_source" : {
      "userid" : "bda57e01-c564-4cdc-bb8d-79bd2db9d2f8",
      "gender" : "MALE",
      "firstname" : "trsqlsz",
      "middlename" : "rgj",
      "lastname" : "ggksb",
      "mobilephone" : "2125954664",
      "dob" : "1988-06-28 00:00:00",
      "applicationid" : "c7fb0147-22fd-4a5e-8851-98241de6aa50",
      "createdat" : "2019-06-07 19:28:54",
      "updatedat" : "2019-06-07 19:28:55",
      "source" : "4",
      "source_name" : "EXPERIAN",
      "applicationcreditreportid" : "b67f9180-9bb6-485c-9cfc-e7ccf9a70a69",
      "applicationcreditreportaccountid" : "the new id",
      "currentbalance" : 4711,
      "institutionid" : "foo",
      "accounttypeid" : "bar",
      "institution_name" : "foo bar",
      "account_type_name" : "foo baz"
    }
}

使用这种数据模型，您可以运行简单的查询来获取结果:

    GET /my_index/_search
{
    "query": {
        "bool": {
            "must": [
            {
                "match":{
                    "account_type_name": "Credit Card"
                }
            },
            {
                "match":{
                    "source_name": "CIBIL"
                }
            }
            ]
        }
    }
}

关于json - 还有其他方法可以针对JSON中的多个嵌套字段优化此Elasticsearch查询，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/57408646/