我正在尝试在“说明”列中获取唯一值。根据我的数据,我有很多类似的描述。我只想要独特的。

con.search(index='data', body={
        "aggs": {
            "query": {
                "match": {"description": query_input}
            },
            "size": 30,
            "distinct_description": {
            }
        }


    })

但是,这根本无法解决。
有什么建议么。

例:
{id: 1, state: "OP", description: "hot and humid"}
{id: 2, state: "LO", description: "dry"}
{id: 3, state: "WE", description: "hot and humid"}
{id: 4, state: "OP", description: "green and vegetative"}
{id: 5, state: "HP", description: "dry"}

结果:
{id: 1, state: "OP", description: "hot and humid"}
{id: 2, state: "LO", description: "dry"}
{id: 4, state: "OP", description: "green and vegetative"}

最佳答案

您应该在description.keyword子字段上尝试术语汇总:

body = {
  "query": {
    "match": {"state": query_input}
  },
   "size":1000,
  "aggs": {
    "distinct_descriptions": {
      "terms": {
        "field": "description.keyword"
      }
    }
  }
}

result = con.search(index='data', body=body)
occurrences_list = list()
occurrences_dict = {"description":None, "score":None}
for res in result["aggregations"]["distinct_descriptions"]["buckets"]:
    occurrences_dict["description"] = {res['key'] : res['doc_count'] }
    occurrences_list.append( occurrences_dict )

for res in result["hits"]["hits"]:
    for elem in occurrences_list:
        if res["_source"]["description"] == elem['description']:
            if not elem["score"]:
                elem["score"] = res["_score"]

请注意星期一产生的查询,现在还有一个size参数,否则elasticsearch默认只检索20个匹配

关于python - 通过python中的Elastic Search搜索唯一值,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/61187249/

10-12 17:03