elasticsearch - 从ES 7.6.1中的ES 6.4.2恢复快照后，为什么无法按ID提取索引文档？

将我的ES群集从6.4.2升级到7.6.1并还原了旧群集的快照之后，一些给定索引上的文档不再可以通过id获取。

恢复快照后，此操作不起作用。

GET myindex/_doc/c1d89b00-d030-11e3-bd52-f3718ac695f3

如果我复制文档:

PUT myindex/_doc/c1d89b00-d030-11e3-bd52-f3718ac695f3
{
   "name" : "dogs and cats",
   "notes" : "Imported",
   "myid" : "c1d89b00-d030-11e3-bd52-f3718ac695f3" // yes, it's redundant
}

这突然起作用:

GET myindex/_doc/c1d89b00-d030-11e3-bd52-f3718ac695f3

但是，现在我有两个具有相同ID的文档。

(更新无效，因为该文档无法通过ID获取)

索引定义:

GET myindex
{
  "myindex" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "merge_id" : {
          "type" : "keyword"
        },
        "name" : {
          "type" : "text",
          "analyzer" : "index_ngram",
          "search_analyzer" : "search_ngram"
        },
        "notes" : {
          "type" : "text",
          "analyzer" : "index_ngram",
          "search_analyzer" : "search_ngram"
        },
        "myid" : {
          "type" : "keyword"
        }
      }
    },
    "settings" : {
      "index" : {
        "max_ngram_diff" : "48",
        "number_of_shards" : "5",
        "provided_name" : "myindex",
        "creation_date" : "1584420860612",
        "analysis" : {
          "filter" : {
            "my_ngram" : {
              "type" : "ngram",
              "min_gram" : "2",
              "max_gram" : "50"
            }
          },
          "analyzer" : {
            "index_ngram" : {
              "filter" : [
                "lowercase",
                "my_ngram"
              ],
              "type" : "custom",
              "tokenizer" : "keyword"
            },
            "default" : {
              "tokenizer" : "keyword"
            },
            "search_ngram" : {
              "filter" : "lowercase",
              "type" : "custom",
              "tokenizer" : "keyword"
            }
          }
        },
        "number_of_replicas" : "0",
        "uuid" : "uyp_WK3xRjucFRGhYDHbcQ",
        "version" : {
          "created" : "7060199"
        }
      }
    }
  }
}

最有趣的部分是我还有其他索引(使用不同的id格式)，这些索引的数据是从同一快照恢复的，升级后它们的文档仍可以按id获取。

最佳答案

在某种程度上，还原了旧集群的快照后，无法通过其ID获取文档的问题似乎与该索引上使用的分片数量有关。

因此，使用如下所示的单个分片将索引缩小到一个新索引即可解决此问题:

PUT /myindex/_settings
{
  "settings": {
    "index.routing.allocation.require._name": "instance-0000000000",
    "index.blocks.write": true
  }
}

POST myindex/_shrink/myindex_shrinked
{
  "settings": {
    "index.number_of_replicas": 0,
    "index.number_of_shards": 1,
    "index.codec": "best_compression"
  },
  "aliases": {
    "my_search_indices": {}
  }
}

PUT /myindex_shrinked/_settings
{
  "settings": {
    "index.routing.allocation.require._name": null,
    "index.blocks.write": true
  }
}