将我的ES群集从6.4.2升级到7.6.1并还原了旧群集的快照之后,一些给定索引上的文档不再可以通过id获取。
恢复快照后,此操作不起作用。
GET myindex/_doc/c1d89b00-d030-11e3-bd52-f3718ac695f3
如果我复制文档:
PUT myindex/_doc/c1d89b00-d030-11e3-bd52-f3718ac695f3
{
"name" : "dogs and cats",
"notes" : "Imported",
"myid" : "c1d89b00-d030-11e3-bd52-f3718ac695f3" // yes, it's redundant
}
这突然起作用:
GET myindex/_doc/c1d89b00-d030-11e3-bd52-f3718ac695f3
但是,现在我有两个具有相同ID的文档。
(更新无效,因为该文档无法通过ID获取)
索引定义:
GET myindex
{
"myindex" : {
"aliases" : { },
"mappings" : {
"properties" : {
"merge_id" : {
"type" : "keyword"
},
"name" : {
"type" : "text",
"analyzer" : "index_ngram",
"search_analyzer" : "search_ngram"
},
"notes" : {
"type" : "text",
"analyzer" : "index_ngram",
"search_analyzer" : "search_ngram"
},
"myid" : {
"type" : "keyword"
}
}
},
"settings" : {
"index" : {
"max_ngram_diff" : "48",
"number_of_shards" : "5",
"provided_name" : "myindex",
"creation_date" : "1584420860612",
"analysis" : {
"filter" : {
"my_ngram" : {
"type" : "ngram",
"min_gram" : "2",
"max_gram" : "50"
}
},
"analyzer" : {
"index_ngram" : {
"filter" : [
"lowercase",
"my_ngram"
],
"type" : "custom",
"tokenizer" : "keyword"
},
"default" : {
"tokenizer" : "keyword"
},
"search_ngram" : {
"filter" : "lowercase",
"type" : "custom",
"tokenizer" : "keyword"
}
}
},
"number_of_replicas" : "0",
"uuid" : "uyp_WK3xRjucFRGhYDHbcQ",
"version" : {
"created" : "7060199"
}
}
}
}
}
最有趣的部分是我还有其他索引(使用不同的id格式),这些索引的数据是从同一快照恢复的,升级后它们的文档仍可以按id获取。
最佳答案
在某种程度上,还原了旧集群的快照后,无法通过其ID获取文档的问题似乎与该索引上使用的分片数量有关。
因此,使用如下所示的单个分片将索引缩小到一个新索引即可解决此问题:
PUT /myindex/_settings
{
"settings": {
"index.routing.allocation.require._name": "instance-0000000000",
"index.blocks.write": true
}
}
POST myindex/_shrink/myindex_shrinked
{
"settings": {
"index.number_of_replicas": 0,
"index.number_of_shards": 1,
"index.codec": "best_compression"
},
"aliases": {
"my_search_indices": {}
}
}
PUT /myindex_shrinked/_settings
{
"settings": {
"index.routing.allocation.require._name": null,
"index.blocks.write": true
}
}