本文介绍了如何使用kubeadm从主服务器故障中恢复的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用kubeadm设置了一个具有单个主节点和两个工作节点的Kubernetes集群,我试图弄清楚如何从节点故障中恢复.

I set up a Kubernetes cluster with a single master node and two worker nodes using kubeadm, and I am trying to figure out how to recover from node failure.

当工作节点失败时,恢复非常简单:我从头开始创建一个新的工作节点,运行kubeadm join,一切都很好.

When a worker node fails, recovery is straightforward: I create a new worker node from scratch, run kubeadm join, and everything's fine.

但是,我无法弄清楚如何从主节点故障中恢复(不中断工作节点上运行的部署).我是否需要备份和还原原始证书,还是可以只运行kubeadm init从头开始创建新的主证书?如何加入现有的工作节点?

However, I cannot figure out how to recover from master node failure (without interrupting the deployments running on the worker nodes). Do I need to backup and restore the original certificates or can I just run kubeadm init to create a new master from scratch? How do I join the existing worker nodes?

推荐答案

我最终写了一个Kubernetes CronJob 备份etcd数据.如果您有兴趣:我写了一篇博客文章: https ://labs.consol.de/kubernetes/2018/05/25/kubeadm-backup.html

I ended up writing a Kubernetes CronJob backing up the etcd data. If you are interested: I wrote a blog post about it: https://labs.consol.de/kubernetes/2018/05/25/kubeadm-backup.html

除此之外,您可能还希望备份所有/etc/kubernetes/pki,以避免必须更新机密(令牌)的问题.

In addition to that you may want to backup all of /etc/kubernetes/pki to avoid issues with secrets (tokens) having to be renewed.

例如,kube-proxy使用密钥存储令牌,如果仅备份etcd证书,此令牌将无效.

For example, kube-proxy uses a secret to store a token and this token becomes invalid if only the etcd certificate is backed up.

这篇关于如何使用kubeadm从主服务器故障中恢复的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-01 20:57