问题描述
我在 RocksDB 中使用增量检查点并将检查点保存到远程目标(在我的情况下为 S3).如果有人删除了作业管理器服务器(检查点协调器运行的地方)并重新安装它,会发生什么?通过失去检查点协调器,我也失去了从检查点恢复状态的选项?因为据我所知,协调器持有检查点的所有引用.
I'm using incremental checkpoint with RocksDB and saving the checkpoints into a remote destination(S3 in my case).What will happen if someone deletes the job manager server (where the checkpoint coordinator operates) and reinstall it?By losing the checkpoint coordinator I also lose the option to recover the state from the checkpoints? because from what I know,the coordinator holds all the references of the checkpoints.
推荐答案
如果你使用 高可用性 启用,然后 Flink 将在 ZooKeeper 中存储指向其检查点的指针.如果 JobManager
失败,Flink 将从 ZooKeeper 恢复所有检查点,并能够从最近完成的检查点恢复作业.
If you run Flink with high availability enabled, then Flink will store pointers to its checkpoints in ZooKeeper. In case of a JobManager
failure, Flink will recover all checkpoints from ZooKeeper and be able to resume the jobs from the latest completed checkpoint.
这篇关于丢失检查点协调器后是否可以恢复的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!