问题描述
我正在RocksDB中使用增量检查点,并将检查点保存到远程目标位置(在我的情况下为S3).如果有人删除作业管理器服务器(检查点协调器在其中运行)并重新安装,将会发生什么?通过丢失检查点协调器,我还失去了从检查点恢复状态的选项吗?因为据我所知协调器保存检查点的所有引用.
I'm using incremental checkpoint with RocksDB and saving the checkpoints into a remote destination(S3 in my case).What will happen if someone deletes the job manager server (where the checkpoint coordinator operates) and reinstall it?By losing the checkpoint coordinator I also lose the option to recover the state from the checkpoints? because from what I know,the coordinator holds all the references of the checkpoints.
推荐答案
如果使用高可用性启用后,Flink会将指向其检查点的指针存储在ZooKeeper中.如果 JobManager
失败,Flink将从ZooKeeper中恢复所有检查点,并能够从最新完成的检查点恢复作业.
If you run Flink with high availability enabled, then Flink will store pointers to its checkpoints in ZooKeeper. In case of a JobManager
failure, Flink will recover all checkpoints from ZooKeeper and be able to resume the jobs from the latest completed checkpoint.
这篇关于失去检查点协调器后是否有可能恢复的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!