本文介绍了在 kafka 中的复制分区下修复的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我们的生产环境中,我们经常看到分区在使用主题消息时复制不足.我们正在使用 Kafka 0.11.从文档中可以理解的是

In our production environment, we often see that the partitions go under-replicated while consuming the messages from the topics. We are using Kafka 0.11. From the documentation what is understand is

配置参数 replica.lag.max.messages 已删除.在决定哪些副本同步时,分区领导将不再考虑滞后消息的数量.

Configuration parameter replica.lag.max.messages was removed. Partition leaders will no longer consider the number of lagging messages when deciding which replicas are in sync.

配置参数 replica.lag.time.max.ms 现在不仅指自上次从副本获取请求以来经过的时间,还指自副本上次赶上以来的时间.仍在从领导者获取消息但没有赶上 replica.lag.time.max.ms 中的最新消息的副本将被视为不同步.

Configuration parameter replica.lag.time.max.ms now refers not just to the time passed since last fetch request from the replica, but also to time since the replica last caught up. Replicas that are still fetching messages from leaders but did not catch up to the latest messages in replica.lag.time.max.ms will be considered out of sync.

我们如何解决这个问题?副本不同步的不同原因是什么?在我们的场景中,我们在刀片服务器的单个机架中拥有所有 Kafka 代理,并且都使用具有 10GBPS 以太网(单工)的相同网络.我看不出有任何原因导致副本因网络不同步.

How do we fix this issue? What are the different reasons for replicas go out of sync? In our scenario, we have all the Kafka brokers in the single RACK of the blade servers and all are using the same network with 10GBPS Ethernet(Simplex). I do not see any reason for the replicas to go out of sync due to the network.

推荐答案

我们遇到了同样的问题:

We faced the same issue:

解决方案是:

  1. 重启 Zookeeper 领导.
  2. 重新启动不复制某些分区的 broker\brokers.

不会丢失数据.

这个问题是由于 ZK 中的一个错误状态,在 ZK 上有一个未解决的问题,不记得编号了.

The issue is due to a faulty state in ZK, there was an opened issue on ZK for this, don't remember the number.

这篇关于在 kafka 中的复制分区下修复的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-23 17:53