问题描述
我有一个有 25 个分区的 kafka 主题,集群已经运行了 5 个月.
I have a kafka topic with 25 partitions and the cluster has been running for 5 months.
根据我对给定主题的每个分区的理解,偏移量从 0,1,2...(无界)开始
As per my understanding for each partition for a given topic, the offset starts from 0,1,2... (un-bounded)
我看到 log-end-offset 的值非常高(现在 -> 1230628032)
I see log-end-offset at a very high value (right now -> 1230628032)
我创建了一个新的消费者组,偏移量设置为最早;所以我预计该消费者组的客户端将从偏移量 0 开始的偏移量.
I created a new consumer group with offset being set to earliest; so i expected the offset from which a client for that consumer group will start from offset 0.
我用来创建一个偏移到最早的新消费者组的命令:
The command which I used to create a new consumer group with offset to earliest:
kafka-consumer-groups --bootstrap-server <IP_address>:9092 --reset-offsets --to-earliest --topic some-topic --group to-earliest-cons --execute
我看到正在创建的消费者组.我预计当前偏移为 0;然而,当我描述消费者群体时,当前偏移量非常高,目前 --> 1143755193.
I see the consumer group being created. I expected the current-offset being to 0; however when I described the consumer group the current offset was very high , at the moment --> 1143755193.
设置的记录保留期为 7 天(标准值).
The record retention period set is for 7 days (standard value).
我的问题是为什么我们没有看到来自这个消费者组的消费者将读取 0 的第一个偏移量?它与数据保留有关吗?
My question is why didn't we see the first offset from which a consumer from this consumer group will read 0? Has it to do something with data-retention?
谁能帮助理解这一点?
推荐答案
正是数据保留.Kafka 很可能已经从您的分区中删除了偏移量为 0 的旧消息,因此从 0 开始是没有意义的.相反,Kafka 会将偏移量设置为您分区上最早的可用消息.您可以使用以下方法检查这些偏移量:
It is exactly data retention. It is highly probable that Kafka already removed old messages with offset 0 from your partitions, so it doesn't make sense to start from 0. Instead, Kafka will set offset to the earliest available message on your partition. You can check those offsets using:
./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list <IP_address>:9092 --topic some-topic --time -2
您可能会看到非常接近您所看到的新消费者抵消的值.
You will probably see values really close to what you're seeing as new consumer offset.
您也可以尝试将偏移量显式设置为 0:
You can also try and set offset explicitly to 0:
./kafka-consumer-groups.sh --bootstrap-server <IP_address>:9092 --reset-offsets --to-offset 0 --topic some-topic --group to-earliest-cons --execute
但是,您会看到警告,偏移量 0 不存在,它将使用更高的值(上述最早可用的消息)
However, you will see warning that offset 0 does not exist and it will use higher value (aforementioned earliest message available)
New offset (0) is lower than earliest offset for topic partition some-topic. Value will be set to 1143755193
这篇关于由 kafka-consumer-groups 设置为最早时的当前偏移行为?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!