java - Debezium刷新超时和MySQL的OutOfMemoryError错误

使用Debezium 0.7读取MySQL，但在初始快照阶段出现刷新超时和OutOfMemoryError错误。查看下面的日志，似乎连接器试图一次写太多消息:

WorkerSourceTask{id=accounts-connector-0} flushing 143706 outstanding messages for offset commit   [org.apache.kafka.connect.runtime.WorkerSourceTask]
WorkerSourceTask{id=accounts-connector-0} Committing offsets   [org.apache.kafka.connect.runtime.WorkerSourceTask]
Exception in thread "RMI TCP Connection(idle)" java.lang.OutOfMemoryError: Java heap space
WorkerSourceTask{id=accounts-connector-0} Failed to flush, timed out while waiting for producer to flush outstanding 143706 messages   [org.apache.kafka.connect.runtime.WorkerSourceTask]

想知道对于大型数据库(> 50GB)正确的设置是http://debezium.io/docs/connectors/mysql/#connector-properties吗。对于较小的数据库，我没有这个问题。简单地增加超时时间似乎不是一个好的策略。我目前正在使用默认的连接器设置。

更新资料

按照以下建议更改了设置，并解决了该问题:

OFFSET_FLUSH_TIMEOUT_MS: 60000  # default 5000
OFFSET_FLUSH_INTERVAL_MS: 15000  # default 60000
MAX_BATCH_SIZE: 32768  # default 2048
MAX_QUEUE_SIZE: 131072  # default 8192
HEAP_OPTS: '-Xms2g -Xmx2g'  # default '-Xms1g -Xmx1g'

最佳答案

这是一个非常复杂的问题-首先，Debezium Docker映像的默认内存设置非常低，因此，如果使用它们，可能有必要增加它们。

接下来，有多个因素在起作用。我建议执行以下步骤。

增加max.batch.size和max.queue.size-减少提交次数

增加offset.flush.timeout.ms-给连接时间来处理累积的记录

减少offset.flush.interval.ms-应减少

的累积偏移量

不幸的是，在后台潜伏着issue KAFKA-6551仍然会造成严重破坏。