问题描述
我在 AWS RDS 上有一个 Postgres Db 和一个在桌子上监听的 kafka 连接连接器 (Debezium Postgres).连接器的配置:
I have a Postgres Db on AWS RDS and a kafka connect connector (Debezium Postgres) listening on a table. The configuration of the connector:
{
"name": "my-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.dbname": "my_db",
"database.user": "my_user",
"max.queue.size": "32000",
"slot.name": "my_slot",
"tasks.max": "1",
"publication.name": "my_publication",
"database.server.name": "postgres",
"heartbeat.interval.ms": "1000",
"database.port": "my_port",
"include.schema.changes": "false",
"plugin.name": "pgoutput",
"table.whitelist": "public.my_table",
"tombstones.on.delete": "false",
"database.hostname": "my_host",
"database.password": "my_password",
"name": "my-connector",
"max.batch.size": "10000",
"database.whitelist": "my_db",
"snapshot.mode": "never"
},
"tasks": [
{
"connector": "my-connector",
"task": 0
}
],
"type": "source"
}
该表的更新频率不如其他表,这最初导致复制延迟如下:
The table is not updated as frequently as other tables, which initially led to replication lag like this:
SELECT slot_name,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) as replicationSlotLag,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) as confirmedLag,
active
FROM pg_replication_slots;
slot_name | replicationslotlag | confirmedlag | active
-------------------------------+--------------------+--------------+--------
my_slot | 1664 MB | 1664 MB | t
它会变得如此之大,以至于可能会耗尽所有磁盘空间.
It would get so large that it would threaten using up all disk space.
我添加了一个心跳,如果我登录到 kafka 代理并设置这样的控制台使用者: ./kafka-console-consumer.sh --bootstrap-server my.broker.address:9092 --topic __debezium-heartbeat.postgres --from-beginning --consumer.config=/etc/kafka/consumer.properties
它会转储所有的心跳消息,然后每 1000 毫秒显示一个新的.
I added a heartbeat, and if I log onto a kafka broker and set up a console consumer like this: ./kafka-console-consumer.sh --bootstrap-server my.broker.address:9092 --topic __debezium-heartbeat.postgres --from-beginning --consumer.config=/etc/kafka/consumer.properties
It would dump out all the heartbeat messages, and then show a new one every 1000ms.
然而,插槽的大小仍在增长.如果我执行诸如在表中插入一条虚拟记录之类的操作,它会将插槽设置回一个小的延迟,这样就可以了.
However, the size of the slots still grows and grows. If I do something like insert a dummy record into the table, it sets the slot back to a small lag, so that works.
不过,我想用心跳来做这件事.我不想插入定期消息,因为这听起来会增加复杂性.为什么心跳没有减少插槽大小?
I would like to do this with a heartbeat though. I do not want to insert periodic messages since it sounds like it would add complexity. Why is the heartbeat not reducing the slot size?
推荐答案
请看https://debezium.io/documentation/reference/1.0/connectors/postgresql.html#wal-disk-space
您确实需要发出定期消息,但现在有帮助 - https://issues.redhat.com/browse/DBZ-1815
You really need to emit the periodic messages but there is a help for it now - https://issues.redhat.com/browse/DBZ-1815
这篇关于Debezium Postgres Kafka Connector 心跳未提交 LSN的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!