环境:
Ubuntu14.04 + Postgresql9.4.
以下是我的设置:(“->”表示物理流复制PSR)
Master1 -> Slave1 (primary) -> Slave2
这表现正确-主1上的变化反映在Slave1中,然后是Slave2。
如果禁用Master1,并使用触发器文件将Slave1升级为Master,则Slave1升级成功-我可以写入Slave1。
但是,新升级的Slave1和Slave2之间的复制将停止。
这是预期的行为吗?我希望复制继续这样:
Slave1 -> Slave2
这样写给Slave1的信就反映在Slave2中
更新
日志:
Slave1推广:
2017-10-03 16:43:20 BST @ LOCATION: libpqrcv_connect, libpqwalreceiver.c:107
2017-10-03 16:43:25 BST @ FATAL: XX000: could not connect to the primary server: could not connect to server: Connection refused
Is the server running on host "192.168.20.55" and accepting
TCP/IP connections on port 5432?
2017-10-03 16:43:25 BST @ LOCATION: libpqrcv_connect, libpqwalreceiver.c:107
2017-10-03 16:43:30 BST @ LOG: 00000: trigger file found: /var/lib/postgresql/9.4/main/failover_trigger.5432
2017-10-03 16:43:30 BST @ LOCATION: CheckForStandbyTrigger, xlog.c:11440
2017-10-03 16:43:30 BST @ LOG: 00000: redo done at 0/19000740
2017-10-03 16:43:30 BST @ LOCATION: StartupXLOG, xlog.c:7032
2017-10-03 16:43:30 BST @ LOG: 00000: last completed transaction was at log time 2017-10-03 16:41:23.430752+01
2017-10-03 16:43:30 BST @ LOCATION: StartupXLOG, xlog.c:7037
2017-10-03 16:43:30 BST @ LOG: 00000: selected new timeline ID: 2
2017-10-03 16:43:30 BST @ LOCATION: StartupXLOG, xlog.c:7153
2017-10-03 16:43:30 BST @ LOG: 00000: archive recovery complete
2017-10-03 16:43:30 BST @ LOCATION: exitArchiveRecovery, xlog.c:5459
2017-10-03 16:43:30 BST @ LOG: 00000: MultiXact member wraparound protections are now enabled
2017-10-03 16:43:30 BST @ LOCATION: DetermineSafeOldestOffset, multixact.c:2619
2017-10-03 16:43:30 BST @ LOG: 00000: database system is ready to accept connections
2017-10-03 16:43:30 BST @ LOCATION: reaper, postmaster.c:2795
2017-10-03 16:43:30 BST @ LOG: 00000: autovacuum launcher started
2017-10-03 16:43:30 BST @ LOCATION: AutoVacLauncherMain, autovacuum.c:431
奴隶2
2017-10-03 16:43:30 BST @ LOG: 00000: replication terminated by primary server
2017-10-03 16:43:30 BST @ DETAIL: End of WAL reached on timeline 1 at 0/190007A8.
2017-10-03 16:43:30 BST @ LOCATION: WalReceiverMain, walreceiver.c:446
2017-10-03 16:43:30 BST @ LOG: 00000: fetching timeline history file for timeline 2 from primary server
2017-10-03 16:43:30 BST @ LOCATION: WalRcvFetchTimeLineHistoryFiles, walreceiver.c:669
2017-10-03 16:43:30 BST @ LOG: 00000: record with zero length at 0/190007A8
2017-10-03 16:43:30 BST @ LOCATION: ReadRecord, xlog.c:4184
2017-10-03 16:43:30 BST @ LOG: 00000: restarted WAL streaming at 0/19000000 on timeline 1
2017-10-03 16:43:30 BST @ LOCATION: WalReceiverMain, walreceiver.c:374
2017-10-03 16:43:30 BST @ LOG: 00000: replication terminated by primary server
2017-10-03 16:43:30 BST @ DETAIL: End of WAL reached on timeline 1 at 0/190007A8.
奴隶1 IP:
192.168.20.56
奴隶2 IP:
192.168.20.53
pg_hba.conf允许Slave2连接到Slave1进行复制:
Slave1 pg_hba.conf段:
host replication replication 192.168.20.53/32 trust
Slave1恢复完成:
standby_mode = 'on'
primary_conninfo = 'user=replication host=192.168.20.55 port=5432 sslmode=prefer sslcompression=1 krbsrvname=postgres'
trigger_file = '/var/lib/postgresql/9.4/main/failover_trigger.5432'
Slave2 recovery.conf:
standby_mode = 'on'
primary_conninfo = 'user=replication host=192.168.20.56 port=5432 sslmode=prefer sslcompression=1 krbsrvname=postgres'
在这方面的任何帮助都是非常感谢的。
更新和解决方案
感谢
@Vao Tsun
应答,在Slave2 recovery.conf中添加设置为“最新”的recovery_target_时间线,并重新启动Slave2 postgresql server(不重新加载),允许复制进程重新启动:standby_mode = 'on'
primary_conninfo = 'user=replication host=192.168.20.56 port=5432 sslmode=prefer sslcompression=1 krbsrvname=postgres'
recovery_target_timeline = 'latest'
最佳答案
在slave1日志中可以看到:
2017-10-03 16:43:30 BST @ LOG: 00000: selected new timeline ID: 2
在奴隶2:
017-10-03 16:43:30 BST @ DETAIL: End of WAL reached on timeline 1 at 0/190007A8.
所以slave2并没有在升级后切换到时间线2。
正如我在评论中所说,您需要
recovery_target_timeline='latest'
在slave2 recovery.conf中https://www.postgresql.org/docs/current/static/recovery-target-settings.html
恢复目标时间线(字符串)指定恢复到
特别的时间表。默认设置是沿同一时间线恢复
在进行基本备份时是最新的。将此设置为
最新恢复到存档中找到的最新时间线,即
在备用服务器中很有用。除此之外你只需要设置这个
参数,在复杂的重新恢复情况下,您需要返回
在时间点恢复后达到的状态。见
第25.3.5节供讨论。
关于postgresql - Postgresql 9.4级联复制故障转移,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/46529323/