PostgreSQL 三节点集群故障模拟及恢复

(postgreSQL9.5.1)

正常状态:

10.2.208.10:node1:master

10.2.208.11:node2:standby1同步

10.2.208.12:node3:standby2异步

10.2.208.13:node4:备用机

故障模拟及恢复

1 主库故障

1 停止主库(master)[M]服务[10.2.208.10]

pg_ctl –D ../data stop –m fast

2 更改S1的postgresql.conf[10.2.208.11]

synchronous_standby_names = 'node3'

3 更改S2 recovery.conf[10.2.208.12]

primary_conninfo = 'host=node2IP[10.2.208.11] port=6432 user=replica password=replica application_name=node3'

4 提升S1为新的主库Master

pg_ctl –D ../data promote[10.2.208.11]

5 重新启动S2[10.2.208.12] 使S2成为新的同步节点

pg_ctl –D ../data restart

6使node4[10.2.208.13]成为异步节点

mkdir data

chmod 0700 data

pg_basebackup -h 10.2.208.11 -P -Fp -x -v -p 6432 -U replica -D /home/replica/data/

7 node4[10.2.208.13]recovery文件处理

mv recovery.done recovery.conf

primary_conninfo = 'host=10.2.208.11 port=6432 user=replica password=replica'

8 node4[10.2.208.13]postgresql.conf文件处理

#synchronous_standby_names = ' '

9 启动node4[10.2.208.13]

pg_ctl -D ../data/ start

2备库1故障

1停止standby1服务

pg_ctl -D ../data/ stop -m fast

2更改node1[10.2.208.10]postgresql.conf

synchronous_standby_names = 'node3'

3更改S2[10.2.208.12]recovery.conf文件

primary_conninfo = 'host=10.2.208.10 port=6432 user=replica password=replica application_name=node3'

4重启主库Master

pg_ctl –D ../data restart

5重启S2成为同步节点

6 node4[10.2.208.13]成为异步节点

pg_basebackup -h 10.2.208.10 -P -Fp -x -v -p 6432 -U replica -D /home/replica/data/

mv recovery.done recovery.conf

primary_conninfo = 'host=10.2.208.10 port=6432 user=replica password=replica '

postgresql.conf文件

#synchronous_standby_names = ' '

7启动node4

pg_ctl –D ../data start

3 备库2故障

1 停止备库2 node3[10.2.208.12]的服务

pg_ctl –D ../data stop –m fast

2 node4[10.2.208.13]成为异步节点

pg_basebackup -h 10.2.208.10 -P -Fp -x -v -p 6432 -U replica -D /home/replica/data/

mv recovery.done recovery.conf

primary_conninfo = 'host=10.2.208.10 port=6432 user=replica password=replica '

postgresql.conf文件

#synchronous_standby_names = ' '

3启动node4

pg_ctl –D ../data start

05-11 15:02