primary:10.189.102.118

standby:10.189.100.195

1. 配置ssh互信机制

  • 在primary主库执行
$ ssh-keygen -t rsa
$ cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys
$ ssh-copy-id [email protected]
  • 在standby备库执行
$ ssh-keygen -t rsa
$ cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys
$ ssh-copy-id [email protected]

2. 在standby服务器安装postgres数据库,不需要初始化.

安装过程详见:http://www.cnblogs.com/ilifeilong/p/6979288.html

3. 在primary服务器创建具有REPLICATION权限的复制用户

postgres=# CREATE ROLE repl WITH REPLICATION PASSWORD 'repl' LOGIN;

4. 允许复制用户远程连接到primary服务器

$ grep "^host" pg_hba.conf
host all all 127.0.0.1/ trust
host replication repl 0.0.0.0/ md5
host all all ::/ trust

5. 在primary服务器设置流复制相关的参数

$ mkdir /usr/local/pgsql/arch
$ egrep "archive_mode|max_wal_senders|wal_keep_segments|archive_command|wal_level|hot_standby" postgresql.conf
al_level = hot_standby # minimal, archive, hot_standby, or logical
archive_mode = on # enables archiving; off, on, or always
archive_command = 'test ! -f /usr/local/pgsql/arch/%f && cp %p /usr/local/pgsql/arch/%f && rsync -a %p [email protected]:/usr/local/pgsql/arch/%f'
max_wal_senders = # max number of walsender processes
wal_keep_segments = # in logfile segments, 16MB each; disables
hot_standby = on # "on" allows queries during recovery
#hot_standby_feedback = off # send info from standby to prevent

6. 重新启动primary服务器进程

$ pg_ctl stop -m fast
$ pg_ctl start

7. 对primary服务器做一个全备并传输到standby服务器

  •    方法一,在primary服务器通过pg_(start|stop)_backup函数进行备份
postgres=# SELECT pg_start_backup('label', true);
pg_start_backup
-----------------
/E6000060
( row)
$ rsync -az --progress ${PGDATA} [email protected]:/usr/local/pgsql/ --exclude postmaster.pid
postgres=# SELECT pg_stop_backup();
NOTICE: pg_stop_backup complete, all required WAL segments have been archived
pg_stop_backup
----------------
/E60005C8
( row)
  •  方法二,在standby服务器通过pg_basebackup命令进行备份,要求standby的PGDATA目录为空
$ pg_basebackup --host=10.189.102.118 --username=repl --port= --label=backup --verbose --progress --pgdata=/usr/local/pgsql/data --checkpoint=fast --format=p --xlog-method=stream
Password:
transaction log start point: /EA000028 on timeline
pg_basebackup: starting background WAL receiver
/ kB (%), / tablespace
transaction log end point: /EA000830
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: base backup completed

8. 设置standby数据库复制相关参数,使得standby失效转移后可以作为主库工作

$ mkdir /usr/local/pgsql/arch
$ grep "hot_standby" postgresql.conf
hot_standby = on # "on" allows queries during recovery
#hot_standby_feedback = off # send info from standby to prevent

9. 在standby文件创建恢复文件

$ cat recovery.conf
restore_command = 'cp /usr/local/pgsql/arch/%f "%p"'
standby_mode = 'on'
primary_conninfo = 'user=repl password=repl host=10.189.102.118 port=5432 sslmode=disable sslcompression=1'
archive_cleanup_command = 'pg_archivecleanup -d /usr/local/pgsql/arch %r >> /usr/local/pgsql/arch/archive_cleanup.log'
trigger_file = '/usr/local/pgsql/data/trigger_active_standby'

10. 启动standby数据库进程,自动启动流复制

$ pg_ctl start -w
waiting for server to start....LOG: could not create IPv6 socket: Address family not supported by protocol
LOG: redirecting log output to logging collector process
HINT: Future log output will appear in directory "pg_log".
done
server started

11. 检查primary和standby数据库的状态

  • 通过函数和系统表查看
edbstore=# select * from pg_stat_replication;           #在primary主库查看
-[ RECORD 1 ]----+------------------------------
pid | 15013
usesysid | 19206
usename | repl
application_name | walreceiver
client_addr | 10.189.100.195
client_hostname |
client_port | 56072
backend_start | 2017-06-13 08:10:35.400508-07
backend_xmin |
state | streaming
sent_location | 7/EC01A588
write_location | 7/EC01A588
flush_location | 7/EC01A588
replay_location | 7/EC01A588
sync_priority | 0
sync_state | async edbstore=# SELECT pg_current_xlog_location(); #在primary主库查看
pg_current_xlog_location
--------------------------
7/EC01A588
(1 row) postgres=# select pg_last_xlog_receive_location(),pg_last_xlog_replay_location(),pg_last_xact_replay_timestamp(); #在standby备库查看
pg_last_xlog_receive_location | pg_last_xlog_replay_location | pg_last_xact_replay_timestamp
-------------------------------+------------------------------+-------------------------------
7/EC01A588 | 7/EC01A588 | 2017-06-13 08:25:20.281568-07
(1 row)
  • 通过进程查看
$ ps -ef | grep sender | grep -v grep        #在primary库查看
postgres : ? :: postgres: wal sender process repl 10.189.100.195() streaming /EC01A668 $ ps -ef | grep receiver | grep -v grep #在standby库查看
postgres : ? :: postgres: wal receiver process streaming /EC01A668
  •  查看备库落后主库多少字节的WAl日志
postgres=# select pg_xlog_location_diff(pg_current_xlog_location(),replay_location) from pg_stat_replication ;     #在primary库查看
pg_xlog_location_diff
-----------------------
0
(1 row)
  •  把主库WAL日志位置转换成WAL文件名称和偏移量
postgres=# select pg_current_xlog_location();
pg_current_xlog_location
--------------------------
7/EC021790
(1 row) postgres=# select * from pg_xlogfile_name_offset('7/EC021790');
file_name | file_offset
--------------------------+-------------
000000010000000700000076 | 137104
(1 row)
  • 查看主库备库的状态
postgres=# select pg_is_in_recovery();            #在primary主库查看
pg_is_in_recovery
-------------------
f
(1 row) postgres=# select pg_is_in_recovery(); #在standby库查看
pg_is_in_recovery
-------------------
t
(1 row)
05-08 15:19