一.部署说明

1.1 实施环境

本文档实验环境如下:

PGSQL主机: 192.168.1.45

PGSQL备机: 192.168.1.50

软件和系统版本

Pgsql 版本: pgsql 9.2.4

Linux 版本: Redhat 5.8

pgpool版本:pgpool-II version 3.3.4 (tokakiboshi)

1.2 文档说明  

  在postgresql 的stream replication配置一文我们实现了postgresql的stream replication,实现了postgresql的热备。而实际生产中集群的目的之一就是负载均衡。那我们系统为例,刚开始java程序需要修改数据源,将查询的压力分担给从机,对数据的修改设置为主机。但是这种方式不灵活,负载不能均衡,并且比较麻烦。所以如果此时有一个路由控制器多好,可以根据请求将请求分发给不同的数据库。刚好,pgpoool就是这么一个中间件,负责与数据库集群交互,对外提供统一的访问接口,使得程序对数据库的访问变得简单,提高整个系统的性能。本文主要配置pgpool的负载均衡,是在上篇博文:PostgreSQL的HA解决方案-1主从和备份(master/slave and backup) 的基础之上配置的。

  为了试验减少不必要的麻烦,本系列试验中,各个主机之间的通信,都是设置为无密码访问,且关闭主从的防火墙,但对于实际生产,需要根据实际需要,配置密码,增强集群的安全性。

二.配置步骤

  1.1 源码安装pgpool。参考中文手册:http://pgpool.projects.pgfoundry.org/pgpool-II/doc/pgpool-zh_cn.html#install

  1.2 安装pgpool中提供的工具。安装 pgpool_regclass。注意 pgpool_regclass是在你解压pgpool后的目录中的sql目录下,而不是你的安装目录。

    pgpool_regclass主要作用是,方便在数据库查询中,查询关于pgpool的的配置参数,监控信息等,请仔细阅读中文手册。应在在每台通过 pgpool-II 访问的数据库中执行 pgpool-regclass.sql。你不需要在你执行“psql -f pgpool-regclass.sql template1”后建立的数据库中这么做,因为这个模板数据库将被克隆成新建的数据库。

cd pgpool-II-x.x.x/sql/pgpool-regclass
make
make install
psql -f pgpool-regclass.sql template1

    安装之后,再在/usr/local/pgsql/share/extension/目录下多出如下文件:

-rw-r--r--  root root  Sep  : pgpool_regclass.control
-rw-r--r-- root root Sep : pgpool_regclass--1.0.sql
-rw-r--r-- root root Sep : pgpool-regclass.sql

  1.3 配置pgpool的配置文件。

  主要配置:

listen_addresses="*"
port=
backend_hostname0="localhost"
backend_port0=""
backend_weight0=
backend_hostname1="192.168.57.175"
backend_port1=""
backend_weight1=
replication_mode=off
load_balance_mode=on
master_slave_mode=on
master_save_sub_mode="stream"
parallel_mode=off

详细配置:

# ----------------------------
# pgPool-II configuration file
# ----------------------------
#
# This file consists of lines of the form:
#
# name = value
#
# Whitespace may be used. Comments are introduced with "#" anywhere on a line.
# The complete list of parameter names and allowed values can be found in the
# pgPool-II documentation.
#
# This file is read on server startup and when the server receives a SIGHUP
# signal. If you edit the file on a running system, you have to SIGHUP the
# server for the changes to take effect, or use "pgpool reload". Some
# parameters, which are marked below, require a server shutdown and restart to
# take effect.
# #------------------------------------------------------------------------------
# CONNECTIONS
#------------------------------------------------------------------------------ # - pgpool Connection Settings - #listen_addresses = 'localhost'
listen_addresses = '*'
# Host name or IP address to listen on:
# '*' for all, '' for no TCP/IP connections
# (change requires restart)
port =
# Port number
# (change requires restart)
socket_dir = '/tmp'
# Unix domain socket path
# The Debian package defaults to
# /var/run/postgresql
# (change requires restart) # - pgpool Communication Manager Connection Settings - pcp_port =
# Port number for pcp
# (change requires restart)
pcp_socket_dir = '/tmp'
# Unix domain socket path for pcp
# The Debian package defaults to
# /var/run/postgresql
# (change requires restart) # - Backend Connection Settings - #backend_hostname0 = 'host1'
backend_hostname0 = '192.168.1.45'
# Host name or IP address to connect to for backend
backend_port0 =
# Port number for backend
backend_weight0 =
# Weight for backend (only in load balancing mode)
#backend_data_directory0 = '/data'
# Data directory for backend
backend_flag0 = 'ALLOW_TO_FAILOVER'
# Controls various backend behavior
# ALLOW_TO_FAILOVER or DISALLOW_TO_FAILOVER
backend_hostname1 = '192.168.1.50'
backend_port1 =
backend_weight1 =
#backend_data_directory1 = '/data1'
backend_flag1 = 'ALLOW_TO_FAILOVER' # - Authentication - enable_pool_hba = on
# Use pool_hba.conf for client authentication
pool_passwd = 'pool_passwd'
# File name of pool_passwd for md5 authentication.
# "" disables pool_passwd.
# (change requires restart)
authentication_timeout =
# Delay in seconds to complete client authentication
# means no timeout. # - SSL Connections - ssl = off
# Enable SSL support
# (change requires restart)
#ssl_key = './server.key'
# Path to the SSL private key file
# (change requires restart)
#ssl_cert = './server.cert'
# Path to the SSL public certificate file
# (change requires restart)
#ssl_ca_cert = ''
# Path to a single PEM format file
# containing CA root certificate(s)
# (change requires restart)
#ssl_ca_cert_dir = ''
# Directory containing CA root certificate(s)
# (change requires restart) #------------------------------------------------------------------------------
# POOLS
#------------------------------------------------------------------------------ # - Pool size - num_init_children =
# Number of pools
# (change requires restart)
max_pool =
# Number of connections per pool
# (change requires restart) # - Life time - child_life_time =
# Pool exits after being idle for this many seconds
child_max_connections =
# Pool exits after receiving that many connections
# means no exit
connection_life_time =
# Connection to backend closes after being idle for this many seconds
# means no close
client_idle_limit =
# Client is disconnected after being idle for that many seconds
# (even inside an explicit transactions!)
# means no disconnection #------------------------------------------------------------------------------
# LOGS
#------------------------------------------------------------------------------ # - Where to log - log_destination = 'stderr'
# Where to log
# Valid values are combinations of stderr,
# and syslog. Default to stderr. # - What to log - print_timestamp = on
# Print timestamp on each line
# (change requires restart) log_connections = on
# Log connections
log_hostname = on
# Hostname will be shown in ps status
# and in logs if connections are logged
log_statement = on
# Log all statements
log_per_node_statement = off
# Log all statements
# with node and backend informations
log_standby_delay = 'none'
# Log standby delay
# Valid values are combinations of always,
# if_over_threshold, none # - Syslog specific - syslog_facility = 'LOCAL0'
# Syslog local facility. Default to LOCAL0
syslog_ident = 'pgpool'
# Syslog program identification string
# Default to 'pgpool' # - Debug - debug_level =
# Debug message verbosity level
# means no message, or more mean verbose #------------------------------------------------------------------------------
# FILE LOCATIONS
#------------------------------------------------------------------------------ pid_file_name = '/var/run/pgpool/pgpool.pid'
# PID file name
# (change requires restart)
logdir = '/tmp'
# Directory of pgPool status file
# (change requires restart) #------------------------------------------------------------------------------
# CONNECTION POOLING
#------------------------------------------------------------------------------ connection_cache = on
# Activate connection pools
# (change requires restart) # Semicolon separated list of queries
# to be issued at the end of a session
# The default is for 8.3 and later
reset_query_list = 'ABORT; DISCARD ALL'
# The following one is for 8.2 and before
#reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION DEFAULT' #------------------------------------------------------------------------------
# REPLICATION MODE
#------------------------------------------------------------------------------ replication_mode = off
# Activate replication mode
# (change requires restart)
replicate_select = off
# Replicate SELECT statements
# when in replication or parallel mode
# replicate_select is higher priority than
# load_balance_mode. insert_lock = on
# Automatically locks a dummy row or a table
# with INSERT statements to keep SERIAL data
# consistency
# Without SERIAL, no lock will be issued
lobj_lock_table = ''
# When rewriting lo_creat command in
# replication mode, specify table name to
# lock # - Degenerate handling - replication_stop_on_mismatch = off
# On disagreement with the packet kind
# sent from backend, degenerate the node
# which is most likely "minority"
# If off, just force to exit this session failover_if_affected_tuples_mismatch = off
# On disagreement with the number of affected
# tuples in UPDATE/DELETE queries, then
# degenerate the node which is most likely
# "minority".
# If off, just abort the transaction to
# keep the consistency #------------------------------------------------------------------------------
# LOAD BALANCING MODE
#------------------------------------------------------------------------------ load_balance_mode = on
# Activate load balancing mode
# (change requires restart)
ignore_leading_white_space = on
# Ignore leading white spaces of each query
white_function_list = ''
# Comma separated list of function names
# that don't write to database
# Regexp are accepted
black_function_list = 'nextval,setval'
# Comma separated list of function names
# that write to database
# Regexp are accepted #------------------------------------------------------------------------------
# MASTER/SLAVE MODE
#------------------------------------------------------------------------------ master_slave_mode = on
# Activate master/slave mode
# (change requires restart)
master_slave_sub_mode = 'stream'
# Master/slave sub mode
# Valid values are combinations slony or
# stream. Default is slony.
# (change requires restart) # - Streaming - #sr_check_period =
sr_check_period =
# Streaming replication check period
# Disabled () by default
sr_check_user = 'postgres'
# Streaming replication check user
# This is necessary even if you disable
# streaming replication delay check with
# sr_check_period =
sr_check_password = 'postgres123'
# Password for streaming replication check user
delay_threshold =
# Threshold before not dispatching query to standby node
# Unit is in bytes
# Disabled () by default # - Special commands - follow_master_command = ''
# Executes this command after master failover
# Special values:
# %d = node id
# %h = host name
# %p = port number
# %D = database cluster path
# %m = new master node id
# %H = hostname of the new master node
# %M = old master node id
# %P = old primary node id
# %r = new master port number
# %R = new master database cluster path
# %% = '%' character #------------------------------------------------------------------------------
# PARALLEL MODE
#------------------------------------------------------------------------------ parallel_mode = off
# Activates parallel query mode
# (change requires restart)
pgpool2_hostname = ''
# Set pgpool2 hostname
# (change requires restart) # - System DB info - system_db_hostname = 'localhost'
# (change requires restart)
system_db_port =
# (change requires restart)
system_db_dbname = 'pgpool'
# (change requires restart)
system_db_schema = 'pgpool_catalog'
# (change requires restart)
system_db_user = 'pgpool'
# (change requires restart)
system_db_password = ''
# (change requires restart) #------------------------------------------------------------------------------
# HEALTH CHECK
#------------------------------------------------------------------------------ #health_check_period =
health_check_period =
# Health check period
# Disabled () by default
health_check_timeout =
# Health check timeout
# means no timeout
health_check_user = 'postgres'
# Health check user
health_check_password = 'postgres123'
# Password for health check user
health_check_max_retries =
# Maximum number of times to retry a failed health check before giving up.
health_check_retry_delay =
# Amount of time to wait (in seconds) between retries. #------------------------------------------------------------------------------
# FAILOVER AND FAILBACK
#------------------------------------------------------------------------------ failover_command = '/home/postgres/scripts/failover_stream.sh %d %H /usr/local/pgsql/data/postgresql.trigger.5432'
# Executes this command at failover
# Special values:
# %d = node id
# %h = host name
# %p = port number
# %D = database cluster path
# %m = new master node id
# %H = hostname of the new master node
# %M = old master node id
# %P = old primary node id
# %r = new master port number
# %R = new master database cluster path
# %% = '%' character
failback_command = ''
# Executes this command at failback.
# Special values:
# %d = node id
# %h = host name
# %p = port number
# %D = database cluster path
# %m = new master node id
# %H = hostname of the new master node
# %M = old master node id
# %P = old primary node id
# %r = new master port number
# %R = new master database cluster path
# %% = '%' character fail_over_on_backend_error = on
# Initiates failover when reading/writing to the
# backend communication socket fails
# If set to off, pgpool will report an
# error and disconnect the session. search_primary_node_timeout =
# Timeout in seconds to search for the
# primary node when a failover occurs.
# means no timeout, keep searching
# for a primary node forever. #------------------------------------------------------------------------------
# ONLINE RECOVERY
#------------------------------------------------------------------------------ recovery_user = 'postgres'
# Online recovery user
recovery_password = 'postgres123'
# Online recovery password
recovery_1st_stage_command = ''
# Executes a command in first stage
recovery_2nd_stage_command = ''
# Executes a command in second stage
recovery_timeout =
# Timeout in seconds to wait for the
# recovering node's postmaster to start up
# means no wait
client_idle_limit_in_recovery =
# Client is disconnected after being idle
# for that many seconds in the second stage
# of online recovery
# means no disconnection
# - means immediate disconnection #------------------------------------------------------------------------------
# WATCHDOG
#------------------------------------------------------------------------------ # - Enabling - use_watchdog = off
# Activates watchdog
# (change requires restart) # -Connection to up stream servers - trusted_servers = ''
# trusted server list which are used
# to confirm network connection
# (hostA,hostB,hostC,...)
# (change requires restart)
ping_path = '/bin'
# ping command path
# (change requires restart) # - Watchdog communication Settings - wd_hostname = ''
# Host name or IP address of this watchdog
# (change requires restart)
wd_port =
# port number for watchdog service
# (change requires restart)
wd_authkey = ''
# Authentication key for watchdog communication
# (change requires restart) # - Virtual IP control Setting - delegate_IP = ''
# delegate IP address
# If this is empty, virtual IP never bring up.
# (change requires restart)
ifconfig_path = '/sbin'
# ifconfig command path
# (change requires restart)
if_up_cmd = 'ifconfig eth0:0 inet $_IP_$ netmask 255.255.255.0'
# startup delegate IP command
# (change requires restart)
if_down_cmd = 'ifconfig eth0:0 down'
# shutdown delegate IP command
# (change requires restart) arping_path = '/usr/sbin' # arping command path
# (change requires restart) arping_cmd = 'arping -U $_IP_$ -w 1'
# arping command
# (change requires restart) # - Behaivor on escalation Setting - clear_memqcache_on_escalation = on
# Clear all the query cache on shared memory
# when standby pgpool escalate to active pgpool
# (= virtual IP holder).
# This should be off if client connects to pgpool
# not using virtual IP.
# (change requires restart)
wd_escalation_command = ''
# Executes this command at escalation on new active pgpool.
# (change requires restart) # - Lifecheck Setting - # -- common -- wd_lifecheck_method = 'heartbeat'
# Method of watchdog lifecheck ('heartbeat' or 'query')
# (change requires restart)
wd_interval =
# lifecheck interval (sec) >
# (change requires restart) # -- heartbeat mode -- wd_heartbeat_port =
# Port number for receiving heartbeat signal
# (change requires restart)
wd_heartbeat_keepalive =
# Interval time of sending heartbeat signal (sec)
# (change requires restart)
wd_heartbeat_deadtime =
# Deadtime interval for heartbeat signal (sec)
# (change requires restart)
heartbeat_destination0 = 'db2'
# Host name or IP address of destination
# for sending heartbeat signal.
# (change requires restart)
heartbeat_destination_port0 =
# Port number of destination for sending
# heartbeat signal. Usually this is the
# same as wd_heartbeat_port.
# (change requires restart)
heartbeat_device0 = ''
# Name of NIC device (such like 'eth0')
# used for sending/receiving heartbeat
# signal to/from destination .
# This works only when this is not empty
# and pgpool has root privilege.
# (change requires restart) #heartbeat_destination1 = 'host0_ip2'
#heartbeat_destination_port1 =
#heartbeat_device1 = '' # -- query mode -- wd_life_point =
# lifecheck retry times
# (change requires restart)
wd_lifecheck_query = 'SELECT 1'
# lifecheck query to pgpool from watchdog
# (change requires restart)
wd_lifecheck_dbname = 'template1'
# Database name connected for lifecheck
# (change requires restart)
wd_lifecheck_user = 'nobody'
# watchdog user monitoring pgpools in lifecheck
# (change requires restart)
wd_lifecheck_password = ''
# Password for watchdog user in lifecheck
# (change requires restart) # - Other pgpool Connection Settings - #other_pgpool_hostname0 = 'host0'
# Host name or IP address to connect to for other pgpool
# (change requires restart)
#other_pgpool_port0 =
# Port number for othet pgpool
# (change requires restart)
#other_wd_port0 =
# Port number for othet watchdog
# (change requires restart)
#other_pgpool_hostname1 = 'host1'
#other_pgpool_port1 =
#other_wd_port1 = #------------------------------------------------------------------------------
# OTHERS
#------------------------------------------------------------------------------
relcache_expire =
# Life time of relation cache in seconds.
# means no cache expiration(the default).
# The relation cache is used for cache the
# query result against PostgreSQL system
# catalog to obtain various information
# including table structures or if it's a
# temporary table or not. The cache is
# maintained in a pgpool child local memory
# and being kept as long as it survives.
# If someone modify the table by using
# ALTER TABLE or some such, the relcache is
# not consistent anymore.
# For this purpose, cache_expiration
# controls the life time of the cache. relcache_size =
# Number of relation cache
# entry. If you see frequently:
# "pool_search_relcache: cache replacement happend"
# in the pgpool log, you might want to increate this number. check_temp_table = on
# If on, enable temporary table check in SELECT statements.
# This initiates queries against system catalog of primary/master
# thus increases load of master.
# If you are absolutely sure that your system never uses temporary tables
# and you want to save access to primary/master, you could turn this off.
# Default is on. #------------------------------------------------------------------------------
# ON MEMORY QUERY MEMORY CACHE
#------------------------------------------------------------------------------
memory_cache_enabled = off
# If on, use the memory cache functionality, off by default
memqcache_method = 'shmem'
# Cache storage method. either 'shmem'(shared memory) or
# 'memcached'. 'shmem' by default
# (change requires restart)
memqcache_memcached_host = 'localhost'
# Memcached host name or IP address. Mandatory if
# memqcache_method = 'memcached'.
# Defaults to localhost.
# (change requires restart)
memqcache_memcached_port =
# Memcached port number. Mondatory if memqcache_method = 'memcached'.
# Defaults to .
# (change requires restart)
memqcache_total_size =
# Total memory size in bytes for storing memory cache.
# Mandatory if memqcache_method = 'shmem'.
# Defaults to 64MB.
# (change requires restart)
memqcache_max_num_cache =
# Total number of cache entries. Mandatory
# if memqcache_method = 'shmem'.
# Each cache entry consumes bytes on shared memory.
# Defaults to ,,(.8MB).
# (change requires restart)
memqcache_expire =
# Memory cache entry life time specified in seconds.
# means infinite life time. by default.
# (change requires restart)
memqcache_auto_cache_invalidation = on
# If on, invalidation of query cache is triggered by corresponding
# DDL/DML/DCL(and memqcache_expire). If off, it is only triggered
# by memqcache_expire. on by default.
# (change requires restart)
memqcache_maxcache =
# Maximum SELECT result size in bytes.
# Must be smaller than memqcache_cache_block_size. Defaults to 400KB.
# (change requires restart)
memqcache_cache_block_size =
# Cache block size in bytes. Mandatory if memqcache_method = 'shmem'.
# Defaults to 1MB.
# (change requires restart)
memqcache_oiddir = '/var/log/pgpool/oiddir'
# Temporary work directory to record table oids
# (change requires restart)
white_memqcache_table_list = ''
# Comma separated list of table names to memcache
# that don't write to database
# Regexp are accepted
black_memqcache_table_list = ''
# Comma separated list of table names not to memcache
# that don't write to database
# Regexp are accepted

  1.4 启动pgpool:pgpool -dn 以非守护进程,调试模式运行pgpool,因为如果默认是守护进程,待会想停止就麻烦了。其他参数--help.

注意观察调试信息。如果有错误,请重新查看配置文件。如果没有法相错误,使用命令:psql -p 9999 ,通过pgpool进入数据库。如果可以进入就说明pgpool监听的9999正常,且成功连接数据库。

  此时使用如下命令查看监听的数据库信息:当发现status为3时,使用ping、telnet、psql -h进行测试网络连通性。关闭防火墙。

postgres=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role
---------+--------------+------+--------+-----------+---------
| 192.168.1.45 | | | 0.500000 | primary
| 192.168.1.50 | | | 0.500000 | standby

  当然,可以启动主从数据库的日志级别为 all。那么对数据的一切操作,都可以查看到。此时可以使用脚本测试一下。就可以查看主从的日志。看看是否都有日志记录。

for i in {..}; do echo $i; psql -p  -U postgres -h 192.168.1.45 -d postgres -c "select * from test"; done

  1.5 查看pgpool的状态。参考手册:http://pgpool.projects.pgfoundry.org/pgpool-II/doc/pgpool-zh_cn.html#show-commands

  之前安装的regclass就是通过psql查看pgpool状态的工具。可以运行下列命令查看:

pgpool-II 通过 SHOW 命令提供一些信息。SHOW 是一个真实的 SQL 语句, 但是如果该命令查询 pgpool-II 信息的话,pgpool-II 解释了该命令。可选项如下:

pool_status, 获取配置
pool_nodes, 获取节点信息
pool_processes, 获取pgPool-II 进程信息
pool_pools, 获取pgPool-II 所有的连接池信息
pool_version, 获取pgPool_II 版本信息

  比如:pool_processes。查看进程池的情况。可以根据需要修改pgpool中的设置。

postgres=# show pool_processes;
pool_pid | start_time | database | username | create_time | pool_counter
----------+---------------------+----------+----------+---------------------+--------------
| -- :: | postgres | postgres | -- :: |
| -- :: | | | |
| -- :: | | | |
| -- :: | | | |
| -- :: | | | |
| -- :: | | | |
| -- :: | | | |
| -- :: | | | |
| -- :: | | | |
| -- :: | | | |
05-15 02:55