MariaDB集群Galera Cluster的研究与测试
Galera Cluster是MariaDB的一个双活多主集群,其可以使得MariDB的所有节点保持同步,Galera为MariaDB提供了同步复制(相对于原生的异步复制),因此其可以保证HA,且其当前仅支持XtraDB/InnoDB存储引擎(扩展支持MyISAM),并且只可在Linux下使用。
Galera Cluster拥有以下特性:
- 真正的多主架构,任何节点都可以进行读写
- 同步复制,各节点间无延迟且节点宕机不会导致数据丢失
- 紧密耦合,所有节点均保持相同状态,节点间无不同数据
- 无需主从切换操作或使用VIP
- 热Standby,在Failover过程中无停机时间(由于不需要Failover)
- 自动节点配置,无需手工备份当前数据库并拷贝至新节点
- 支持InnoDB存储引擎
- 对应于透明,无需更改应用或是进行极小的更改
- 无需进行读写分离
Galera使用基于认证的复制,其流程如下:
其主要思想是在不出现冲突的背景下事务正常执行并持续到commit为止;当客户端发起commit命令时(此时仍然没有发生真正的commit),所有本事务内对数据库的改动与改动数据行的主键都会被搜集到一个写入集(writeset)中,该写入集随后会被复制到其他节点,该写入集会在每个节点上使用搜集到的主键进行确认性认证测试(包括被“提交”事务的当前节点)来判断该写入集是否可以被应用。如果认证测试失败,写入集会被丢弃并且原始事务会被回滚,如果认证成功,事务会被提交并且写入集会被在剩余节点进行应用。
以上的认证测试在Galera集群中的实现取决于全局事务顺序,每个事务在复制期间都会被指派一个全局顺序序列;当一个事务到达提交点时,该事务会知道当前与该事务不冲突的最新已提交事务的顺序序号,在这两个事务的全局顺序序列之间的间隔是不确定区域,在该区域间的事务相互是“看不到”对方的影响的,但所有在这间隔之间的唯物都会被进行主键冲突检测(如果发现冲突认证测试就会失败)。
以下演示如何配置Galera Cluster:
1、首先在各节点上安装MariaDB Galera:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | [root@nd1 ~]# ls -lhtr total 123M -rw-r--r-- 1 root root 340K Mar 27 21:30 socat-1.7.2.4-1.el6.rf.x86_64.rpm -rwxrwxrwx 1 root root 5.8M Jun 7 18:59 MariaDB-5.5.38-centos6-x86_64-devel.rpm -rwxrwxrwx 1 root root 2.7M Jun 7 18:59 MariaDB-5.5.38-centos6-x86_64-compat.rpm -rwxrwxrwx 1 root root 24K Jun 7 18:59 MariaDB-5.5.38-centos6-x86_64-common.rpm -rwxrwxrwx 1 root root 11M Jun 7 18:59 MariaDB-5.5.38-centos6-x86_64-client.rpm -rwxrwxrwx 1 root root 1.2M Jun 7 18:59 MariaDB-5.5.38-centos6-x86_64-shared.rpm -rwxrwxrwx 1 root root 43M Jun 7 18:59 MariaDB-5.5.38-centos6-x86_64-server.rpm -rwxrwxrwx 1 root root 29M Jun 7 18:59 MariaDB-5.5.38-centos6-x86_64-test.rpm -rwxrwxrwx 1 root root 26M Jun 25 06:25 MariaDB-Galera-5.5.38-centos6-x86_64-server.rpm -rwxrwxrwx 1 root root 6.0M Jun 25 06:26 galera-25.3.5-1.rhel6.x86_64.rpm [root@nd1 ~]# yum localinstall socat-1.7.2.4-1.el6.rf.x86_64.rpm \ > MariaDB-5.5.38-centos6-x86_64-common.rpm \ > MariaDB-5.5.38-centos6-x86_64-client.rpm \ > MariaDB-Galera-5.5.38-centos6-x86_64-server.rpm \ > MariaDB-5.5.38-centos6-x86_64-compat.rpm \ > galera-25.3.5-1.rhel6.x86_64.rpm Loaded plugins: fastestmirror Setting up Local Package Process ... |
2、初始化MariaDB数据库并启动MariaDB服务:
1 2 3 4 5 6 7 8 9 10 11 12 | [root@nd1 ~]# mysql_install_db Installing MariaDB/MySQL system tables in '/var/lib/mysql' ... 140809 8:39:48 [Note] WSREP: Read nil XID from storage engines, skipping position init 140809 8:39:48 [Note] WSREP: wsrep_load(): loading provider library 'none' 140809 8:39:48 [Note] [Debug] WSREP: dummy_init 140809 8:40:00 [Note] [Debug] WSREP: dummy_disconnect 140809 8:40:00 [Note] WSREP: Service disconnected. 140809 8:40:01 [Note] WSREP: Some threads may fail to exit. 140809 8:40:01 [Note] [Debug] WSREP: dummy_free OK Filling help tables... ... |
初始化完成后启动MariaDB:
1 2 3 | [root@nd1 ~]# /etc/init.d/mysql start Starting MySQL..... SUCCESS! [root@nd1 ~]# |
此时可以查看到Galera相关的配置选项,可以看出Galera相关的配置参数有36个之多:
1 2 3 4 5 | MariaDB [(none)]> show global variables like 'wsrep%'; 4d63dbf5f8aa375c0a7cb0459cb319ca - 36 rows in set (0.01 sec) MariaDB [(none)]> |
3、在MariaDB实例上创建用于Galera集群的用户,第一个节点建立即可:
1 2 3 4 5 6 7 8 9 10 11 12 13 | MariaDB [(none)]> grant all privileges on *.* to 'wsrep_sst-user'@'nd3' identified by 'password'; Query OK, 0 rows affected (0.01 sec) MariaDB [(none)]> grant all privileges on *.* to 'wsrep_sst-user'@'nd2' identified by 'password'; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> grant all privileges on *.* to 'wsrep_sst-user'@'nd1' identified by 'password'; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> flush privileges; Query OK, 0 rows affected (0.01 sec) MariaDB [(none)]> |
4、先在一个节点上配置Galera Cluster:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | [root@nd1 ~]# ip addr show eth0 | grep 'inet.*scope global' | awk '{print $2}' 192.168.192.129/24 [root@nd1 ~]# cat /etc/my.cnf.d/galera.cnf [server] query_cache_size=0 binlog_format=ROW default_storage_engine=innodb innodb_autoinc_lock_mode=2 wsrep_provider=/usr/lib64/galera/libgalera_smm.so #wsrep_cluster_address=gcomm://192.168.192.133,192.168.192.134 wsrep_cluster_address=gcomm:// wsrep_cluster_name='example_cluster' wsrep_node_address='192.168.192.129' wsrep_node_name='nd1' wsrep_sst_method=rsync #wsrep_sst_method=xtrabackup wsrep_sst_auth=wsrep_sst-user:password [root@nd1 ~]# /etc/init.d/mysql bootstrap Bootstrapping the clusterStarting MySQL...... SUCCESS! [root@nd1 ~]# ps -ef | grep mysqld | grep -v grep | wc -l 2 [root@nd1 ~]# |
补充:后续过程发现bootstrap模式初始化后并启动的MariaDB实例不能正常的处理其他节点请求,因此需要重启MariaDB实例;
注意:wsrep_node_address和wsrep_node_name需是相对应的用于指定当前Galera节点主机及IP信息;当第一个节点启动完成后便可以调换以上配置文件中的wsrep_cluster_address参数了;参数wsrep_sst_method可以使用rsync方式或Percona提供的Xtrabackup;
经测发现wsrep_sst_method=xtrabackup时不能正常工作,有待进一步测试;
5、向之前配置的Galera集群(即IP为192.168.192.129的节点)中加入其他新节点:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | [root@nd2 ~]# ip addr show eth0 | grep 'inet.*scope global' | awk '{print $2}' 192.168.192.133/24 [root@nd2 ~]# cat /etc/my.cnf.d/galera.cnf query_cache_size=0 binlog_format=ROW default_storage_engine=innodb innodb_autoinc_lock_mode=2 wsrep_provider=/usr/lib64/galera/libgalera_smm.so wsrep_cluster_address=gcomm://192.168.192.129,192.168.192.134 wsrep_cluster_name='example_cluster' wsrep_node_address='192.168.192.133' wsrep_node_name='nd2' #wsrep_sst_method=xtrabackup wsrep_sst_method=rsync wsrep_sst_auth=wsrep_sst-user:password [root@nd2 ~]# |
第三节点:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | [root@nd3 ~]# cat /etc/my.cnf.d/galera.cnf [server] query_cache_size=0 binlog_format=ROW default_storage_engine=innodb innodb_autoinc_lock_mode=2 wsrep_provider=/usr/lib64/galera/libgalera_smm.so wsrep_cluster_address=gcomm://192.168.192.129,192.168.192.133 wsrep_cluster_name='example_cluster' wsrep_node_address='192.168.192.134' wsrep_node_name='nd3' wsrep_sst_method=rsync wsrep_sst_auth=wsrep_sst-user:password [root@nd3 ~]# |
6、依次启动其他节点,其他节点会根据配置自动加入集群并同步数据:
1 2 3 | [root@nd2 ~]# /etc/init.d/mysql start Starting MySQL......SST in progress, setting sleep higher. SUCCESS! [root@nd2 ~]# |
7、在Galera上查看各集群状态:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | MariaDB [(none)]> show global status like 'wsrep_cluster%'; +--------------------------+--------------------------------------+ | Variable_name | Value | +--------------------------+--------------------------------------+ | wsrep_cluster_conf_id | 1 | | wsrep_cluster_size | 1 | | wsrep_cluster_state_uuid | 98f4ddfe-205a-11e4-b7a2-7ede18e6364f | | wsrep_cluster_status | Primary | +--------------------------+--------------------------------------+ 4 rows in set (0.00 sec) MariaDB [(none)]> show global status like 'wsrep_cluster%'; +--------------------------+--------------------------------------+ | Variable_name | Value | +--------------------------+--------------------------------------+ | wsrep_cluster_conf_id | 4 | | wsrep_cluster_size | 2 | | wsrep_cluster_state_uuid | 98f4ddfe-205a-11e4-b7a2-7ede18e6364f | | wsrep_cluster_status | Primary | +--------------------------+--------------------------------------+ 4 rows in set (0.01 sec) MariaDB [(none)]> show global status like 'wsrep_cluster%'; +--------------------------+--------------------------------------+ | Variable_name | Value | +--------------------------+--------------------------------------+ | wsrep_cluster_conf_id | 5 | | wsrep_cluster_size | 3 | | wsrep_cluster_state_uuid | 98f4ddfe-205a-11e4-b7a2-7ede18e6364f | | wsrep_cluster_status | Primary | +--------------------------+--------------------------------------+ 4 rows in set (0.00 sec) MariaDB [(none)]> |
从此处可以看出当有其他节点加入到集群中时,wsrep_cluster_size可以反映出当前集群节点数量;
8、在一节点上新建表并插入数据以在其他实例上观测数据是否同步:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | MariaDB [(none)]> create database galera; Query OK, 1 row affected (0.14 sec) MariaDB [(none)]> create table t_galera( -> id int auto_increment primary key, -> name varchar(10)); Query OK, 0 rows affected (0.31 sec) MariaDB [galera]> insert into t_galera values(null,rand()); Query OK, 1 row affected (0.01 sec) MariaDB [galera]> insert into t_galera values(null,rand()),(null,rand()),(null,rand()); Query OK, 3 rows affected (0.07 sec) Records: 3 Duplicates: 0 Warnings: 0 MariaDB [galera]> select *,@@global.hostname from t_galera; +----+------------+-------------------+ | id | name | @@global.hostname | +----+------------+-------------------+ | 3 | 0.47992669 | nd1.galera | | 6 | 0.88471807 | nd1.galera | | 9 | 0.98381029 | nd1.galera | | 12 | 0.26489729 | nd1.galera | | 15 | 0.37305561 | nd1.galera | +----+------------+-------------------+ 5 rows in set (0.00 sec) MariaDB [galera]> |
在其他节点上观测数据:
1 2 3 4 5 6 7 8 9 10 11 12 13 | MariaDB [galera]> select *,@@global.hostname from t_galera; +----+------------+-------------------+ | id | name | @@global.hostname | +----+------------+-------------------+ | 3 | 0.47992669 | nd3.galera | | 6 | 0.88471807 | nd3.galera | | 9 | 0.98381029 | nd3.galera | | 12 | 0.26489729 | nd3.galera | | 15 | 0.37305561 | nd3.galera | +----+------------+-------------------+ 5 rows in set (0.05 sec) MariaDB [galera]> |
1 2 3 4 5 6 7 8 9 10 11 12 13 | MariaDB [galera]> select *,@@global.hostname from t_galera; +----+------------+-------------------+ | id | name | @@global.hostname | +----+------------+-------------------+ | 3 | 0.47992669 | nd2.galera | | 6 | 0.88471807 | nd2.galera | | 9 | 0.98381029 | nd2.galera | | 12 | 0.26489729 | nd2.galera | | 15 | 0.37305561 | nd2.galera | +----+------------+-------------------+ 5 rows in set (0.07 sec) MariaDB [galera]> |
8、测试节点故障机恢复:
首先停止nd1节点(已调整过该节点的wsrep_cluster_address配置):
1 2 3 | [root@nd1 ~]# /etc/init.d/mysql stop Shutting down MySQL.... [ OK ] [root@nd1 ~]# |
在nd2节点上删除并更新数据:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | MariaDB [galera]> select *,@@global.hostname from t_galera; +----+------------+-------------------+ | id | name | @@global.hostname | +----+------------+-------------------+ | 3 | 0.47992669 | nd2.galera | | 6 | 0.88471807 | nd2.galera | | 9 | 0.98381029 | nd2.galera | | 12 | 0.26489729 | nd2.galera | | 15 | 0.37305561 | nd2.galera | +----+------------+-------------------+ 5 rows in set (0.00 sec) MariaDB [galera]> delete from t_galera where id != 12; Query OK, 4 rows affected (0.07 sec) MariaDB [galera]> update t_galera set name=rand() where id in (10,12); Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 MariaDB [galera]> select *,@@global.hostname from t_galera; +----+------------+-------------------+ | id | name | @@global.hostname | +----+------------+-------------------+ | 12 | 0.00109373 | nd2.galera | +----+------------+-------------------+ 1 row in set (0.00 sec) MariaDB [galera]> |
在nd3节点上查看数据是否一致:
1 2 3 4 5 6 7 8 9 | MariaDB [galera]> select *,@@global.hostname from t_galera; +----+------------+-------------------+ | id | name | @@global.hostname | +----+------------+-------------------+ | 12 | 0.00109373 | nd3.galera | +----+------------+-------------------+ 1 row in set (0.00 sec) MariaDB [galera]> |
启动停止的nd1节点:
1 2 3 | [root@nd1 ~]# /etc/init.d/mysql start Starting MySQL........SST in progress, setting sleep higher[ OK ] [root@nd1 ~]# |
在nd1节点上查看数据是否自动同步:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | MariaDB [galera]> select *,@@global.hostname from t_galera; ERROR 2006 (HY000): MySQL server has gone away No connection. Trying to reconnect... Connection id: 8 Current database: galera +----+------------+-------------------+ | id | name | @@global.hostname | +----+------------+-------------------+ | 12 | 0.00109373 | nd1.galera | +----+------------+-------------------+ 1 row in set (0.12 sec) MariaDB [galera]> |
关于更多的INSERT、UPDATE、DELETE测试道理是相同的,各位可自行完成;当然,如果要在生产环境中使用的话还需要进行更复杂功能性、逻辑性测试。也欢迎各位一起探讨Galera Cluster的使用及测试经验。