一、介绍
高可用,大家可能会想到比较简单的Keepalived,或者更早一点的 heartbeat,也可能会用到 Corosync+Pacemaker,那么他们之间有什么区别。
Heartbeat到了v3版本后,拆分为多个子项目:Heartbeat、cluster-glue、Resource Agent、Pacemaker。
Heartbeat:只负责维护集群各节点的信息以及它们之前通信。
Cluster-glue:当于一个中间层,可以将heartbeat和crm(pacemaker)联系起来,主要包含2个部分,LRM和STONITH;
Resource Agent :用来控制服务启停,监控服务状态的脚本集合,这些脚本将被LRM调用从而实现各种资源启动、停止、监控等等。
pacemaker:原Heartbeat 拆分出来的资源管理器,用来管理整个HA的控制中心,客户端通过pacemaker来配置管理监控整个集群。
它不能提供底层心跳信息传递的功能,它要想与对方节点通信需要借助底层(新拆分的heartbeat或corosync)的心跳传递服务,将信息通告给对方。
Pacemaker 配置文件比较不太容易修改,可以使用命令行界面的crmsh、pcs和图形化界面pygui、hawk等进行管理,看个人喜好。
Heartbeat 和 Corosync 的区别:
1、经过安装heartbeat 体验,Heartbeat 配置比较简单,主要修改三个文件即可: ha.cf、 haresources、 authkeys ,
但是在支持多节点的时候不知道个人配置问题,还是其他,脑裂严重(查看很多博客说只支持2个节点),并且自带的服务脚本较少,很多服务监控脚本需要自己编写。
2、Heartbeat只能为所有的资源配置一个主服务,而corosync则允许为不同的资源组配置不同的主服务 ,corosync支持多个节点的集群,
支持把资源进行分组,按照组进行资源的管理,设置主服务,自行进行启停 。
3、管理资源的灵活性:在corosync中,其会自行处理配置文件的同步问题,heartbeat则无此功能
k
配置环境准备
vip 10.101.11.13
node1 10.101.11.11
node2 10.101.11.12
安装drbd软件(各个节点)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | sudo su - ip=` /sbin/ifconfig | grep 'inet ' | grep '10.101.11' | grep - v 'grep' | awk '{print $2}' ` echo $ip #如果不为空 if [ "$ip" != '' ]; then echo "数据库段IP: ${ip} " name_add=bwscdb${ip:10:3} '.bwscdb.local' echo $name_add fi echo $name_add hostnamectl set - hostname $name_add uname -n systemctl daemon-reload systemctl restart network |
#更改各个网卡的mac 地址
1 2 3 4 | maceth0=` /sbin/ifconfig eth0 | egrep "ether" | awk '{print $2}' ` echo $maceth0 maceth1=` /sbin/ifconfig eth1 | egrep "ether" | awk '{print $2}' ` echo $maceth1 |
> /etc/udev/rules.d/90-eno-fix.rules
1 2 3 4 5 | cat >> /etc/udev/rules .d /90-eno-fix .rules <<EOF # This file was automatically generated on systemd update SUBSYSTEM== "net" , ACTION== "add" , DRIVERS== "?*" , ATTR{address}== "$maceth0" , NAME= "eth0" SUBSYSTEM== "net" , ACTION== "add" , DRIVERS== "?*" , ATTR{address}== "$maceth1" , NAME= "eth1" EOF |
> /etc/udev/rules.d/70-persistent-net.rules
1 2 3 4 5 | cat >> /etc/udev/rules .d /70-persistent-net .rules <<EOF # This file was automatically generated on systemd update SUBSYSTEM== "net" , ACTION== "add" , DRIVERS== "?*" , ATTR{address}== "$maceth0" , NAME= "eth0" SUBSYSTEM== "net" , ACTION== "add" , DRIVERS== "?*" , ATTR{address}== "$maceth1" , NAME= "eth1" EOF |
> /etc/sysconfig/network-scripts/ifcfg-eth0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | cat >> /etc/sysconfig/network-scripts/ifcfg-eth0 <<EOF TYPE=Ethernet BOOTPROTO=static DEFROUTE= yes IPV4_FAILURE_FATAL=no IPV6INIT=no IPV6_AUTOCONF= yes IPV6_DEFROUTE= yes IPV6_PEERDNS= yes IPV6_PEERROUTES= yes IPV6_FAILURE_FATAL=no NAME=eth0 HWADDR=$maceth0 ONBOOT= yes IPADDR=$ip PREFIX=24 GATEWAY=10.101.11.1 DNS1=218.104.111.114 DNS2=223.6.6.6 EOF |
> /etc/sysconfig/network-scripts/ifcfg-eth1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | cat >> /etc/sysconfig/network-scripts/ifcfg-eth1 <<EOF TYPE=Ethernet BOOTPROTO=static DEFROUTE= yes IPV4_FAILURE_FATAL=no IPV6INIT=no IPV6_AUTOCONF= yes IPV6_DEFROUTE= yes IPV6_PEERDNS= yes IPV6_PEERROUTES= yes IPV6_FAILURE_FATAL=no NAME=eth1 DEVICE=eth1 ONBOOT= yes HWADDR=$maceth1 IPADDR=192.168.11.${ip:10:3} PREFIX=24 EOF |
1 | /etc/init .d /network restart |
1 2 3 4 5 6 | /usr/bin/chattr -i /etc/passwd /usr/bin/chattr -i /etc/inittab /usr/bin/chattr -i /etc/group /usr/bin/chattr -i /etc/shadow /usr/bin/chattr -i /etc/gshadow /usr/bin/chattr -i /etc/hosts |
> /etc/hosts
1 2 3 4 5 6 7 | echo '127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4' >> /etc/hosts echo '::1 localhost localhost.localdomain localhost6 localhost6.localdomain6' >> /etc/hosts echo '10.101.11.11 bwscdb11.bwscdb.local bwscdb11' >> /etc/hosts echo '10.101.11.12 bwscdb12.bwscdb.local bwscdb12' >> /etc/hosts echo '192.168.11.11 bwscdb11_priv bwscdb11_priv' >> /etc/hosts echo '192.168.11.12 bwscdb11_priv bwscdb12_priv' >> /etc/hosts echo '10.101.11.13 mysql_vip1 mysql_vip1' >> /etc/hosts |
同步时间:
1 | ntpdate ntp6.aliyun.com |
安装依赖包:
1 2 3 4 5 | yum -y install gcc gcc-c++ make glibc kernel-devel kernel-headers #yum -y install gcc gcc-c++ make glibc flex kernel kernel-devel kernel-headers rpm -- import https: //www .elrepo.org /RPM-GPG-KEY-elrepo .org rpm -Uvh http: //www .elrepo.org /elrepo-release-7 .0-2.el7.elrepo.noarch.rpm yum install -y kmod-drbd84 drbd84-utils |
安装 pacemaker pcs
1 2 3 4 5 6 7 8 | ntpdate cn.pool.ntp.org yum install -y pacemaker pcs psmisc polic ycoreutils-python yum -y install corosync pacemaker pcs systemctl enable pcsd systemctl enable corosync systemctl enable pacemaker systemctl restart pcsd.service systemctl enable pcsd.service |
# 安装crmsh:最好两个节点都安装方便测试
#crmsh是opensuse源提供 http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/
1 2 3 | cd /etc/yum .repos.d/ wget http: //download .opensuse.org /repositories/network : /ha-clustering : /Stable/CentOS_CentOS-7/network :ha-clustering:Stable.repo yum install crmsh -y |
#重启
1 | reboot |
#加载DRBD模块、查看DRBD模块是否加载到内核:(node1,node2)
1 2 3 4 5 | [root@bwscdb12 ~] # modprobe drbd [root@bwscdb12 ~] # [root@bwscdb12 ~] # lsmod |grep drbd drbd 397041 0 libcrc32c 12644 4 xfs,drbd,nf_nat,nf_conntrack |
#主配置文件
1 2 | /etc/drbd .conf #主配置文件 /etc/drbd .d /global_common .conf #全局配置文件 |
#查看主配置文件
1 2 3 4 | [root@node1 ~] # cat /etc/drbd.conf # You can find anexample in /usr/share/doc/drbd.../drbd.conf.example include "drbd.d/global_common.conf" ; include "drbd.d/*.res" ; |
#分别在两台机器上各添加一块硬盘,并分区
#LVM 的方案
创建lvm(每个节点都需执行)
1 2 3 | pvcreate /dev/sdb vgcreate datavg /dev/sdb lvcreate --size 195G --name drbd_data datavg |
#创建配置文件
>/etc/drbd.d/db.res
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | cat >> /etc/drbd .d /db .res <<EOF resource drbd_data{ protocol C; startup { wfc-timeout 0; degr-wfc-timeout 120; } disk { on-io-error detach; } net{ timeout 60; connect-int 10; ping -int 10; max-buffers 2048; max-epoch-size 2048; } syncer{ verify-alg sha1; #加密算法 rate 100M;} on bwscdb11.bwscdb. local { device /dev/drbd1 ; #drbd1 为DRBD 自定义分区 disk /dev/mapper/datavg-drbd_data ; address 10.101.11.11:7788; meta-disk internal;} on bwscdb12.bwscdb. local { device /dev/drbd1 ; disk /dev/mapper/datavg-drbd_data ; address 10.101.11.12:7788; meta-disk internal;} } EOF |
#启动
1 | dd if = /dev/zero bs=1M count=1 of= /dev/sdb1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | [root@bwscdb11 drbd.d] # drbdadm create-md drbd_data # drbdadm create-md drbd_data 等待片刻,显示success表示drbd块创建成功 md_offset 1036290879488 al_offset 1036290846720 bm_offset 1036259221504 Found some data ==> This might destroy existing data! <== Do you want to proceed? [need to type 'yes' to confirm] yes initializing activity log initializing bitmap (30884 KB) to all zero Writing meta data... New drbd meta data block successfully created. success |
注意:如果等很久都没提示success,就按下回车键再等等。
再次输入该命令:
1 | # drbdadm create-md drbd_data |
#启动DRBD服务:(node1,node2)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | [root@bwscdb12 ~] # systemctl start drbd [root@bwscdb12 ~] # systemctl status drbd ● drbd.service - DRBD -- please disable. Unless you are NOT using a cluster manager. Loaded: loaded ( /usr/lib/systemd/system/drbd .service; disabled; vendor preset: disabled) Active: active (exited) since Mon 2019-04-29 16:55:05 CST; 4s ago Process: 2959 ExecStart= /lib/drbd/drbd start (code=exited, status=0 /SUCCESS ) Main PID: 2959 (code=exited, status=0 /SUCCESS ) Apr 29 16:55:04 bwscdb12 drbd[2959]: Starting DRBD resources: [ Apr 29 16:55:04 bwscdb12 drbd[2959]: create res: drbd_data Apr 29 16:55:04 bwscdb12 drbd[2959]: prepare disk: drbd_data Apr 29 16:55:04 bwscdb12 drbd[2959]: adjust disk: drbd_data Apr 29 16:55:04 bwscdb12 drbd[2959]: adjust net: drbd_data Apr 29 16:55:04 bwscdb12 drbd[2959]: ] Apr 29 16:55:04 bwscdb12 drbd[2959]: WARN: stdin /stdout is not a TTY; using /dev/consoleoutdated-wfc-timeout has to be shorter than degr-wfc-timeout Apr 29 16:55:04 bwscdb12 drbd[2959]: outdated-wfc-timeout implicitly set to degr-wfc-timeout (120s) Apr 29 16:55:05 bwscdb12 drbd[2959]: . Apr 29 16:55:05 bwscdb12 systemd[1]: Started DRBD -- please disable. Unless you are NOT using a cluster manager.. |
1 | [root@bwscdb11 ~] #drbdadm up drbd_data |
#初始化
1 | [root@bwscdb11 ~] #drbdadm --force primary drbd_data |
#查看同步状态
1 2 3 4 5 6 7 | [root@bwscdb11 ~] # cat /proc/drbd version: 8.4.11-1 (api:1 /proto :86-101) GIT- hash : 66145a308421e9c124ec391a7848ac20203bb03c build by mockbuild@, 2018-11-03 01:26:55 0: cs:SyncSource ro:Primary /Secondary ds:UpToDate /Inconsistent C r----- ns:446464 nr:0 dw:0 dr:448592 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:41494236 [>....................] sync 'ed: 1.1% (40520 /40956 )M finish: 0:17:01 speed: 40,584 (40,584) K /sec |
1 2 3 4 5 | [root@bwscdb11 resource.d] # drbdadm status drbd_data role:Primary disk:UpToDate peer role:Secondary replication:SyncSource peer-disk:Inconsistent done :91.18 |
同步完成
1 2 3 4 5 6 | [root@bwscdb11 ~] #cat /proc/drbd #查看状态 [root@bwscdb11 drbd_data] # cat /proc/drbd version: 8.4.11-1 (api:1 /proto :86-101) GIT- hash : 66145a308421e9c124ec391a7848ac20203bb03c build by mockbuild@, 2018-11-03 01:26:55 0: cs:Connected ro:Primary /Secondary ds:UpToDate /UpToDate C r----- ns:41618688 nr:0 dw:792108 dr:41486240 al:210 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 |
1 2 3 4 5 6 7 | [root@bwscdb11 drbd_data] # drbdadm status drbd_data role:Primary disk:UpToDate peer role:Secondary replication:Established peer-disk:UpToDate [root@bwscdb11 resource.d] # drbdadm dstate drbd_data UpToDate /UpToDate |
#在对端节点 执行
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | [root@bwscdb12 ~] # modprobe drbd [root@bwscdb12 ~] # lsmod | grep drbd drbd 397041 0 libcrc32c 12644 4 xfs,drbd,nf_nat,nf_conntrack [root@bwscdb12 ~] # [root@bwscdb12 ~] # drbdadm up drbd_data [root@bwscdb12 ~] # cat /proc/drbd version: 8.4.11-1 (api:1 /proto :86-101) GIT- hash : 66145a308421e9c124ec391a7848ac20203bb03c build by mockbuild@, 2018-11-03 01:26:55 0: cs:SyncTarget ro:Secondary /Primary ds:Inconsistent /UpToDate C r----- ns:0 nr:2400256 dw:2397184 dr:0 al:8 bm:0 lo:3 pe:2 ua:3 ap:0 ep:1 wo:f oos:39543516 [>...................] sync 'ed: 5.8% (38616 /40956 )M finish: 0:19:00 speed: 34,652 (37,456) want: 27,960 K /sec [root@bwscdb12 ~] # cat /proc/drbd version: 8.4.11-1 (api:1 /proto :86-101) GIT- hash : 66145a308421e9c124ec391a7848ac20203bb03c build by mockbuild@, 2018-11-03 01:26:55 0: cs:Connected ro:Secondary /Primary ds:UpToDate /UpToDate C r----- ns:0 nr:41618688 dw:42273884 dr:0 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 [root@bwscdb12 ~] # drbdadm status drbd_data role:Secondary disk:UpToDate peer role:Primary replication:Established peer-disk:UpToDate |
备注:ro在主从服务器上分别显示 Primary/Secondary和Secondary/Primary
ds显示UpToDate/UpToDate,表示主从配置成功(注意这个需要时间初始化和同步的,请等待显示成上面的状态后再执行下面的步骤)。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | [root@bwscdb11 ~] # cat /proc/drbd version: 8.4.11-1 (api:1 /proto :86-101) GIT- hash : 66145a308421e9c124ec391a7848ac20203bb03c build by mockbuild@, 2018-11-03 01:26:55 1: cs:Connected ro:Primary /Secondary ds:UpToDate /UpToDate C r----- ns:5242684 nr:0 dw:0 dr:5244780 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 [root@bwscdb11 ~] # cat /proc/drbd version: 8.4.11-1 (api:1 /proto :86-101) GIT- hash : 66145a308421e9c124ec391a7848ac20203bb03c build by mockbuild@, 2018-11-03 01:26:55 1: cs:Connected ro:Primary /Secondary ds:UpToDate /UpToDate C r----- ns:5242684 nr:0 dw:0 dr:5244780 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 [root@bwscdb11 ~] # [root@bwscdb11 ~] # drbdadm status drbd_data role:Primary disk:UpToDate peer role:Secondary replication:Established peer-disk:UpToDate [root@bwsc45 ~] # cat /proc/drbd version: 8.4.11-1 (api:1 /proto :86-101) GIT- hash : 66145a308421e9c124ec391a7848ac20203bb03c build by mockbuild@, 2018-11-03 01:26:55 1: cs:Connected ro:Secondary /Primary ds:UpToDate /UpToDate C r----- ns:0 nr:5242684 dw:5242684 dr:0 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 [root@bwsc45 ~] # [root@bwsc45 ~] # drbdadm status drbd_data role:Secondary disk:UpToDate peer role:Primary replication:Established peer-disk:UpToDate |
同步完成
####***(node1,注意只有node1)
#挂载DRBD:(node1,注意只有node1)
#从刚才的状态上看到mounted和fstype参数为空,所以我们这步开始挂载DRBD到系统目录/drbd_data
(每个节点都需执行)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | mkdir -p /drbd_data mkfs.xfs /dev/drbd1 mount /dev/drbd1 /drbd_data [root@bwsc44 ~] # df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 36G 5.6G 30G 16% / devtmpfs 1.9G 0 1.9G 0% /dev tmpfs 1.9G 0 1.9G 0% /dev/shm tmpfs 1.9G 9.5M 1.9G 1% /run tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup /dev/sda1 497M 207M 291M 42% /boot tmpfs 379M 12K 379M 1% /run/user/42 tmpfs 379M 0 379M 0% /run/user/0 /dev/drbd1 3G 33M 3G 1% /drbd_data |
注:Secondary节点上不允许对DRBD设备进行任何操作,包括挂载;所有的读写操作只能在Primary节点上进行,
只有当Primary节点挂掉时,Secondary节点才能提升为Primary节点,并自动挂载DRBD继续工作。
##两节点上修改用户hacluster的密码(用户已经固定不可以改变)
1 | [root@cml1 ~] # echo 000000 | passwd --stdin hacluster |
##注册pcs集群主机(默认注册使用用户名hacluster,和密码):
1 2 3 4 5 | [root@bwscdb11 ~] # pcs cluster auth bwscdb11.bwscdb.local bwscdb12.bwscdb.local Username: hacluster Password: 000000 bwscdb11: Authorized bwscdb12: Authorized |
在集群上注册两台集群:
##设置集群
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | [root@bwscdb11 ~] # pcs cluster setup --name mycluster bwscdb11.bwscdb.local bwscdb12.bwscdb.local --force Destroying cluster on nodes: bwscdb11, bwscdb12... bwscdb12: Stopping Cluster (pacemaker)... bwscdb11: Stopping Cluster (pacemaker)... bwscdb12: Successfully destroyed cluster bwscdb11: Successfully destroyed cluster Sending 'pacemaker_remote authkey' to 'bwscdb11' , 'bwscdb12' bwscdb11: successful distribution of the file 'pacemaker_remote authkey' bwscdb12: successful distribution of the file 'pacemaker_remote authkey' Sending cluster config files to the nodes... bwscdb11: Succeeded bwscdb12: Succeeded Synchronizing pcsd certificates on nodes bwscdb11, bwscdb12... bwscdb11: Success bwscdb12: Success Restarting pcsd on the nodes in order to reload the certificates... bwscdb11: Success bwscdb12: Success |
接下来就在某个节点上已经生成来corosync配置文件:
1 2 3 | [root@bwscdb11 ~] # cd /etc/corosync/ [root@bwscdb11 corosync] # ls corosync.conf corosync.conf.example corosync.conf.example.udpu corosync.xml.example uidgid.d |
我们看一下注册进来的文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | [root@bwscdb11 corosync] # cat corosync.conf totem { version: 2 cluster_name: mycluster secauth: off transport: udpu } nodelist { node { ring0_addr: bwscdb11.bwscdb. local nodeid: 1 } node { ring0_addr: bwscdb12.bwscdb. local nodeid: 2 } } quorum { provider: corosync_votequorum two_node: 1 } logging { to_logfile: yes logfile: /var/log/cluster/corosync .log to_syslog: yes } [root@bwscdb11 corosync] # |
启动集群:现在启动所有集群服务并启用它们。
1 2 3 4 5 6 7 8 9 | [root@bwscdb11 corosync] # pcs cluster start --all bwscdb11: Starting Cluster (corosync)... bwscdb12: Starting Cluster (corosync)... bwscdb12: Starting Cluster (pacemaker)... bwscdb11: Starting Cluster (pacemaker)... [root@bwscdb11 corosync] # pcs cluster enable --all bwscdb11: Cluster Enabled bwscdb12: Cluster Enabled [root@bwscdb11 corosync] # |
##相当于启动来pacemaker和corosync:
1 2 3 4 5 6 7 8 9 10 11 12 | [root@bwscdb11 corosync] # ps -ef | grep corosync root 3038 1 1 21:02 ? 00:00:00 corosync root 3149 1995 0 21:02 pts /0 00:00:00 grep --color=auto corosync [root@bwscdb11 corosync] # ps -ef | grep pacemaker root 3057 1 0 21:02 ? 00:00:00 /usr/sbin/pacemakerd -f haclust+ 3058 3057 0 21:02 ? 00:00:00 /usr/libexec/pacemaker/cib root 3059 3057 0 21:02 ? 00:00:00 /usr/libexec/pacemaker/stonithd root 3060 3057 0 21:02 ? 00:00:00 /usr/libexec/pacemaker/lrmd haclust+ 3061 3057 0 21:02 ? 00:00:00 /usr/libexec/pacemaker/attrd haclust+ 3062 3057 0 21:02 ? 00:00:00 /usr/libexec/pacemaker/pengine haclust+ 3063 3057 0 21:02 ? 00:00:00 /usr/libexec/pacemaker/crmd root 3167 1995 0 21:02 pts /0 00:00:00 grep --color=auto pacemaker |
查看集群的状态(显示为no faults就是ok)
1 2 3 4 5 6 7 8 9 10 11 12 | [root@bwscdb11 ~] # corosync-cfgtool -s Printing ring status. Local node ID 1 RING ID 0 id = 10.101.11.11 status = ring 0 active with no faults [root@bwscdb12 yum.repos.d] # corosync-cfgtool -s Printing ring status. Local node ID 2 RING ID 0 id = 10.101.11.12 status = ring 0 active with no faults |
可以查看集群是否有错:
1 2 3 4 5 | [root@bwscdb11 corosync] # crm_verify -L -V error: unpack_resources:Resource start-up disabled since no STONITH resources have been defined error: unpack_resources:Either configure some or disable STONITH with the stonith-enabled option error: unpack_resources:NOTE: Clusters with shared data need STONITH to ensure data integrity Errors found during check: config not valid |
##因为我们没有配置STONITH设备,所以我们下面要关闭
关闭STONITH设备:
使用以下pcs命令禁用STONITH。
1 2 3 | [root@bwscdb11 corosync] # pcs property set stonith-enabled=false [root@bwscdb11 corosync] # crm_verify -L -V [root@bwscdb12 ~] # pcs property set stonith-enabled=false |
#已经显示不报错了
1 | [root@bwscdb12 ~] # crm_verify -L -V |
#接下来,因为两个节点所以对于仲裁政策,请忽略它。
#pcs property set no-quorum-policy=ignore
1 2 3 4 5 6 7 8 | [root@bwscdb11 corosync] # pcs property list Cluster Properties: cluster-infrastructure: corosync cluster-name: mycluster dc -version: 1.1.19-8.el7_6.4-c3c624ea3d have-watchdog: false stonith-enabled: false [root@bwscdb11 corosync] # |
Mysql安装和配置
##这里我是使用yum直接安装的,如果使用编译安装也是一样的但是就要在
1 | wget http: //yum .bwceshi. top /SE_tools/cmake-2 .8.8. tar .gz |
安装过程在这里就不详细说明了
PS 两边用户一样 环境变量一样 mysql 安装只需要在一边安装即可
> /lib/systemd/system/mysqld.service
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | cat >> /lib/systemd/system/mysqld .service <<EOF [Unit] Description=MySQL Community Server After=network.target After=syslog.target [Install] WantedBy=multi-user.target Alias=mysql.service [Service] User=mysql Group=mysql # Execute pre and post scripts as root PermissionsStartOnly= true # Needed to create system tables etc. #ExecStartPre=/usr/bin/mysql-systemd-start pre # Start main service ExecStart= /drbd_data/mysql56/bin/mysqld_safe # Don't signal startup success before a ping works #ExecStartPost=/usr/bin/mysql-systemd-start post # Give up if ping don't get an answer TimeoutSec=600 Restart=always EOF systemctl daemon-reload systemctl start mysqld systemctl status mysqld systemctl stop mysqld systemctl restart mysqld systemctl status mysqld |
登录到数据库
用mysql -uroot命令登录到mysqld,
1 2 3 4 5 6 7 8 9 10 | [root@bwscdb11 bin] # mysql -uroot -p000000 Warning: Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.6.16-log Source distribution Copyright (c) 2000, 2014, Oracle and /or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and /or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. |
##配置之前最好检查服务没有启动
1 2 3 4 5 | systemctl stop mysqld fuser -m - v -k /drbd_data umount /dev/drbd1 systemctl stop drbd systemctl enable mysqld |
####需要设置开机启动在下面配置systemd:mysqld时才会出现。
Crmsh安装和资源管理
借助crmsh配置mysql高可用
1 2 3 4 5 6 7 8 9 10 11 | [root@bwscdb11 ~] # crm crm(live) # status Stack: corosync Current DC: bwscdb11 (version 1.1.19-8.el7_6.4-c3c624ea3d) - partition with quorum Last updated: Tue Apr 30 06:13:11 2019 Last change: Mon Apr 29 21:05:49 2019 by root via cibadmin on bwscdb11 2 nodes configured 0 resources configured Online: [ bwscdb11.bwscdb. local bwscdb12.bwscdb. local ] No resources crm(live) # |
##配置之前最好检查服务没有启动
1 2 3 4 5 6 | [root@bwscdb11 ~] # systemctl stop mysqld [root@bwscdb11 ~] # fuser -m -v -k /dev/drbd1 [root@bwscdb11 ~] # umount /dev/drbd1 [root@bwscdb11 ~] # systemctl stop drbd #[root@bwscdb11 ~]# systemctl enable mysqld Created symlink from /etc/systemd/system/multi-user .target.wants /mysqld .service to /usr/lib/systemd/system/mysqld .service. |
1 2 3 4 5 6 7 8 | [root@bwscdb11 ~] # crm crm(live) # configure #关闭STONITH设备 crm(live)configure # property stonith-enabled=false #仲裁政策 crm(live)configure # property no-quorum-policy=ignore crm(live)configure # property migration-limit=1 crm(live)configure # verify |
#创建资源
1 2 3 | crm(live)configure # primitive mysqldrbd ocf:linbit:drbd params drbd_resource=drbd_data op start timeout=240 op stop timeout=100 op monitor role=Master interval=20 timeout=30 op monitor role=Slave interval=30 timeout=30 crm(live)configure # crm(live)configure # |
#创建主从资源
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | crm(live)configure # ms ms_mysqldrbd mysqldrbd meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true crm(live)configure # verify crm(live)configure # show node 1: bwscdb11.bwscdb. local node 2: bwscdb12.bwscdb. local primitive mysqldrbd ocf:linbit:drbd \ params drbd_resource=drbd_data \ op start timeout=240 interval=0 \ op stop timeout=100 interval=0 \ op monitor role=Master interval=20 timeout=30 \ op monitor role=Slave interval=30 timeout=30 ms ms_mysqldrbd mysqldrbd \ meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify= true property cib-bootstrap-options: \ have-watchdog= false \ dc -version=1.1.19-8.el7_6.4-c3c624ea3d \ cluster-infrastructure=corosync \ cluster-name=mycluster \ stonith-enabled= false \ no-quorum-policy=ignore \ migration-limit=1 crm(live)configure # commit |
增加文件系统资源:
1 2 3 | crm(live)configure # primitive mystore ocf:heartbeat:Filesystem params device=/dev/drbd1 directory=/drbd_data fstype=xfs op start timeout=60 opstop timeout=60 crm(live)configure # crm(live)configure # verify |
给文件系统类型和drbd做亲缘性绑定(inf为证书为接近,当位负数时候为分离)。
###资源间亲缘关系
1 2 3 | crm(live)configure #colocation mystore_with_ms_mysqldrbd inf: mystore ms_mysqldrbd:Master crm(live)configure # verify |
做顺序约束,当drbd起来之后才对文件系统进行绑定:
#资源顺序
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | crm(live)configure #order mystore_after_ms_mysqldrbd mandatory: ms_mysqldrbd:promote mystore:start crm(live)configure # verify crm(live)configure # commit crm(live)configure # show node 1: bwscdb11.bwscdb. local node 2: bwscdb12.bwscdb. local primitive mysqldrbd ocf:linbit:drbd \ params drbd_resource=drbd_data \ op start timeout=240 interval=0 \ op stop timeout=100 interval=0 \ op monitor role=Master interval=20 timeout=30 \ op monitor role=Slave interval=30 timeout=30 primitive mystore Filesystem \ params device= "/dev/drbd1" directory= "/drbd_data" fstype=xfs \ op start timeout=60 interval=0 opstop ms ms_mysqldrbd mysqldrbd \ meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify= true order mystore_after_ms_mysqldrbd Mandatory: ms_mysqldrbd:promote mystore:start colocation mystore_with_ms_mysqldrbd inf: mystore ms_mysqldrbd:Master property cib-bootstrap-options: \ have-watchdog= false \ dc -version=1.1.19-8.el7_6.4-c3c624ea3d \ cluster-infrastructure=corosync \ cluster-name=mycluster \ stonith-enabled= false \ no-quorum-policy=ignore \ migration-limit=1 |
查看
1 2 3 4 5 6 7 8 9 10 11 | [root@bwscdb12 src] # df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 36G 5.6G 30G 16% / devtmpfs 1.9G 0 1.9G 0% /dev tmpfs 1.9G 38M 1.9G 2% /dev/shm tmpfs 1.9G 9.5M 1.9G 1% /run tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup /dev/sda1 497M 207M 291M 42% /boot tmpfs 379M 12K 379M 1% /run/user/42 tmpfs 379M 0 379M 0% /run/user/0 /dev/drbd1 40G 49M 38G 1% /drbd_data |
增加mysql资源,资源间在一起启动
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | [root@bwscdb11 ~] # crm crm(live) # configure #创建资源 crm(live)configure # primitive mysqld systemd:mysqld #资源间亲缘关系 crm(live)configure # colocation mysqld_with_mystore inf: mysqld mystore crm(live)configure # verify WARNING: mysqld: default timeout 20s for start is smaller than the advised 100 WARNING: mysqld: default timeout 20s for stop is smaller than the advised 100 ##会有警告,直接edit编辑即可 primitive mysqld systemd:mysqld op start timeout=100 interval=0 op stop timeout=100 interval=0 crm(live)configure # commit [root@bwscdb11 ~] # systemctl status mysqld ● mysqld.service - Cluster Controlled mysqld Loaded: loaded ( /usr/lib/systemd/system/mysqld .service; disabled; vendor preset: disabled) Drop-In: /run/systemd/system/mysqld .service.d └─50-pacemaker.conf Active: active (running) since Mon 2019-05-20 17:19:16 CST; 11s ago Main PID: 8787 (mysqld_safe) CGroup: /system .slice /mysqld .service ├─8787 /bin/sh /drbd_data/mysql56/bin/mysqld_safe └─9534 /drbd_data/mysql56/bin/mysqld --basedir= /drbd_data/mysql56 --datadir= /drbd_data/mysqldata --plugin- dir = /drbd_data/mysql56/lib/plugin --log-error= /drbd_data/mysql56/txy_m ... May 20 17:19:16 bwscdb11.bwscdb. local systemd[1]: Started Cluster Controlled mysqld. May 20 17:19:16 bwscdb11.bwscdb. local mysqld_safe[8787]: 190520 17:19:16 mysqld_safe Logging to '/drbd_data/mysql56/txy_mysql_error.log' . May 20 17:19:16 bwscdb11.bwscdb. local mysqld_safe[8787]: 190520 17:19:16 mysqld_safe Starting mysqld daemon with databases from /drbd_data/mysqldata |
做顺序约束,先挂载文件系统,然后启动mysqld资源
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | crm(live)configure # order mysqld_after_mystore mandatory: mystore mysqld crm(live)configure # verify crm(live)configure # commit crm(live)configure # show node 1: bwscdb11.bwscdb. local node 2: bwscdb12.bwscdb. local primitive mysqld systemd:mysqld primitive mysqldrbd ocf:linbit:drbd \ params drbd_resource=drbd_data \ op start timeout=240 interval=0 \ op stop timeout=100 interval=0 \ op monitor role=Master interval=20 timeout=30 \ op monitor role=Slave interval=30 timeout=30 primitive mystore Filesystem \ params device= "/dev/drbd1" directory= "/drbd_data" fstype=xfs \ op start timeout=60 interval=0 opstop ms ms_mysqldrbd mysqldrbd \ meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify= true order mysqld_after_mystore Mandatory: mystore mysqld colocation mysqld_with_mystore inf: mysqld mystore order mystore_after_ms_mysqldrbd Mandatory: ms_mysqldrbd:promote mystore:start colocation mystore_with_ms_mysqldrbd inf: mystore ms_mysqldrbd:Master property cib-bootstrap-options: \ have-watchdog= false \ dc -version=1.1.19-8.el7_6.4-c3c624ea3d \ cluster-infrastructure=corosync \ cluster-name=mycluster \ stonith-enabled= false \ no-quorum-policy=ignore \ migration-limit=1 |
检测资源,并且看一下node2的mysql是否已经启动:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | [root@bwscdb11 yum.repos.d] # crm crm(live) # status Stack: corosync Current DC: bwscdb12 (version 1.1.19-8.el7_6.4-c3c624ea3d) - partition with quorum Last updated: Tue Apr 30 23:06:55 2019 Last change: Tue Apr 30 23:06:36 2019 by root via cibadmin on bwscdb11 2 nodes configured 4 resources configured Online: [ bwscdb11.bwscdb. local bwscdb12.bwscdb. local ] Full list of resources: Master /Slave Set: ms_mysqldrbd [mysqldrbd] Masters: [ bwscdb11 ] Slaves: [ bwscdb12 ] mystore(ocf::heartbeat:Filesystem):Started bwscdb11 mysqld(systemd:mysqld):Started bwscdb11 |
增加VIP资源,作虚拟IP调度 crm configure primitive eth0_virtual ocf:heartbeat:IPaddr params ip="200.zzz.z.162" nic="eth0" cidr_netmask="24" broadcast="200.zzz.z.255" op monitor interval="10s" timeout="20s"
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | crm(live) # crm(live) # configure crm(live)configure # primitive myvip ocf:heartbeat:IPaddr params ip="10.101.11.13" nic="eth0" cidr_netmask="24" op monitor interval=20 timeout=20 on-fail=restart #资源间亲缘关系 myvip ms_mysqldrbd 在一起启动 crm(live)configure # colocation vip_with_ms_mysqldrbd inf: myvip mysqld crm(live)configure # verify crm(live)configure # commit crm(live)configure # show node 1: bwscdb11.bwscdb. local node 2: bwscdb12.bwscdb. local primitive mysqld systemd:mysqld primitive mysqldrbd ocf:linbit:drbd \ params drbd_resource=drbd_data \ op start timeout=240 interval=0 \ op stop timeout=100 interval=0 \ op monitor role=Master interval=20 timeout=30 \ op monitor role=Slave interval=30 timeout=30 primitive mystore Filesystem \ params device= "/dev/drbd1" directory= "/drbd_data" fstype=xfs \ op start timeout=60 interval=0 opstop primitive myvip IPaddr \ params ip=10.101.11.13 nic=eth0 cidr_netmask=24 \ op monitor interval=20 timeout=20 on-fail=restart ms ms_mysqldrbd mysqldrbd \ meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify= true order mysqld_after_mystore Mandatory: mystore mysqld colocation mysqld_with_mystore inf: mysqld mystore order mystore_after_ms_mysqldrbd Mandatory: ms_mysqldrbd:promote mystore:start colocation mystore_with_ms_mysqldrbd inf: mystore ms_mysqldrbd:Master colocation vip_with_ms_mysqldrbd inf: myvip mysqld property cib-bootstrap-options: \ have-watchdog= false \ dc -version=1.1.19-8.el7_6.4-c3c624ea3d \ cluster-infrastructure=corosync \ cluster-name=mycluster \ stonith-enabled= false \ no-quorum-policy=ignore \ migration-limit=1 [root@bwscdb11 ~] # ip addr | grep 10.101.11 inet 10.101.11.11 /24 brd 10.101.11.255 scope global eth0 inet 10.101.11.13 /24 brd 10.101.11.255 scope global secondary eth0 |
#测试
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | [root@bwscdb11 ~] # mysql -uroot -p000000 Welcome to the mysqld monitor. Commands end with ; or \g. Your mysqld connection id is 5 Server version: 5.5.60-mysqld mysqld Server Copyright (c) 2000, 2018, Oracle, mysqld Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysqld [(none)]> use mysql; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed mysqld [mysql]> select database(); +------------+ | database() | +------------+ | mysql | +------------+ 1 row in set (0.00 sec) mysqld [mysql]> select User,Host,Password from user; +------+-----------+-------------------------------------------+ | User | Host | Password | +------+-----------+-------------------------------------------+ | root | localhost | *032197AE5731D4664921A6CCAC7CFCE6A0698693 | | root | bwscdb11 | | | root | 127.0.0.1 | | | root | ::1 | | | | localhost | | | | bwscdb11 | | +------+-----------+-------------------------------------------+ 6 rows in set (0.00 sec) mysqld [mysql]> delete from mysql.user where user= '' ; Query OK, 2 rows affected (0.00 sec) mysqld [mysql]> delete from mysql.user where password= '' ; Query OK, 3 rows affected (0.00 sec) mysqld [mysql]> flush privileges; Query OK, 0 rows affected (0.00 sec) mysqld [mysql]> SET PASSWORD FOR 'root' @ 'localhost' = PASSWORD( '000000' ); Query OK, 0 rows affected (0.00 sec) mysqld [mysql]> flush privileges; = PASSWORD( '000000' ); flush privileges;Query OK, 0 rows affected (0.00 sec) mysqld [mysql]> use mysql Database changed mysqld [mysql]> update user set host = '%' where user = 'root' ; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysqld [mysql]> GRANT ALL PRIVILEGES ON *.* TO 'root' @ 'localhost' IDENTIFIED BY '000000' WITH GRANT OPTION; Query OK, 0 rows affected (0.00 sec) mysqld [mysql]> mysqld [mysql]> mysqld [mysql]> SET PASSWORD FOR 'root' @ '%' = PASSWORD( '000000' ); ERROR 1133 (42000): Can't find any matching row in the user table mysqld [mysql]> mysqld [mysql]> flush privileges; Query OK, 0 rows affected (0.00 sec) mysqld [mysql]> mysqld [mysql]> mysqld [mysql]> exit Bye [root@bwscdb11 ~] # mysql -uroot -p000000 Welcome to the mysqld monitor. Commands end with ; or \g. Your mysqld connection id is 6 Server version: 5.5.60-mysqld mysqld Server Copyright (c) 2000, 2018, Oracle, mysqld Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysqld [(none)]> |
###下面我们把 bwsw112 设置为 standby,然后在 bwscdb12 访问:
#standby
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | [root@bwscdb11 ~] # crm node standby bwscdb11 [root@bwscdb11 ~] # crm status Stack: corosync Current DC: bwscdb12 (version 1.1.19-8.el7_6.4-c3c624ea3d) - partition with quorum Last updated: Tue Apr 30 23:23:07 2019 Last change: Tue Apr 30 23:22:53 2019 by root via crm_attribute on bwscdb11 2 nodes configured 5 resources configured Node bwscdb11: standby Online: [ bwscdb12 ] Full list of resources: Master /Slave Set: ms_mysqldrbd [mysqldrbd] Masters: [ bwscdb12 ] Stopped: [ bwscdb11 ] mystore(ocf::heartbeat:Filesystem):Started bwscdb12 mysqld(systemd:mysqld):Started bwscdb12 myvip(ocf::heartbeat:IPaddr):Started bwscdb12 |
在 bwscdb12 访问:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | [root@bwscdb12 yum.repos.d] # mysql -uroot -p000000 Welcome to the mysqld monitor. Commands end with ; or \g. Your mysqld connection id is 2 Server version: 5.5.60-mysqld mysqld Server Copyright (c) 2000, 2018, Oracle, mysqld Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysqld [(none)]> select User,Host,Password from user; ERROR 1046 (3D000): No database selected mysqld [(none)]> use mysql; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed mysqld [mysql]> select User,Host,Password from user; +------+-----------+-------------------------------------------+ | User | Host | Password | +------+-----------+-------------------------------------------+ | root | % | *032197AE5731D4664921A6CCAC7CFCE6A0698693 | | root | localhost | *032197AE5731D4664921A6CCAC7CFCE6A0698693 | +------+-----------+-------------------------------------------+ 2 rows in set (0.00 sec) mysqld [mysql]> exit # online 节点1 standby 节点2 切换后在online 节点2 crm node online bwscdb11 sleep 5 crm node standby bwscdb12 sleep 10 crm node online bwscdb12 [root@bwscdb11 ~] # crm status Stack: corosync Current DC: bwscdb12 (version 1.1.19-8.el7_6.4-c3c624ea3d) - partition with quorum Last updated: Tue Apr 30 23:26:48 2019 Last change: Tue Apr 30 23:26:44 2019 by root via crm_attribute on bwscdb11 2 nodes configured 5 resources configured Online: [ bwscdb11.bwscdb. local bwscdb12.bwscdb. local ] Full list of resources: Master /Slave Set: ms_mysqldrbd [mysqldrbd] Masters: [ bwscdb11 ] Slaves: [ bwscdb12 ] mystore(ocf::heartbeat:Filesystem):Started bwscdb11 mysqld(systemd:mysqld):Started bwscdb11 myvip(ocf::heartbeat:IPaddr):Started bwscdb11 [root@bwscdb11 yum.repos.d] # mysql -uroot -p000000 Welcome to the mysqld monitor. Commands end with ; or \g. Your mysqld connection id is 2 Server version: 5.5.60-mysqld mysqld Server Copyright (c) 2000, 2018, Oracle, mysqld Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysqld [(none)]> |
move 资源的方式迁移 切换主备
1 2 | [root@bwsc45 yum.repos.d] # pcs resource move ms_haproxy_drbd bwscdb12.bwscdb.local [root@bwsc45 yum.repos.d] # crm status |