Mongodb副本集主(PRIMARY)从(SECONDARY)切换
当使用mongodb(mdb)副本集作为高可用解决方案时,主从切换根据发起的情景不一样分为主动切换和被动切换.主
动切换主要是由PRIMARY发起的,而被动切换是当PRIMARY无法联系时,直接由SECONDARY发起的强制激活.
PRIMARY:192.168.56.11
SECONDARY:192.168.56.10
MDB:3.0.3
查看当前的复制状态
rep-test:PRIMARY> rs.status()
{
"set" : "rep-test",
"date" : ISODate("2016-02-26T03:15:36.810Z"),
"myState" : 1,
"members" : [
{
"_id" : 1,
"name" : "192.168.56.11:63105",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 51164,
"optime" : Timestamp(1456456239, 1),
"optimeDate" : ISODate("2016-02-26T03:10:39Z"),
"electionTime" : Timestamp(1456456241, 1),
"electionDate" : ISODate("2016-02-26T03:10:41Z"),
"configVersion" : 300282,
"self" : true
},
{
"_id" : 2,
"name" : "192.168.56.10:63105",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 295,
"optime" : Timestamp(1456456239, 1),
"optimeDate" : ISODate("2016-02-26T03:10:39Z"),
"lastHeartbeat" : ISODate("2016-02-26T03:15:35.483Z"),
"lastHeartbeatRecv" : ISODate("2016-02-26T03:15:35.483Z"),
"pingMs" : 0,
"configVersion" : 300282
}
],
"ok" : 1
}
主动切换比较简单,通过在主库上执行rs.stepDown()方法即可
rep-test:PRIMARY> rs.stepDown()
日志中有显示已转换成SECONDARY
2016-02-26T11:17:08.912+0800 I COMMAND [conn313] Attempting to step down in response to replSetStepDown command
2016-02-26T11:17:08.912+0800 I REPL [ReplicationExecutor] transition to SECONDARY
2016-02-26T11:17:08.913+0800 I NETWORK [conn313] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server
[127.0.0.1:41741]
2016-02-26T11:17:08.920+0800 I NETWORK [initandlisten] connection accepted from 127.0.0.1:41754 #329 (2 connections now open)
2016-02-26T11:17:09.583+0800 I NETWORK [conn328] end connection 192.168.56.10:58104 (1 connection now open)
2016-02-26T11:17:09.584+0800 I NETWORK [initandlisten] connection accepted from 192.168.56.10:58105 #330 (2 connections now open)
2016-02-26T11:17:10.379+0800 I REPL [ReplicationExecutor] replSetElect voting yea for 192.168.56.10:63105 (2)
2016-02-26T11:17:11.587+0800 I REPL [ReplicationExecutor] Member 192.168.56.10:63105 is now in state PRIMARY
rep-test:SECONDARY> rs.status()
{
"set" : "rep-test",
"date" : ISODate("2016-02-26T03:18:11.642Z"),
"myState" : 2,
"members" : [
{
"_id" : 1,
"name" : "192.168.56.11:63105",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 51319,
"optime" : Timestamp(1456456239, 1),
"optimeDate" : ISODate("2016-02-26T03:10:39Z"),
"configVersion" : 300282,
"self" : true
},
{
"_id" : 2,
"name" : "192.168.56.10:63105",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 450,
"optime" : Timestamp(1456456239, 1),
"optimeDate" : ISODate("2016-02-26T03:10:39Z"),
"lastHeartbeat" : ISODate("2016-02-26T03:18:11.640Z"),
"lastHeartbeatRecv" : ISODate("2016-02-26T03:18:11.639Z"),
"pingMs" : 1,
"electionTime" : Timestamp(1456456239, 2),
"electionDate" : ISODate("2016-02-26T03:10:39Z"),
"configVersion" : 300282
}
],
"ok" : 1
}
被动切换稍微麻烦一些,假设当前PRIMARY无法连接上了,需要手动删除成员.
rep-test:SECONDARY> rs.status()
{
"set" : "rep-test",
"date" : ISODate("2016-01-17T06:54:11.083Z"),
"myState" : 2,
"members" : [
{
"_id" : 1,
"name" : "192.168.56.11:63105",
"health" : 0,
"state" : 8,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" : Timestamp(0, 0),
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2016-01-17T06:54:10.781Z"),
"lastHeartbeatRecv" : ISODate("2016-01-17T06:54:10.772Z"),
"pingMs" : 0,
"lastHeartbeatMessage" : "Failed attempt to connect to 192.168.56.11:63105; couldn't connect to server 192.168.56.11:63105
(192.168.56.11), connection attempt failed",
"configVersion" : -1
},
{
"_id" : 2,
"name" : "192.168.56.10:63105",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 60691,
"optime" : Timestamp(1456456239, 1),
"optimeDate" : ISODate("2016-02-26T03:10:39Z"),
"configVersion" : 300282,
"self" : true
}
],
"ok" : 1
}
可以看到状态已经无法连接上了.需要直接强制激活第二个(状态为SECONDARY)的成员
rep-test:SECONDARY> cfg=rs.conf()
rep-test:SECONDARY> cfg.members=[cfg.members[1]]
[
{
"_id" : 2,
"host" : "192.168.56.10:63105",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {
},
"slaveDelay" : 0,
"votes" : 1
}
]
只给一个成员的数据,注意数组的下标是从0开始,本例是id为2,所以数组下标为1
rep-test:SECONDARY> rs.reconfig(cfg, {force: true});
{ "ok" : 1 }
强制重新配置复制
rep-test:SECONDARY> rs.status()
{
"set" : "rep-test",
"date" : ISODate("2016-01-17T06:56:33.838Z"),
"myState" : 1,
"members" : [
{
"_id" : 2,
"name" : "192.168.56.10:63105",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 60833,
"optime" : Timestamp(1456456239, 1),
"optimeDate" : ISODate("2016-02-26T03:10:39Z"),
"electionTime" : Timestamp(1456456239, 3),
"electionDate" : ISODate("2016-02-26T03:10:39Z"),
"configVersion" : 339433,
"self" : true
}
],
"ok" : 1
}
可以看到我们当前已经成功激活了id为2的成员.
注意:在主动和被动切换的过程中都会造成应用的短暂中断.