第一种情况,开启GTID,从库与主库不同步。

1、在从库上查看从的状态

 
mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 10.120.141.168
Master_User: sys_repl
Master_Port: 3306
Connect_Retry: 10
Master_Log_File:
Read_Master_Log_Pos: 4
Relay_Log_File: mysqld-relay-bin.000001
Relay_Log_Pos: 4
。。。。。。。。
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
。。。。。。
Retrieved_Gtid_Set: 79212d47-7122-11e7-8641-0050569f788a:13579-17760
Executed_Gtid_Set: 07cfd8f2-b30c-11e7-8909-000d3a80115c:1-7,
635227c2-af6c-11e8-a447-5254003471ec:1-297,
66d902a5-b546-11e7-b1d4-000d3a80115c:1-13,
79212d47-7122-11e7-8641-0050569f788a:1-17760,
9d5436f6-7122-11e7-8e0c-0050569f19f6:1-6
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
 

结果报1236的错误,再从主库上查看下主库的状态

 
 mysql> show master status \G
*************************** 1. row ***************************
File: mysql-bin.000480
Position: 10751
Binlog_Do_DB:
Binlog_Ignore_DB:
Executed_Gtid_Set: 07cfd8f2-b30c-11e7-8909-000d3a80115c:1-7,
635227c2-af6c-11e8-a447-5254003471ec:1-297,
66d902a5-b546-11e7-b1d4-000d3a80115c:1-13,
79212d47-7122-11e7-8641-0050569f788a:1-17797,
9d5436f6-7122-11e7-8e0c-0050569f19f6:1-6,
ac29dfbf-aa66-11e8-9d1e-5254003471ec:1
1 row in set (0.00 sec) mysql> show variables like '%gtid_purged%';
+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Variable_name | Value |
+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| gtid_purged | 07cfd8f2-b30c-11e7-8909-000d3a80115c:1-7,
635227c2-af6c-11e8-a447-5254003471ec:1-297,
66d902a5-b546-11e7-b1d4-000d3a80115c:1-13,
79212d47-7122-11e7-8641-0050569f788a:1-15338,
9d5436f6-7122-11e7-8e0c-0050569f19f6:1-6,
ac29dfbf-aa66-11e8-9d1e-5254003471ec:1 |
+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
 

发现有一个GTID从库还没有跑,主库就把这个GTID purged掉了,由于从库没有业务在跑,属于备份库,所以我索性直接

 
mysql> stop slave;
mysql> reset master; mysql> set global gtid_purged = '07cfd8f2-b30c-11e7-8909-000d3a80115c:1-7,
'> 635227c2-af6c-11e8-a447-5254003471ec:1-297,
'> 66d902a5-b546-11e7-b1d4-000d3a80115c:1-13,
'> 79212d47-7122-11e7-8641-0050569f788a:1-15223,
'> 9d5436f6-7122-11e7-8e0c-0050569f19f6:1-6,
'> ac29dfbf-aa66-11e8-9d1e-5254003471ec:1'; #主库的gtid_purged#
mysql> change master to master_host='10.120.141.136',master_user='sys_replication',master_password='x!Jkz@SIe',master_port=3306,master_auto_position=1;
mysql> start slave;
mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.120.141.136
Master_User: sys_replication
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000480
Read_Master_Log_Pos: 10272
Relay_Log_File: mysqld-relay-bin.000034
Relay_Log_Pos: 10285
Relay_Master_Log_File: mysql-bin.000480
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 10272
Relay_Log_Space: 10580
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 141135
Master_UUID: 79212d47-7122-11e7-8641-0050569f788a
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 79212d47-7122-11e7-8641-0050569f788a:15224-17793
Executed_Gtid_Set: 07cfd8f2-b30c-11e7-8909-000d3a80115c:1-7,
635227c2-af6c-11e8-a447-5254003471ec:1-297,
66d902a5-b546-11e7-b1d4-000d3a80115c:1-13,
79212d47-7122-11e7-8641-0050569f788a:1-17793,
9d5436f6-7122-11e7-8e0c-0050569f19f6:1-6,
ac29dfbf-aa66-11e8-9d1e-5254003471ec:1
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
 

总结:

从库在开始同步前,主库会依靠GTID来确认从库在开始同步以后, 能够把每一个主库上执行过的事务(包括slave的SQL Thread)都复现一次,最终保持和主库完全一致;
判断方法也很简单,基本基于两个条件:
1.主库不能purge从库还没有execute的事务(即从库的executed_GTID要大于主库的GTID_Purged);
2.主库上的事务号不能低于从库(即从库的executed_GTID的最后一个事务要在主库的executed_GTID的范围之内);

2、 构架为双主(一主一从,且互为主从),业务和应用在主库上跑,从库做备份,基本没有业务和应用。

从库(s)指向主库(m)时连接良好,主库(m)指向从库(s)时报错1236。

 
mysql> show slave status \G  #主库(m)状态
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 10.120.141.168
Master_User: sys_repl
Master_Port: 3306
Connect_Retry: 10
Master_Log_File:
Read_Master_Log_Pos: 4
Relay_Log_File: mysqld-relay-bin.000001
Relay_Log_Pos: 4
Relay_Master_Log_File:
Slave_IO_Running: No
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 0
Relay_Log_Space: 154
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 141168
Master_UUID: 055a9521-4906-11e8-8cdb-0050569f3621
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp: 181119 14:27:58
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set: 055a9521-4906-11e8-8cdb-0050569f3621:1-164859,
2f7211a3-fba0-11e5-b668-0050569f3621:1-627,
30d4f3f6-b56b-11e7-acf1-000d3a801c2f:1-14,
375942c7-0723-11e6-b55c-0050569f3621:1-16630,
61edd40b-af6c-11e8-a4f6-525400adeb6d:1-5059,
971844d6-d7ca-11e6-8d01-0050569f6058:1-11929269,
d408633e-fb9f-11e5-8de2-0050569f6058:1-1317354
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
 

再看从库(s)的gtid_purged

 
mysql> show variables like '%gtid%purged%'\G
*************************** 1. row ***************************
Variable_name: gtid_purged
Value: 055a9521-4906-11e8-8cdb-0050569f3621:1-157766,
2f7211a3-fba0-11e5-b668-0050569f3621:1-627,
30d4f3f6-b56b-11e7-acf1-000d3a801c2f:1-14,
375942c7-0723-11e6-b55c-0050569f3621:1-16633,
971844d6-d7ca-11e6-8d01-0050569f6058:1-9581086,
d408633e-fb9f-11e5-8de2-0050569f6058:1-1317354
1 row in set (0.00 sec)
 

发现由于从库(s)的 gtid_purged大于主库(m)的Executed_Gtid  #从库指向主库的结构已经搭建完成,现在是搭建主库指向从库时报错,即当前主是从#

根据之前总结的规则,主库(s)不能purge从库(m)还没有execute的事务(即从库(m)的executed_GTID要大于主库(s)的GTID_Purged)

所以会报1236的错误。由于主库(m)上还有业务和应用在跑,所以不能生硬的reset master,所以只能想办法把execunt gtid追回来,

我的方法是跳过这三个事务(不是唯一解法,如果差的事务号过多,这个办法就很愚蠢,在这个构架下出现这种错误很有可能在从库上有应用执行过事务,

如果从库执行的事务太多,那就要查查原因了)

 
stop slave;
set gtid_next='375942c7-0723-11e6-b55c-0050569f3621:16631'; --指定下一个事务执行的版本,即想要跳过的GTID
begin;
commit; --注入一个空事物
set gtid_next='375942c7-0723-11e6-b55c-0050569f3621:16632';
begin;
commit;
set gtid_next='375942c7-0723-11e6-b55c-0050569f3621:16633';
begin;
commit; set gtid_next='AUTOMATIC'; --自动的寻找GTID事务。 start slave; --开始同步 mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.120.141.168
Master_User: sys_repl
Master_Port: 3306
Connect_Retry: 10
Master_Log_File: mysql-bin.000026
Read_Master_Log_Pos: 185245755
Relay_Log_File: mysqld-relay-bin.000002
Relay_Log_Pos: 414
Relay_Master_Log_File: mysql-bin.000026
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 185245755
Relay_Log_Space: 622
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 141168
Master_UUID: 055a9521-4906-11e8-8cdb-0050569f3621
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set: 055a9521-4906-11e8-8cdb-0050569f3621:1-164859,
2f7211a3-fba0-11e5-b668-0050569f3621:1-627,
30d4f3f6-b56b-11e7-acf1-000d3a801c2f:1-14,
375942c7-0723-11e6-b55c-0050569f3621:1-16633,
61edd40b-af6c-11e8-a4f6-525400adeb6d:1-5059,
971844d6-d7ca-11e6-8d01-0050569f6058:1-11929305,
d408633e-fb9f-11e5-8de2-0050569f6058:1-1317354
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
 

这样就好了。

04-26 13:22