We have Master <> Master replication. After some start-stop issues - I found some odds in replication.
First - on both boxes we have an error:
160226 10:08:59 [Note] Slave I/O thread: connected to master 'user@172.**.**.51:3306',replication started in log 'mysql-bin.000210' at position 107 160226 10:08:59 [ERROR] Error reading packet from server: Could not find first log file name in binary log index file ( server_errno=1236) 160226 10:08:59 [ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file', Error_code: 1236 160226 10:08:59 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000210', position 107
Status on both nodes also says:
Master_Log_File: mysql-bin.000210 Read_Master_Log_Pos: 107 Relay_Log_File: relay.000629 Relay_Log_Pos: 4 Relay_Master_Log_File: mysql-bin.000210 ... Exec_Master_Log_Pos: 107 Relay_Log_Space: 107 Last_IO_Errno: 1236 Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'
I found a possible solution to fix a 1236
error - jusr move replication to last bin-log at first position, i.e.:
STOP SLAVE;
CHANGE MASTER TO MASTER_LOG_POS = 0;
CHANGE MASTER TO MASTER_LOG_FILE = 'mysql-bin.0000LASTLOGNUM';
START SLAVE;
But here is what confused me.
Master_1:
# ls -l /var/log/mysql/mysql-bin.*
-rw-rw---- 1 mysql adm 126 Feb 26 09:56 /var/log/mysql/mysql-bin.000213
-rw-rw---- 1 mysql adm 126 Feb 26 10:15 /var/log/mysql/mysql-bin.000214
-rw-rw---- 1 mysql adm 1054 Feb 26 11:13 /var/log/mysql/mysql-bin.000215
-rw-rw---- 1 mysql adm 96 Feb 26 10:15 /var/log/mysql/mysql-bin.index
While Master_2 have:
# ls -l /var/log/mysql/mysql-bin.*
-rw-rw---- 1 mysql adm 126 Feb 26 07:26 /var/log/mysql/mysql-bin.000212
-rw-rw---- 1 mysql adm 107 Feb 26 09:39 /var/log/mysql/mysql-bin.000213
-rw-rw---- 1 mysql adm 64 Feb 26 09:39 /var/log/mysql/mysql-bin.index
What is the correct way then?
Set CHANGE MASTER TO MASTER_LOG_FILE = 'mysql-bin.000215';
on Mater_1?
What then to do on the Master_2?
P.S. In fact - I can just drop && recreate databases on both hosts - there is no sensitive data at this moment and this must "reload" replication process (amirite?). But out of curiosity - what's wrong with those replications?