I am working on a replication problem that completely baffles me! This client has two MySQL replication clusters on HUGE bare metal HW. See Environment below.
The IO_Tread of the slave is WAY behind, several hours or more. Yes, the IO_tread, not the SQL_tread. Why is it so hard to download not all that large binlog records and write them to disk. I tried finding a resource bottleneck, but given the massive HW none can be found.
The only strange observation is that the slave has 8x the IO OPS than the master. But even this does not really overload the SSD disk. Packet traces show the SLAVE often setting the TCP window to zero. Why, there are plenty resources?
Anyone with ideas what could be causing this strange behaviour? Why do I have more IO on the slave? What can cause the IO_tread to slow down?
Environment: Both machines: Bare metal DELL, MySQL 5.6.30, 12CPUs, 128GB mem, datadir on SSD, Net I/F: Emulex 10Gb, ROW based binlog FMT
Symptoms:
MASTER: CPU: 67% 1 processor lightly used, MEM: 70% used, 30% free, IO OPS: ~2500 tps, 30% util on SSD, slave client tread: Send binlog to slave.
SLAVE:
CPU: 40% 1 processor lightly used, MEM: 70% used, 30% free, IO OPS: ~16000 tps, 70% util on SSD, Error counters on net I/F are 0 (zero), TCP window is often set to 0 on IO_tread, Slave IO_tread is VERY slow. Lags more than an hour!
Another slave on SAME master has no trouble at all! This slave has much lower HW spec!
Trouble downloading master binlog. Why this insanely high IO rate?
Stopping the slave also stops the IO OPS. (As expected, the OPS are from MySQL)
Copying large amounts of data from master to slave over the network (using ncat) show performance as expected.
Other observations:
When reversing the roles the problem stays the same.
Another replication cluster with same HW has no trouble. IO OPS in this cluster on slave are slightly less than on master. This cluster uses STATEMENT based binlog