You are here

Be cautious when using Virtualized System with your Database

A customer rose a support case with a problem on his Master-Master set-up. The 2nd Master claims to have a problem:

master2> SHOW SLAVE STATUS\G
                ...
    Master_Log_File: master1-bin.000014
Read_Master_Log_Pos: 97975045
   Slave_IO_Running: No
  Slave_SQL_Running: Yes
Exec_Master_Log_Pos: 97975045
      Last_IO_Errno: 1236
      Last_IO_Error: Got fatal error 1236 from master when reading data from
                     binary log: 'Client requested master to start replication
                     from impossible position'

What has happened?

When we look at the actual Masters binary logs we see the following situation:

master1> SHOW BINARY LOGS;
+--------------------+-----------+
| Log_name           | File_size |
+--------------------+-----------+
| master1-bin.000013 |  68757710 |
| master1-bin.000014 |  97973131 |
| master1-bin.000015 |    626440 |
+--------------------+-----------+

That looks really odd: The Slave wants some events above the size of the Masters binary log!

Looking at the masters binary log we cannot see anything suspicious. Binary log #14 was closed and rotated properly and the size of the file matches exactly the size mentioned in the output.

When we were asking the customer a bit more in detail what has happened he told us that both masters crashed because their SAN had a problem and all virtual machines went down.

Master 1 is located on the SAN and the volume was mounted through VMWare and Master 2 had direct attached disks directly mounted into the VM.

So it looks like during the crash on the active Master 1 we lost some data which were already arrived on the Master 2. After everything came back on-line the Application continued to write to Master 1.

So Master 1 had some data missing which were on the Master 2 but Master 1 had in addition some data which were not yet replicated to Master 2 because the replication broke.

To make the replication work again we first pointed Master 2 to the beginning of Master 1's next binary log file:

master2> CHANGE MASTER TO master_log_file='master1-bin.000015', master_log_pos=4;

Then we started the Slave on Master 2 again. The replication caught up within seconds without any further problems. Fortunately the application was build in a way that the replication just continued and did not have any conflicts.

Now we had the situation that the Master-Master replication was working again, application was running fine on Master 1 and we have for sure more data on Master 2 than on Master 1 and we are not 100% sure if Master 2 had some data missing.

Luckily there is the Maatkit [ 1 ] Toolkit which provides the 2 scripts mk-table-checksum [ 2 ] and mk-table-sync [ 3 ]. Those scripts help in such a situation.

To find the differences we run on the Master 1:

mk-table-checksum --chunk-size=100000 \
  --create-replicate-table --replicate=test.checksum --empty-replicate-table \
  u=root,p=secret,h=localhost,P=3306

Then we had to wait until everything was replicated to the Master 2. When Master 2 had caught up we executed on Master 2 (Slave):

mk-table-checksum --replicate=test.checksum --replicate-check=2 \
  u=root,h=localhost,P=3306,p=secret

And got the following output:

Differences on P=3306,h=master2
DB         TBL          CHUNK CNT_DIFF CRC_DIFF BOUNDARIES
sales      accounting       6        3        1 `account_id` >= 1694287 AND
                                                `account_id` < 1976668
sales      accounting       7        7        1 `account_id` >= 1976668 AND
                                                `account_id` < 2259049
sales      notification     0       -1        0 1=1
monitoring server_export    5        8        1 `server_date` >= "2011-06-01"

Now we can see that we have on 3 different tables in 2 different schemata an inconsistency and 18 rows in total were affected.

To sync the tables again you can run the following command: Make sure, that you run it on the right Master:

mk-table-sync --sync-to-master --print \
h=localhost,u=root,p=secret,P=3306,D=sales,t=accounting
mk-table-sync --sync-to-master --execute\
h=localhost,u=root,p=secret,P=3306,D=sales,t=accounting

If you do it on the wrong side it will suggest you a DELETE instead of a REPLACE.

We found that the mk-table-sync script had some problems with FLOAT values resulting in empty REPLACE statement. Those rows we fixed manually. Luckily it was only a hand-full of rows and not zillions.

Learnings:

  • Be careful with virtualization solutions. They swallow you precious data in some situations.
  • SAN can be a Single-Point-of-Failor (SPoF). When it goes down you loose all your virtual instances!

Comments

Thats why I rant against VMWare, Xen and so on \o/ They use to cache fsync's. So I suggest the position was there but never synced. So with the crash it was gone. But the Slave already knew the position :)
Erkan Yanarcomment

I've seen this happen a lot when the master has crashed and sync_binlog isn't turned on.
mikehcomment

I have seen this issue on physical servers as well. sync_binlog can help to avoid the problems.
Wasifcomment