You are here

News

MySQL table Point-in-Time-Recovery from mysqldump backup

Shinguz - Sun, 2015-01-25 19:42
Taxonomy upgrade extras: BackupRestoreRecoverymysqldumppoint-in-time-recoverypitr

Sometimes we face the situation where we have a full MySQL database backup done with mysqldump and then we have to restore and recover just one single table out of our huge mysqldump file.
Further our mysqldump backup was taken hours ago so we want to recover all the changes on that table since our backup was taken up to the end.

In this blog article we cover all the steps needed to achieve this goal for MySQL and MariaDB.

Recommendation: It is recommended to do theses steps on a testing system and then dump and restore your table back to the production system. If you do it directly on your production system you have to know exactly what you are doing...
Further this process should be tested carefully and regularly to get familiar with it and to assure your backup/restore/recovery procedure works properly.

The table we want to recover is called test.test from our backup full_dump.sql.gz. As a first step we have to do the recovery with the following command to our test database:

shell> zcat full_dump.sql.gz | extract_table.py --database=test --table=test | mysql -u root

The script extract_table.py is part of the FromDual Recovery Manager to extract one single table from a mysqldump backup.

As a next step we have to extract the binary log file and its position where to start recovery from out of our dump:

shell> zcat full_dump.sql.gz | head -n 25 | grep CHANGE CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000216', MASTER_LOG_POS=1300976;

Then we have to find out where we want to stop our Point-in-Time-Recovery. The need for recover is possibly due to a TRUCATE TABLE command or similar operation executed on the wrong system or it is just a time somebody has indicated us to recover to. The position to stop we can find as follows:

shell> mysqlbinlog -v mysql-bin.000216 | grep -B5 TRUNCATE --color #150123 19:53:14 server id 35622 end_log_pos 1302950 CRC32 0x24471494 Xid = 3803 COMMIT/*!*/; # at 1302950 #150123 19:53:14 server id 35622 end_log_pos 1303036 CRC32 0xf9ac63a6 Query thread_id=54 exec_time=0 error_code=0 SET TIMESTAMP=1422039194/*!*/; TRUNCATE TABLE test

And as a last step we have to apply all the changes from the binary log to our testing database:

shell> mysqlbinlog --disable-log-bin --database=test --start-position=1300976 --stop-position=1302950 mysql-bin.000216 | mysql -u root --force

Now the table test.test is recovered to the wanted point in time and we can dump and restore it to its final location back to the production database.

shell> mysqldump --user=root --host=testing test test | mysql --user=root --host=production test

This process has been tested on MySQL 5.1.73, 5.5.38, 5.6.22 and 5.7.5 and also on MariaDB 10.0.10 and 10.1.0.

FromDual Backup and Recovery Manager for MySQL 1.2.1 has been released

FromDual.en - Thu, 2015-01-22 21:12

FromDual has the pleasure to announce the release of the new version 1.2.1 of the popular Backup and Recovery Manager for MySQL and MariaDB (fromdual_bman).

You can download the FromDual Backup and Recovery Manager from here.

In the inconceivable case that you find a bug in the Backup and Recovery Manager please report it to our Bugtracker.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Upgrade from 1.2.0 to 1.2.1 # cd ${HOME}/product # tar xf /download/fromdual_brman-1.2.1.tar.gz # rm -f fromdual_brman # ln -s fromdual_brman-1.2.1 fromdual_brman
Changes in FromDual Backup Manager 1.2.1 FromDual Backup Manager

It contains mainly one very critical bug fix, one bug fix for using FromDual Backup Manger in combination with MariaDB and xtrabackup and several minor bug fixes and small improvements.

We STRONGLY RECOMMEND to upgrade immediately to v1.2.1 if you are using FromDual Backup Manager v1.2.0-RC1 and newer. In these versions errors of mysqldump in combination with the --direct-compress option are not handled correctly and a return code of 0 is returned which can be wrong. This leads to incomplete backups and loss of data during the restore.

You can verify your current FromDual Backup Manager version with the following command:

fromdual_bman --version
  • Merges of myEnv.inc with the one of MyEnv.
  • Bug with MariaDB using xtrabackup fixed (PM).
  • Lock file handling improved.
  • Very important: max_allowed_packet bug fixed which was introduced in v1.2.0-RC1.
  • Schema names with special characters are now handled properly.
  • Log directory is automatically created if it does not exist.
  • Error comment for --blocking-backup improved.
Taxonomy upgrade extras: BackupRestoreRecoverypitrfromdual_brmanreleasebrmanbmanrman

FromDual Backup and Recovery Manager for MySQL 1.2.1 has been released

FromDual.en - Thu, 2015-01-22 21:12

FromDual has the pleasure to announce the release of the new version 1.2.1 of the popular Backup and Recovery Manager for MySQL and MariaDB (fromdual_bman).

You can download the FromDual Backup and Recovery Manager from here.

In the inconceivable case that you find a bug in the Backup and Recovery Manager please report it to our Bugtracker.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Upgrade from 1.2.0 to 1.2.1 # cd ${HOME}/product # tar xf /download/fromdual_brman-1.2.1.tar.gz # rm -f fromdual_brman # ln -s fromdual_brman-1.2.1 fromdual_brman
Changes in FromDual Backup Manager 1.2.1 FromDual Backup Manager

It contains mainly one very critical bug fix, one bug fix for using FromDual Backup Manger in combination with MariaDB and xtrabackup and several minor bug fixes and small improvements.

We STRONGLY RECOMMEND to upgrade immediately to v1.2.1 if you are using FromDual Backup Manager v1.2.0-RC1 and newer. In these versions errors of mysqldump in combination with the --direct-compress option are not handled correctly and a return code of 0 is returned which can be wrong. This leads to incomplete backups and loss of data during the restore.

You can verify your current FromDual Backup Manager version with the following command:

fromdual_bman --version
  • Merges of myEnv.inc with the one of MyEnv.
  • Bug with MariaDB using xtrabackup fixed (PM).
  • Lock file handling improved.
  • Very important: max_allowed_packet bug fixed which was introduced in v1.2.0-RC1.
  • Schema names with special characters are now handled properly.
  • Log directory is automatically created if it does not exist.
  • Error comment for --blocking-backup improved.
Taxonomy upgrade extras: BackupRestoreRecoverypitrfromdual_brmanreleasebrmanbmanrman

FromDual Backup Manager for MySQL 1.2.1 has been released

FromDual.en - Thu, 2015-01-22 21:12
Taxonomy upgrade extras: BackupRestoreRecoveryfromdual_brmanmysql_bman

FromDual has the pleasure to announce the release of the new version 1.2.1 of the popular Backup Manager for MySQL and MariaDB (fromdual_bman).

You can download the FromDual Backup Manager from here.

In the inconceivable case that you find a bug in the Backup Manager please report it to our Bugtracker.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Upgrade from 1.2.0 to 1.2.1 # cd ${HOME}/product # tar xf /download/fromdual_brman-1.2.1.tar.gz # rm -f fromdual_brman # ln -s fromdual_brman-1.2.1 fromdual_brman
Changes in FromDual Backup Manager 1.2.1 FromDual Backup Manager

It contains mainly one very critical bug fix, one bug fix for using FromDual Backup Manger in combination with MariaDB and xtrabackup and several minor bug fixes and small improvements.

We STRONGLY RECOMMEND to upgrade immediately to v1.2.1 if you are using FromDual Backup Manager v1.2.0-RC1 and newer. In these versions errors of mysqldump in combination with the --direct-compress option are not handled correctly and a return code of 0 is returned which can be wrong. This leads to incomplete backups and loss of data during the restore.

You can verify your current FromDual Backup Manager version with the following command:

fromdual_bman --version
  • Merges of myEnv.inc with the one of MyEnv.
  • Bug with MariaDB using xtrabackup fixed (PM).
  • Lock file handling improved.
  • Very important: max_allowed_packet bug fixed which was introduced in v1.2.0-RC1.
  • Schema names with special characters are now handled properly.
  • Log directory is automatically created if it does not exist.
  • Error comment for --blocking-backup improved.

Introducing Myself: Jörg Brühe

Jörg Brühe - Mon, 2015-01-19 20:56
For some time already, FromDual's "Our Team" page lists me, and it even reveals that I joined in September, 2014. Also for some time, the list of FromDual blogs contains an entry "Jörg's Blog", but it doesn't lead to any entries. It is high time to fix this and create entries, starting with an introduction of myself.

Often, in such introductions people use the phrase of "the new kid on the block". I won't. If I am to use those words, I will arrange them as "the kid on the new block". The reason is that I don't feel as a new kid in the MySQL village (or is it a city?), let alone in DBMS country.

Ever since I left university (Technical University of Berlin, Germany), I have been involved in SQL DBMS development. After my previous product's team had been dissolved in Berlin and maintenance moved to Riga, Latvia, as a cost-cutting measure, I joined MySQL AB in 2004 as a member of the Build Team. Some of you will remember my name from MySQL release announcements, others may have seen it in bug reports, newsgroup entries, or other MySQL communication. Together with the other MySQL colleagues, I got acquired by Sun in 2008 and then by Oracle in 2010.

In 2012, I decided I wanted a culture change, from a highly (and very strictly) organized corporation to a smaller, more flexible entity. That brought me to an internet site, where I worked as a DBA, concentrating on their (more than 100) MySQL installations. It was very interesting to view MySQL from the DBA side now, I got a big load of new experiences and learned a lot about using, running, and automating MySQL (rather than building and testing it).

That might have continued, but as usual the world is changing, and my DBA job was no exception: Company strategy was to shift those tasks more to the various application teams and to do without a central point for MySQL. It is a sign how far MySQL knowledge has spread, and how little effort may be sufficient to run MySQL, that this worked. OTOH, this invalidated the assumptions on which my job was based, and I did not really fit the alternatives left in that company.

In that situation, my contact to FromDual got renewed, and both the company and me felt we should be a good match. Till now, there is no indication we were wrong: I like the work, I feel I'm productive and make customers happy, the tasks are manifold and interesting (so interesting that writing the first blog gets delayed quite long), and I expect this will continue.

"Ad multos annos!"

Introducing Myself: Jörg Brühe

Jörg Brühe - Mon, 2015-01-19 20:56
For some time already, FromDual's "Our Team" page lists me, and it even reveals that I joined in September, 2014. Also for some time, the list of FromDual blogs contains an entry "Jörg's Blog", but it doesn't lead to any entries. It is high time to fix this and create entries, starting with an introduction of myself.

Often, in such introductions people use the phrase of "the new kid on the block". I won't. If I am to use those words, I will arrange them as "the kid on the new block". The reason is that I don't feel as a new kid in the MySQL village (or is it a city?), let alone in DBMS country.

Ever since I left university (Technical University of Berlin, Germany), I have been involved in SQL DBMS development. After my previous product's team had been dissolved in Berlin and maintenance moved to Riga, Latvia, as a cost-cutting measure, I joined MySQL AB in 2004 as a member of the Build Team. Some of you will remember my name from MySQL release announcements, others may have seen it in bug reports, newsgroup entries, or other MySQL communication. Together with the other MySQL colleagues, I got acquired by Sun in 2008 and then by Oracle in 2010.

In 2012, I decided I wanted a culture change, from a highly (and very strictly) organized corporation to a smaller, more flexible entity. That brought me to an internet site, where I worked as a DBA, concentrating on their (more than 100) MySQL installations. It was very interesting to view MySQL from the DBA side now, I got a big load of new experiences and learned a lot about using, running, and automating MySQL (rather than building and testing it).

That might have continued, but as usual the world is changing, and my DBA job was no exception: Company strategy was to shift those tasks more to the various application teams and to do without a central point for MySQL. It is a sign how far MySQL knowledge has spread, and how little effort may be sufficient to run MySQL, that this worked. OTOH, this invalidated the assumptions on which my job was based, and I did not really fit the alternatives left in that company.

In that situation, my contact to FromDual got renewed, and both the company and me felt we should be a good match. Till now, there is no indication we were wrong: I like the work, I feel I'm productive and make customers happy, the tasks are manifold and interesting (so interesting that writing the first blog gets delayed quite long), and I expect this will continue.

"Ad multos annos!"

Introducing Myself: Jörg Brühe

Jörg Brühe - Mon, 2015-01-19 20:56
For some time already, FromDual's "Our Team" page lists me, and it even reveals that I joined in September, 2014. Also for some time, the list of FromDual blogs contains an entry "Jörg's Blog", but it doesn't lead to any entries. It is high time to fix this and create entries, starting with an introduction of myself.

Often, in such introductions people use the phrase of "the new kid on the block". I won't. If I am to use those words, I will arrange them as "the kid on the new block". The reason is that I don't feel as a new kid in the MySQL village (or is it a city?), let alone in DBMS country.

Ever since I left university (Technical University of Berlin, Germany), I have been involved in SQL DBMS development. After my previous product's team had been dissolved in Berlin and maintenance moved to Riga, Latvia, as a cost-cutting measure, I joined MySQL AB in 2004 as a member of the Build Team. Some of you will remember my name from MySQL release announcements, others may have seen it in bug reports, newsgroup entries, or other MySQL communication. Together with the other MySQL colleagues, I got acquired by Sun in 2008 and then by Oracle in 2010.

In 2012, I decided I wanted a culture change, from a highly (and very strictly) organized corporation to a smaller, more flexible entity. That brought me to an internet site, where I worked as a DBA, concentrating on their (more than 100) MySQL installations. It was very interesting to view MySQL from the DBA side now, I got a big load of new experiences and learned a lot about using, running, and automating MySQL (rather than building and testing it).

That might have continued, but as usual the world is changing, and my DBA job was no exception: Company strategy was to shift those tasks more to the various application teams and to do without a central point for MySQL. It is a sign how far MySQL knowledge has spread, and how little effort may be sufficient to run MySQL, that this worked. OTOH, this invalidated the assumptions on which my job was based, and I did not really fit the alternatives left in that company.

In that situation, my contact to FromDual got renewed, and both the company and me felt we should be a good match. Till now, there is no indication we were wrong: I like the work, I feel I'm productive and make customers happy, the tasks are manifold and interesting (so interesting that writing the first blog gets delayed quite long), and I expect this will continue.

"Ad multos annos!"

Impacts of max_allowed_packet size problems on your MySQL database

Shinguz - Sun, 2015-01-18 11:18
Taxonomy upgrade extras: max_allowed_packetconnectionBackupRestoredump

We recently run into some troubles with max_allowed_packet size problems during backups with the FromDual Backup/Recovery Manager and thus I investigated a bit more in the symptoms of such problems.

Read more about: max_allowed_packet.

A general rule for max_allowed_packet size to avoid problems is: All clients and the server should have set the same value for max_allowed_packet size!

I prepared some data for the test which looked as follows:

mysql> SELECT id, LEFT(data, 30), LENGTH(data), ts FROM test; +----+--------------------------------+--------------+------+ | id | left(data, 30) | length(data) | ts | +----+--------------------------------+--------------+------+ | 1 | Anhang | 6 | NULL | | 2 | Anhang | 6 | NULL | | 3 | Anhangblablablablablablablabla | 2400006 | NULL | | 4 | Anhang | 6 | NULL | +----+--------------------------------+--------------+------+

Max_packet_size was set to a too small value then:

mysql> SHOW GLOBAL VARIABLES WHERE variable_name = 'max_allowed_packet'; +--------------------+---------+ | Variable_name | Value | +--------------------+---------+ | max_allowed_packet | 1048576 | +--------------------+---------+

The first test was to retrieve the too big row:

mysql> SELECT * FROM test WHERE id = 3; ERROR 2020 (HY000): Got packet bigger than 'max_allowed_packet' bytes mysql> SELECT CURRENT_USER(); ERROR 2006 (HY000): MySQL server has gone away No connection. Trying to reconnect... Connection id: 6 Current database: test

We got an error message AND we were disconnected from the server. This is indicated with the message MySQL server has gone away which is basically wrong. We were disconnected and not the server has died or similar in this case.

A further symptom is that we get an entry in the MySQL error log about this incident:

[Warning] Aborted connection 3 to db: 'test' user: 'root' host: 'localhost' (Got an error writing communication packets)

So watching carefully such error messages in your MySQL error log with the script check_error_log_mysql.pl from our Nagios/Icinga plugins would be a good idea...

The mysqldump utility basically does the same as a SELECT command so I tried this out and got the same error:

shell> mysqldump -u root test > /tmp/test_dump.sql mysqldump: Error 2020: Got packet bigger than 'max_allowed_packet' bytes when dumping table `test` at row: 2

And again we get an error message in the error log! This is also a good indicator to see if your backup, made with mysqldump failed in this case.

To get a proper dump we have to configure the mysqldump utility properly:

shell> mysqldump --max-allowed-packet=5000000 -u root test > /tmp/test_dump.sql

After the backup we tried to restore the data:

shell> mysql -u root test < /tmp/test_dump.sql ERROR 2006 (HY000) at line 40: MySQL server has gone away

Again we got an error on the command line and in the MySQL error log:

[Warning] Aborted connection 11 to db: 'test' user: 'root' host: 'localhost' (Got a packet bigger than 'max_allowed_packet' bytes)

and further the data are only partially loaded:

mysql> SELECT * FROM test; +----+--------+------+ | id | data | ts | +----+--------+------+ | 1 | Angang | NULL | | 2 | Angang | NULL | +----+--------+------+

Another symptom we can see here is that the MySQL status aborted_clients is increased in all 3 situation:

mysql> SHOW GLOBAL STATUS WHERE variable_name = 'aborted_clients'; +-----------------+-------+ | Variable_name | Value | +-----------------+-------+ | Aborted_clients | 10 | +-----------------+-------+

One positive aspect is that with MySQL 5.7.5 the first 2 symptoms do not appear any more...

Further information you can find here: Communication Errors and Aborted Connections.

Avoid temporary disk tables with MySQL

Shinguz - Fri, 2014-12-19 07:38
Taxonomy upgrade extras: temporary tablediskselectquery tuning

For processing SELECT queries MySQL needs some times the help of temporary tables. These temporary tables can be created either in memory or on disk.

The number of creations of such temporary tables can be found with the following command:

mysql> SHOW GLOBAL STATUS LIKE 'created_tmp%tables'; +-------------------------+-------+ | Variable_name | Value | +-------------------------+-------+ | Created_tmp_disk_tables | 4 | | Created_tmp_tables | 36 | +-------------------------+-------+
There are 2 different reasons why MySQL is creating a temporary disk table instead of a temporary memory table:
  • The result is bigger than the smaller one of the MySQL variables max_heap_table_size and tmp_table_size.
  • The result contains columns of type BLOB or TEXT.
In the following example we can see how the temporary disk table can be avoided without changing the column types: mysql> CREATE TABLE test ( id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY , data TEXT , type TINYINT UNSIGNED ); mysql> INSERT INTO test VALUES (NULL, 'State is green', 1), (NULL, 'State is green', 1) , (NULL, 'State is red', 3), (NULL, 'State is red', 3) , (NULL, 'State is red', 3), (NULL, 'State is orange', 2); mysql> EXPLAIN SELECT data, COUNT(*) FROM test GROUP BY data; +----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+ | 1 | SIMPLE | test | ALL | NULL | NULL | NULL | NULL | 6 | Using temporary; Using filesort | +----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+ mysql> SHOW SESSION STATUS LIKE 'created_tmp%tables'; +-------------------------+-------+ | Variable_name | Value | +-------------------------+-------+ | Created_tmp_disk_tables | 0 | | Created_tmp_tables | 3 | +-------------------------+-------+ mysql> SELECT data, COUNT(*) FROM test GROUP BY data; +-----------------+----------+ | data | count(*) | +-----------------+----------+ | State is green | 2 | | State is orange | 1 | | State is red | 3 | +-----------------+----------+ mysql> SHOW SESSION STATUS LIKE 'created_tmp%tables'; +-------------------------+-------+ | Variable_name | Value | +-------------------------+-------+ | Created_tmp_disk_tables | 1 | | Created_tmp_tables | 4 | +-------------------------+-------+ mysql> SELECT SUBSTR(data, 1, 32), COUNT(*) FROM test GROUP BY SUBSTR(data, 1, 32); mysql> SHOW SESSION STATUS LIKE 'created_tmp%tables'; +-------------------------+-------+ | Variable_name | Value | +-------------------------+-------+ | Created_tmp_disk_tables | 1 | | Created_tmp_tables | 5 | +-------------------------+-------+

This method can be used if changing the table structure from TEXT to VARCHAR or the use of a RAM disk are not possible solutions.

Making HAProxy High Available for MySQL Galera Cluster

Shinguz - Sun, 2014-12-14 18:37
Taxonomy upgrade extras: HAProxyload balancerGalera ClusterVIPvirtual IPHigh Availabilityha

After properly installing and testing a Galera Cluster we see that the set-up is not finished yet. It needs something in front of the Galera Cluster that balances the load over all nodes.
So we install a load balancer in front of the Galera Cluster. Typically nowadays HAProxy is chosen for this purpose. But then we find, that the whole Galera Cluster is still not high available in case the load balancer fails or dies. So we need a second load balancer for high availability.
But how should we properly fail-over when the HAProxy load balancer dies? For this purpose we put a Virtual IP (VIP) in front of the HAProxy load balancer pair. The Virtual IP is controlled and fail-overed with Keepalived.

Installation of HAProxy and Keepalived

First some preparations: For installing socat we need the repoforge repository:

shell> cd /tmp shell> wget http://pkgs.repoforge.org/rpmforge-release/rpmforge-release-0.5.3-1.el6.rf.x86_64.rpm shell> yum localinstall rpmforge-release-0.5.3-1.el6.rf.x86_64.rpm shell> yum update shell> yum install socat

Then we can start installing HAProxy and Keepalived:

shell> yum install haproxy keepalived shell> chkconfig haproxy on shell> chkconfig keepalived on

We can check the installed HAProxy and Keepalived versions as follows:

shell> haproxy -v HA-Proxy version 1.5.2 2014/07/12 shell> keepalived --version Keepalived v1.2.13 (10/15,2014)
Configuration of HAProxy

More details you can find in the HAProxy documentation.

shell> cp /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.bak shell> cat << _EOF >/etc/haproxy/haproxy.cfg # # /etc/haproxy/haproxy.cfg # #--------------------------------------------------------------------- # Global settings #--------------------------------------------------------------------- global # to have these messages end up in /var/log/haproxy.log you will # need to: # # 1) configure syslog to accept network log events. This is done # by adding the '-r' option to the SYSLOGD_OPTIONS in # /etc/sysconfig/syslog # # 2) configure local2 events to go to the /var/log/haproxy.log # file. A line like the following can be added to # /etc/sysconfig/syslog # # local2.* /var/log/haproxy.log # log 127.0.0.1 local2 chroot /var/lib/haproxy pidfile /var/run/haproxy.pid maxconn 1020 # See also: ulimit -n user haproxy group haproxy daemon # turn on stats unix socket stats socket /var/lib/haproxy/stats.sock mode 600 level admin stats timeout 2m #--------------------------------------------------------------------- # common defaults that all the 'frontend' and 'backend' sections will # use if not designated in their block #--------------------------------------------------------------------- defaults mode tcp log global option dontlognull option redispatch retries 3 timeout queue 45s timeout connect 5s timeout client 1m timeout server 1m timeout check 10s maxconn 1020 #--------------------------------------------------------------------- # HAProxy statistics backend #--------------------------------------------------------------------- listen haproxy-monitoring *:80 mode http stats enable stats show-legends stats refresh 5s stats uri / stats realm Haproxy\ Statistics stats auth monitor:AdMiN123 stats admin if TRUE frontend haproxy1 # change on 2nd HAProxy bind *:3306 default_backend galera-cluster backend galera-cluster balance roundrobin server nodeA 192.168.1.61:5201 maxconn 151 check server nodeB 192.168.1.61:5202 maxconn 151 check server nodeC 192.168.1.61:5203 maxconn 151 check _EOF
Starting and testing HAProxy

The HAProxy can be started as follows:

shell> service haproxy start

and then be checked either over the socket:

shell> socat /var/lib/haproxy/stats.sock readline prompt > show info > show stat > help

or over your favourite web browser entering the username and password (monitor:AdMiN123) specified in the configuration file above:

To check the application over the load balancer we can run the following command:

shell> mysql --user=app --password=secret --host=192.168.1.38 --port=3306 --exec="SELECT @@wsrep_node_name;" +-------------------+ | @@wsrep_node_name | +-------------------+ | Node C | +-------------------+ shell> mysql --user=app --password=secret --host=192.168.1.38 --port=3306 --exec="SELECT @@wsrep_node_name;" +-------------------+ | @@wsrep_node_name | +-------------------+ | Node A | +-------------------+ shell> mysql --user=app --password=secret --host=192.168.1.38 --port=3306 --exec="SELECT @@wsrep_node_name;" +-------------------+ | @@wsrep_node_name | +-------------------+ | Node B | +-------------------+
Configuration a Virtual IP (VIP) with Keepalived

Now we have 2 HAProxy load balancers. But what happens if one of them fails. Then we do not want to reconfigure our application to work properly again. The fail-over should happen automatically. For this we need a Virtual IP which should automatically fail-over.

More details you can find in the Keepalived documentation and the keepalived user guide.

shell> cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak cat << _EOF >/etc/keepalived/keepalived.conf # # /etc/keepalived/keepalived.conf # global_defs { notification_email { remote-dba@fromdual.com root@localhost } # Change email from on lb2: notification_email_from lb1@haproxy1 router_id HAPROXY } vrrp_script chk_haproxy { script "killall -0 haproxy" interval 2 weight 2 } vrrp_instance GALERA_VIP { interface eth1 virtual_router_id 42 # Higher priority on other node priority 101 # 102 advert_int 1 # notify "/usr/local/bin/VRRP-notification.sh" virtual_ipaddress { 192.168.1.99/32 dev eth1 } track_script { chk_haproxy } authentication { auth_type PASS auth_pass secret } } _EOF
Starting and testing Keepalived

To test the keepalived we can run the following command:

shell> keepalived -f /etc/keepalived/keepalived.conf --dont-fork --log-console --log-detail ^C

To finally start it the following command will serve:

shell> service keepalived start

To check the Virtual IP the following command will help:

shell> ip addr show eth1

And then we can check our application over the VIP:

shell> mysql --user=app --password=secret --host=192.168.1.99 --port=3306 --exec="SELECT @@wsrep_node_name;"
Literature

failed MySQL DDL commands and Galera replication

Shinguz - Tue, 2014-12-09 15:45
Taxonomy upgrade extras: galerareplicationDDLTOIRSU

We have recently seen a case where the following command was executed on a Galera Cluster node:

SQL> GRANT SUPER ON userdb.* TO root@127.0.0.111; ERROR 1221 (HY000): Incorrect usage of DB GRANT and GLOBAL PRIVILEGES
2014-12-09 14:53:55 7457 [Warning] Did not write failed 'GRANT SUPER ON userdb.* TO root@127.0.0.111' into binary log while granting/revoking privileges in databases. 2014-12-09 14:53:55 7457 [ERROR] Slave SQL: Error 'Incorrect usage of DB GRANT and GLOBAL PRIVILEGES' on query. Default database: ''. Query: 'GRANT SUPER ON userdb.* TO root@127.0.0.111', Error_code: 1221 2014-12-09 14:53:55 7457 [Warning] WSREP: RBR event 1 Query apply warning: 1, 17 2014-12-09 14:53:55 7457 [Warning] WSREP: Ignoring error for TO isolated action: source: c5e54ef5-7faa-11e4-97b0-5e5c695f08a5 version: 3 local: 0 state: APPLYING flags: 65 conn_id: 4 trx_id: -1 seqnos (l: 4, g: 17, s: 15, d: 15, ts: 113215863294782)

According to the error message it looks like this command is done in Total Order Isolation (TOI) mode during the Rolling Schema Upgrade (RSU).

Only on the nodes which did NOT receive this wrong command the error log message was written and further they have received a GRA_*.log file.

Analysis of the GRA_*.log (failed transactions) files:

hexdump -C GRA_2_16.log 00000000 f3 fe 86 54 02 53 14 00 00 76 00 00 00 76 00 00 |...T.S...v...v..| 00000010 00 00 00 04 00 00 00 00 00 00 00 00 00 00 2a 00 |..............*.| 00000020 00 00 00 00 00 01 00 00 00 40 00 00 00 00 06 03 |.........@......| 00000030 73 74 64 04 21 00 21 00 08 00 0b 04 72 6f 6f 74 |std.!.!.....root| 00000040 09 6c 6f 63 61 6c 68 6f 73 74 00 67 72 61 6e 74 |.localhost.grant| 00000050 20 53 55 50 45 52 20 6f 6e 20 75 73 65 72 64 62 | SUPER on userdb| 00000060 2e 2a 20 74 6f 20 72 6f 6f 74 40 31 32 37 2e 30 |.* to root@127.0| 00000070 2e 30 2e 31 31 31 |.0.111 |
dd if=bin-log.000001 of=binlog.header bs=1 count=120 cat binlog.header GRA_2_17.log > GRA_2_17.binlog_events mysqlbinlog GRA_2_17.binlog_events ... # at 120 #141209 15:04:54 server id 5201 end_log_pos 118 CRC32 0x3432312e Query thread_id=45 exec_time=0 error_code=0 SET TIMESTAMP=1418133894/*!*/; SET @@session.pseudo_thread_id=4/*!*/; SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/; SET @@session.sql_mode=1073741824/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!\C utf8 *//*!*/; SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/; SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; grant SUPER on userdb.* to root@127.0.0.111 /*!*/; DELIMITER ; # End of log file ROLLBACK /* added by mysqlbinlog */; /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

It further looks like this command was issues by Connection ID number 4: conn_id: 4.

How to recover deleted tablespace?

Abdel-Mawla Gharieb - Fri, 2014-11-14 22:56

Sometimes, MySQL tablespace file(s) might be deleted by mistake, e.g. delete the shared tablespace (ibdata1) or an individual tablespace (table_name.ibd).

In this post I will show you how to recover those files (on Linux OS) having only one condition, MySQL service should still be running. If MySQL service stopped after deleting that file, this method will not work, so it is extremely important to act as quick as possible to avoid data loss.

The following is a simple table creation (innodb_file_per_table is enabled) and the records count inside that table:

SQL> SHOW CREATE TABLE t\G *************************** 1. row *************************** Table: t Create Table: CREATE TABLE `t` ( `id` int(11) NOT NULL AUTO_INCREMENT, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=23 DEFAULT CHARSET=latin1 1 row in set (0.00 sec) SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 22 | +----------+ 1 row in set (0.02 sec)

Now, lets delete the individual tablespace for that table:

shell> rm -rf /var/lib/mysql/test/t.ibd

At this time, we can still select and modify that table!!

SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 23 | +----------+ 1 row in set (0.00 sec)

To be more accurate, rm does not actually delete the file, rather it removes the directory entry pointing to the file's inode. The inode - and in consequence the file - will be removed only if this is the last reference, but as long as the MySQL server process has the file opened, there is another reference which is the open file handle (that's why the "mysqld" server process must still be running).

To list the opened files we can use the Linux command lsof (we filter the output to get only the deleted tablespace information):

shell> lsof |grep t.ibd COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME mysqld 11401 mysql 25uW REG 7,0 98304 1010691 /var/lib/mysql/test/t.ibd (deleted)

The file has a tag of (deleted) which means that the directory entry pointing to the file's inode was deleted but there is another reference(s) to that inode, otherwise it won't be listed by the above command.
Now the question is, how can we get the on-disk path to that opened file if the directory entry was removed?

We can use the "/proc" interface to the running processes and their file handles by the following formula:

  • File path = /proc/PID/fd/FD-number

According to the above formula and using the output of the "lsof" command, the file we just deleted is located here:

shell> ll /proc/11401/fd/25 lrwx------ 1 mysql mysql 64 Oct 28 16:14 /proc/11401/fd/25 -> /var/lib/mysql/test/t.ibd (deleted)

To make sure that this is the on-disk path for the file we deleted, check the reference: it still points to the original path.

How can we recover that file??
  • First, we should make sure that no other queries are modifying that table: SQL> LOCK TABLE t READ; Query OK, 0 rows affected (0.00 sec)
  • Then we copy the data blocks (/proc/11401/fd/25) to a new file (we use the original file path) and change the ownership to the MySQL system user (mysql): shell> cp /proc/11401/fd/25 /var/lib/mysql/test/t.ibd shell> chown mysql:mysql /var/lib/mysql/test/t.ibd
  • Restart MySQL service (if we didn't restart MySQL service directly after recovering the tablespace all changes on that table will still be redirected to the open file handle not the just recovered copy and thus will be lost after the restart): shell> service mysql restart ..... SUCCESS! ..... SUCCESS!
  • The tablespace is now recovered and we can modify the table normally: SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 23 | +----------+ 1 row in set (0.00 sec) SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> select COUNT(*) from t; +----------+ | COUNT(*) | +----------+ | 25 | +----------+ 1 row in set (0.00 sec)

Notes:

  • We can use the same procedures above to recover the shared tablespace (ibdata1) but we should lock all tables before the recovery process by using the SQL command "FLUSH TABLES WITH READ LOCK;"
  • If the MySQL server had to deal with more files (.frm, .ibd, .MYI, .MYD, ...) than its "open_file_limit", it might happen that it will close this handle. In that case, the file will also cease to exist, even though the process is still running and that's why it is extremely important to act as quick as possible.
  • The same procedure can be used to recover MyISAM files (.MYI and .MYD) but note that the file handle will be released if "FLUSH TABLES;" SQL command was executed.
  • The same procedure can be used as well to recover binary logs, general logs, ... etc but note that the file handle will be released if "FLUSH LOGS;" SQL command was executed.
  • This method can be used to recover any deleted file on Linux not only MySQL files but if the file's inode has other references (lsof).

Real life case:

One of our customers was enabling the general query log on his production system, he noticed that the file was continuously growing and to not to consume the available free disk space on his server he removed that file by "rm /path/to/general_query.log". However, the available free space was still being consumed while he couldn't see the general log file. The customer thought that the file was deleted but in fact, the file handle was still opened by MySQL server process.
To get the problem solved we only issued the SQL command "FLUSH LOGS;" - which the customer should have issued after removing the file - then the file handle was closed, thus the inode was deleted and the consumed disk space freed back to the system.

Taxonomy upgrade extras: innodbtablespaceibdata1general query log

How to recover deleted tablespace?

Abdel-Mawla Gharieb - Fri, 2014-11-14 22:56

Sometimes, MySQL tablespace file(s) might be deleted by mistake, e.g. delete the shared tablespace (ibdata1) or an individual tablespace (table_name.ibd).

In this post I will show you how to recover those files (on Linux OS) having only one condition, MySQL service should still be running. If MySQL service stopped after deleting that file, this method will not work, so it is extremely important to act as quick as possible to avoid data loss.

The following is a simple table creation (innodb_file_per_table is enabled) and the records count inside that table:

SQL> SHOW CREATE TABLE t\G *************************** 1. row *************************** Table: t Create Table: CREATE TABLE `t` ( `id` int(11) NOT NULL AUTO_INCREMENT, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=23 DEFAULT CHARSET=latin1 1 row in set (0.00 sec) SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 22 | +----------+ 1 row in set (0.02 sec)

Now, lets delete the individual tablespace for that table:

shell> rm -rf /var/lib/mysql/test/t.ibd

At this time, we can still select and modify that table!!

SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 23 | +----------+ 1 row in set (0.00 sec)

To be more accurate, rm does not actually delete the file, rather it removes the directory entry pointing to the file's inode. The inode - and in consequence the file - will be removed only if this is the last reference, but as long as the MySQL server process has the file opened, there is another reference which is the open file handle (that's why the "mysqld" server process must still be running).

To list the opened files we can use the Linux command lsof (we filter the output to get only the deleted tablespace information):

shell> lsof |grep t.ibd COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME mysqld 11401 mysql 25uW REG 7,0 98304 1010691 /var/lib/mysql/test/t.ibd (deleted)

The file has a tag of (deleted) which means that the directory entry pointing to the file's inode was deleted but there is another reference(s) to that inode, otherwise it won't be listed by the above command.
Now the question is, how can we get the on-disk path to that opened file if the directory entry was removed?

We can use the "/proc" interface to the running processes and their file handles by the following formula:

  • File path = /proc/PID/fd/FD-number

According to the above formula and using the output of the "lsof" command, the file we just deleted is located here:

shell> ll /proc/11401/fd/25 lrwx------ 1 mysql mysql 64 Oct 28 16:14 /proc/11401/fd/25 -> /var/lib/mysql/test/t.ibd (deleted)

To make sure that this is the on-disk path for the file we deleted, check the reference: it still points to the original path.

How can we recover that file??
  • First, we should make sure that no other queries are modifying that table: SQL> LOCK TABLE t READ; Query OK, 0 rows affected (0.00 sec)
  • Then we copy the data blocks (/proc/11401/fd/25) to a new file (we use the original file path) and change the ownership to the MySQL system user (mysql): shell> cp /proc/11401/fd/25 /var/lib/mysql/test/t.ibd shell> chown mysql:mysql /var/lib/mysql/test/t.ibd
  • Restart MySQL service (if we didn't restart MySQL service directly after recovering the tablespace all changes on that table will still be redirected to the open file handle not the just recovered copy and thus will be lost after the restart): shell> service mysql restart ..... SUCCESS! ..... SUCCESS!
  • The tablespace is now recovered and we can modify the table normally: SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 23 | +----------+ 1 row in set (0.00 sec) SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> select COUNT(*) from t; +----------+ | COUNT(*) | +----------+ | 25 | +----------+ 1 row in set (0.00 sec)

Notes:

  • We can use the same procedures above to recover the shared tablespace (ibdata1) but we should lock all tables before the recovery process by using the SQL command "FLUSH TABLES WITH READ LOCK;"
  • If the MySQL server had to deal with more files (.frm, .ibd, .MYI, .MYD, ...) than its "open_file_limit", it might happen that it will close this handle. In that case, the file will also cease to exist, even though the process is still running and that's why it is extremely important to act as quick as possible.
  • The same procedure can be used to recover MyISAM files (.MYI and .MYD) but note that the file handle will be released if "FLUSH TABLES;" SQL command was executed.
  • The same procedure can be used as well to recover binary logs, general logs, ... etc but note that the file handle will be released if "FLUSH LOGS;" SQL command was executed.
  • This method can be used to recover any deleted file on Linux not only MySQL files but if the file's inode has other references (lsof).

Real life case:

One of our customers was enabling the general query log on his production system, he noticed that the file was continuously growing and to not to consume the available free disk space on his server he removed that file by "rm /path/to/general_query.log". However, the available free space was still being consumed while he couldn't see the general log file. The customer thought that the file was deleted but in fact, the file handle was still opened by MySQL server process.
To get the problem solved we only issued the SQL command "FLUSH LOGS;" - which the customer should have issued after removing the file - then the file handle was closed, thus the inode was deleted and the consumed disk space freed back to the system.

How to recover deleted tablespace?

Abdel-Mawla Gharieb - Fri, 2014-11-14 22:56

Sometimes, MySQL tablespace file(s) might be deleted by mistake, e.g. delete the shared tablespace (ibdata1) or an individual tablespace (table_name.ibd).

In this post I will show you how to recover those files (on Linux OS) having only one condition, MySQL service should still be running. If MySQL service stopped after deleting that file, this method will not work, so it is extremely important to act as quick as possible to avoid data loss.

The following is a simple table creation (innodb_file_per_table is enabled) and the records count inside that table:

SQL> SHOW CREATE TABLE t\G *************************** 1. row *************************** Table: t Create Table: CREATE TABLE `t` ( `id` int(11) NOT NULL AUTO_INCREMENT, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=23 DEFAULT CHARSET=latin1 1 row in set (0.00 sec) SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 22 | +----------+ 1 row in set (0.02 sec)

Now, lets delete the individual tablespace for that table:

shell> rm -rf /var/lib/mysql/test/t.ibd

At this time, we can still select and modify that table!!

SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 23 | +----------+ 1 row in set (0.00 sec)

To be more accurate, rm does not actually delete the file, rather it removes the directory entry pointing to the file's inode. The inode - and in consequence the file - will be removed only if this is the last reference, but as long as the MySQL server process has the file opened, there is another reference which is the open file handle (that's why the "mysqld" server process must still be running).

To list the opened files we can use the Linux command lsof (we filter the output to get only the deleted tablespace information):

shell> lsof |grep t.ibd COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME mysqld 11401 mysql 25uW REG 7,0 98304 1010691 /var/lib/mysql/test/t.ibd (deleted)

The file has a tag of (deleted) which means that the directory entry pointing to the file's inode was deleted but there is another reference(s) to that inode, otherwise it won't be listed by the above command.
Now the question is, how can we get the on-disk path to that opened file if the directory entry was removed?

We can use the "/proc" interface to the running processes and their file handles by the following formula:

  • File path = /proc/PID/fd/FD-number

According to the above formula and using the output of the "lsof" command, the file we just deleted is located here:

shell> ll /proc/11401/fd/25 lrwx------ 1 mysql mysql 64 Oct 28 16:14 /proc/11401/fd/25 -> /var/lib/mysql/test/t.ibd (deleted)

To make sure that this is the on-disk path for the file we deleted, check the reference: it still points to the original path.

How can we recover that file??
  • First, we should make sure that no other queries are modifying that table: SQL> LOCK TABLE t READ; Query OK, 0 rows affected (0.00 sec)
  • Then we copy the data blocks (/proc/11401/fd/25) to a new file (we use the original file path) and change the ownership to the MySQL system user (mysql): shell> cp /proc/11401/fd/25 /var/lib/mysql/test/t.ibd shell> chown mysql:mysql /var/lib/mysql/test/t.ibd
  • Restart MySQL service (if we didn't restart MySQL service directly after recovering the tablespace all changes on that table will still be redirected to the open file handle not the just recovered copy and thus will be lost after the restart): shell> service mysql restart ..... SUCCESS! ..... SUCCESS!
  • The tablespace is now recovered and we can modify the table normally: SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 23 | +----------+ 1 row in set (0.00 sec) SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> select COUNT(*) from t; +----------+ | COUNT(*) | +----------+ | 25 | +----------+ 1 row in set (0.00 sec)

Notes:

  • We can use the same procedures above to recover the shared tablespace (ibdata1) but we should lock all tables before the recovery process by using the SQL command "FLUSH TABLES WITH READ LOCK;"
  • If the MySQL server had to deal with more files (.frm, .ibd, .MYI, .MYD, ...) than its "open_file_limit", it might happen that it will close this handle. In that case, the file will also cease to exist, even though the process is still running and that's why it is extremely important to act as quick as possible.
  • The same procedure can be used to recover MyISAM files (.MYI and .MYD) but note that the file handle will be released if "FLUSH TABLES;" SQL command was executed.
  • The same procedure can be used as well to recover binary logs, general logs, ... etc but note that the file handle will be released if "FLUSH LOGS;" SQL command was executed.
  • This method can be used to recover any deleted file on Linux not only MySQL files but if the inode files have other references (lsof).

Real life case:

One of our customers was enabling the general query log on his production system, he noticed that the file was continuously growing and to not to consume the available free disk space on his server he removed that file by "rm /path/to/general_query.log". However, the available free space was still being consumed while he couldn't see the general log file. The customer thought that the file was deleted but in fact, the file handle was still opened by MySQL server process.
To get the problem solved we only issued the SQL command "FLUSH LOGS;" - which the customer should have issued after removing the file - then the file handle was closed, thus the inode was deleted and the consumed disk space freed back to the system.

Things you should consider before using GTID

Abdel-Mawla Gharieb - Fri, 2014-11-14 16:50

Global Transaction ID (GTID) is one of the major features that were introduced in MySQL 5.6 which provides a lot of benefits.
I have talked about the GTID concept, implementation and possible troubleshooting at Percona Live London 2014, you can download the slides from our presentations repository or from my session at Percona Live.

On the other hand, there are some important things you should consider before deploying GTID in production, I'm going to list them here in this blog post.

Table of Content Migration to GTID replication

It is required to shutdown MySQL service on all servers in the replication setup in order to perform the migration from classic replication (based on binary logs information) to the transaction-based (GTID) replication which means that the migration process requires downtime.

The online migration to GTID replication is not yet available.
Facebook and Booking.com provided some MySQL patches for this, but they are not yet contained in Oracle's binaries.
So, if you can't afford a downtime during the migration process, then you might not be able to make the change.

Non transactionally safe statement will raise errors now

It is required to enable the system variable (enforce_gtid_consistency) on all servers inside the GTID replication setup which prevent executing the non transactionally safe statements (check GTID restrictions) like:

  • CREATE TABLE .. SELECT.
  • CREATE TEMPORARY TABLE (inside a transaction).
  • Statements that update non-transactional tables inside a transaction.

So, you will have to fix your application first if it contains any of the above statements before using GTID replication.

MySQL Performance in GTID

It is required to enable the variables (bin_log and log_slave_updates) on - at least - the slave servers which affects the performance on those slaves negatively.

So, the performance should be tested very well before the production migration to GTID replication.

mysql_upgrade script

The mysql_upgrade script's problem when executed on a server having gtid_mode=on has been fixed since MySQL 5.6.7, but it is still not recommended to execute mysql_upgrade when gtid_mode=on as it might change system tables that is using MyISAM, which is non transactional.

Errant transactions! Transactions which are executed on a slave apart from the replication transactions (i.e. not executed on the master) are called "Errant transactions", those transactions cause trouble if that slave later is promoted to be a new master in the a fail-over process.
Once the other slaves connect to the new master, they send the value of gtid_executed and the master in turn checks those values, compares it with its own gtid_executed set and sends back all missing transactions (the errant transactions) to the slaves which leads to one of the following problems:
  • If those transactions still exist in the new master's binary log files, they will be replicated to the other slave which was not intentional when those were executed only on the slave (new master).
  • If those transactions do no longer exist in the new master's binary log file, the replication will break on all slaves.
How to avoid such problem?
  • Choose some slaves to be possible candidates for promotion in case of fail-over. Thus, stand alone transactions (which are not coming from the master) should NOT be executed there.
  • Use one of the MySQL utilities (mysqlfailover or mysqlrpladmin) to find out if there are any errant transactions on the slave before the promotion or not.
Filtration on the slave

In some cases we might need to make filtration to the replication on the slave(s) i.e. not all tables' or databases' changes are propagated to the slave by using the system variables (replicate_ignore_db or replicate_ignore_table). When the slave receives transactions from the master which modify those ignored tables or databases, it simply skips executing them and when the slave restarted it sends the gtid_excuted to the master and the master finds the missing transactions (those for the ignored tables or databases) and sends them back to the slave.

Again, that leads to one of the following two conditions:

  • If those transactions still exist in the master's binary log files, then no problem as the slave will skip executing them again!!
  • If those transactions are no longer there in the master's binary log files, the replication will break on the slave.

Well, the above problem is supposed to be fixed in MySQL 5.6.18 (Bug #70048) The fix is injecting empty transactions on the slave for those ones modifying ignored tables or databases instead of just skipping them, and when the slave restarted they won't be sent back again from the master.

I listed the above problem although it should be fixed now because I want to mention that having MySQL always updated to the recent release is a good practice to avoid such problems and to get the most bug fixes.

Conclusion

The following are the main things which should be considered before using GTID:

  • Migration from classic replication to transaction-based (GTID) replication requires downtime.
  • Non-transactionally safe statements will not be executed in GTID replication.
  • MySQL performance is a little bit slower in GTID replication, especially, on the slaves.
  • mysql_upgrade script might cause troubles on a server having GTID_MODE=ON and it should be tested first.
  • Errant transactions might break the replication in the fail-over process, thus, planning slaves for promotion will avoid falling in such cases.
  • Some GTID bugs are fixed now (like slave filtration issue), thus MySQL server should be updated to the latest version once there is a new release.
  • New bugs are expected to be discovered, so the application should be tested very well with GTID before performing the migration on production.
Taxonomy upgrade extras: GTIDreplication

Things you should consider before using GTID

Abdel-Mawla Gharieb - Fri, 2014-11-14 16:50

Global Transaction ID (GTID) is one of the major features that were introduced in MySQL 5.6 which provides a lot of benefits.
I have talked about the GTID concept, implementation and possible troubleshooting at Percona Live London 2014, you can download the slides from our presentations repository or from my session at Percona Live.

On the other hand, there are some important things you should consider before deploying GTID in production, I'm going to list them here in this blog post.

Table of Content Migration to GTID replication

It is required to shutdown MySQL service on all servers in the replication setup in order to perform the migration from classic replication (based on binary logs information) to the transaction-based (GTID) replication which means that the migration process requires downtime.

The online migration to GTID replication is not yet available.
Facebook and Booking.com provided some MySQL patches for this, but they are not yet contained in Oracle's binaries.
So, if you can't afford a downtime during the migration process, then you might not be able to make the change.

Non transactionally safe statement will raise errors now

It is required to enable the system variable (enforce_gtid_consistency) on all servers inside the GTID replication setup which prevent executing the non transactionally safe statements (check GTID restrictions) like:

  • CREATE TABLE .. SELECT.
  • CREATE TEMPORARY TABLE (inside a transaction).
  • Statements that update non-transactional tables inside a transaction.

So, you will have to fix your application first if it contains any of the above statements before using GTID replication.

MySQL Performance in GTID

It is required to enable the variables (bin_log and log_slave_updates) on - at least - the slave servers which affects the performance on those slaves negatively.

So, the performance should be tested very well before the production migration to GTID replication.

mysql_upgrade script

The mysql_upgrade script's problem when executed on a server having gtid_mode=on has been fixed since MySQL 5.6.7, but it is still not recommended to execute mysql_upgrade when gtid_mode=on as it might change system tables that is using MyISAM, which is non transactional.

Errant transactions! Transactions which are executed on a slave apart from the replication transactions (i.e. not executed on the master) are called "Errant transactions", those transactions cause trouble if that slave later is promoted to be a new master in the a fail-over process.
Once the other slaves connect to the new master, they send the value of gtid_executed and the master in turn checks those values, compares it with its own gtid_executed set and sends back all missing transactions (the errant transactions) to the slaves which leads to one of the following problems:
  • If those transactions still exist in the new master's binary log files, they will be replicated to the other slave which was not intentional when those were executed only on the slave (new master).
  • If those transactions do no longer exist in the new master's binary log file, the replication will break on all slaves.
How to avoid such problem?
  • Choose some slaves to be possible candidates for promotion in case of fail-over. Thus, stand alone transactions (which are not coming from the master) should NOT be executed there.
  • Use one of the MySQL utilities (mysqlfailover or mysqlrpladmin) to find out if there are any errant transactions on the slave before the promotion or not.
Filtration on the slave

In some cases we might need to make filtration to the replication on the slave(s) i.e. not all tables' or databases' changes are propagated to the slave by using the system variables (replicate_ignore_db or replicate_ignore_table). When the slave receives transactions from the master which modify those ignored tables or databases, it simply skips executing them and when the slave restarted it sends the gtid_excuted to the master and the master finds the missing transactions (those for the ignored tables or databases) and sends them back to the slave.

Again, that leads to one of the following two conditions:

  • If those transactions still exist in the master's binary log files, then no problem as the slave will skip executing them again!!
  • If those transactions are no longer there in the master's binary log files, the replication will break on the slave.

Well, the above problem is supposed to be fixed in MySQL 5.6.18 (Bug #70048) The fix is injecting empty transactions on the slave for those ones modifying ignored tables or databases instead of just skipping them, and when the slave restarted they won't be sent back again from the master.

I listed the above problem although it should be fixed now because I want to mention that having MySQL always updated to the recent release is a good practice to avoid such problems and to get the most bug fixes.

Conclusion

The following are the main things which should be considered before using GTID:

  • Migration from classic replication to transaction-based (GTID) replication requires downtime.
  • Non-transactionally safe statements will not be executed in GTID replication.
  • MySQL performance is a little bit slower in GTID replication, especially, on the slaves.
  • mysql_upgrade script might cause troubles on a server having GTID_MODE=ON and it should be tested first.
  • Errant transactions might break the replication in the fail-over process, thus, planning slaves for promotion will avoid falling in such cases.
  • Some GTID bugs are fixed now (like slave filtration issue), thus MySQL server should be updated to the latest version once there is a new release.
  • New bugs are expected to be discovered, so the application should be tested very well with GTID before performing the migration on production.
Taxonomy upgrade extras: GTIDreplication

Things you should consider before using GTID

Abdel-Mawla Gharieb - Fri, 2014-11-14 16:50

Global Transaction ID (GTID) is one of the major features that were introduced in MySQL 5.6 which provides a lot of benefits (I have talked about the GTID concept, implementation and possible troubleshooting at Percona Live London 2014, you can download the slides from our presentations repository or from my session at Percona Live.
On the other hand, there are some important things you should consider before deploying GTID in production, I'm going to list them here in this blog post.

Table of Content Migration to GTID replication

It is required to shutdown MySQL service on all servers in the replication setup in order to perform the migration from classic replication (based on binary logs information) to the transaction-based (GTID) replication which means that the migration process requires downtime.

The online migration to GTID replication is not yet available.
Facebook and Booking.com provided some MySQL patches for this, but they are not yet contained in Oracle's binaries.
So, if you can't afford a downtime during the migration process, then you might not be able to make the change.

Non transactionally safe statement will raise errors now

It is required to enable the system variable (enforce_gtid_consistency) on all servers inside the GTID replication setup which prevent executing the non transactionally safe statements (check GTID restrictions) like:

  • CREATE TABLE .. SELECT.
  • CREATE TEMPORARY TABLE (inside a transaction).
  • Statements that update non-transactional tables inside a transaction.

So, you will have to fix your application first if it contains any of the above statements before using GTID replication.

MySQL Performance in GTID

It is required to enable the variables (bin_log and log_slave_updates) on - at least - the slave servers which affects the performance on those slaves negatively.

So, the performance should be tested very well before the production migration to GTID replication.

mysql_upgrade script

The mysql_upgrade script's problem when executed on a server having gtid_mode=on has been fixed since MySQL 5.6.7, but it is still not recommended to execute mysql_upgrade when gtid_mode=on as it might change system tables that is using MyISAM, which is non transactional.

Errant transactions! Transactions which are executed on a slave apart from the replication transactions (i.e. not executed on the master) are called "Errant transactions", those transactions cause trouble if that slave later is promoted to be a new master in the a fail-over process.
Once the other slaves connect to the new master, they send the value of gtid_executed and the master in turn checks those values, compares it with its own gtid_executed set and sends back all missing transactions (the errant transactions) to the slaves which leads to one of the following problems:
  • If those transactions still exist in the new master's binary log files, they will be replicated to the other slave which was not intentional when those were executed only on the slave (new master).
  • If those transactions do no longer exist in the new master's binary log file, the replication will break on all slaves.
How to avoid such problem?
  • Choose some slaves to be possible candidates for promotion in case of fail-over. Thus, stand alone transactions (which are not coming from the master) should NOT be executed there.
  • Use one of the MySQL utilities (mysqlfailover or mysqlrpladmin) to find out if there are any errant transactions on the slave before the promotion or not.
Filtration on the slave

In some cases we might need to make filtration to the replication on the slave(s) i.e. not all tables' or databases' changes are propagated to the slave by using the system variables (replicate_ignore_db or replicate_ignore_table). When the slave receives transactions from the master which modify those ignored tables or databases, it simply skips executing them and when the slave restarted it sends the gtid_excuted to the master and the master finds the missing transactions (those for the ignored tables or databases) and sends them back to the slave.

Again, that leads to one of the following two conditions:

  • If those transactions still exist in the master's binary log files, then no problem as the slave will skip executing them again!!
  • If those transactions are no longer there in the master's binary log files, the replication will break on the slave.

Well, the above problem is supposed to be fixed in MySQL 5.6.18 (Bug #70048) The fix is injecting empty transactions on the slave for those ones modifying ignored tables or databases instead of just skipping them, and when the slave restarted they won't be sent back again from the master.

I listed the above problem although it should be fixed now because I want to mention that having MySQL always updated to the recent release is a good practice to avoid such problems and to get the most bug fixes.

Conclusion:

The following are the main things which should be considered before using GTID:

  • Migration from classic replication to transaction-based (GTID) replication requires downtime.
  • Non-transactionally safe statements will not be executed in GTID replication.
  • MySQL performance is a little bit slower in GTID replication, especially, on the slaves.
  • mysql_upgrade script might cause troubles on a server having GTID_MODE=ON and it should be tested first.
  • Errant transactions might break the replication in the fail-over process, thus, planning slaves for promotion will avoid falling in such cases.
  • Some GTID bugs are fixed now (like slave filtration issue), thus MySQL server should be updated to the latest version once there is a new release.
  • New bugs are expected to be discovered, so the application should be tested very well with GTID before performing the migration on production.

Galera Cluster and XA Transactions

Abdel-Mawla Gharieb - Thu, 2014-10-23 23:47

A few weeks ago, we received an interesting Galera Cluster support case from one of our customers that the application is not working well and they face a lot of troubles in their Galera Cluster setup.

After some investigations, we found a lot of insert queries in state "query end" and lasting for long time without being completed. Also some other queries which were sleeping for long time having the info of "XA COMMIT":

SQL> SHOW PROCESSLIST; 27 user host:33214 foodmart Query 14440 sleeping XA COMMIT 0x31302e31312e31332e34372e746d30303336383030303031,0x31302e31312e31332e34372e746d333638,0x SQL> SHOW ENGINE INNODB STATUS; TRANSACTIONS ============ ---TRANSACTION 2DE71D, ACTIVE 14459 sec 9 lock struct(s), heap size 1248, 1 row lock(s), undo log entries 115 MySQL thread id 27, OS thread handle 0x7fc21a42c700, query id 96187 host host-ip foodmart sleeping XA COMMIT 0x31302e31312e31332e34372e746d30303336383030303031,0x31302e31312e31332e34372e746d333638,0x41544f4d ONE PHASE

XA means eXtended Architecture and "XA COMMIT" statement is one of the distributed transactions (XA Transactions) commands which are clearly NOT supported in Galera Cluster and one of its limitations because of possible rollback on commit.

The following command can be used to check if XA Transactions are being used by your application or not:

SQL> SHOW GLOBAL STATUS LIKE 'Com_xa%'; +-------------------+---------+ | Variable_name | Value | +-------------------+---------+ | Com_xa_commit | 2828094 | | Com_xa_end | 2828221 | | Com_xa_prepare | 0 | | Com_xa_recover | 2205697 | | Com_xa_rollback | 42 | | Com_xa_start | 2828305 | +-------------------+---------+ 6 rows in set (0.00 sec)

There are only two possible solutions for this problem:

  • Get rid of all XA transactions in the application to get the Galera Cluster work.
  • Use another HA solution (Active/passive, Master/Slave, ... etc) but not Galera Cluster.
Conclusion
  • XA transactions can not be supported in Galera Cluster and that is already stated clearly in the Galera Cluster Limitations.
  • Before Moving to Galera Cluster, it is preferred to go through all its limitations and check whether your application can cope with them or not.

Galera Cluster and XA Transactions

Abdel-Mawla Gharieb - Thu, 2014-10-23 23:47

A few weeks ago, we received an interesting Galera Cluster support case from one of our customers that the application is not working well and they face a lot of troubles in their Galera Cluster setup.

After some investigations, we found a lot of insert queries in state "query end" and lasting for long time without being completed. Also some other queries which were sleeping for long time having the info of "XA COMMIT":

SQL> SHOW PROCESSLIST; 27 user host:33214 foodmart Query 14440 sleeping XA COMMIT 0x31302e31312e31332e34372e746d30303336383030303031,0x31302e31312e31332e34372e746d333638,0x SQL> SHOW ENGINE INNODB STATUS; TRANSACTIONS ============ ---TRANSACTION 2DE71D, ACTIVE 14459 sec 9 lock struct(s), heap size 1248, 1 row lock(s), undo log entries 115 MySQL thread id 27, OS thread handle 0x7fc21a42c700, query id 96187 host host-ip foodmart sleeping XA COMMIT 0x31302e31312e31332e34372e746d30303336383030303031,0x31302e31312e31332e34372e746d333638,0x41544f4d ONE PHASE

XA means eXtended Architecture and "XA COMMIT" statement is one of the distributed transactions (XA Transactions) commands which are clearly NOT supported in Galera Cluster and one of its limitations because of possible rollback on commit.

The following command can be used to check if XA Transactions are being used by your application or not:

SQL> SHOW GLOBAL STATUS LIKE 'Com_xa%'; +-------------------+---------+ | Variable_name | Value | +-------------------+---------+ | Com_xa_commit | 2828094 | | Com_xa_end | 2828221 | | Com_xa_prepare | 0 | | Com_xa_recover | 2205697 | | Com_xa_rollback | 42 | | Com_xa_start | 2828305 | +-------------------+---------+ 6 rows in set (0.00 sec)

There are only two possible solutions for this problem:

  • Get rid of all XA transactions in the application to get the Galera Cluster work.
  • Use another HA solution (Active/passive, Master/Slave, ... etc) but not Galera Cluster.
Conclusion
  • XA transactions can not be supported in Galera Cluster and that is already stated clearly in the Galera Cluster Limitations.
  • Before Moving to Galera Cluster, it is preferred to go through all its limitations and check whether your application can cope with them or not.

Galera Cluster and XA Transactions

Abdel-Mawla Gharieb - Thu, 2014-10-23 23:47

A few weeks ago, we received an interesting Galera Cluster support case from one of our customers that the application is not working well and they face a lot of troubles in their Galera Cluster setup.

After some investigations, we found a lot of insert queries in state "query end" and lasting for long time without being completed. Also some other queries which were sleeping for long time having the info of "XA COMMIT":

SQL> SHOW PROCESSLIST; 27 user host:33214 foodmart Query 14440 sleeping XA COMMIT 0x31302e31312e31332e34372e746d30303336383030303031,0x31302e31312e31332e34372e746d333638,0x SQL> SHOW ENGINE INNODB STATUS; TRANSACTIONS ============ ---TRANSACTION 2DE71D, ACTIVE 14459 sec 9 lock struct(s), heap size 1248, 1 row lock(s), undo log entries 115 MySQL thread id 27, OS thread handle 0x7fc21a42c700, query id 96187 host host-ip foodmart sleeping XA COMMIT 0x31302e31312e31332e34372e746d30303336383030303031,0x31302e31312e31332e34372e746d333638,0x41544f4d ONE PHASE

XA means eXtended Architecture and "XA COMMIT" statement is one of the distributed transactions (XA Transactions) commands which are clearly NOT supported in Galera Cluster and one of its limitations because of possible rollback on commit.

The following command can be used to check if XA Transactions are being used by your application or not:

SQL> SHOW GLOBAL STATUS LIKE 'Com_xa%'; +-------------------+---------+ | Variable_name | Value | +-------------------+---------+ | Com_xa_commit | 2828094 | | Com_xa_end | 2828221 | | Com_xa_prepare | 0 | | Com_xa_recover | 2205697 | | Com_xa_rollback | 42 | | Com_xa_start | 2828305 | +-------------------+---------+ 6 rows in set (0.00 sec)

There are only two possible solutions for this problem:

  • Get rid of all XA transactions in the application to get the Galera Cluster work.
  • Use another HA solution (Active/passive, Master/Slave, ... etc) but not Galera Cluster.
Conclusion
  • XA transactions can not be supported in Galera Cluster and that is already stated clearly in the Galera Cluster Limitations.
  • Before Moving to Galera Cluster, it is preferred to go through all its limitations and check whether your application can cope with them or not.

Pages

Subscribe to FromDual aggregator - FromDual all (en)