You are here
Abdel-Mawla Gharieb
How to recover deleted tablespace?
Sometimes, MySQL tablespace file(s) might be deleted by mistake, e.g. delete the shared tablespace (ibdata1) or an individual tablespace (table_name.ibd).
In this post I will show you how to recover those files (on Linux OS) having only one condition, MySQL service should still be running. If MySQL service stopped after deleting that file, this method will not work, so it is extremely important to act as quick as possible to avoid data loss.
The following is a simple table creation (innodb_file_per_table is enabled) and the records count inside that table:
SQL> SHOW CREATE TABLE t\G *************************** 1. row *************************** Table: t Create Table: CREATE TABLE `t` ( `id` int(11) NOT NULL AUTO_INCREMENT, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=23 DEFAULT CHARSET=latin1 1 row in set (0.00 sec) SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 22 | +----------+ 1 row in set (0.02 sec)Now, lets delete the individual tablespace for that table:
shell> rm -rf /var/lib/mysql/test/t.ibdAt this time, we can still select and modify that table!!
SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 23 | +----------+ 1 row in set (0.00 sec)To be more accurate, rm does not actually delete the file, rather it removes the directory entry pointing to the file's inode. The inode - and in consequence the file - will be removed only if this is the last reference, but as long as the MySQL server process has the file opened, there is another reference which is the open file handle (that's why the "mysqld" server process must still be running).
To list the opened files we can use the Linux command lsof (we filter the output to get only the deleted tablespace information):
shell> lsof |grep t.ibd COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME mysqld 11401 mysql 25uW REG 7,0 98304 1010691 /var/lib/mysql/test/t.ibd (deleted)The file has a tag of (deleted) which means that the directory entry pointing to the file's inode was deleted but there is another reference(s) to that inode, otherwise it won't be listed by the above command.
Now the question is, how can we get the on-disk path to that opened file if the directory entry was removed?
We can use the "/proc" interface to the running processes and their file handles by the following formula:
- File path = /proc/PID/fd/FD-number
According to the above formula and using the output of the "lsof" command, the file we just deleted is located here:
shell> ll /proc/11401/fd/25 lrwx------ 1 mysql mysql 64 Oct 28 16:14 /proc/11401/fd/25 -> /var/lib/mysql/test/t.ibd (deleted)To make sure that this is the on-disk path for the file we deleted, check the reference: it still points to the original path.
How can we recover that file??- First, we should make sure that no other queries are modifying that table: SQL> LOCK TABLE t READ; Query OK, 0 rows affected (0.00 sec)
- Then we copy the data blocks (/proc/11401/fd/25) to a new file (we use the original file path) and change the ownership to the MySQL system user (mysql): shell> cp /proc/11401/fd/25 /var/lib/mysql/test/t.ibd shell> chown mysql:mysql /var/lib/mysql/test/t.ibd
- Restart MySQL service (if we didn't restart MySQL service directly after recovering the tablespace all changes on that table will still be redirected to the open file handle not the just recovered copy and thus will be lost after the restart): shell> service mysql restart ..... SUCCESS! ..... SUCCESS!
- The tablespace is now recovered and we can modify the table normally: SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 23 | +----------+ 1 row in set (0.00 sec) SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> select COUNT(*) from t; +----------+ | COUNT(*) | +----------+ | 25 | +----------+ 1 row in set (0.00 sec)
Notes:
- We can use the same procedures above to recover the shared tablespace (ibdata1) but we should lock all tables before the recovery process by using the SQL command "FLUSH TABLES WITH READ LOCK;"
- If the MySQL server had to deal with more files (.frm, .ibd, .MYI, .MYD, ...) than its "open_file_limit", it might happen that it will close this handle. In that case, the file will also cease to exist, even though the process is still running and that's why it is extremely important to act as quick as possible.
- The same procedure can be used to recover MyISAM files (.MYI and .MYD) but note that the file handle will be released if "FLUSH TABLES;" SQL command was executed.
- The same procedure can be used as well to recover binary logs, general logs, ... etc but note that the file handle will be released if "FLUSH LOGS;" SQL command was executed.
- This method can be used to recover any deleted file on Linux not only MySQL files but if the file's inode has other references (lsof).
Real life case:
One of our customers was enabling the general query log on his production system, he noticed that the file was continuously growing and to not to consume the available free disk space on his server he removed that file by "rm /path/to/general_query.log". However, the available free space was still being consumed while he couldn't see the general log file. The customer thought that the file was deleted but in fact, the file handle was still opened by MySQL server process.
To get the problem solved we only issued the SQL command "FLUSH LOGS;" - which the customer should have issued after removing the file - then the file handle was closed, thus the inode was deleted and the consumed disk space freed back to the system.
How to recover deleted tablespace?
Sometimes, MySQL tablespace file(s) might be deleted by mistake, e.g. delete the shared tablespace (ibdata1) or an individual tablespace (table_name.ibd).
In this post I will show you how to recover those files (on Linux OS) having only one condition, MySQL service should still be running. If MySQL service stopped after deleting that file, this method will not work, so it is extremely important to act as quick as possible to avoid data loss.
The following is a simple table creation (innodb_file_per_table is enabled) and the records count inside that table:
SQL> SHOW CREATE TABLE t\G *************************** 1. row *************************** Table: t Create Table: CREATE TABLE `t` ( `id` int(11) NOT NULL AUTO_INCREMENT, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=23 DEFAULT CHARSET=latin1 1 row in set (0.00 sec) SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 22 | +----------+ 1 row in set (0.02 sec)Now, lets delete the individual tablespace for that table:
shell> rm -rf /var/lib/mysql/test/t.ibdAt this time, we can still select and modify that table!!
SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 23 | +----------+ 1 row in set (0.00 sec)To be more accurate, rm does not actually delete the file, rather it removes the directory entry pointing to the file's inode. The inode - and in consequence the file - will be removed only if this is the last reference, but as long as the MySQL server process has the file opened, there is another reference which is the open file handle (that's why the "mysqld" server process must still be running).
To list the opened files we can use the Linux command lsof (we filter the output to get only the deleted tablespace information):
shell> lsof |grep t.ibd COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME mysqld 11401 mysql 25uW REG 7,0 98304 1010691 /var/lib/mysql/test/t.ibd (deleted)The file has a tag of (deleted) which means that the directory entry pointing to the file's inode was deleted but there is another reference(s) to that inode, otherwise it won't be listed by the above command.
Now the question is, how can we get the on-disk path to that opened file if the directory entry was removed?
We can use the "/proc" interface to the running processes and their file handles by the following formula:
- File path = /proc/PID/fd/FD-number
According to the above formula and using the output of the "lsof" command, the file we just deleted is located here:
shell> ll /proc/11401/fd/25 lrwx------ 1 mysql mysql 64 Oct 28 16:14 /proc/11401/fd/25 -> /var/lib/mysql/test/t.ibd (deleted)To make sure that this is the on-disk path for the file we deleted, check the reference: it still points to the original path.
How can we recover that file??- First, we should make sure that no other queries are modifying that table: SQL> LOCK TABLE t READ; Query OK, 0 rows affected (0.00 sec)
- Then we copy the data blocks (/proc/11401/fd/25) to a new file (we use the original file path) and change the ownership to the MySQL system user (mysql): shell> cp /proc/11401/fd/25 /var/lib/mysql/test/t.ibd shell> chown mysql:mysql /var/lib/mysql/test/t.ibd
- Restart MySQL service (if we didn't restart MySQL service directly after recovering the tablespace all changes on that table will still be redirected to the open file handle not the just recovered copy and thus will be lost after the restart): shell> service mysql restart ..... SUCCESS! ..... SUCCESS!
- The tablespace is now recovered and we can modify the table normally: SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 23 | +----------+ 1 row in set (0.00 sec) SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> select COUNT(*) from t; +----------+ | COUNT(*) | +----------+ | 25 | +----------+ 1 row in set (0.00 sec)
Notes:
- We can use the same procedures above to recover the shared tablespace (ibdata1) but we should lock all tables before the recovery process by using the SQL command "FLUSH TABLES WITH READ LOCK;"
- If the MySQL server had to deal with more files (.frm, .ibd, .MYI, .MYD, ...) than its "open_file_limit", it might happen that it will close this handle. In that case, the file will also cease to exist, even though the process is still running and that's why it is extremely important to act as quick as possible.
- The same procedure can be used to recover MyISAM files (.MYI and .MYD) but note that the file handle will be released if "FLUSH TABLES;" SQL command was executed.
- The same procedure can be used as well to recover binary logs, general logs, ... etc but note that the file handle will be released if "FLUSH LOGS;" SQL command was executed.
- This method can be used to recover any deleted file on Linux not only MySQL files but if the file's inode has other references (lsof).
Real life case:
One of our customers was enabling the general query log on his production system, he noticed that the file was continuously growing and to not to consume the available free disk space on his server he removed that file by "rm /path/to/general_query.log". However, the available free space was still being consumed while he couldn't see the general log file. The customer thought that the file was deleted but in fact, the file handle was still opened by MySQL server process.
To get the problem solved we only issued the SQL command "FLUSH LOGS;" - which the customer should have issued after removing the file - then the file handle was closed, thus the inode was deleted and the consumed disk space freed back to the system.
How to recover deleted tablespace?
Sometimes, MySQL tablespace file(s) might be deleted by mistake, e.g. delete the shared tablespace (ibdata1) or an individual tablespace (table_name.ibd).
In this post I will show you how to recover those files (on Linux OS) having only one condition, MySQL service should still be running. If MySQL service stopped after deleting that file, this method will not work, so it is extremely important to act as quick as possible to avoid data loss.
The following is a simple table creation (innodb_file_per_table is enabled) and the records count inside that table:
SQL> SHOW CREATE TABLE t\G *************************** 1. row *************************** Table: t Create Table: CREATE TABLE `t` ( `id` int(11) NOT NULL AUTO_INCREMENT, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=23 DEFAULT CHARSET=latin1 1 row in set (0.00 sec) SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 22 | +----------+ 1 row in set (0.02 sec)Now, lets delete the individual tablespace for that table:
shell> rm -rf /var/lib/mysql/test/t.ibdAt this time, we can still select and modify that table!!
SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 23 | +----------+ 1 row in set (0.00 sec)To be more accurate, rm does not actually delete the file, rather it removes the directory entry pointing to the file's inode. The inode - and in consequence the file - will be removed only if this is the last reference, but as long as the MySQL server process has the file opened, there is another reference which is the open file handle (that's why the "mysqld" server process must still be running).
To list the opened files we can use the Linux command lsof (we filter the output to get only the deleted tablespace information):
shell> lsof |grep t.ibd COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME mysqld 11401 mysql 25uW REG 7,0 98304 1010691 /var/lib/mysql/test/t.ibd (deleted)The file has a tag of (deleted) which means that the directory entry pointing to the file's inode was deleted but there is another reference(s) to that inode, otherwise it won't be listed by the above command.
Now the question is, how can we get the on-disk path to that opened file if the directory entry was removed?
We can use the "/proc" interface to the running processes and their file handles by the following formula:
- File path = /proc/PID/fd/FD-number
According to the above formula and using the output of the "lsof" command, the file we just deleted is located here:
shell> ll /proc/11401/fd/25 lrwx------ 1 mysql mysql 64 Oct 28 16:14 /proc/11401/fd/25 -> /var/lib/mysql/test/t.ibd (deleted)To make sure that this is the on-disk path for the file we deleted, check the reference: it still points to the original path.
How can we recover that file??- First, we should make sure that no other queries are modifying that table: SQL> LOCK TABLE t READ; Query OK, 0 rows affected (0.00 sec)
- Then we copy the data blocks (/proc/11401/fd/25) to a new file (we use the original file path) and change the ownership to the MySQL system user (mysql): shell> cp /proc/11401/fd/25 /var/lib/mysql/test/t.ibd shell> chown mysql:mysql /var/lib/mysql/test/t.ibd
- Restart MySQL service (if we didn't restart MySQL service directly after recovering the tablespace all changes on that table will still be redirected to the open file handle not the just recovered copy and thus will be lost after the restart): shell> service mysql restart ..... SUCCESS! ..... SUCCESS!
- The tablespace is now recovered and we can modify the table normally: SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 23 | +----------+ 1 row in set (0.00 sec) SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> select COUNT(*) from t; +----------+ | COUNT(*) | +----------+ | 25 | +----------+ 1 row in set (0.00 sec)
Notes:
- We can use the same procedures above to recover the shared tablespace (ibdata1) but we should lock all tables before the recovery process by using the SQL command "FLUSH TABLES WITH READ LOCK;"
- If the MySQL server had to deal with more files (.frm, .ibd, .MYI, .MYD, ...) than its "open_file_limit", it might happen that it will close this handle. In that case, the file will also cease to exist, even though the process is still running and that's why it is extremely important to act as quick as possible.
- The same procedure can be used to recover MyISAM files (.MYI and .MYD) but note that the file handle will be released if "FLUSH TABLES;" SQL command was executed.
- The same procedure can be used as well to recover binary logs, general logs, ... etc but note that the file handle will be released if "FLUSH LOGS;" SQL command was executed.
- This method can be used to recover any deleted file on Linux not only MySQL files but if the inode files have other references (lsof).
Real life case:
One of our customers was enabling the general query log on his production system, he noticed that the file was continuously growing and to not to consume the available free disk space on his server he removed that file by "rm /path/to/general_query.log". However, the available free space was still being consumed while he couldn't see the general log file. The customer thought that the file was deleted but in fact, the file handle was still opened by MySQL server process.
To get the problem solved we only issued the SQL command "FLUSH LOGS;" - which the customer should have issued after removing the file - then the file handle was closed, thus the inode was deleted and the consumed disk space freed back to the system.
Things you should consider before using GTID
Global Transaction ID (GTID) is one of the major features that were introduced in MySQL 5.6 which provides a lot of benefits.
I have talked about the GTID concept, implementation and possible troubleshooting at Percona Live London 2014, you can download the slides from our presentations repository or from my session at Percona Live.
On the other hand, there are some important things you should consider before deploying GTID in production, I'm going to list them here in this blog post.
Table of Content- Migration to GTID replication
- Non transactionally safe statement will raise errors now
- MySQL Performance in GTID
- mysql_upgrade script
- Errant transactions!
- Filtration on the slave
- Conclusion
It is required to shutdown MySQL service on all servers in the replication setup in order to perform the migration from classic replication (based on binary logs information) to the transaction-based (GTID) replication which means that the migration process requires downtime.
The online migration to GTID replication is not yet available.
Facebook and Booking.com provided some MySQL patches for this, but they are not yet contained in Oracle's binaries.
So, if you can't afford a downtime during the migration process, then you might not be able to make the change.
It is required to enable the system variable (enforce_gtid_consistency) on all servers inside the GTID replication setup which prevent executing the non transactionally safe statements (check GTID restrictions) like:
- CREATE TABLE .. SELECT.
- CREATE TEMPORARY TABLE (inside a transaction).
- Statements that update non-transactional tables inside a transaction.
So, you will have to fix your application first if it contains any of the above statements before using GTID replication.
MySQL Performance in GTIDIt is required to enable the variables (bin_log and log_slave_updates) on - at least - the slave servers which affects the performance on those slaves negatively.
So, the performance should be tested very well before the production migration to GTID replication.
mysql_upgrade scriptThe mysql_upgrade script's problem when executed on a server having gtid_mode=on has been fixed since MySQL 5.6.7, but it is still not recommended to execute mysql_upgrade when gtid_mode=on as it might change system tables that is using MyISAM, which is non transactional.
Errant transactions! Transactions which are executed on a slave apart from the replication transactions (i.e. not executed on the master) are called "Errant transactions", those transactions cause trouble if that slave later is promoted to be a new master in the a fail-over process.Once the other slaves connect to the new master, they send the value of gtid_executed and the master in turn checks those values, compares it with its own gtid_executed set and sends back all missing transactions (the errant transactions) to the slaves which leads to one of the following problems:
- If those transactions still exist in the new master's binary log files, they will be replicated to the other slave which was not intentional when those were executed only on the slave (new master).
- If those transactions do no longer exist in the new master's binary log file, the replication will break on all slaves.
- Choose some slaves to be possible candidates for promotion in case of fail-over. Thus, stand alone transactions (which are not coming from the master) should NOT be executed there.
- Use one of the MySQL utilities (mysqlfailover or mysqlrpladmin) to find out if there are any errant transactions on the slave before the promotion or not.
In some cases we might need to make filtration to the replication on the slave(s) i.e. not all tables' or databases' changes are propagated to the slave by using the system variables (replicate_ignore_db or replicate_ignore_table). When the slave receives transactions from the master which modify those ignored tables or databases, it simply skips executing them and when the slave restarted it sends the gtid_excuted to the master and the master finds the missing transactions (those for the ignored tables or databases) and sends them back to the slave.
Again, that leads to one of the following two conditions:
- If those transactions still exist in the master's binary log files, then no problem as the slave will skip executing them again!!
- If those transactions are no longer there in the master's binary log files, the replication will break on the slave.
Well, the above problem is supposed to be fixed in MySQL 5.6.18 (Bug #70048) The fix is injecting empty transactions on the slave for those ones modifying ignored tables or databases instead of just skipping them, and when the slave restarted they won't be sent back again from the master.
I listed the above problem although it should be fixed now because I want to mention that having MySQL always updated to the recent release is a good practice to avoid such problems and to get the most bug fixes.
ConclusionThe following are the main things which should be considered before using GTID:
- Migration from classic replication to transaction-based (GTID) replication requires downtime.
- Non-transactionally safe statements will not be executed in GTID replication.
- MySQL performance is a little bit slower in GTID replication, especially, on the slaves.
- mysql_upgrade script might cause troubles on a server having GTID_MODE=ON and it should be tested first.
- Errant transactions might break the replication in the fail-over process, thus, planning slaves for promotion will avoid falling in such cases.
- Some GTID bugs are fixed now (like slave filtration issue), thus MySQL server should be updated to the latest version once there is a new release.
- New bugs are expected to be discovered, so the application should be tested very well with GTID before performing the migration on production.
Things you should consider before using GTID
Global Transaction ID (GTID) is one of the major features that were introduced in MySQL 5.6 which provides a lot of benefits.
I have talked about the GTID concept, implementation and possible troubleshooting at Percona Live London 2014, you can download the slides from our presentations repository or from my session at Percona Live.
On the other hand, there are some important things you should consider before deploying GTID in production, I'm going to list them here in this blog post.
Table of Content- Migration to GTID replication
- Non transactionally safe statement will raise errors now
- MySQL Performance in GTID
- mysql_upgrade script
- Errant transactions!
- Filtration on the slave
- Conclusion
It is required to shutdown MySQL service on all servers in the replication setup in order to perform the migration from classic replication (based on binary logs information) to the transaction-based (GTID) replication which means that the migration process requires downtime.
The online migration to GTID replication is not yet available.
Facebook and Booking.com provided some MySQL patches for this, but they are not yet contained in Oracle's binaries.
So, if you can't afford a downtime during the migration process, then you might not be able to make the change.
It is required to enable the system variable (enforce_gtid_consistency) on all servers inside the GTID replication setup which prevent executing the non transactionally safe statements (check GTID restrictions) like:
- CREATE TABLE .. SELECT.
- CREATE TEMPORARY TABLE (inside a transaction).
- Statements that update non-transactional tables inside a transaction.
So, you will have to fix your application first if it contains any of the above statements before using GTID replication.
MySQL Performance in GTIDIt is required to enable the variables (bin_log and log_slave_updates) on - at least - the slave servers which affects the performance on those slaves negatively.
So, the performance should be tested very well before the production migration to GTID replication.
mysql_upgrade scriptThe mysql_upgrade script's problem when executed on a server having gtid_mode=on has been fixed since MySQL 5.6.7, but it is still not recommended to execute mysql_upgrade when gtid_mode=on as it might change system tables that is using MyISAM, which is non transactional.
Errant transactions! Transactions which are executed on a slave apart from the replication transactions (i.e. not executed on the master) are called "Errant transactions", those transactions cause trouble if that slave later is promoted to be a new master in the a fail-over process.Once the other slaves connect to the new master, they send the value of gtid_executed and the master in turn checks those values, compares it with its own gtid_executed set and sends back all missing transactions (the errant transactions) to the slaves which leads to one of the following problems:
- If those transactions still exist in the new master's binary log files, they will be replicated to the other slave which was not intentional when those were executed only on the slave (new master).
- If those transactions do no longer exist in the new master's binary log file, the replication will break on all slaves.
- Choose some slaves to be possible candidates for promotion in case of fail-over. Thus, stand alone transactions (which are not coming from the master) should NOT be executed there.
- Use one of the MySQL utilities (mysqlfailover or mysqlrpladmin) to find out if there are any errant transactions on the slave before the promotion or not.
In some cases we might need to make filtration to the replication on the slave(s) i.e. not all tables' or databases' changes are propagated to the slave by using the system variables (replicate_ignore_db or replicate_ignore_table). When the slave receives transactions from the master which modify those ignored tables or databases, it simply skips executing them and when the slave restarted it sends the gtid_excuted to the master and the master finds the missing transactions (those for the ignored tables or databases) and sends them back to the slave.
Again, that leads to one of the following two conditions:
- If those transactions still exist in the master's binary log files, then no problem as the slave will skip executing them again!!
- If those transactions are no longer there in the master's binary log files, the replication will break on the slave.
Well, the above problem is supposed to be fixed in MySQL 5.6.18 (Bug #70048) The fix is injecting empty transactions on the slave for those ones modifying ignored tables or databases instead of just skipping them, and when the slave restarted they won't be sent back again from the master.
I listed the above problem although it should be fixed now because I want to mention that having MySQL always updated to the recent release is a good practice to avoid such problems and to get the most bug fixes.
ConclusionThe following are the main things which should be considered before using GTID:
- Migration from classic replication to transaction-based (GTID) replication requires downtime.
- Non-transactionally safe statements will not be executed in GTID replication.
- MySQL performance is a little bit slower in GTID replication, especially, on the slaves.
- mysql_upgrade script might cause troubles on a server having GTID_MODE=ON and it should be tested first.
- Errant transactions might break the replication in the fail-over process, thus, planning slaves for promotion will avoid falling in such cases.
- Some GTID bugs are fixed now (like slave filtration issue), thus MySQL server should be updated to the latest version once there is a new release.
- New bugs are expected to be discovered, so the application should be tested very well with GTID before performing the migration on production.
Things you should consider before using GTID
Global Transaction ID (GTID) is one of the major features that were introduced in MySQL 5.6 which provides a lot of benefits (I have talked about the GTID concept, implementation and possible troubleshooting at Percona Live London 2014, you can download the slides from our presentations repository or from my session at Percona Live.
On the other hand, there are some important things you should consider before deploying GTID in production, I'm going to list them here in this blog post.
- Migration to GTID replication
- Non transactionally safe statement will raise errors now
- MySQL Performance in GTID
- mysql_upgrade script
- Errant transactions!
- Filtration on the slave
- Conclusion
It is required to shutdown MySQL service on all servers in the replication setup in order to perform the migration from classic replication (based on binary logs information) to the transaction-based (GTID) replication which means that the migration process requires downtime.
The online migration to GTID replication is not yet available.
Facebook and Booking.com provided some MySQL patches for this, but they are not yet contained in Oracle's binaries.
So, if you can't afford a downtime during the migration process, then you might not be able to make the change.
It is required to enable the system variable (enforce_gtid_consistency) on all servers inside the GTID replication setup which prevent executing the non transactionally safe statements (check GTID restrictions) like:
- CREATE TABLE .. SELECT.
- CREATE TEMPORARY TABLE (inside a transaction).
- Statements that update non-transactional tables inside a transaction.
So, you will have to fix your application first if it contains any of the above statements before using GTID replication.
MySQL Performance in GTIDIt is required to enable the variables (bin_log and log_slave_updates) on - at least - the slave servers which affects the performance on those slaves negatively.
So, the performance should be tested very well before the production migration to GTID replication.
mysql_upgrade scriptThe mysql_upgrade script's problem when executed on a server having gtid_mode=on has been fixed since MySQL 5.6.7, but it is still not recommended to execute mysql_upgrade when gtid_mode=on as it might change system tables that is using MyISAM, which is non transactional.
Errant transactions! Transactions which are executed on a slave apart from the replication transactions (i.e. not executed on the master) are called "Errant transactions", those transactions cause trouble if that slave later is promoted to be a new master in the a fail-over process.Once the other slaves connect to the new master, they send the value of gtid_executed and the master in turn checks those values, compares it with its own gtid_executed set and sends back all missing transactions (the errant transactions) to the slaves which leads to one of the following problems:
- If those transactions still exist in the new master's binary log files, they will be replicated to the other slave which was not intentional when those were executed only on the slave (new master).
- If those transactions do no longer exist in the new master's binary log file, the replication will break on all slaves.
- Choose some slaves to be possible candidates for promotion in case of fail-over. Thus, stand alone transactions (which are not coming from the master) should NOT be executed there.
- Use one of the MySQL utilities (mysqlfailover or mysqlrpladmin) to find out if there are any errant transactions on the slave before the promotion or not.
In some cases we might need to make filtration to the replication on the slave(s) i.e. not all tables' or databases' changes are propagated to the slave by using the system variables (replicate_ignore_db or replicate_ignore_table). When the slave receives transactions from the master which modify those ignored tables or databases, it simply skips executing them and when the slave restarted it sends the gtid_excuted to the master and the master finds the missing transactions (those for the ignored tables or databases) and sends them back to the slave.
Again, that leads to one of the following two conditions:
- If those transactions still exist in the master's binary log files, then no problem as the slave will skip executing them again!!
- If those transactions are no longer there in the master's binary log files, the replication will break on the slave.
Well, the above problem is supposed to be fixed in MySQL 5.6.18 (Bug #70048) The fix is injecting empty transactions on the slave for those ones modifying ignored tables or databases instead of just skipping them, and when the slave restarted they won't be sent back again from the master.
I listed the above problem although it should be fixed now because I want to mention that having MySQL always updated to the recent release is a good practice to avoid such problems and to get the most bug fixes.
Conclusion:The following are the main things which should be considered before using GTID:
- Migration from classic replication to transaction-based (GTID) replication requires downtime.
- Non-transactionally safe statements will not be executed in GTID replication.
- MySQL performance is a little bit slower in GTID replication, especially, on the slaves.
- mysql_upgrade script might cause troubles on a server having GTID_MODE=ON and it should be tested first.
- Errant transactions might break the replication in the fail-over process, thus, planning slaves for promotion will avoid falling in such cases.
- Some GTID bugs are fixed now (like slave filtration issue), thus MySQL server should be updated to the latest version once there is a new release.
- New bugs are expected to be discovered, so the application should be tested very well with GTID before performing the migration on production.
Galera Cluster and XA Transactions
A few weeks ago, we received an interesting Galera Cluster support case from one of our customers that the application is not working well and they face a lot of troubles in their Galera Cluster setup.
After some investigations, we found a lot of insert queries in state "query end" and lasting for long time without being completed. Also some other queries which were sleeping for long time having the info of "XA COMMIT":
SQL> SHOW PROCESSLIST; 27 user host:33214 foodmart Query 14440 sleeping XA COMMIT 0x31302e31312e31332e34372e746d30303336383030303031,0x31302e31312e31332e34372e746d333638,0x SQL> SHOW ENGINE INNODB STATUS; TRANSACTIONS ============ ---TRANSACTION 2DE71D, ACTIVE 14459 sec 9 lock struct(s), heap size 1248, 1 row lock(s), undo log entries 115 MySQL thread id 27, OS thread handle 0x7fc21a42c700, query id 96187 host host-ip foodmart sleeping XA COMMIT 0x31302e31312e31332e34372e746d30303336383030303031,0x31302e31312e31332e34372e746d333638,0x41544f4d ONE PHASEXA means eXtended Architecture and "XA COMMIT" statement is one of the distributed transactions (XA Transactions) commands which are clearly NOT supported in Galera Cluster and one of its limitations because of possible rollback on commit.
The following command can be used to check if XA Transactions are being used by your application or not:
SQL> SHOW GLOBAL STATUS LIKE 'Com_xa%'; +-------------------+---------+ | Variable_name | Value | +-------------------+---------+ | Com_xa_commit | 2828094 | | Com_xa_end | 2828221 | | Com_xa_prepare | 0 | | Com_xa_recover | 2205697 | | Com_xa_rollback | 42 | | Com_xa_start | 2828305 | +-------------------+---------+ 6 rows in set (0.00 sec)There are only two possible solutions for this problem:
- Get rid of all XA transactions in the application to get the Galera Cluster work.
- Use another HA solution (Active/passive, Master/Slave, ... etc) but not Galera Cluster.
- XA transactions can not be supported in Galera Cluster and that is already stated clearly in the Galera Cluster Limitations.
- Before Moving to Galera Cluster, it is preferred to go through all its limitations and check whether your application can cope with them or not.
Galera Cluster and XA Transactions
A few weeks ago, we received an interesting Galera Cluster support case from one of our customers that the application is not working well and they face a lot of troubles in their Galera Cluster setup.
After some investigations, we found a lot of insert queries in state "query end" and lasting for long time without being completed. Also some other queries which were sleeping for long time having the info of "XA COMMIT":
SQL> SHOW PROCESSLIST; 27 user host:33214 foodmart Query 14440 sleeping XA COMMIT 0x31302e31312e31332e34372e746d30303336383030303031,0x31302e31312e31332e34372e746d333638,0x SQL> SHOW ENGINE INNODB STATUS; TRANSACTIONS ============ ---TRANSACTION 2DE71D, ACTIVE 14459 sec 9 lock struct(s), heap size 1248, 1 row lock(s), undo log entries 115 MySQL thread id 27, OS thread handle 0x7fc21a42c700, query id 96187 host host-ip foodmart sleeping XA COMMIT 0x31302e31312e31332e34372e746d30303336383030303031,0x31302e31312e31332e34372e746d333638,0x41544f4d ONE PHASEXA means eXtended Architecture and "XA COMMIT" statement is one of the distributed transactions (XA Transactions) commands which are clearly NOT supported in Galera Cluster and one of its limitations because of possible rollback on commit.
The following command can be used to check if XA Transactions are being used by your application or not:
SQL> SHOW GLOBAL STATUS LIKE 'Com_xa%'; +-------------------+---------+ | Variable_name | Value | +-------------------+---------+ | Com_xa_commit | 2828094 | | Com_xa_end | 2828221 | | Com_xa_prepare | 0 | | Com_xa_recover | 2205697 | | Com_xa_rollback | 42 | | Com_xa_start | 2828305 | +-------------------+---------+ 6 rows in set (0.00 sec)There are only two possible solutions for this problem:
- Get rid of all XA transactions in the application to get the Galera Cluster work.
- Use another HA solution (Active/passive, Master/Slave, ... etc) but not Galera Cluster.
- XA transactions can not be supported in Galera Cluster and that is already stated clearly in the Galera Cluster Limitations.
- Before Moving to Galera Cluster, it is preferred to go through all its limitations and check whether your application can cope with them or not.
Galera Cluster and XA Transactions
A few weeks ago, we received an interesting Galera Cluster support case from one of our customers that the application is not working well and they face a lot of troubles in their Galera Cluster setup.
After some investigations, we found a lot of insert queries in state "query end" and lasting for long time without being completed. Also some other queries which were sleeping for long time having the info of "XA COMMIT":
SQL> SHOW PROCESSLIST; 27 user host:33214 foodmart Query 14440 sleeping XA COMMIT 0x31302e31312e31332e34372e746d30303336383030303031,0x31302e31312e31332e34372e746d333638,0x SQL> SHOW ENGINE INNODB STATUS; TRANSACTIONS ============ ---TRANSACTION 2DE71D, ACTIVE 14459 sec 9 lock struct(s), heap size 1248, 1 row lock(s), undo log entries 115 MySQL thread id 27, OS thread handle 0x7fc21a42c700, query id 96187 host host-ip foodmart sleeping XA COMMIT 0x31302e31312e31332e34372e746d30303336383030303031,0x31302e31312e31332e34372e746d333638,0x41544f4d ONE PHASEXA means eXtended Architecture and "XA COMMIT" statement is one of the distributed transactions (XA Transactions) commands which are clearly NOT supported in Galera Cluster and one of its limitations because of possible rollback on commit.
The following command can be used to check if XA Transactions are being used by your application or not:
SQL> SHOW GLOBAL STATUS LIKE 'Com_xa%'; +-------------------+---------+ | Variable_name | Value | +-------------------+---------+ | Com_xa_commit | 2828094 | | Com_xa_end | 2828221 | | Com_xa_prepare | 0 | | Com_xa_recover | 2205697 | | Com_xa_rollback | 42 | | Com_xa_start | 2828305 | +-------------------+---------+ 6 rows in set (0.00 sec)There are only two possible solutions for this problem:
- Get rid of all XA transactions in the application to get the Galera Cluster work.
- Use another HA solution (Active/passive, Master/Slave, ... etc) but not Galera Cluster.
- XA transactions can not be supported in Galera Cluster and that is already stated clearly in the Galera Cluster Limitations.
- Before Moving to Galera Cluster, it is preferred to go through all its limitations and check whether your application can cope with them or not.
GTID Replication talk at Percona Live London 2014
Global Transaction ID (GTID) is a new feature coming with MySQL 5.6 which introduced a new MySQL replication method called Transaction-based Replication that is depend on GTID.
In a few weeks, I will be speaking at Percona Live London 2014 (November 3-4) about "Transaction-based REPLICATION (GTID) - IMPLEMENTATION AND TROUBLESHOOTING". I'll talk about how to implement GTID replication in different scenarios and how to troubleshoot most of the common issues in GTID replication.
Anyone interested in learning more about GTID replication or planing to go with GTID replication in production should attend this talk.
Face-to-face meetings are very welcome, please let me know (abdel-mawla.gharieb@fromdual.com) if you are interested so we can schedule a meeting.
Looking forward to seeing you there!!
GTID Replication talk at Percona Live London 2014
Global Transaction ID (GTID) is a new feature coming with MySQL 5.6 which introduced a new MySQL replication method called Transaction-based Replication that is depend on GTID.
In a few weeks, I will be speaking at Percona Live London 2014 (November 3-4) about "Transaction-based REPLICATION (GTID) - IMPLEMENTATION AND TROUBLESHOOTING". I'll talk about how to implement GTID replication in different scenarios and how to troubleshoot most of the common issues in GTID replication.
Anyone interested in learning more about GTID replication or planing to go with GTID replication in production should attend this talk.
Face-to-face meetings are very welcome, please let me know (abdel-mawla.gharieb@fromdual.com) if you are interested so we can schedule a meeting.
Looking forward to seeing you there!!
GTID Replication talk at Percona Live London 2014
In a few weeks, I'll be speaking at Percona Live London 2014 (November 3-4) about "Transaction-based REPLICATION (GTID) - IMPLEMENTATION AND TROUBLESHOOTING". I'll talk about how to implement GTID replication in different scenarios and how to troubleshoot most of the common issues in GTID replication.
Anyone interested in learning more about GTID replication or planing to go with GTID replication in production should attend this talk.
Face-to-face meetings are very welcome, please let me know (abdel-mawla.gharieb@fromdual.com) if you are interested so we can schedule a meeting.
Looking forward to seeing you there!!
How to install multiple MySQL instances on a single host using MyEnv?
We have been asked several times by MySQL users about how to install multiple MySQL instances on a single host.
Typically, this is required when testing different MySQL versions or MySQL servers (MySQL server, Percona server and MariaDB server) while no available resources are available.
Sometimes, it is even required to install multiple MySQL instances on a single production server.
In this article, I'll go through the steps needed to install multiple MySQL instances on a single host (using the tar balls binaries) and how our popular tool MyEnv can make such process so easy.
Prepare MySQL environment [root@centos-temp ~]# groupadd mysql [root@centos-temp ~]# useradd -g mysql mysql [root@centos-temp ~]# su - mysql [mysql@centos-temp:~]$ mkdir ~/product [mysql@centos-temp:~]$ mkdir ~/data [mysql@centos-temp ~]$ cd /downloads [mysql@centos-temp ~]$ wget http://dev.mysql.com/get/Downloads/MySQL-5.6/mysql-5.6.21-linux-glibc2.5-x86_64.tar.gz [mysql@centos-temp ~]$ cd ~/product [mysql@centos-temp ~]$ tar xf /downloads/mysql-5.6.21-linux-glibc2.5-x86_64.tar.gz [mysql@centos-temp ~]$ ln -s mysql-5.6.21-linux-glibc2.5-x86_64 mysql-5.6.21 Install MyEnvMyEnv can be downloaded from here and the installation steps are listed here.
Install the first instance (named master)- Launch the myenv installer: [mysql@centos-temp ~]$ ~/product/myenv/bin/installMyEnv.sh PHP is installed on /usr/bin/php Starting MyEnv installer: /home/mysql/product/myenv-1.1.2/bin/installMyEnv.php Configuration file /etc/myenv/myenv.conf does NOT exist. Copy from template or abort (T, a):
- Since this is the first instance, the myenv config file does not exist yet, we take the template (t): Copy from template or abort (T, a): t Copy /home/mysql/product/myenv-1.1.2/etc/myenv.conf.template to /etc/myenv/myenv.conf
- Then MyEnv will detect that no instances are there, we choose the first option (a) to add a new instance: No instance exists yet. An instance is the same as a mysqld process. What do you want to do next? o Add a new instance, o change an existing instance, o delete an existing instance, o save configuration and exit or o quit without saving (A/c/d/s/q)? a
- MyEnv will ask for the new instance name (master): You have to enter a new instance name. An instance name can consist of 0-9, a-z, A-Z, - and _. Examples: test, prod, dev, [mysqld1] mysqld-3306 Enter a new instance name: master
- MyEnv will ask for the MySQL basedir and datadir (it will create tha datadir if it does not exist): Changing instance master: Set basedir. The basedir is the directory where your MySQL binaries are located. Example: /home/mysql/product/myenv /home/mysql/product/myenv-1.1.2 /home/mysql/product/mysql-5.6.21 /home/mysql/product/mysql-5.6.21-linux-glibc2.5-x86_64 /usr/local/mysql /opt/mysql [/usr] basedir = /home/mysql/product/mysql-5.6.21 /home/mysql/product/mysql-5.6.21 seems to exist and has already some files in it . Omitting installation of binary tar ball. Set datadir. The datadir is the directory where your MySQL data are located. Example: /usr/local/mysql/data, /opt/mysql/data or /var/lib/mysql or /home/mysql/product/mysql-5.6.21/data or [/home/mysql/data/master] or /home/mysql/data/master datadir = /home/mysql/data/master
- MyEnv will detect that no mysql database is created yet in the datadir and ask if it should install it for us or not (using the normal mysql_install_db script) Shall I install a mysql database under /home/mysql/data/master (Y/n)? y Installing MySQL instance. Please be patient, this can take a while... /home/mysql/product/mysql-5.6.21/scripts/mysql_install_db --datadir=/home/mysql/data/master --basedir=/home/mysql/product/mysql-5.6.21
- For the new instance, we should specify the port, the socket, and the location of my.cnf (MyEnv will generate a my.cnf file including our choices in the specified location). All these must be specific per instance: Set port. Example: 3307 or [3306] port = 3306 Set socket. Example: /tmp/mysql-3306.sock /tmp/mysql-master.sock [/tmp/mysql.sock] /var/run/mysqld/mysql.sock socket = /home/mysql/data/master/master.sock Choose location of my.cnf: Example: /etc/my.cnf /etc/mysql/my.cnf /home/mysql/data/master/my-3306.cnf [/home/mysql/data/master/my.cnf] my.cnf = /home/mysql/data/master/my.cnf
- MyEnv will list the just added instance and ask what should be the next step, we should save and exit (s): The following instances are available: master An instance is the same as a mysqld process. What do you want to do next? o Add a new instance, o change an existing instance, o delete an existing instance, o save configuration and exit or o quit without saving (a/c/d/S/q)? s Writing the configuration file... Rename /etc/myenv/myenv.conf to /etc/myenv/myenv.conf.2014-10-15-05-50-12 Write /etc/myenv/myenv.conf
- MyEnv will propose to add the following lines to the .bash_profile file, we should confirm (y): Do you want to add the following code to your ~/.bash_profile file? . /etc/myenv/MYENV_BASE MYENV_PWD=`pwd` cd $MYENV_BASE/bin . myenv.profile cd $MYENV_BASE; $MYENV_BASE/bin/showMyEnvStatus.php; cd - > /dev/null cd $MYENV_PWD (Y/n)? y Writing /etc/myenv/MYENV_BASE
- We should add the myenv startup script as shown here into the appropriate path (the provided commands should be executed manually from the root/privileged-user after that): Please copy yourself manually the MyEnv start/stop script to the following location: shell> sudo cp /home/mysql/product/myenv-1.1.2/bin/myenv.server /etc/init.d/myenv done (Y)? y and link it to your O/S runlevels: RedHat/CentOS: shell> sudo chkconfig --add myenv; chkconfig myenv on Ubuntu/Debian: shell> sudo update-rc.d myenv defaults SLES/OpenSuSE: shell> sudo chkconfig --add myenv done (Y)? y Now source your profile as follows: . ~/.bash_profile
- Now, we should logout and login so that the .bash_profile code will be executed (we can copy the startup script in the meantime): [mysql@centos-temp ~]$> exit [root@centos-temp ~]# cp /home/mysql/product/myenv-1.1.2/bin/myenv.server /etc/init.d/myenv [root@centos-temp ~]# chkconfig --add myenv [root@centos-temp ~]# su - mysql Up : Down : master (5.6.21) master ( 3306) : test mysql@centos-temp:~ [master, 3306]>
- Now, we can start the just installed instance and check if it is started correctly or not: mysql@centos-temp:~ [master, 3306]> start . SUCCESS! mysql@centos-temp:~ [master, 3306]> up Up : master (5.6.21) Down : master ( 3306) : test mysql@centos-temp:~/data/master [master, 3306]>
Cool! We have got the first instance installed. let's install the second one ...
Install the second instance (named slave1)- Launch the MyEnv installer and add the new instance: mysql@centos-temp:~ [master, 3306]> product/myenv/bin/installMyEnv.sh PHP is installed on /usr/bin/php Starting MyEnv installer: /home/mysql/product/myenv-1.1.2/bin/installMyEnv.php Configuration file /etc/myenv/myenv.conf already exists. Use this, overwrite with template or abort (U,t,a): u The following instances are available: master An instance is the same as a mysqld process. What do you want to do next? o Add a new instance, o change an existing instance, o delete an existing instance, o save configuration and exit or o quit without saving (A/c/d/s/q)? a You have to enter a new instance name. An instance name can consist of 0-9, a-z, A-Z, - and _. Examples: test, prod, dev, [mysqld1] mysqld-3306 Enter a new instance name: slave1 Changing instance slave1: Set basedir. The basedir is the directory where your MySQL binaries are located. Example: /home/mysql/product/myenv /home/mysql/product/myenv-1.1.2 /home/mysql/product/mysql-5.6.21 /home/mysql/product/mysql-5.6.21-linux-glibc2.5-x86_64 /usr/local/mysql /opt/mysql [/usr] basedir = /home/mysql/product/mysql-5.6.21 /home/mysql/product/mysql-5.6.21 seems to exist and has already some files in it. Omitting installation of binary tar ball. Set datadir. The datadir is the directory where your MySQL data are located. Example: /usr/local/mysql/data, /opt/mysql/data or /var/lib/mysql or /home/mysql/product/mysql-5.6.21/data or [/home/mysql/data/slave1] or /home/mysql/data/slave1 datadir = /home/mysql/data/slave1 Shall I install a mysql database under /home/mysql/data/slave1 (Y/n)? y Installing MySQL instance. Please be patient, this can take a while... /home/mysql/product/mysql-5.6.21/scripts/mysql_install_db --datadir=/home/mysql/data/slave1 --basedir=/home/mysql/product/mysql-5.6.21 Set port. Example: 3307 or [3306] port = 3307 Set socket. Example: /tmp/mysql-3307.sock /tmp/mysql-slave1.sock [/tmp/mysql.sock] /var/run/mysqld/mysql.sock socket = /home/mysql/data/slave1/slave1.sock Choose location of my.cnf: Example: /etc/my.cnf /etc/mysql/my.cnf /home/mysql/data/slave1/my-3307.cnf [/home/mysql/data/slave1/my.cnf] my.cnf = /home/mysql/data/slave1/my.cnf The following instances are available: master slave1 An instance is the same as a mysqld process. What do you want to do next? o Add a new instance, o change an existing instance, o delete an existing instance, o save configuration and exit or o quit without saving (a/c/d/S/q)? s Writing the configuration file... Rename /etc/myenv/myenv.conf to /etc/myenv/myenv.conf.2014-10-15-06-44-04 Write /etc/myenv/myenv.conf Now source your profile as follows: . ~/.bash_profile
- Then we logout and login again to load the new configs: mysql@centos-temp:~ [master, 3306]> exit logout [root@centos-temp product]# su - mysql Up : master (5.6.21) Down : slave1 (5.6.21) master ( 3306) : test slave1 ( 3307) : test mysql@centos-temp:~ [master, 3306]>
MyEnv says that we have two instances (master and slave1), master is up and running while slave1 is stopped. Let's start it then ..
- First, all commands we execute are sent to the master instance, we should change to the slave1 instance: mysql@centos-temp:~ [master, 3306]> slave1 mysql@centos-temp:~ [slave1, 3307]> start . SUCCESS! mysql@centos-temp:~ [slave1, 3307]>
- Both instances are now up and running ... mysql@centos-temp:~ [slave1, 3307]> up Up : master (5.6.21) slave1 (5.6.21) Down : master ( 3306) : test slave1 ( 3307) : test mysql@centos-temp:~ [slave1, 3307]>
After we installed the two instances we need to know how can we navigate between them to open MySQL connections.
- Open a MySQL connection to the master instance: mysql@centos-temp:~ [master, 3306]> mysql -u root Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 7 Server version: 5.6.21 MySQL Community Server (GPL) Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. root@localhost master [(none)] SQL>
- While the instance name is already printed in the terminal we need to double check that by a MySQL command (the used port) and exit after that: root@localhost master [(none)] SQL> SHOW GLOBAL VARIABLES LIKE 'port'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | port | 3306 | +---------------+-------+ 1 row in set (0.00 sec) root@localhost master [(none)] SQL> exit Bye mysql@centos-temp:~ [master, 3306]>
- Open a MySQL connection to the slave1 instance and check the used port: mysql@centos-temp:~ [master, 3306]> slave1 mysql@centos-temp:~ [slave1, 3307]> mysql -u root Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 5 Server version: 5.6.21 MySQL Community Server (GPL) Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. root@localhost slave1 [(none)] SQL> SHOW GLOBAL VARIABLES LIKE 'port'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | port | 3307 | +---------------+-------+ 1 row in set (0.00 sec) root@localhost slave1 [(none)] SQL>
- Navigate back to the master instance: mysql@centos-temp:~ [slave1, 3307]> master mysql@centos-temp:~ [master, 3306]>
More information about MyEnv commands can be checked here.
Note:Using the same procedures above we can add several instances using either the same binaries or some others.
Taxonomy upgrade extras: MyEnvenvironmentvirtualizationconsolidationmulti-instanceSaaSHow to install multiple MySQL instances on a single host using MyEnv?
We have been asked several times by MySQL users about how to install multiple MySQL instances on a single host.
Typically, this is required when testing different MySQL versions or MySQL servers (MySQL server, Percona server and MariaDB server) while no available resources are available.
Sometimes, it is even required to install multiple MySQL instances on a single production server.
In this article, I'll go through the steps needed to install multiple MySQL instances on a single host (using the tar balls binaries) and how our popular tool MyEnv can make such process so easy.
Prepare MySQL environment [root@centos-temp ~]# groupadd mysql [root@centos-temp ~]# useradd -g mysql mysql [root@centos-temp ~]# su - mysql [mysql@centos-temp:~]$ mkdir ~/product [mysql@centos-temp:~]$ mkdir ~/data [mysql@centos-temp ~]$ cd /downloads [mysql@centos-temp ~]$ wget http://dev.mysql.com/get/Downloads/MySQL-5.6/mysql-5.6.21-linux-glibc2.5-x86_64.tar.gz [mysql@centos-temp ~]$ cd ~/product [mysql@centos-temp ~]$ tar xf /downloads/mysql-5.6.21-linux-glibc2.5-x86_64.tar.gz [mysql@centos-temp ~]$ ln -s mysql-5.6.21-linux-glibc2.5-x86_64 mysql-5.6.21 Install MyEnvMyEnv can be downloaded from here and the installation steps are listed here.
Install the first instance (named master)- Launch the myenv installer: [mysql@centos-temp ~]$ ~/product/myenv/bin/installMyEnv.sh PHP is installed on /usr/bin/php Starting MyEnv installer: /home/mysql/product/myenv-1.1.2/bin/installMyEnv.php Configuration file /etc/myenv/myenv.conf does NOT exist. Copy from template or abort (T, a):
- Since this is the first instance, the myenv config file does not exist yet, we take the template (t): Copy from template or abort (T, a): t Copy /home/mysql/product/myenv-1.1.2/etc/myenv.conf.template to /etc/myenv/myenv.conf
- Then MyEnv will detect that no instances are there, we choose the first option (a) to add a new instance: No instance exists yet. An instance is the same as a mysqld process. What do you want to do next? o Add a new instance, o change an existing instance, o delete an existing instance, o save configuration and exit or o quit without saving (A/c/d/s/q)? a
- MyEnv will ask for the new instance name (master): You have to enter a new instance name. An instance name can consist of 0-9, a-z, A-Z, - and _. Examples: test, prod, dev, [mysqld1] mysqld-3306 Enter a new instance name: master
- MyEnv will ask for the MySQL basedir and datadir (it will create tha datadir if it does not exist): Changing instance master: Set basedir. The basedir is the directory where your MySQL binaries are located. Example: /home/mysql/product/myenv /home/mysql/product/myenv-1.1.2 /home/mysql/product/mysql-5.6.21 /home/mysql/product/mysql-5.6.21-linux-glibc2.5-x86_64 /usr/local/mysql /opt/mysql [/usr] basedir = /home/mysql/product/mysql-5.6.21 /home/mysql/product/mysql-5.6.21 seems to exist and has already some files in it . Omitting installation of binary tar ball. Set datadir. The datadir is the directory where your MySQL data are located. Example: /usr/local/mysql/data, /opt/mysql/data or /var/lib/mysql or /home/mysql/product/mysql-5.6.21/data or [/home/mysql/data/master] or /home/mysql/data/master datadir = /home/mysql/data/master
- MyEnv will detect that no mysql database is created yet in the datadir and ask if it should install it for us or not (using the normal mysql_install_db script) Shall I install a mysql database under /home/mysql/data/master (Y/n)? y Installing MySQL instance. Please be patient, this can take a while... /home/mysql/product/mysql-5.6.21/scripts/mysql_install_db --datadir=/home/mysql/data/master --basedir=/home/mysql/product/mysql-5.6.21
- For the new instance, we should specify the port, the socket, and the location of my.cnf (MyEnv will generate a my.cnf file including our choices in the specified location). All these must be specific per instance: Set port. Example: 3307 or [3306] port = 3306 Set socket. Example: /tmp/mysql-3306.sock /tmp/mysql-master.sock [/tmp/mysql.sock] /var/run/mysqld/mysql.sock socket = /home/mysql/data/master/master.sock Choose location of my.cnf: Example: /etc/my.cnf /etc/mysql/my.cnf /home/mysql/data/master/my-3306.cnf [/home/mysql/data/master/my.cnf] my.cnf = /home/mysql/data/master/my.cnf
- MyEnv will list the just added instance and ask what should be the next step, we should save and exit (s): The following instances are available: master An instance is the same as a mysqld process. What do you want to do next? o Add a new instance, o change an existing instance, o delete an existing instance, o save configuration and exit or o quit without saving (a/c/d/S/q)? s Writing the configuration file... Rename /etc/myenv/myenv.conf to /etc/myenv/myenv.conf.2014-10-15-05-50-12 Write /etc/myenv/myenv.conf
- MyEnv will propose to add the following lines to the .bash_profile file, we should confirm (y): Do you want to add the following code to your ~/.bash_profile file? . /etc/myenv/MYENV_BASE MYENV_PWD=`pwd` cd $MYENV_BASE/bin . myenv.profile cd $MYENV_BASE; $MYENV_BASE/bin/showMyEnvStatus.php; cd - > /dev/null cd $MYENV_PWD (Y/n)? y Writing /etc/myenv/MYENV_BASE
- We should add the myenv startup script as shown here into the appropriate path (the provided commands should be executed manually from the root/privileged-user after that): Please copy yourself manually the MyEnv start/stop script to the following location: shell> sudo cp /home/mysql/product/myenv-1.1.2/bin/myenv.server /etc/init.d/myenv done (Y)? y and link it to your O/S runlevels: RedHat/CentOS: shell> sudo chkconfig --add myenv; chkconfig myenv on Ubuntu/Debian: shell> sudo update-rc.d myenv defaults SLES/OpenSuSE: shell> sudo chkconfig --add myenv done (Y)? y Now source your profile as follows: . ~/.bash_profile
- Now, we should logout and login so that the .bash_profile code will be executed (we can copy the startup script in the meantime): [mysql@centos-temp ~]$> exit [root@centos-temp ~]# cp /home/mysql/product/myenv-1.1.2/bin/myenv.server /etc/init.d/myenv [root@centos-temp ~]# chkconfig --add myenv [root@centos-temp ~]# su - mysql Up : Down : master (5.6.21) master ( 3306) : test mysql@centos-temp:~ [master, 3306]>
- Now, we can start the just installed instance and check if it is started correctly or not: mysql@centos-temp:~ [master, 3306]> start . SUCCESS! mysql@centos-temp:~ [master, 3306]> up Up : master (5.6.21) Down : master ( 3306) : test mysql@centos-temp:~/data/master [master, 3306]>
Cool! We have got the first instance installed. let's install the second one ...
Install the second instance (named slave1)- Launch the MyEnv installer and add the new instance: mysql@centos-temp:~ [master, 3306]> product/myenv/bin/installMyEnv.sh PHP is installed on /usr/bin/php Starting MyEnv installer: /home/mysql/product/myenv-1.1.2/bin/installMyEnv.php Configuration file /etc/myenv/myenv.conf already exists. Use this, overwrite with template or abort (U,t,a): u The following instances are available: master An instance is the same as a mysqld process. What do you want to do next? o Add a new instance, o change an existing instance, o delete an existing instance, o save configuration and exit or o quit without saving (A/c/d/s/q)? a You have to enter a new instance name. An instance name can consist of 0-9, a-z, A-Z, - and _. Examples: test, prod, dev, [mysqld1] mysqld-3306 Enter a new instance name: slave1 Changing instance slave1: Set basedir. The basedir is the directory where your MySQL binaries are located. Example: /home/mysql/product/myenv /home/mysql/product/myenv-1.1.2 /home/mysql/product/mysql-5.6.21 /home/mysql/product/mysql-5.6.21-linux-glibc2.5-x86_64 /usr/local/mysql /opt/mysql [/usr] basedir = /home/mysql/product/mysql-5.6.21 /home/mysql/product/mysql-5.6.21 seems to exist and has already some files in it. Omitting installation of binary tar ball. Set datadir. The datadir is the directory where your MySQL data are located. Example: /usr/local/mysql/data, /opt/mysql/data or /var/lib/mysql or /home/mysql/product/mysql-5.6.21/data or [/home/mysql/data/slave1] or /home/mysql/data/slave1 datadir = /home/mysql/data/slave1 Shall I install a mysql database under /home/mysql/data/slave1 (Y/n)? y Installing MySQL instance. Please be patient, this can take a while... /home/mysql/product/mysql-5.6.21/scripts/mysql_install_db --datadir=/home/mysql/data/slave1 --basedir=/home/mysql/product/mysql-5.6.21 Set port. Example: 3307 or [3306] port = 3307 Set socket. Example: /tmp/mysql-3307.sock /tmp/mysql-slave1.sock [/tmp/mysql.sock] /var/run/mysqld/mysql.sock socket = /home/mysql/data/slave1/slave1.sock Choose location of my.cnf: Example: /etc/my.cnf /etc/mysql/my.cnf /home/mysql/data/slave1/my-3307.cnf [/home/mysql/data/slave1/my.cnf] my.cnf = /home/mysql/data/slave1/my.cnf The following instances are available: master slave1 An instance is the same as a mysqld process. What do you want to do next? o Add a new instance, o change an existing instance, o delete an existing instance, o save configuration and exit or o quit without saving (a/c/d/S/q)? s Writing the configuration file... Rename /etc/myenv/myenv.conf to /etc/myenv/myenv.conf.2014-10-15-06-44-04 Write /etc/myenv/myenv.conf Now source your profile as follows: . ~/.bash_profile
- Then we logout and login again to load the new configs: mysql@centos-temp:~ [master, 3306]> exit logout [root@centos-temp product]# su - mysql Up : master (5.6.21) Down : slave1 (5.6.21) master ( 3306) : test slave1 ( 3307) : test mysql@centos-temp:~ [master, 3306]>
MyEnv says that we have two instances (master and slave1), master is up and running while slave1 is stopped. Let's start it then ..
- First, all commands we execute are sent to the master instance, we should change to the slave1 instance: mysql@centos-temp:~ [master, 3306]> slave1 mysql@centos-temp:~ [slave1, 3307]> start . SUCCESS! mysql@centos-temp:~ [slave1, 3307]>
- Both instances are now up and running ... mysql@centos-temp:~ [slave1, 3307]> up Up : master (5.6.21) slave1 (5.6.21) Down : master ( 3306) : test slave1 ( 3307) : test mysql@centos-temp:~ [slave1, 3307]>
After we installed the two instances we need to know how can we navigate between them to open MySQL connections.
- Open a MySQL connection to the master instance: mysql@centos-temp:~ [master, 3306]> mysql -u root Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 7 Server version: 5.6.21 MySQL Community Server (GPL) Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. root@localhost master [(none)] SQL>
- While the instance name is already printed in the terminal we need to double check that by a MySQL command (the used port) and exit after that: root@localhost master [(none)] SQL> SHOW GLOBAL VARIABLES LIKE 'port'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | port | 3306 | +---------------+-------+ 1 row in set (0.00 sec) root@localhost master [(none)] SQL> exit Bye mysql@centos-temp:~ [master, 3306]>
- Open a MySQL connection to the slave1 instance and check the used port: mysql@centos-temp:~ [master, 3306]> slave1 mysql@centos-temp:~ [slave1, 3307]> mysql -u root Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 5 Server version: 5.6.21 MySQL Community Server (GPL) Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. root@localhost slave1 [(none)] SQL> SHOW GLOBAL VARIABLES LIKE 'port'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | port | 3307 | +---------------+-------+ 1 row in set (0.00 sec) root@localhost slave1 [(none)] SQL>
- Navigate back to the master instance: mysql@centos-temp:~ [slave1, 3307]> master mysql@centos-temp:~ [master, 3306]>
More information about MyEnv commands can be checked here.
Note:Using the same procedures above we can add several instances using either the same binaries or some others.
Taxonomy upgrade extras: MyEnvenvironmentvirtualizationconsolidationmulti-instanceSaaSHow to install multiple MySQL instances on a single host using MyEnv?
We have been asked several times by MySQL users about how to install multiple MySQL instances on a single host.
Typically, this is required when testing different MySQL versions or MySQL servers (MySQL server, Percona server and MariaDB server) while no available resources are available.
Sometimes, it is even required to install multiple MySQL instances on a single production server.
In this article, I'll go through the steps needed to install multiple MySQL instances on a single host (using the tar balls binaries) and how our popular tool MyEnv can make such process so easy.
Prepare MySQL environment [root@centos-temp ~]# groupadd mysql [root@centos-temp ~]# useradd -g mysql mysql [root@centos-temp ~]# su - mysql [mysql@centos-temp:~]$ mkdir ~/product [mysql@centos-temp:~]$ mkdir ~/data [mysql@centos-temp ~]$ cd /downloads [mysql@centos-temp ~]$ wget http://dev.mysql.com/get/Downloads/MySQL-5.6/mysql-5.6.21-linux-glibc2.5-x86_64.tar.gz [mysql@centos-temp ~]$ cd ~/product [mysql@centos-temp ~]$ tar xf /downloads/mysql-5.6.21-linux-glibc2.5-x86_64.tar.gz [mysql@centos-temp ~]$ ln -s mysql-5.6.21-linux-glibc2.5-x86_64 mysql-5.6.21 Install MyEnvMyEnv can be downloaded from here and the installation steps are listed here.
Install the first instance (named master)- Prepare the first instance: [mysql@centos-temp ~]$ mkdir ~/data/master [mysql@centos-temp ~]$ vi ~/data/master/my.cnf [mysqld] port = 3306 basedir = /home/mysql/product/mysql-5.6.21 datadir = /home/mysql/data/master socket = /home/mysql/data/master/master.sock user = mysql
- Launch the myenv installer: [mysql@centos-temp ~]$ ~/product/myenv/bin/installMyEnv.sh PHP is installed on /usr/bin/php Starting MyEnv installer: /home/mysql/product/myenv-1.1.2/bin/installMyEnv.php Configuration file /etc/myenv/myenv.conf does NOT exist. Copy from template or abort (T, a):
- Since this is the first instance, the myenv config file does not exist yet, we take the template (t): Copy from template or abort (T, a): t Copy /home/mysql/product/myenv-1.1.2/etc/myenv.conf.template to /etc/myenv/myenv.conf
- Then MyEnv will detect that no instances are there, we choose the first option (a) to add a new instance: No instance exists yet. An instance is the same as a mysqld process. What do you want to do next? o Add a new instance, o change an existing instance, o delete an existing instance, o save configuration and exit or o quit without saving (A/c/d/s/q)? a
- MyEnv will ask for the new instance name (master): You have to enter a new instance name. An instance name can consist of 0-9, a-z, A-Z, - and _. Examples: test, prod, dev, [mysqld1] mysqld-3306 Enter a new instance name: master
- MyEnv will ask for the MySQL basedir and datadir: Changing instance master: Set basedir. The basedir is the directory where your MySQL binaries are located. Example: /home/mysql/product/myenv /home/mysql/product/myenv-1.1.2 /home/mysql/product/mysql-5.6.21 /home/mysql/product/mysql-5.6.21-linux-glibc2.5-x86_64 /usr/local/mysql /opt/mysql [/usr] basedir = /home/mysql/product/mysql-5.6.21 /home/mysql/product/mysql-5.6.21 seems to exist and has already some files in it . Omitting installation of binary tar ball. Set datadir. The datadir is the directory where your MySQL data are located. Example: /usr/local/mysql/data, /opt/mysql/data or /var/lib/mysql or /home/mysql/product/mysql-5.6.21/data or [/home/mysql/data/master] or /home/mysql/data/master datadir = /home/mysql/data/master WARNING: directory /home/mysql/data/master already exist.
- MyEnv will detect that no mysql database is created yet in the datadir and ask if it should install it for us or not (using the normal mysql_install_db script) Shall I install a mysql database under /home/mysql/data/master (Y/n)? y Installing MySQL instance. Please be patient, this can take a while... /home/mysql/product/mysql-5.6.21/scripts/mysql_install_db --datadir=/home/mysql/data/master --basedir=/home/mysql/product/mysql-5.6.21
- For the new instance, we should specify the port, the socket, and the location of my.cnf (all these must be specific per instance): Set port. Example: 3307 or [3306] port = 3306 Set socket. Example: /tmp/mysql-3306.sock /tmp/mysql-master.sock [/tmp/mysql.sock] /var/run/mysqld/mysql.sock socket = /home/mysql/data/master/master.sock Choose location of my.cnf: Example: /etc/my.cnf /etc/mysql/my.cnf /home/mysql/data/master/my-3306.cnf [/home/mysql/data/master/my.cnf] my.cnf = /home/mysql/data/master/my.cnf
- MyEnv will list the just added instance and ask what should be the next step, we should save and exit (s): The following instances are available: master An instance is the same as a mysqld process. What do you want to do next? o Add a new instance, o change an existing instance, o delete an existing instance, o save configuration and exit or o quit without saving (a/c/d/S/q)? s Writing the configuration file... Rename /etc/myenv/myenv.conf to /etc/myenv/myenv.conf.2014-10-15-05-50-12 Write /etc/myenv/myenv.conf
- MyEnv will propose to add the following lines to the .bash_profile file, we should confirm (y): Do you want to add the following code to your ~/.bash_profile file? . /etc/myenv/MYENV_BASE MYENV_PWD=`pwd` cd $MYENV_BASE/bin . myenv.profile cd $MYENV_BASE; $MYENV_BASE/bin/showMyEnvStatus.php; cd - > /dev/null cd $MYENV_PWD (Y/n)? y Writing /etc/myenv/MYENV_BASE
- We should add the myenv startup script as shown here into the appropriate path (the provided commands should be executed manually from the root/privileged-user after that): Please copy yourself manually the MyEnv start/stop script to the following location: shell> sudo cp /home/mysql/product/myenv-1.1.2/bin/myenv.server /etc/init.d/myenv done (Y)? y and link it to your O/S runlevels: RedHat/CentOS: shell> sudo chkconfig --add myenv; chkconfig myenv on Ubuntu/Debian: shell> sudo update-rc.d myenv defaults SLES/OpenSuSE: shell> sudo chkconfig --add myenv done (Y)? y Now source your profile as follows: . ~/.bash_profile
- Now, we should logout and login so that the .bash_profile code will be executed (we can copy the startup script in the meantime): [mysql@centos-temp ~]$> exit [root@centos-temp ~]# cp /home/mysql/product/myenv-1.1.2/bin/myenv.server /etc/init.d/myenv [root@centos-temp ~]# chkconfig --add myenv [root@centos-temp ~]# su - mysql Up : Down : master (5.6.21) master ( 3306) : test mysql@centos-temp:~ [master, 3306]>
- Now, we can start the just installed instance and check if it is started correctly or not: mysql@centos-temp:~ [master, 3306]> start . SUCCESS! mysql@centos-temp:~ [master, 3306]> up Up : master (5.6.21) Down : master ( 3306) : test mysql@centos-temp:~/data/master [master, 3306]>
Cool! We have got the first instance installed. let's install the second one ...
Install the second instance (named slave1)- Prepare the second instance: mysql@centos-temp:~ [master, 3306]> mkdir ~/data/slave1 mysql@centos-temp:~ [master, 3306]> vi ~/data/slave1/my.cnf [mysqld] port = 3307 basedir = /home/mysql/product/mysql-5.6.21 datadir = /home/mysql/data/slave1 socket = /home/mysql/data/slave1/slave1.sock user = mysql
- Launch the MyEnv installer and add the new instance: mysql@centos-temp:~ [master, 3306]> product/myenv/bin/installMyEnv.sh PHP is installed on /usr/bin/php Starting MyEnv installer: /home/mysql/product/myenv-1.1.2/bin/installMyEnv.php Configuration file /etc/myenv/myenv.conf already exists. Use this, overwrite with template or abort (U,t,a): u The following instances are available: master An instance is the same as a mysqld process. What do you want to do next? o Add a new instance, o change an existing instance, o delete an existing instance, o save configuration and exit or o quit without saving (A/c/d/s/q)? a You have to enter a new instance name. An instance name can consist of 0-9, a-z, A-Z, - and _. Examples: test, prod, dev, [mysqld1] mysqld-3306 Enter a new instance name: slave1 Changing instance slave1: Set basedir. The basedir is the directory where your MySQL binaries are located. Example: /home/mysql/product/myenv /home/mysql/product/myenv-1.1.2 /home/mysql/product/mysql-5.6.21 /home/mysql/product/mysql-5.6.21-linux-glibc2.5-x86_64 /usr/local/mysql /opt/mysql [/usr] basedir = /home/mysql/product/mysql-5.6.21 /home/mysql/product/mysql-5.6.21 seems to exist and has already some files in it. Omitting installation of binary tar ball. Set datadir. The datadir is the directory where your MySQL data are located. Example: /usr/local/mysql/data, /opt/mysql/data or /var/lib/mysql or /home/mysql/product/mysql-5.6.21/data or [/home/mysql/data/slave1] or /home/mysql/data/slave1 datadir = /home/mysql/data/slave1 WARNING: directory /home/mysql/data/slave1 already exist. Shall I install a mysql database under /home/mysql/data/slav1 (Y/n)? y Installing MySQL instance. Please be patient, this can take a while... /home/mysql/product/mysql-5.6.21/scripts/mysql_install_db --datadir=/home/mysql/data/slave1 --basedir=/home/mysql/product/mysql-5.6.21 Set port. Example: 3307 or [3306] port = 3307 Set socket. Example: /tmp/mysql-3307.sock /tmp/mysql-slave1.sock [/tmp/mysql.sock] /var/run/mysqld/mysql.sock socket = /home/mysql/data/slave1/slave1.sock Choose location of my.cnf: Example: /etc/my.cnf /etc/mysql/my.cnf /home/mysql/data/slave1/my-3307.cnf [/home/mysql/data/slave1/my.cnf] my.cnf = /home/mysql/data/slave1/my.cnf The following instances are available: master slave1 An instance is the same as a mysqld process. What do you want to do next? o Add a new instance, o change an existing instance, o delete an existing instance, o save configuration and exit or o quit without saving (a/c/d/S/q)? s Writing the configuration file... Rename /etc/myenv/myenv.conf to /etc/myenv/myenv.conf.2014-10-15-06-44-04 Write /etc/myenv/myenv.conf Now source your profile as follows: . ~/.bash_profile
- Then we logout and login again to load the new configs: mysql@centos-temp:~ [master, 3306]> exit logout [root@centos-temp product]# su - mysql Up : master (5.6.21) Down : slave1 (5.6.21) master ( 3306) : test slave1 ( 3307) : test mysql@centos-temp:~ [master, 3306]>
MyEnv says that we have two instances (master and slave1), master is up and running while slave1 is stopped. Let's start it then ..
- First, all commands we execute are sent to the master instance, we should change to the slave1 instance: mysql@centos-temp:~ [master, 3306]> slave1 mysql@centos-temp:~ [slave1, 3307]> start . SUCCESS! mysql@centos-temp:~ [slave1, 3307]>
- Both instances are now up and running ... mysql@centos-temp:~ [slave1, 3307]> up Up : master (5.6.21) slave1 (5.6.21) Down : master ( 3306) : test slave1 ( 3307) : test mysql@centos-temp:~ [slave1, 3307]>
After we installed the two instances we need to know how can we navigate between them to open MySQL connections.
- Open a MySQL connection to the master instance: mysql@centos-temp:~ [master, 3306]> mysql -u root Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 7 Server version: 5.6.21 MySQL Community Server (GPL) Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. root@localhost master [(none)] SQL>
- While the instance name is already printed in the terminal we need to double check that by a MySQL command (the used port) and exit after that: root@localhost master [(none)] SQL> SHOW GLOBAL VARIABLES LIKE 'port'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | port | 3306 | +---------------+-------+ 1 row in set (0.00 sec) root@localhost master [(none)] SQL> exit Bye mysql@centos-temp:~ [master, 3306]>
- Open a MySQL connection to the slave1 instance and check the used port: mysql@centos-temp:~ [master, 3306]> slave1 mysql@centos-temp:~ [slave1, 3307]> mysql -u root Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 5 Server version: 5.6.21 MySQL Community Server (GPL) Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. root@localhost slave1 [(none)] SQL> SHOW GLOBAL VARIABLES LIKE 'port'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | port | 3307 | +---------------+-------+ 1 row in set (0.00 sec) root@localhost slave1 [(none)] SQL>
- Navigate back to the master instance: mysql@centos-temp:~ [slave1, 3307]> master mysql@centos-temp:~ [master, 3306]>
More information about MyEnv commands can be checked here.
Notes:- Using the same procedures above we can add several instances using either the same binaries or some others.
- We used the smallest configurations in the my.cnf file, you are free to make your own changes.
Galera Cluster VS PXC VS MariaDB Galera Cluster - Benchmarking
It is not clear for many MySQL users that Percona XtraDB Cluster (PXC) and MariaDB Galera Cluster depend on the same Galera library i.e used in Galera Cluster for MySQL which is provided by Codership team:
- Galera Cluster: MySQL Server (by Oracle) + Galera library.
- Percona XtraDB Cluster: Percona Server + Galera library.
- MariaDB Galera Cluster: MariaDB Server + Galera library.
But the question is, are there any performance differences between the three of them ?
Let's discover that by doing some simple benchmark to test MySQL write performance in Galera Cluster, PXC and MariaDB Galera Cluster installations.
System Information: HW configurations (AWS Servers): Nodes Servers HW configurations:- CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz (# of cores 8, # of threads 16, HT enabled).
- Memory: 16GB RAM.
- Storage: HDD 120GB/ 5400RPM.
- CPU: Intel(R) Xeon(R) CPU E5-2651 v2 @ 1.80GHz (# of cores 4, # of threads 8, HT enabled).
- Memory: 16GB RAM.
- Storage: HDD 10GB/ 5400RPM.
- CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz (# of cores 16, # of threads 32, HT enabled).
- Memory: 32GB RAM.
- Storage: HDD 10GB/ 5400RPM.
- OS : Red Hat Enterprise Linux Server release 6.5 (Santiago)
- Sysbench : 0.5.3
- GLB : 1.0.0
- Galera Cluster : 5.5.34 and 5.6.16
- Percona XtraDB Cluster : 5.5.37 and 5.6.19
- MariaDB Galera Cluster : 5.5.38 and 10.0.12
- Galera Library : 3.5
- The testing environment consists of 5 AWS servers, three servers for a three-node cluster (each node is installed on a single server), one server for the load balancer and the final server for the load generator in which sysbench is installed to send requests to the load balancer from.
- Sysbench command: sysbench --num-threads=64 --max-requests=1000 --db-driver=mysql --test=/usr/share/doc/sysbench/tests/db/oltp.lua --mysql-table-engine=InnoDB --mysql-user=dev --mysql-password='test' --mysql-host=load_balancer_ip run .
- Table structure which was used by sysbench tests: mysql> show create table sbtest.sbtest\G CREATE TABLE `sbtest` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `k` int(10) unsigned NOT NULL DEFAULT '0', `c` char(120) NOT NULL DEFAULT '', `pad` char(60) NOT NULL DEFAULT '', PRIMARY KEY (`id`), KEY `k` (`k`) ) ENGINE=InnoDB AUTO_INCREMENT=8574 DEFAULT CHARSET=latin1
- The my.cnf used is something like: [mysqld] key_buffer_size = 16M max_allowed_packet = 16M thread_stack = 192K thread_cache_size = 8 innodb_buffer_pool_size = 8G innodb_flush_log_at_trx_commit = 0 expire_logs_days = 10 max_binlog_size = 100M server-id = 1 log-bin = mysql-bin binlog_format = ROW auto_increment_increment = 3 auto_increment_offset = 1 log_slave_updates default_storage_engine = InnoDB # Path to Galera library wsrep_provider = /usr/lib64/galera/libgalera_smm.so # Cluster connection URL contains the IPs of node#1, node#2 and node#3 wsrep_cluster_address = gcomm://nodeB-IP,nodeC-IP innodb_autoinc_lock_mode = 2 # Node #1 address wsrep_node_address = nodeA-IP # Cluster name wsrep_cluster_name = test_cluster # SST method wsrep_sst_method = rsync # Authentication for SST method wsrep_sst_auth = "sst:password"
Notes:
- The number of threads used in this test is 64 as it generated the highest throughput on all cluster installations.
- Each throughput value for each test case is generated by the average of ten (10) times execution.
The raw results in Transactions / Sec might be useful:
sync_binlog=0innodb_flush_log_ at_trx_commitGalera Cluster 5.5.34PXC 5.5.37MariaDB Galera Cluster 5.5.38Galera Cluster 5.6.16PXC 5.6.15MariaDB Galera Cluster 10.0.120525.119534.022534.249519.575532.19520.7361125.615131.748341.384157.001162.783174.972526.761528.858524.039511.817526.06521.024sync_binlog=10242.201249.622262.516220.313229.807220.97196.82996.759148.815111.995114.8113.0562224.476210.904217.142209.139201.596214.311
Conclusion
According to the above results:
- innodb_flush_log_at_trx_commit = 1 significantly slows down Galera.
- sync_binlog also cuts in half the throughput.
- All other are more or less equal in throughput.
Galera Cluster VS PXC VS MariaDB Galera Cluster - Benchmarking
It is not clear for many MySQL users that Percona XtraDB Cluster (PXC) and MariaDB Galera Cluster depend on the same Galera library i.e used in Galera Cluster for MySQL which is provided by Codership team:
- Galera Cluster: MySQL Server (by Oracle) + Galera library.
- Percona XtraDB Cluster: Percona Server + Galera library.
- MariaDB Galera Cluster: MariaDB Server + Galera library.
But the question is, are there any performance differences between the three of them ?
Let's discover that by doing some simple benchmark to test MySQL write performance in Galera Cluster, PXC and MariaDB Galera Cluster installations.
System Information: HW configurations (AWS Servers): Nodes Servers HW configurations:- CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz (# of cores 8, # of threads 16, HT enabled).
- Memory: 16GB RAM.
- Storage: HDD 120GB/ 5400RPM.
- CPU: Intel(R) Xeon(R) CPU E5-2651 v2 @ 1.80GHz (# of cores 4, # of threads 8, HT enabled).
- Memory: 16GB RAM.
- Storage: HDD 10GB/ 5400RPM.
- CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz (# of cores 16, # of threads 32, HT enabled).
- Memory: 32GB RAM.
- Storage: HDD 10GB/ 5400RPM.
- OS : Red Hat Enterprise Linux Server release 6.5 (Santiago)
- Sysbench : 0.5.3
- GLB : 1.0.0
- Galera Cluster : 5.5.34 and 5.6.16
- Percona XtraDB Cluster : 5.5.37 and 5.6.19
- MariaDB Galera Cluster : 5.5.38 and 10.0.12
- Galera Library : 3.5
- The testing environment consists of 5 AWS servers, three servers for a three-node cluster (each node is installed on a single server), one server for the load balancer and the final server for the load generator in which sysbench is installed to send requests to the load balancer from.
- Sysbench command: sysbench --num-threads=64 --max-requests=1000 --db-driver=mysql --test=/usr/share/doc/sysbench/tests/db/oltp.lua --mysql-table-engine=InnoDB --mysql-user=dev --mysql-password='test' --mysql-host=load_balancer_ip run .
- Table structure which was used by sysbench tests: mysql> show create table sbtest.sbtest\G CREATE TABLE `sbtest` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `k` int(10) unsigned NOT NULL DEFAULT '0', `c` char(120) NOT NULL DEFAULT '', `pad` char(60) NOT NULL DEFAULT '', PRIMARY KEY (`id`), KEY `k` (`k`) ) ENGINE=InnoDB AUTO_INCREMENT=8574 DEFAULT CHARSET=latin1
- The my.cnf used is something like: [mysqld] key_buffer_size = 16M max_allowed_packet = 16M thread_stack = 192K thread_cache_size = 8 innodb_buffer_pool_size = 8G innodb_flush_log_at_trx_commit = 0 expire_logs_days = 10 max_binlog_size = 100M server-id = 1 log-bin = mysql-bin binlog_format = ROW auto_increment_increment = 3 auto_increment_offset = 1 log_slave_updates default_storage_engine = InnoDB # Path to Galera library wsrep_provider = /usr/lib64/galera/libgalera_smm.so # Cluster connection URL contains the IPs of node#1, node#2 and node#3 wsrep_cluster_address = gcomm://nodeB-IP,nodeC-IP innodb_autoinc_lock_mode = 2 # Node #1 address wsrep_node_address = nodeA-IP # Cluster name wsrep_cluster_name = test_cluster # SST method wsrep_sst_method = rsync # Authentication for SST method wsrep_sst_auth = "sst:password"
Notes:
- The number of threads used in this test is 64 as it generated the highest throughput on all cluster installations.
- Each throughput value for each test case is generated by the average of ten (10) times execution.
The raw results in Transactions / Sec might be useful:
sync_binlog=0innodb_flush_log_ at_trx_commitGalera Cluster 5.5.34PXC 5.5.37MariaDB Galera Cluster 5.5.38Galera Cluster 5.6.16PXC 5.6.15MariaDB Galera Cluster 10.0.120525.119534.022534.249519.575532.19520.7361125.615131.748341.384157.001162.783174.972526.761528.858524.039511.817526.06521.024sync_binlog=10242.201249.622262.516220.313229.807220.97196.82996.759148.815111.995114.8113.0562224.476210.904217.142209.139201.596214.311
Conclusion
According to the above results:
- innodb_flush_log_at_trx_commit = 1 significantly slows down Galera.
- sync_binlog also cuts in half the throughput.
- All other are more or less equal in throughput.
Galera Cluster VS PXC VS MariaDB Galera Cluster - Benchmarking
It is not clear for many MySQL users that Percona XtraDB Cluster (PXC) and MariaDB Galera Cluster depend on the same Galera library i.e used in Galera Cluster for MySQL which is provided by Codership team:
- Galera Cluster: MySQL Server (by Oracle) + Galera library.
- Percona XtraDB Cluster: Percona Server + Galera library.
- MariaDB Galera Cluster: MariaDB Server + Galera library.
But the question is, are there any performance differences between the three of them ?
Let's discover that by doing some simple benchmark to test MySQL write performance in Galera Cluster, PXC and MariaDB Galera Cluster installations.
System Information: HW configurations (AWS Servers): Nodes Servers HW configurations:- CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz (# of cores 8, # of threads 16, HT enabled).
- Memory: 16GB RAM.
- Storage: HDD 120GB/ 5400RPM.
- CPU: Intel(R) Xeon(R) CPU E5-2651 v2 @ 1.80GHz (# of cores 4, # of threads 8, HT enabled).
- Memory: 16GB RAM.
- Storage: HDD 10GB/ 5400RPM.
- CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz (# of cores 16, # of threads 32, HT enabled).
- Memory: 32GB RAM.
- Storage: HDD 10GB/ 5400RPM.
- OS : Red Hat Enterprise Linux Server release 6.5 (Santiago)
- Sysbench : 0.5.3
- GLB : 1.0.0
- Galera Cluster : 5.5.34 and 5.6.16
- Percona XtraDB Cluster : 5.5.37 and 5.6.19
- MariaDB Galera Cluster : 5.5.38 and 10.0.12
- Galera Library : 3.5
- The testing environment consists of 5 AWS servers, three servers for a three-node cluster (each node is installed on a single server), one server for the load balancer and the final server for the load generator in which sysbench is installed to send requests to the load balancer from.
- Sysbench command: sysbench --num-threads=64 --max-requests=1000 --db-driver=mysql --test=/usr/share/doc/sysbench/tests/db/oltp.lua --mysql-table-engine=InnoDB --mysql-user=dev --mysql-password='test' --mysql-host=load_balancer_ip run .
- Table structure which was used by sysbench tests: mysql> show create table sbtest.sbtest\G CREATE TABLE `sbtest` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `k` int(10) unsigned NOT NULL DEFAULT '0', `c` char(120) NOT NULL DEFAULT '', `pad` char(60) NOT NULL DEFAULT '', PRIMARY KEY (`id`), KEY `k` (`k`) ) ENGINE=InnoDB AUTO_INCREMENT=8574 DEFAULT CHARSET=latin1
- The my.cnf used is something like: [mysqld] key_buffer_size = 16M max_allowed_packet = 16M thread_stack = 192K thread_cache_size = 8 innodb_buffer_pool_size = 8G innodb_flush_log_at_trx_commit = 0 expire_logs_days = 10 max_binlog_size = 100M server-id = 1 log-bin = mysql-bin binlog_format = ROW auto_increment_increment = 3 auto_increment_offset = 1 log_slave_updates default_storage_engine = InnoDB # Path to Galera library wsrep_provider = /usr/lib64/galera/libgalera_smm.so # Cluster connection URL contains the IPs of node#1, node#2 and node#3 wsrep_cluster_address = gcomm://nodeB-IP,nodeC-IP innodb_autoinc_lock_mode = 2 # Node #1 address wsrep_node_address = nodeA-IP # Cluster name wsrep_cluster_name = test_cluster # SST method wsrep_sst_method = rsync # Authentication for SST method wsrep_sst_auth = "sst:password"
Notes:
- The number of threads used in this test is 64 as it generated the highest throughput on all cluster installations.
- Each throughput value for each test case is generated by the average of ten (10) times execution.
The raw results in Transactions / Sec might be useful:
sync_binlog=0innodb_flush_log_ at_trx_commitGalera Cluster 5.5.34PXC 5.5.37MariaDB Galera Cluster 5.5.38Galera Cluster 5.6.16PXC 5.6.15MariaDB Galera Cluster 10.0.120525.119534.022534.249519.575532.19520.7361125.615131.748341.384157.001162.783174.972526.761528.858524.039511.817526.06521.024sync_binlog=10242.201249.622262.516220.313229.807220.97196.82996.759148.815111.995114.8113.0562224.476210.904217.142209.139201.596214.311
Conclusion
According to the above results:
- innodb_flush_log_at_trx_commit = 1 significantly slows down Galera.
- sync_binlog also cuts in half the throughput.
- All other are more or less equal in throughput.
Replication Troubleshooting - Classic VS GTID
In previous posts, I was talking about how to set up MySQL replication, Classic Replication (based on binary logs information) and Transaction-based Replication (based on GTID). In this article I'll summarize how to troubleshoot MySQL replication for the most common issues we might face with a simple comparison how can we get them solved in the different replication methods (Classic VS GTID).
There are two main operations we might need to do in a replication setup:
- Skip or ignore a statement that causes the replication to stop.
- Re-initialize a slave when the Replication is broke and could not be started anymore.
Basically, the slave should be always synchronized with its master having the same copy of data, but for some reasons there might be inconsistency between both of them (unsafe statement in SBR, Slave is not read_only and was modified apart of replication queries, .. etc) which causes errors and stops the replication, e.g. if the master inserted a record which was already inserted on the slave (Duplicate entry) or updated/deleted a row which was not exist on the slave, ... etc.
To solve this issue, we have to either reverse what we have done on the slave (e.g. delete the inserted rows) if that was made by mistake and is known or we can skip executing those statements on the slave and continue the replication again (I'll focus on skipping a statement in this post as it needs different interaction in Classic and GTID replication).
Sample error messages (from SHOW SLAVE STATUS output): Last_SQL_Error: Could not execute Write_rows event on table test.t1; Duplicate entry '4' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000304, end_log_pos 285 Last_SQL_Error: Could not execute Update_rows event on table test.t1; Can't find record in 't1', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000304, end_log_pos 492 Last_SQL_Error: Could not execute Delete_rows event on table test.t1; Can't find record in 't1', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000304, end_log_pos 688 How to solve that issue ?CLASSIC REPLICATION
Solving this problem is a straight forward process in the classic replication setup, what only we need is to issue the following SQL commands on the slave's:
SQL> SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; SQL> START SLAVE;GTID REPLICATION
Solving this problem is not a straight forward in GTID replication like it is in the Classic replication and the variable SQL_SLAVE_SKIP_COUNTER wont be useful in this area anymore.
To get this problem solved in a GTID replication we will need to inject an empty transaction as follows:
- Check which transaction is causing the problem:
SQL> SHOW SLAVE STATUS\G
.
.
Retrieved_Gtid_Set: b9b4712a-df64-11e3-b391-60672090eb04:1-7
Executed_Gtid_Set: 4f6d62ed-df65-11e3-b395-60672090eb04:1,
b9b4712a-df64-11e3-b391-60672090eb04:1-6
Auto_Position: 1
Retrieved_Gtid_Set means the retrieved GTIDs from the master
Executed_Gtid_Set means the executed GTIDs on the slave.
According to the above output, the slave retrieved GTIDs from 1:7 (b9b4712a-df64-11e3-b391-60672090eb04:1-7) and executed only from 1:6 (b9b4712a-df64-11e3-b391-60672090eb04:1-6), so the problem is in transaction number 7.
- Inject an empty transaction:
SQL> SET GTID_NEXT='b9b4712a-df64-11e3-b391-60672090eb04:7';
SQL> BEGIN;COMMIT;
SQL> SET GTID_NEXT='AUTOMATIC';
SQL> START SLAVE;
BE CAUTIOUS: The first part of Executed_Gtid_Set (4f6d62ed-df65-11e3-b395-60672090eb04:1) is the local executed GTIDs (not received from the master) while the second part (b9b4712a-df64-11e3-b391-60672090eb04:1-6) is the executed GTIDs which retrieved from the master (check the master's UUID by either checking the UUID value in "Retrieved_Gtid_Set" which is basically for the master's UUID or by issuing SHOW GLOBAL VARIABLES LIKE 'server_uuid'; on the master server). So we should make sure that we are using the master's UUID when injecting an empty transaction, otherwise, the problem will still remain and the slave wont be started.
Note:
After starting the slave successfully in either classic or GTID replication we might need to use a combination of Percona tools pt-table-checksum and pt-table-sync to fix the inconsistency problem.
Re-initialize/ re-build a slaveFor many reasons, we might end up with only re-build a slave to get the replication working, e.g. if we stopped a slave for a while where the master purged the binary log file that is needed by that slave or there are many duplicate entry errors so that pt-table-checksum and pt-table-sync could not be used then we have to re-initialize the slave from the beginning by having a fresh backup from the master server and restore it on the slave. Lets check how can we do that in both replication methods.
How to solve that issue ?CLASSIC REPLICATION
Sample error message:
Last_IO_Errno: 1236 Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'
Fix steps:
- Backup the master server by the following command: shell> mysqldump -u root -p --all-databases --flush-privileges --single-transaction --master-data=2 --flush-logs --triggers --routines --events --hex-blob >/path/to/backupdir/full_backup-$TIMESTAMP.sql
- Restore the backup file on the slave: shell> mysql -u root -p < /path/to/backupdir/full_backup-$TIMESTAMP.sql
- Get the binary logs information when the backup was taken: shell> head -n 50 /path/to/backupdir/full_backup-$TIMESTAMP.sql|grep "CHANGE MASTER TO" CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000011', MASTER_LOG_POS=120;
- Issue the "CHANGE MASTER TO" command using the new information: SQL> CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000011', MASTER_LOG_POS=120;
- Start the slave: SQL> START SLAVE;
NOTE:
Xtrabackup tool could be used instead of mysqldump,especially, if the database size is big. Check out this link for more information.
GTID REPLICATIONSample error message:
Last_IO_Errno: 1236 Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
Fix steps:
- Backup the master server by the following command: shell> mysqldump -u root -p --all-databases --flush-privileges --single-transaction --flush-logs --triggers --routines --events --hex-blob >/path/to/backupdir/full_backup-$TIMESTAMP.sql
- Check the GTID value when the backup was taken: shell> head -n 50 /path/to/backupdir/full_backup-$TIMESTAMP.sql|grep PURGED SET @@GLOBAL.GTID_PURGED='b9b4712a-df64-11e3-b391-60672090eb04:1-8';
- Reset the GTID_EXECUTED and GTID_PURGED values on the slave: SQL> RESET MASTER;
- Restore the backup file on the slave: shell> mysql -u root -p < /path/to/backupdir/full_backup-$TIMESTAMP.sql
- Make sure that the values of GTID_EXEUCTED and GTID_PURGED are the correct ones: SQL> SHOW GLOBAL VARIABLES LIKE 'gtid_executed'; +---------------+------------------------------------------+ | Variable_name | Value | +---------------+------------------------------------------+ | gtid_executed | b9b4712a-df64-11e3-b391-60672090eb04:1-8 | +---------------+------------------------------------------+ 1 row in set (0.00 sec) SHOW GLOBAL VARIABLES LIKE 'gtid_purged'; +---------------+------------------------------------------+ | Variable_name | Value | +---------------+------------------------------------------+ | gtid_purged | b9b4712a-df64-11e3-b391-60672090eb04:1-8 | +---------------+------------------------------------------+ 1 row in set (0.01 sec)
- Start the slave: SQL> START SLAVE;
NOTES:
- If we didn't reset the GTID_EXECUTED and GTID_PURGED values on the slave before restoring the backup file, the following error will be appeared:
shell> mysql -u root -p < /path/to/backupdir/full_backup-$TIMESTAMP.sql. ERROR 1840 (HY000): @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.The above error indicates that the statement at the beginning of the backup file - which is "SET @@GLOBAL.GTID_PURGED='b9b4712a-df64-11e3-b391-60672090eb04:1-8';" - failed because GTID_PURGED cannot be set unless GTID_EXECUTED is empty. Since GTID_EXECUTED is a read only variable, the only way to empty its value is to issue "RESET MASTER" on the slave server before restoring the backup file.
- Xtrabackup tool could be used as well instead of mysqldump to get this problem solved and without the need to reset GTID_EXECUTED and GTID_PURGED values . Check out this link for more information.
While GTID provides many benefits over the classic replication but it has different troubleshooting and fix strategies which must be known first before deploying GTID in production systems.
Taxonomy upgrade extras: GTIDreplicationReplication Troubleshooting - Classic VS GTID
In previous posts, I was talking about how to set up MySQL replication, Classic Replication (based on binary logs information) and Transaction-based Replication (based on GTID). In this article I'll summarize how to troubleshoot MySQL replication for the most common issues we might face with a simple comparison how can we get them solved in the different replication methods (Classic VS GTID).
There are two main operations we might need to do in a replication setup:
- Skip or ignore a statement that causes the replication to stop.
- Re-initialize a slave when the Replication is broke and could not be started anymore.
Basically, the slave should be always synchronized with its master having the same copy of data, but for some reasons there might be inconsistency between both of them (unsafe statement in SBR, Slave is not read_only and was modified apart of replication queries, .. etc) which causes errors and stops the replication, e.g. if the master inserted a record which was already inserted on the slave (Duplicate entry) or updated/deleted a row which was not exist on the slave, ... etc.
To solve this issue, we have to either reverse what we have done on the slave (e.g. delete the inserted rows) if that was made by mistake and is known or we can skip executing those statements on the slave and continue the replication again (I'll focus on skipping a statement in this post as it needs different interaction in Classic and GTID replication).
Sample error messages (from SHOW SLAVE STATUS output): Last_SQL_Error: Could not execute Write_rows event on table test.t1; Duplicate entry '4' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000304, end_log_pos 285 Last_SQL_Error: Could not execute Update_rows event on table test.t1; Can't find record in 't1', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000304, end_log_pos 492 Last_SQL_Error: Could not execute Delete_rows event on table test.t1; Can't find record in 't1', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000304, end_log_pos 688 How to solve that issue ?CLASSIC REPLICATION
Solving this problem is a straight forward process in the classic replication setup, what only we need is to issue the following SQL commands on the slave's:
SQL> SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; SQL> START SLAVE;GTID REPLICATION
Solving this problem is not a straight forward in GTID replication like it is in the Classic replication and the variable SQL_SLAVE_SKIP_COUNTER wont be useful in this area anymore.
To get this problem solved in a GTID replication we will need to inject an empty transaction as follows:
- Check which transaction is causing the problem:
SQL> SHOW SLAVE STATUS\G
.
.
Retrieved_Gtid_Set: b9b4712a-df64-11e3-b391-60672090eb04:1-7
Executed_Gtid_Set: 4f6d62ed-df65-11e3-b395-60672090eb04:1,
b9b4712a-df64-11e3-b391-60672090eb04:1-6
Auto_Position: 1
Retrieved_Gtid_Set means the retrieved GTIDs from the master
Executed_Gtid_Set means the executed GTIDs on the slave.
According to the above output, the slave retrieved GTIDs from 1:7 (b9b4712a-df64-11e3-b391-60672090eb04:1-7) and executed only from 1:6 (b9b4712a-df64-11e3-b391-60672090eb04:1-6), so the problem is in transaction number 7.
- Inject an empty transaction:
SQL> SET GTID_NEXT='b9b4712a-df64-11e3-b391-60672090eb04:7';
SQL> BEGIN;COMMIT;
SQL> SET GTID_NEXT='AUTOMATIC';
SQL> START SLAVE;
BE CAUTIOUS: The first part of Executed_Gtid_Set (4f6d62ed-df65-11e3-b395-60672090eb04:1) is the local executed GTIDs (not received from the master) while the second part (b9b4712a-df64-11e3-b391-60672090eb04:1-6) is the executed GTIDs which retrieved from the master (check the master's UUID by either checking the UUID value in "Retrieved_Gtid_Set" which is basically for the master's UUID or by issuing SHOW GLOBAL VARIABLES LIKE 'server_uuid'; on the master server). So we should make sure that we are using the master's UUID when injecting an empty transaction, otherwise, the problem will still remain and the slave wont be started.
Note:
After starting the slave successfully in either classic or GTID replication we might need to use a combination of Percona tools pt-table-checksum and pt-table-sync to fix the inconsistency problem.
Re-initialize/ re-build a slaveFor many reasons, we might end up with only re-build a slave to get the replication working, e.g. if we stopped a slave for a while where the master purged the binary log file that is needed by that slave or there are many duplicate entry errors so that pt-table-checksum and pt-table-sync could not be used then we have to re-initialize the slave from the beginning by having a fresh backup from the master server and restore it on the slave. Lets check how can we do that in both replication methods.
How to solve that issue ?CLASSIC REPLICATION
Sample error message:
Last_IO_Errno: 1236 Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'
Fix steps:
- Backup the master server by the following command: shell> mysqldump -u root -p --all-databases --flush-privileges --single-transaction --master-data=2 --flush-logs --triggers --routines --events --hex-blob >/path/to/backupdir/full_backup-$TIMESTAMP.sql
- Restore the backup file on the slave: shell> mysql -u root -p < /path/to/backupdir/full_backup-$TIMESTAMP.sql
- Get the binary logs information when the backup was taken: shell> head -n 50 /path/to/backupdir/full_backup-$TIMESTAMP.sql|grep "CHANGE MASTER TO" CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000011', MASTER_LOG_POS=120;
- Issue the "CHANGE MASTER TO" command using the new information: SQL> CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000011', MASTER_LOG_POS=120;
- Start the slave: SQL> START SLAVE;
NOTE:
Xtrabackup tool could be used instead of mysqldump,especially, if the database size is big. Check out this link for more information.
GTID REPLICATIONSample error message:
Last_IO_Errno: 1236 Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
Fix steps:
- Backup the master server by the following command: shell> mysqldump -u root -p --all-databases --flush-privileges --single-transaction --flush-logs --triggers --routines --events --hex-blob >/path/to/backupdir/full_backup-$TIMESTAMP.sql
- Check the GTID value when the backup was taken: shell> head -n 50 /path/to/backupdir/full_backup-$TIMESTAMP.sql|grep PURGED SET @@GLOBAL.GTID_PURGED='b9b4712a-df64-11e3-b391-60672090eb04:1-8';
- Reset the GTID_EXECUTED and GTID_PURGED values on the slave: SQL> RESET MASTER;
- Restore the backup file on the slave: shell> mysql -u root -p < /path/to/backupdir/full_backup-$TIMESTAMP.sql
- Make sure that the values of GTID_EXEUCTED and GTID_PURGED are the correct ones: SQL> SHOW GLOBAL VARIABLES LIKE 'gtid_executed'; +---------------+------------------------------------------+ | Variable_name | Value | +---------------+------------------------------------------+ | gtid_executed | b9b4712a-df64-11e3-b391-60672090eb04:1-8 | +---------------+------------------------------------------+ 1 row in set (0.00 sec) SHOW GLOBAL VARIABLES LIKE 'gtid_purged'; +---------------+------------------------------------------+ | Variable_name | Value | +---------------+------------------------------------------+ | gtid_purged | b9b4712a-df64-11e3-b391-60672090eb04:1-8 | +---------------+------------------------------------------+ 1 row in set (0.01 sec)
- Start the slave: SQL> START SLAVE;
NOTES:
- If we didn't reset the GTID_EXECUTED and GTID_PURGED values on the slave before restoring the backup file, the following error will be appeared:
shell> mysql -u root -p < /path/to/backupdir/full_backup-$TIMESTAMP.sql. ERROR 1840 (HY000): @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.The above error indicates that the statement at the beginning of the backup file - which is "SET @@GLOBAL.GTID_PURGED='b9b4712a-df64-11e3-b391-60672090eb04:1-8';" - failed because GTID_PURGED cannot be set unless GTID_EXECUTED is empty. Since GTID_EXECUTED is a read only variable, the only way to empty its value is to issue "RESET MASTER" on the slave server before restoring the backup file.
- Xtrabackup tool could be used as well instead of mysqldump to get this problem solved and without the need to reset GTID_EXECUTED and GTID_PURGED values . Check out this link for more information.
While GTID provides many benefits over the classic replication but it has different troubleshooting and fix strategies which must be known first before deploying GTID in production systems.
Taxonomy upgrade extras: GTIDreplication