You are here
FromDual TechFeed (en)
Replication in a star
Most of you know, that it is possible to synchronize MySQL and MariaDB servers using replication. But with the latest releases, it is also possible to use more than just two servers as a multi-master setup.
Most of you know that both MySQL and MariaDB support replication in a hierarchical master-slave-setup, to propagate changes across all connected servers.
But with the latest releases, a slave can have more than one master.
The keyword: Multi-Source replication
It is supported from MySQL 5.7 and MariaDB 10.0 on, and this article describes how to set it up.
What does Multi-Source mean?Multi-Source means, that you can take two or more masters and replicate them to one slave, where their changes will be merged. This works just like the regular MySQL/MariaDB replication.
Well, we are going to exploit that a little. It is still possible to configure a Master-Master set up, what basically allows the following configuration.
Multiple servers are conjoined in a cluster, where every server is a master, and the replication happens over one central node.
Why should we use Multi-Source-Replication, if it's just Master-Master?With a Master-Master-Setup, you are limited to a ring topology or two servers. With Multi-Source, you can now use more than two server without having to use the ring topology, which might break and cause the replication to halt.
Further it is possible to back everything up at one place, without the risk of interrupting access to the databases.
LayoutThe logical topography is a star. The following image shows a possible set up, with servers located in different countries. Thanks to the asynchronous replication, which does not require a broadband connection to work properly, this is possible without a problem.
What has to be considered?- The problems for any other Master-Master-Setup apply here as well.
- If the replication in the cluster is stalled, the problem is usually on more than one server, maybe even the entire cluster.
- Although the synchronization is asynchronous and does not cause a lot of network traffic itself, the replication of large or heavily accessed databases will cause some traffic at the central node.
The way you set it up, is like any other Master-Master replication. Except, that you will have more masters in the cluster.
1) Set up a standard installation of MySQL 5.7 or MariaDB 10.0 or above.
2) Prepare the configuration on all servers:
- On all the outer nodes (In Layout: All except server 1)
log_slave_updates = 0;- On the central node (In Layout: server 1)
log_slave_updates = 1;
NOTE:
- The central server must forward everything it receives, so that the changes starting on some outer node will also reach all the other outer nodes.
- The outer nodes must not forward such changes, because they would loop through the cluster forever.
- Give each server a unique ID!
- Create the user which will be used for the replication:
CREATE USER 'replicator'@'localhost' IDENTIFIED BY 'password'; GRANT REPLICATION SLAVE ON *.* TO 'replicator'@'' IDENTIFIED BY 'password';- On MySQL 5.7 you have to enable the use of GTIDs (see step 3).
3) Set up the replication with all the outer nodes:
I will start with MySQL 5.7:
Source:
- MySQL Multi-Source-Replication
- Online enable GTIDs
Although there is a paragraph, which states that it is possible with file and position, I have experienced something different, what forced me to enable GTIDs.
You have to change the repositories for "master info" and "relay log" to the format "TABLE":
Further, it is necessary to enable GTID. From MySQL 5.7.6 and higher, it is possible to do this without restarting the server.
mysql> SET GLOBAL ENFORCE_GTID_CONSISTENCY = ON; mysql> SET GLOBAL gtid_mode= OFF_PERMISSIVE; mysql> SET GLOBAL gtid_mode= ON_PERMISSIVE; mysql> SET GLOBAL gtid_mode= ON;The value of "ENFORCE_GTID_CONSISTENCY" has to be "ON", otherwise the slave will return an error and refuse to work. And don't forget to make those changes in your "my.cnf" persistent.
Create a new link to a master:
mysql> CHANGE MASTER TO MASTER_HOST='', MASTER_PORT=, MASTER_USER='', MASTER_PASSWORD='', MASTER_AUTO_POSITION=1 FOR CHANNEL 'mysql-master';Start slave:
mysql> START SLAVE [thread_type] [FOR CHANNEL=[]];Stop slave:
mysql> STOP SLAVE [thread_type] [FOR CHANNEL=[]];How to modify a link:
mysql> STOP SLAVE FOR CHANNEL='mysql-master'; mysql> CHANGE MASTER TO MASTER_HOST='', MASTER_PORT=, MASTER_USER='', MASTER_PASSWORD='', MASTER_AUTO_POSITION=1 FOR CHANNEL 'mysql-master'; mysql> START SLAVE FRO CHANNEL='mysql-master'; mysql> SHOW SLAVE STATUS FOR CHANNEL 'mysql-master'\G
Now MariaDB 10.0:
Source: MariaDB Multi-Source-Replication
Create a new link to a master (ATTENTION: MariaDB has a different syntax, compared to MySQL):
mariadb> CHANGE MASTER 'maria-master' TO MASTER_HOST='', MASTER_PORT=, MASTER_USER='', MASTER_PASSWORD='';Start all slaves on this server at the same time:
mariadb> START ALL SLAVES;Check the status of all slaves on this server:
mariadb> SHOW ALL SLAVES STATUS\GIf you want to manage a slave on its own, you can use the regular commands. But you have to make the connection the default one.
Here is an example what you have to do, if you want to modify the connection 'maria-master':
mariadb> STOP SLAVE'maria-master'; mariadb> CHANGE MASTER 'maria-master' TO MASTER_HOST='', MASTER_PORT=, MASTER_USER='', MASTER_PASSWORD=''; mariadb> START SLAVE 'maria-master'; mariadb> SHOW SLAVE 'maria-master' STATUS\GAfter you have done this, you have a normal Master-Slave replication between two servers (as shown in the following picture). To complete the star, repeat those steps on each server, you want to connect.
Is it possible expand an existing set up?It is no problem to expand an existing master-master or master-slave set up. However I would recommend to create new connections. This way the replication is consistent in the configuration with the other connections.
How do I break it?Just like any other Master-Master-Setup. So be careful, with writing on more than one server. In general, if a bad command is committed, it will be replicated to the other nodes. This will cause the cluster to stall.
Another problem is auto_increment. Duplicate IDs will cause a "Duplicate Key Error" and stall each server it happens on. This can be prevented by editing the values of "auto_increment_increment" and "auto_increment_offset". In this scenario, the value "auto_increment_increment" should be 7 (the amount of servers in the cluster) and the value of "auto_increment_offset" would be something from 0 to 6 (or 1 to 7, depending on your preferences).
How do I fix it?Sadly, that is not so easy. In addition, the statements are still replicated over the cluster, what usually causes more than just one server to stall, most time it is the entire cluster.
You have to execute every action on each connection. This gets tedious if you have a large amount of nodes.
Let's assume you have a duplicate key error, because two inserts happened at the same time.
MySQL 5.7:
If you experience a stalled replication on MySQL, you have to skip the GTID of the transaction which caused the stall.
First, stop the slave:
mysql> STOP SLAVE FOR CHANNEL 'failed-transactions-channel-name';Retrieve the next GTID:
mysql> SHOW SLAVE STATUS\GThe line "Retrieved_Gtid_Set" contains the next GTID which would be executed. Copy the value and tell the server to execute that GTID:
mysql> SET GTID_NEXT="dd57a411-b477-11e5-b518-005056244454";Execute a blank transaction:
mysql> BEGIN; COMMIT;Tell the server to take control of the GTIDs:
mysql> SET GTID_NEXT="AUTOMATIC";And restart the slave:
mysql> START SLAVE FOR CHANNEL 'failed-transactions-channel-name';
MariaDB 10.0:
Stop the slave for the connection:
mariadb> STOP SLAVE 'maria-master';Define which connection, you would like to edit:
mariadb> SET @@default_master_connection='maria-master'; # No, it's not a joke.Skip the failed statement:
mariadb> SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1;Restart the slave:
mariadb> START SLAVE 'maria-master';Check if the slave is running:
mariadb> SHOW SLAVE 'maria-master' STATUS\GPlease note the "@@default_master_connection". It is really necessary to set this variable, or you will not be able to change the counter you want. I'm not sure, if I should be surprised or shocked, that this is the recommended solution by MariaDB.
Hot expandWhat is needed to add a new node? Is it required to stop the entire cluster? Or can I just add a new server?
If you want to add a new node to the cluster, it would be best, if you take a dump from another node, including the master data. You then import that dump into the node, make sure the link to the master is configured correctly, and start it. If everything is set up the right way, you should have no problems at all.
It is not necessary to stop the entire cluster, since you can add the links between the nodes while the servers are up.
Backup and Restore Method 1The thought behind Multi-Source replication was to make the administrators life easier when it comes to backups. Instead of backing up all data at their respective location, you can gather it on one server and do the backup on this machine, without interrupting the "productive" servers work.
To obtain a backup of all the replicated databases from the cluster, it would be best to do it on the central server. The reason for this is, that everything has to go over it. This method is suitable to protect yourself from to losing all data on the cluster. The downside is, when one node dies, you have to obtain the backup from that location and transfer it to the failed server.
Method 2If you would like to keep your data save on the location of the server, you can set up a slave at each location and replicate the master to it. For a restore, you could use the existing slave as the new node of the star, while the old server is rebuilding. This set up is capable of keeping the downtime of the service as little as possible. Further, you are not required to transfer the backup from one location to another, since it is already stored close to the failed server.
Versions used:
MariaDB: 10.1.10
MySQL: 5.7.10
Taxonomy upgrade extras: GTIDreplicationmulti-sourceReplication in a star
Most of you know, that it is possible to synchronize MySQL and MariaDB servers using replication. But with the latest releases, it is also possible to use more than just two servers as a multi-master setup.
Most of you know that both MySQL and MariaDB support replication in a hierarchical master-slave-setup, to propagate changes across all connected servers.
But with the latest releases, a slave can have more than one master.
The keyword: Multi-Source replication
It is supported from MySQL 5.7 and MariaDB 10.0 on, and this article describes how to set it up.
What does Multi-Source mean?Multi-Source means, that you can take two or more masters and replicate them to one slave, where their changes will be merged. This works just like the regular MySQL/MariaDB replication.
Well, we are going to exploit that a little. It is still possible to configure a Master-Master set up, what basically allows the following configuration.
Multiple servers are conjoined in a cluster, where every server is a master, and the replication happens over one central node.
Why should we use Multi-Source-Replication, if it's just Master-Master?With a Master-Master-Setup, you are limited to a ring topology or two servers. With Multi-Source, you can now use more than two server without having to use the ring topology, which might break and cause the replication to halt.
Further it is possible to back everything up at one place, without the risk of interrupting access to the databases.
LayoutThe logical topography is a star. The following image shows a possible set up, with servers located in different countries. Thanks to the asynchronous replication, which does not require a broadband connection to work properly, this is possible without a problem.
What has to be considered?- The problems for any other Master-Master-Setup apply here as well.
- If the replication in the cluster is stalled, the problem is usually on more than one server, maybe even the entire cluster.
- Although the synchronization is asynchronous and does not cause a lot of network traffic itself, the replication of large or heavily accessed databases will cause some traffic at the central node.
The way you set it up, is like any other Master-Master replication. Except, that you will have more masters in the cluster.
1) Set up a standard installation of MySQL 5.7 or MariaDB 10.0 or above.
2) Prepare the configuration on all servers:
- On all the outer nodes (In Layout: All except server 1)
log_slave_updates = 0;- On the central node (In Layout: server 1)
log_slave_updates = 1;
NOTE:
- The central server must forward everything it receives, so that the changes starting on some outer node will also reach all the other outer nodes.
- The outer nodes must not forward such changes, because they would loop through the cluster forever.
- Give each server a unique ID!
- Create the user which will be used for the replication:
CREATE USER 'replicator'@'localhost' IDENTIFIED BY 'password'; GRANT REPLICATION SLAVE ON *.* TO 'replicator'@'' IDENTIFIED BY 'password';- On MySQL 5.7 you have to enable the use of GTIDs (see step 3).
3) Set up the replication with all the outer nodes:
I will start with MySQL 5.7:
Source:
- MySQL Multi-Source-Replication
- Online enable GTIDs
Although there is a paragraph, which states that it is possible with file and position, I have experienced something different, what forced me to enable GTIDs.
You have to change the repositories for "master info" and "relay log" to the format "TABLE":
Further, it is necessary to enable GTID. From MySQL 5.7.6 and higher, it is possible to do this without restarting the server.
mysql> SET GLOBAL ENFORCE_GTID_CONSISTENCY = ON; mysql> SET GLOBAL gtid_mode= OFF_PERMISSIVE; mysql> SET GLOBAL gtid_mode= ON_PERMISSIVE; mysql> SET GLOBAL gtid_mode= ON;The value of "ENFORCE_GTID_CONSISTENCY" has to be "ON", otherwise the slave will return an error and refuse to work. And don't forget to make those changes in your "my.cnf" persistent.
Create a new link to a master:
mysql> CHANGE MASTER TO MASTER_HOST='', MASTER_PORT=, MASTER_USER='', MASTER_PASSWORD='', MASTER_AUTO_POSITION=1 FOR CHANNEL 'mysql-master';Start slave:
mysql> START SLAVE [thread_type] [FOR CHANNEL=[]];Stop slave:
mysql> STOP SLAVE [thread_type] [FOR CHANNEL=[]];How to modify a link:
mysql> STOP SLAVE FOR CHANNEL='mysql-master'; mysql> CHANGE MASTER TO MASTER_HOST='', MASTER_PORT=, MASTER_USER='', MASTER_PASSWORD='', MASTER_AUTO_POSITION=1 FOR CHANNEL 'mysql-master'; mysql> START SLAVE FRO CHANNEL='mysql-master'; mysql> SHOW SLAVE STATUS FOR CHANNEL 'mysql-master'\G
Now MariaDB 10.0:
Source: MariaDB Multi-Source-Replication
Create a new link to a master (ATTENTION: MariaDB has a different syntax, compared to MySQL):
mariadb> CHANGE MASTER 'maria-master' TO MASTER_HOST='', MASTER_PORT=, MASTER_USER='', MASTER_PASSWORD='';Start all slaves on this server at the same time:
mariadb> START ALL SLAVES;Check the status of all slaves on this server:
mariadb> SHOW ALL SLAVES STATUS\GIf you want to manage a slave on its own, you can use the regular commands. But you have to make the connection the default one.
Here is an example what you have to do, if you want to modify the connection 'maria-master':
mariadb> STOP SLAVE'maria-master'; mariadb> CHANGE MASTER 'maria-master' TO MASTER_HOST='', MASTER_PORT=, MASTER_USER='', MASTER_PASSWORD=''; mariadb> START SLAVE 'maria-master'; mariadb> SHOW SLAVE 'maria-master' STATUS\GAfter you have done this, you have a normal Master-Slave replication between two servers (as shown in the following picture). To complete the star, repeat those steps on each server, you want to connect.
Is it possible expand an existing set up?It is no problem to expand an existing master-master or master-slave set up. However I would recommend to create new connections. This way the replication is consistent in the configuration with the other connections.
How do I break it?Just like any other Master-Master-Setup. So be careful, with writing on more than one server. In general, if a bad command is committed, it will be replicated to the other nodes. This will cause the cluster to stall.
Another problem is auto_increment. Duplicate IDs will cause a "Duplicate Key Error" and stall each server it happens on. This can be prevented by editing the values of "auto_increment_increment" and "auto_increment_offset". In this scenario, the value "auto_increment_increment" should be 7 (the amount of servers in the cluster) and the value of "auto_increment_offset" would be something from 0 to 6 (or 1 to 7, depending on your preferences).
How do I fix it?Sadly, that is not so easy. In addition, the statements are still replicated over the cluster, what usually causes more than just one server to stall, most time it is the entire cluster.
You have to execute every action on each connection. This gets tedious if you have a large amount of nodes.
Let's assume you have a duplicate key error, because two inserts happened at the same time.
MySQL 5.7:
If you experience a stalled replication on MySQL, you have to skip the GTID of the transaction which caused the stall.
First, stop the slave:
mysql> STOP SLAVE FOR CHANNEL 'failed-transactions-channel-name';Retrieve the next GTID:
mysql> SHOW SLAVE STATUS\GThe line "Retrieved_Gtid_Set" contains the next GTID which would be executed. Copy the value and tell the server to execute that GTID:
mysql> SET GTID_NEXT="dd57a411-b477-11e5-b518-005056244454";Execute a blank transaction:
mysql> BEGIN; COMMIT;Tell the server to take control of the GTIDs:
mysql> SET GTID_NEXT="AUTOMATIC";And restart the slave:
mysql> START SLAVE FOR CHANNEL 'failed-transactions-channel-name';
MariaDB 10.0:
Stop the slave for the connection:
mariadb> STOP SLAVE 'maria-master';Define which connection, you would like to edit:
mariadb> SET @@default_master_connection='maria-master'; # No, it's not a joke.Skip the failed statement:
mariadb> SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1;Restart the slave:
mariadb> START SLAVE 'maria-master';Check if the slave is running:
mariadb> SHOW SLAVE 'maria-master' STATUS\GPlease note the "@@default_master_connection". It is really necessary to set this variable, or you will not be able to change the counter you want. I'm not sure, if I should be surprised or shocked, that this is the recommended solution by MariaDB.
Hot expandWhat is needed to add a new node? Is it required to stop the entire cluster? Or can I just add a new server?
If you want to add a new node to the cluster, it would be best, if you take a dump from another node, including the master data. You then import that dump into the node, make sure the link to the master is configured correctly, and start it. If everything is set up the right way, you should have no problems at all.
It is not necessary to stop the entire cluster, since you can add the links between the nodes while the servers are up.
Backup and Restore Method 1The thought behind Multi-Source replication was to make the administrators life easier when it comes to backups. Instead of backing up all data at their respective location, you can gather it on one server and do the backup on this machine, without interrupting the "productive" servers work.
To obtain a backup of all the replicated databases from the cluster, it would be best to do it on the central server. The reason for this is, that everything has to go over it. This method is suitable to protect yourself from to losing all data on the cluster. The downside is, when one node dies, you have to obtain the backup from that location and transfer it to the failed server.
Method 2If you would like to keep your data save on the location of the server, you can set up a slave at each location and replicate the master to it. For a restore, you could use the existing slave as the new node of the star, while the old server is rebuilding. This set up is capable of keeping the downtime of the service as little as possible. Further, you are not required to transfer the backup from one location to another, since it is already stored close to the failed server.
Versions used:
MariaDB: 10.1.10
MySQL: 5.7.10
Taxonomy upgrade extras: GTIDreplicationmulti-source replicationReplication in a star
Most of you know, that it is possible to synchronize MySQL and MariaDB servers using replication. But with the latest releases, it is also possible to use more than just two servers as a multi-master setup.
Most of you know that both MySQL and MariaDB support replication in a hierarchical master-slave-setup, to propagate changes across all connected servers.
But with the latest releases, a slave can have more than one master.
The keyword: Multi-Source replication
It is supported from MySQL 5.7 and MariaDB 10.0 on, and this article describes how to set it up.
What does Multi-Source mean?Multi-Source means, that you can take two or more masters and replicate them to one slave, where their changes will be merged. This works just like the regular MySQL/MariaDB replication.
Well, we are going to exploit that a little. It is still possible to configure a Master-Master set up, what basically allows the following configuration.
Multiple servers are conjoined in a cluster, where every server is a master, and the replication happens over one central node.
Why should we use Multi-Source-Replication, if it's just Master-Master?With a Master-Master-Setup, you are limited to a ring topology or two servers. With Multi-Source, you can now use more than two server without having to use the ring topology, which might break and cause the replication to halt.
Further it is possible to back everything up at one place, without the risk of interrupting access to the databases.
LayoutThe logical topography is a star. The following image shows a possible set up, with servers located in different countries. Thanks to the asynchronous replication, which does not require a broadband connection to work properly, this is possible without a problem.
What has to be considered?- The problems for any other Master-Master-Setup apply here as well.
- If the replication in the cluster is stalled, the problem is usually on more than one server, maybe even the entire cluster.
- Although the synchronization is asynchronous and does not cause a lot of network traffic itself, the replication of large or heavily accessed databases will cause some traffic at the central node.
The way you set it up, is like any other Master-Master replication. Except, that you will have more masters in the cluster.
1) Set up a standard installation of MySQL 5.7 or MariaDB 10.0 or above.
2) Prepare the configuration on all servers:
- On all the outer nodes (In Layout: All except server 1)
log_slave_updates = 0;- On the central node (In Layout: server 1)
log_slave_updates = 1;
NOTE:
- The central server must forward everything it receives, so that the changes starting on some outer node will also reach all the other outer nodes.
- The outer nodes must not forward such changes, because they would loop through the cluster forever.
- Give each server a unique ID!
- Create the user which will be used for the replication:
CREATE USER 'replicator'@'localhost' IDENTIFIED BY 'password'; GRANT REPLICATION SLAVE ON *.* TO 'replicator'@'' IDENTIFIED BY 'password';- On MySQL 5.7 you have to enable the use of GTIDs (see step 3).
3) Set up the replication with all the outer nodes:
I will start with MySQL 5.7:
Source:
- MySQL Multi-Source-Replication
- Online enable GTIDs
Although there is a paragraph, which states that it is possible with file and position, I have experienced something different, what forced me to enable GTIDs.
You have to change the repositories for "master info" and "relay log" to the format "TABLE":
Further, it is necessary to enable GTID. From MySQL 5.7.6 and higher, it is possible to do this without restarting the server.
mysql> SET GLOBAL ENFORCE_GTID_CONSISTENCY = ON; mysql> SET GLOBAL gtid_mode= OFF_PERMISSIVE; mysql> SET GLOBAL gtid_mode= ON_PERMISSIVE; mysql> SET GLOBAL gtid_mode= ON;The value of "ENFORCE_GTID_CONSISTENCY" has to be "ON", otherwise the slave will return an error and refuse to work. And don't forget to make those changes in your "my.cnf" persistent.
Create a new link to a master:
mysql> CHANGE MASTER TO MASTER_HOST='', MASTER_PORT=, MASTER_USER='', MASTER_PASSWORD='', MASTER_AUTO_POSITION=1 FOR CHANNEL 'mysql-master';Start slave:
mysql> START SLAVE [thread_type] [FOR CHANNEL=[]];Stop slave:
mysql> STOP SLAVE [thread_type] [FOR CHANNEL=[]];How to modify a link:
mysql> STOP SLAVE FOR CHANNEL='mysql-master'; mysql> CHANGE MASTER TO MASTER_HOST='', MASTER_PORT=, MASTER_USER='', MASTER_PASSWORD='', MASTER_AUTO_POSITION=1 FOR CHANNEL 'mysql-master'; mysql> START SLAVE FRO CHANNEL='mysql-master'; mysql> SHOW SLAVE STATUS FOR CHANNEL 'mysql-master'\G
Now MariaDB 10.0:
Source: MariaDB Multi-Source-Replication
Create a new link to a master (ATTENTION: MariaDB has a different syntax, compared to MySQL):
mariadb> CHANGE MASTER 'maria-master' TO MASTER_HOST='', MASTER_PORT=, MASTER_USER='', MASTER_PASSWORD='';Start all slaves on this server at the same time:
mariadb> START ALL SLAVES;Check the status of all slaves on this server:
mariadb> SHOW ALL SLAVES STATUS\GIf you want to manage a slave on its own, you can use the regular commands. But you have to make the connection the default one.
Here is an example what you have to do, if you want to modify the connection 'maria-master':
mariadb> STOP SLAVE'maria-master'; mariadb> CHANGE MASTER 'maria-master' TO MASTER_HOST='', MASTER_PORT=, MASTER_USER='', MASTER_PASSWORD=''; mariadb> START SLAVE 'maria-master'; mariadb> SHOW SLAVE 'maria-master' STATUS\GAfter you have done this, you have a normal Master-Slave replication between two servers (as shown in the following picture). To complete the star, repeat those steps on each server, you want to connect.
Is it possible expand an existing set up?It is no problem to expand an existing master-master or master-slave set up. However I would recommend to create new connections. This way the replication is consistent in the configuration with the other connections.
How do I break it?Just like any other Master-Master-Setup. So be careful, with writing on more than one server. In general, if a bad command is committed, it will be replicated to the other nodes. This will cause the cluster to stall.
Another problem is auto_increment. Duplicate IDs will cause a "Duplicate Key Error" and stall each server it happens on. This can be prevented by editing the values of "auto_increment_increment" and "auto_increment_offset". In this scenario, the value "auto_increment_increment" should be 7 (the amount of servers in the cluster) and the value of "auto_increment_offset" would be something from 0 to 6 (or 1 to 7, depending on your preferences).
How do I fix it?Sadly, that is not so easy. In addition, the statements are still replicated over the cluster, what usually causes more than just one server to stall, most time it is the entire cluster.
You have to execute every action on each connection. This gets tedious if you have a large amount of nodes.
Let's assume you have a duplicate key error, because two inserts happened at the same time.
MySQL 5.7:
If you experience a stalled replication on MySQL, you have to skip the GTID of the transaction which caused the stall.
First, stop the slave:
mysql> STOP SLAVE FOR CHANNEL 'failed-transactions-channel-name';Retrieve the next GTID:
mysql> SHOW SLAVE STATUS\GThe line "Retrieved_Gtid_Set" contains the next GTID which would be executed. Copy the value and tell the server to execute that GTID:
mysql> SET GTID_NEXT="dd57a411-b477-11e5-b518-005056244454";Execute a blank transaction:
mysql> BEGIN; COMMIT;Tell the server to take control of the GTIDs:
mysql> SET GTID_NEXT="AUTOMATIC";And restart the slave:
mysql> START SLAVE FOR CHANNEL 'failed-transactions-channel-name';
MariaDB 10.0:
Stop the slave for the connection:
mariadb> STOP SLAVE 'maria-master';Define which connection, you would like to edit:
mariadb> SET @@default_master_connection='maria-master'; # No, it's not a joke.Skip the failed statement:
mariadb> SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1;Restart the slave:
mariadb> START SLAVE 'maria-master';Check if the slave is running:
mariadb> SHOW SLAVE 'maria-master' STATUS\GPlease note the "@@default_master_connection". It is really necessary to set this variable, or you will not be able to change the counter you want. I'm not sure, if I should be surprised or shocked, that this is the recommended solution by MariaDB.
Hot expandWhat is needed to add a new node? Is it required to stop the entire cluster? Or can I just add a new server?
If you want to add a new node to the cluster, it would be best, if you take a dump from another node, including the master data. You then import that dump into the node, make sure the link to the master is configured correctly, and start it. If everything is set up the right way, you should have no problems at all.
It is not necessary to stop the entire cluster, since you can add the links between the nodes while the servers are up.
Backup and Restore Method 1The thought behind Multi-Source replication was to make the administrators life easier when it comes to backups. Instead of backing up all data at their respective location, you can gather it on one server and do the backup on this machine, without interrupting the "productive" servers work.
To obtain a backup of all the replicated databases from the cluster, it would be best to do it on the central server. The reason for this is, that everything has to go over it. This method is suitable to protect yourself from to losing all data on the cluster. The downside is, when one node dies, you have to obtain the backup from that location and transfer it to the failed server.
Method 2If you would like to keep your data save on the location of the server, you can set up a slave at each location and replicate the master to it. For a restore, you could use the existing slave as the new node of the star, while the old server is rebuilding. This set up is capable of keeping the downtime of the service as little as possible. Further, you are not required to transfer the backup from one location to another, since it is already stored close to the failed server.
Versions used:
MariaDB: 10.1.10
MySQL: 5.7.10
How to Get a Galera Cluster Into Split Brain
"Split Brain" is the term commonly used for a cluster whose nodes have different contents, rather than identical as they should have. Typically, a "split brain" situation is the DBA's nightmare, and the Galera software is designed to avoid it. Galera is very successful in that avoidance, and it needs some special steps by the DBA to achieve "split brain". Here is how to do it - or, for most DBAs, what to avoid doing to not get a split-brain cluster.
Galera's DesignFirst, let's remember how Galera is operating:
- The Galera software ensures that all nodes participating in a cluster will start from identical contents, by doing a "snapshot state transfer" (SST) of all current data to a newly joining node.
- When the cluster is running, Galera transfers all changes (transactions) to all cluster nodes and applies them (or rolls back and ignores, in the case of a conflict).
- If some connections get lost, all nodes check whether they "have quorum" (belong to a majority), and stop serving requests if they don't.
- When a disconnected node re-joins the cluster, it gets all meantime changes transferred ("incremental state transfer" IST) and so makes its contents current.
- Should that be impossible, because some of those changes have become unavailable (log purge), a full transfer (SST) is done.
Of course, the quorum is a well-known concept. The old term for it is "majority consensus", and the approach is built on a simple principle:
In any set (of cluster nodes), there cannot be two (or more) non-overlapping subsets which both contain a majority of the elements.
So if some loss of connectivity splits the cluster into subsets, at most one of them can "have quorum", all others will stop serving requests, and there cannot be two (or more) different directions in which the contents (data) changes.
What Galera introduced (compared to previous designs of distributed DBMSs) is the efficient transfer of changes and conflict detection / resolution ("certification" in Galera terms) at "commit" time that makes the system fast, while previous designs used "distributed locking" or other principles which added latency to many commands and so made their systems slow.
The StoryLet's get back to the "split brain" issue, of which I said that Galera avoids it, and also said it can be reached. Sounds contradictory? Well, there are more active components than just Galera. Here is a real-world case, as happened to (I won't say "achieved by") a customer:
Originally, they had set up a Galera cluster of three nodes; let's call them A, B, and C. This is started by bringing up node A as a stand-alone node, running MySQL with the "wsrep" plugin. Then, one after the other, nodes B and C are configured to join node A in forming a cluster, and started. As a result, there are three nodes communicating with each other that form the cluster. In addition, HAproxy is running somewhere, it will direct the clients to an active cluster node.
So far, so good: The cluster is running, clients connect and issue transactions, everything is ok, and the DBA/s turn/s to other tasks ...
Some time later, node A must be stopped to do some hardware maintenance. No problem, nodes B and C are running fine, they have quorum (2 of 3 is a majority), so the system is still available and operations continue. HAproxy detects that node A does not respond, so it directs all clients to B or C. The cluster architecture is serving its purpose of continuous availability even during a maintenance period.
Maintenance is done, node A is rebooted, its MySQL+Galera server process restarts. It comes up, HAproxy detects it as running and directs clients to it. All seems fine ...
Three hours later, someone has become suspicious, detects trouble, and node A is stopped. Why? What has happened?
Some
large
gap
is
appropriate
to
give
readers
a
break
and
let
them
consider
the
situation.
What has happened?
Remember what I wrote about the cluster setup: It was
... started by bringing up node A as a stand-alone node, ... ... nodes B and C are configured to join node A ... ... three nodes communicating with each other that form the cluster.These steps were sufficient to get the nodes up and running. What was missing, however, was to re-configure A from "stand-alone" to "member of cluster with B and C".
As a consequence, when A was restarted after the maintenance, it again came up stand-alone. It did not try to join the cluster (which would have triggered an IST of the meantime changes) or check for quorum (a stand-alone node is self-sufficient).
Based on A's configuration (as read from disk), all was fine.
Based on the concept of the A+B+C cluster, it was a plain, simple split-brain:
Some changes were done on B+C (which still had quorum), while others were done on A only (which was mis-configured).
Lesson to learn (or rather, to bring back into active brain memory):
If some configuration is changed at run-time, this change must also be done in the configuration files so that it is used on restart.
The typical example for such changes is a "set global" command modifying some dynamic variable, like "max_connections".
But in a Galera cluster, a node joining the others is also a dynamic configuration change, and it should ASAP be reflected in all configuration files.
If this isn't done, the consequences might be as described above.
Now, most stories have a happy ending, and this one shouldn't be an exception:
Luckily, the application uses self-generated keys, similar to UUIDs, so the entries created on A did not conflict with those of B+C. Also, there were no changes of existing data, just inserts. So the situation could be corrected by extracting all new data from A, inserting them on B+C, and then resetting A's state so that it asked B+C for an SST. Uff!
Operational AdviceThere are some tools available that will do such a transfer. However, it can be done completely with standard parts coming with MySQL:
- "mysqldump" will extract the data from A.
- Suitable options will make sure this exctract does not contain "drop" or "create" commands, and generates "insert ignore".
Check the documentation for the options "--no-create-db", "--no-create-info", "--skip-add-drop-table", and "--insert-ignore". - If the old, common data are deleted first, both dumping and loading becomes faster, and duplicates are reduced / avoided.
- "mysql" can be used to load these data into B or C.
Had they used auto-increment keys, or had they modified existing data, it would have been much more complicated, and it might even have been impossible to combine all changes without losing some. I leave it to your imagination to think of such scenarios.
To repeat the lesson in DBMS / DBA terms:
- The most important property of a database is consistency, it must be kept up at all times.
- For database operations, the configuration on disk (in files) must be consistent to that in RAM (of the running processes), so any runtime changes must be reflected in the configuration files on disk to maintain consistency.
Percona's "pt-config-diff" can be used to compare a node's current variables to its configuration file.
Take care!
How to Get a Galera Cluster Into Split Brain
"Split Brain" is the term commonly used for a cluster whose nodes have different contents, rather than identical as they should have. Typically, a "split brain" situation is the DBA's nightmare, and the Galera software is designed to avoid it. Galera is very successful in that avoidance, and it needs some special steps by the DBA to achieve "split brain". Here is how to do it - or, for most DBAs, what to avoid doing to not get a split-brain cluster.
Galera's DesignFirst, let's remember how Galera is operating:
- The Galera software ensures that all nodes participating in a cluster will start from identical contents, by doing a "snapshot state transfer" (SST) of all current data to a newly joining node.
- When the cluster is running, Galera transfers all changes (transactions) to all cluster nodes and applies them (or rolls back and ignores, in the case of a conflict).
- If some connections get lost, all nodes check whether they "have quorum" (belong to a majority), and stop serving requests if they don't.
- When a disconnected node re-joins the cluster, it gets all meantime changes transferred ("incremental state transfer" IST) and so makes its contents current.
- Should that be impossible, because some of those changes have become unavailable (log purge), a full transfer (SST) is done.
Of course, the quorum is a well-known concept. The old term for it is "majority consensus", and the approach is built on a simple principle:
In any set (of cluster nodes), there cannot be two (or more) non-overlapping subsets which both contain a majority of the elements.
So if some loss of connectivity splits the cluster into subsets, at most one of them can "have quorum", all others will stop serving requests, and there cannot be two (or more) different directions in which the contents (data) changes.
What Galera introduced (compared to previous designs of distributed DBMSs) is the efficient transfer of changes and conflict detection / resolution ("certification" in Galera terms) at "commit" time that makes the system fast, while previous designs used "distributed locking" or other principles which added latency to many commands and so made their systems slow.
The StoryLet's get back to the "split brain" issue, of which I said that Galera avoids it, and also said it can be reached. Sounds contradictory? Well, there are more active components than just Galera. Here is a real-world case, as happened to (I won't say "achieved by") a customer:
Originally, they had set up a Galera cluster of three nodes; let's call them A, B, and C. This is started by bringing up node A as a stand-alone node, running MySQL with the "wsrep" plugin. Then, one after the other, nodes B and C are configured to join node A in forming a cluster, and started. As a result, there are three nodes communicating with each other that form the cluster. In addition, HAproxy is running somewhere, it will direct the clients to an active cluster node.
So far, so good: The cluster is running, clients connect and issue transactions, everything is ok, and the DBA/s turn/s to other tasks ...
Some time later, node A must be stopped to do some hardware maintenance. No problem, nodes B and C are running fine, they have quorum (2 of 3 is a majority), so the system is still available and operations continue. HAproxy detects that node A does not respond, so it directs all clients to B or C. The cluster architecture is serving its purpose of continuous availability even during a maintenance period.
Maintenance is done, node A is rebooted, its MySQL+Galera server process restarts. It comes up, HAproxy detects it as running and directs clients to it. All seems fine ...
Three hours later, someone has become suspicious, detects trouble, and node A is stopped. Why? What has happened?
Some
large
gap
is
appropriate
to
give
readers
a
break
and
let
them
consider
the
situation.
What has happened?
Remember what I wrote about the cluster setup: It was
... started by bringing up node A as a stand-alone node, ... ... nodes B and C are configured to join node A ... ... three nodes communicating with each other that form the cluster.These steps were sufficient to get the nodes up and running. What was missing, however, was to re-configure A from "stand-alone" to "member of cluster with B and C".
As a consequence, when A was restarted after the maintenance, it again came up stand-alone. It did not try to join the cluster (which would have triggered an IST of the meantime changes) or check for quorum (a stand-alone node is self-sufficient).
Based on A's configuration (as read from disk), all was fine.
Based on the concept of the A+B+C cluster, it was a plain, simple split-brain:
Some changes were done on B+C (which still had quorum), while others were done on A only (which was mis-configured).
Lesson to learn (or rather, to bring back into active brain memory):
If some configuration is changed at run-time, this change must also be done in the configuration files so that it is used on restart.
The typical example for such changes is a "set global" command modifying some dynamic variable, like "max_connections".
But in a Galera cluster, a node joining the others is also a dynamic configuration change, and it should ASAP be reflected in all configuration files.
If this isn't done, the consequences might be as described above.
Now, most stories have a happy ending, and this one shouldn't be an exception:
Luckily, the application uses self-generated keys, similar to UUIDs, so the entries created on A did not conflict with those of B+C. Also, there were no changes of existing data, just inserts. So the situation could be corrected by extracting all new data from A, inserting them on B+C, and then resetting A's state so that it asked B+C for an SST. Uff!
Operational AdviceThere are some tools available that will do such a transfer. However, it can be done completely with standard parts coming with MySQL:
- "mysqldump" will extract the data from A.
- Suitable options will make sure this exctract does not contain "drop" or "create" commands, and generates "insert ignore".
Check the documentation for the options "--no-create-db", "--no-create-info", "--skip-add-drop-table", and "--insert-ignore". - If the old, common data are deleted first, both dumping and loading becomes faster, and duplicates are reduced / avoided.
- "mysql" can be used to load these data into B or C.
Had they used auto-increment keys, or had they modified existing data, it would have been much more complicated, and it might even have been impossible to combine all changes without losing some. I leave it to your imagination to think of such scenarios.
To repeat the lesson in DBMS / DBA terms:
- The most important property of a database is consistency, it must be kept up at all times.
- For database operations, the configuration on disk (in files) must be consistent to that in RAM (of the running processes), so any runtime changes must be reflected in the configuration files on disk to maintain consistency.
Percona's "pt-config-diff" can be used to compare a node's current variables to its configuration file.
Take care!
How to Get a Galera Cluster Into Split Brain
"Split Brain" is the term commonly used for a cluster whose nodes have different contents, rather than identical as they should have. Typically, a "split brain" situation is the DBA's nightmare, and the Galera software is designed to avoid it. Galera is very successful in that avoidance, and it needs some special steps by the DBA to achieve "split brain". Here is how to do it - or, for most DBAs, what to avoid doing to not get a split-brain cluster.
Galera's DesignFirst, let's remember how Galera is operating:
- The Galera software ensures that all nodes participating in a cluster will start from identical contents, by doing a "snapshot state transfer" (SST) of all current data to a newly joining node.
- When the cluster is running, Galera transfers all changes (transactions) to all cluster nodes and applies them (or rolls back and ignores, in the case of a conflict).
- If some connections get lost, all nodes check whether they "have quorum" (belong to a majority), and stop serving requests if they don't.
- When a disconnected node re-joins the cluster, it gets all meantime changes transferred ("incremental state transfer" IST) and so makes its contents current.
- Should that be impossible, because some of those changes have become unavailable (log purge), a full transfer (SST) is done.
Of course, the quorum is a well-known concept. The old term for it is "majority consensus", and the approach is built on a simple principle:
In any set (of cluster nodes), there cannot be two (or more) non-overlapping subsets which both contain a majority of the elements.
So if some loss of connectivity splits the cluster into subsets, at most one of them can "have quorum", all others will stop serving requests, and there cannot be two (or more) different directions in which the contents (data) changes.
What Galera introduced (compared to previous designs of distributed DBMSs) is the efficient transfer of changes and conflict detection / resolution ("certification" in Galera terms) at "commit" time that makes the system fast, while previous designs used "distributed locking" or other principles which added latency to many commands and so made their systems slow.
The StoryLet's get back to the "split brain" issue, of which I said that Galera avoids it, and also said it can be reached. Sounds contradictory? Well, there are more active components than just Galera. Here is a real-world case, as happened to (I won't say "achieved by") a customer:
Originally, they had set up a Galera cluster of three nodes; let's call them A, B, and C. This is started by bringing up node A as a stand-alone node, running MySQL with the "wsrep" plugin. Then, one after the other, nodes B and C are configured to join node A in forming a cluster, and started. As a result, there are three nodes communicating with each other that form the cluster. In addition, HAproxy is running somewhere, it will direct the clients to an active cluster node.
So far, so good: The cluster is running, clients connect and issue transactions, everything is ok, and the DBA/s turn/s to other tasks ...
Some time later, node A must be stopped to do some hardware maintenance. No problem, nodes B and C are running fine, they have quorum (2 of 3 is a majority), so the system is still available and operations continue. HAproxy detects that node A does not respond, so it directs all clients to B or C. The cluster architecture is serving its purpose of continuous availability even during a maintenance period.
Maintenance is done, node A is rebooted, its MySQL+Galera server process restarts. It comes up, HAproxy detects it as running and directs clients to it. All seems fine ...
Three hours later, someone has become suspicious, detects trouble, and node A is stopped. Why? What has happened?
Some
large
gap
is
appropriate
to
give
readers
a
break
and
let
them
consider
the
situation.
What has happened?
Remember what I wrote about the cluster setup: It was
... started by bringing up node A as a stand-alone node, ... ... nodes B and C are configured to join node A ... ... three nodes communicating with each other that form the cluster.These steps were sufficient to get the nodes up and running. What was missing, however, was to re-configure A from "stand-alone" to "member of cluster with B and C".
As a consequence, when A was restarted after the maintenance, it again came up stand-alone. It did not try to join the cluster (which would have triggered an IST of the meantime changes) or check for quorum (a stand-alone node is self-sufficient).
Based on A's configuration (as read from disk), all was fine.
Based on the concept of the A+B+C cluster, it was a plain, simple split-brain:
Some changes were done on B+C (which still had quorum), while others were done on A only (which was mis-configured).
Lesson to learn (or rather, to bring back into active brain memory):
If some configuration is changed at run-time, this change must also be done in the configuration files so that it is used on restart.
The typical example for such changes is a "set global" command modifying some dynamic variable, like "max_connections".
But in a Galera cluster, a node joining the others is also a dynamic configuration change, and it should ASAP be reflected in all configuration files.
If this isn't done, the consequences might be as described above.
Now, most stories have a happy ending, and this one shouldn't be an exception:
Luckily, the application uses self-generated keys, similar to UUIDs, so the entries created on A did not conflict with those of B+C. Also, there were no changes of existing data, just inserts. So the situation could be corrected by extracting all new data from A, inserting them on B+C, and then resetting A's state so that it asked B+C for an SST. Uff!
Operational AdviceThere are some tools available that will do such a transfer. However, it can be done completely with standard parts coming with MySQL:
- "mysqldump" will extract the data from A.
- Suitable options will make sure this exctract does not contain "drop" or "create" commands, and generates "insert ignore".
Check the documentation for the options "--no-create-db", "--no-create-info", "--skip-add-drop-table", and "--insert-ignore". - If the old, common data are deleted first, both dumping and loading becomes faster, and duplicates are reduced / avoided.
- "mysql" can be used to load these data into B or C.
Had they used auto-increment keys, or had they modified existing data, it would have been much more complicated, and it might even have been impossible to combine all changes without losing some. I leave it to your imagination to think of such scenarios.
To repeat the lesson in DBMS / DBA terms:
- The most important property of a database is consistency, it must be kept up at all times.
- For database operations, the configuration on disk (in files) must be consistent to that in RAM (of the running processes), so any runtime changes must be reflected in the configuration files on disk to maintain consistency.
Percona's "pt-config-diff" can be used to compare a node's current variables to its configuration file.
Take care!
Migration of SQLite to MySQL
In my first Blog ever, I am going to cover the migration of a SQLite-Database to MySQL. The Tool used is MySQL-Workbench, which you can Download from the MySQL website. In this particular case, it is about the upgrade of mocenter 0.2 to 0.3.
In the Workbench on the right side, you have the button “Database Migration”. Once you click on it, the introduction to the migration wizard will show up.
Setting up source and targetOn the bottom of the screen there is a button called “Start Migration”. Click it to get to the source selection.
On the first drop down menu choose “SQLite”. The menu will now change and give you the possibility to load a file. After you did this you can test the connection with the button on the bottom left, or go to the bottom right and click next.
You get a new menu, where you can choose your target of the database. You can use a stored connection or a new one. If you have no idea which connection type to use, TCP/IP usually works fine. Here I recommend you to test the connection, so you see if you can reach target. Then click next.
If the schema fetch does not return any error, you get to the schema selection.
There you have to select a target schema, before you can click next and go to fetch the source. Once you have done that too, click next once more.
Object selectionNow the database is ready to copy.
If you want to remove some of the tables from the migration, you can do this under “Show Selection”.
In case you see some warnings, you can ignore them.
Now click next and make sure there are no errors or failures, until you can select how to create the target database.
If you want to, you can create the SQL file to import the database structure somewhere else. I am just going to put it onto my server.
Clicking next after this step will create the database on the server. If you chose to create the SQL file, it would be created now as well.
Click next, when the creation is done, and you should get something, that looks like this picture. If you select a line, you will be shown the command that was executed.
This was the structure. Now to the data:
To transfer your data from the SQLite directly into the MySQL-Database, you can make an online copy. If you want to, or have to, do it from the command line you could also make a bash file.
If you click the option “Truncate target tables”, all the tables that already exist will be cleared of any data, so be careful using that checkbox.
Once complete, you should get report that looks somewhat like this.
------------------------------------------------------------------------------------ MySQL Workbench Migration Wizard Report Date: Sun Oct 18 17:36:53 2015 Source: SQLite 1.0.0 Target: MySQL 5.6.24 ------------------------------------------------------------------------------------ I. Migration 1. Summary Number of migrated schemas: 1 1. mocenter Source Schema: mocenter - Tables: 10 - Triggers: 0 - Views: 0 - Stored Procedures: 0 - Functions: 0 2. Migration Issues - versions warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - PRIMARY warning Truncated key column length for column from 0 to 255 - nodes warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - name_ui warning Truncated key column length for column from 0 to 255 - clusters warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - cluster_name warning Truncated key column length for column from 0 to 255 - vips warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - ip_address warning Truncated key column length for column from 0 to 255 - servers warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - server_name warning Truncated key column length for column from 0 to 255 - users warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - login_name warning Truncated key column length for column from 0 to 255 - moc_identifier warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - PRIMARY warning Truncated key column length for column from 0 to 255 - checks warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL.warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - PRIMARY warning Truncated key column length for column from 0 to 255 - name_dc warning Truncated key column length for column from 0 to 255 - jobs warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. 3. Object Creation Issues 4. Migration Details 4.1. Table mocenter.versions (versions) Columns: - version LONGTEXT - mr_version INT NULL - timestamp INT NULL Foreign Keys: Indices: - PRIMARY (version(255)) 4.2. Table mocenter.licenses (licenses) Columns: - license LONGTEXT NULL Foreign Keys: Indices: 4.3. Table mocenter.nodes (nodes) Columns: - node_id INT - name LONGTEXT NULL - last_change_ts INT NULL - status LONGTEXT NULL - node_type LONGTEXT NULL - hostname LONGTEXT NULL - basedir LONGTEXT NULL - datadir LONGTEXT NULL - my_cnf LONGTEXT NULL - port INT NULL - database_user LONGTEXT NULL - database_user_password LONGTEXT NULL - error_log LONGTEXT NULL - pid_file LONGTEXT NULL - read_only INT NULL - server_id INT NULL - role_id INT NULL - cluster_id INT NULL - master_id INT NULL Foreign Keys: Indices: - PRIMARY (node_id) - name_ui (name(255)) 4.4. Table mocenter.clusters (clusters) Columns: - cluster_id INT - name LONGTEXT NULL - last_change_ts INT NULL - type INT NULL Foreign Keys: Indices: - PRIMARY (cluster_id) - cluster_name (name(255)) 4.5. Table mocenter.vips (vips) Columns: - vip_id INT - ip_address LONGTEXT NULL - name LONGTEXT NULL - ipaddr_type INT NULL - interface LONGTEXT NULL - alias INT NULL - primary_id INT NULL - failover_id INT NULL - location_id INT NULL - cluster_id INT NULL - last_change_ts INT NULL - fo_sync_only INT NULL - fo_wait_sync INT NULL Foreign Keys: Indices: - PRIMARY (vip_id) - ip_address (ip_address(255)) 4.6. Table mocenter.servers (servers) Columns: - server_id INT - name LONGTEXT NULL - default_ip LONGTEXT NULL - os_user LONGTEXT NULL - cluster_id INT NULL - last_change_ts INT NULL - myenv_basedir LONGTEXT NULL Foreign Keys: Indices: - PRIMARY (server_id) - server_name (name(255)) 4.7. Table mocenter.users (users) Columns: - user_id INT - login_name LONGTEXT NULL - password_hash LONGTEXT NULL - email_address LONGTEXT NULL - first_name LONGTEXT NULL - last_name LONGTEXT NULL - mobile LONGTEXT NULL - role_id INT NULL Foreign Keys: Indices: - PRIMARY (user_id) - login_name (login_name(255)) 4.8. Table mocenter.moc_identifier (moc_identifier) Columns: - moc_identifier LONGTEXT Foreign Keys: Indices: - PRIMARY (moc_identifier(255)) 4.9. Table mocenter.checks (checks) Columns: - unit_id INT - type LONGTEXT NULL - name LONGTEXT - last_check_ts INT NULL - last_check_status LONGTEXT NULL - last_successful_check_ts INT NULL - last_successful_check_status LONGTEXT NULL Foreign Keys: Indices: - PRIMARY (unit_id, name(255)) - name_dc (name(255)) 4.10. Table mocenter.jobs (jobs) Columns: - job_id INT - name LONGTEXT NULL - server LONGTEXT NULL - pid INT NULL - start_ts INT NULL - status LONGTEXT NULL - check_interval INT NULL - last_check_ts INT NULL - end_ts INT NULL - error_code INT NULL - error_message LONGTEXT NULL - command LONGTEXT NULL Foreign Keys: Indices: - PRIMARY (job_id) II. Data Copy - `mocenter`.`moc_identifier` Succeeded : copied 1 of 1 rows from "mocenter"."moc_identifier" - `mocenter`.`clusters` Succeeded : copied 1 of 1 rows from "mocenter"."clusters" - `mocenter`.`jobs` Succeeded : copied 0 of 0 rows from "mocenter"."jobs" - `mocenter`.`users` Succeeded : copied 1 of 1 rows from "mocenter"."users" - `mocenter`.`vips` Succeeded : copied 1 of 1 rows from "mocenter"."vips" - `mocenter`.`versions` Succeeded : copied 2 of 2 rows from "mocenter"."versions" - `mocenter`.`servers` Succeeded : copied 2 of 2 rows from "mocenter"."servers" - `mocenter`.`checks` Succeeded : copied 46 of 46 rows from "mocenter"."checks" - `mocenter`.`licenses` Succeeded : copied 1 of 1 rows from "mocenter"."licenses" ConclusionMigrating from SQLite to MySQL is very easy when using MySQL Workbench.
Taxonomy upgrade extras: focmmmigrationMigration of SQLite to MySQL
In my first Blog ever, I am going to cover the migration of a SQLite-Database to MySQL. The Tool used is MySQL-Workbench, which you can Download from the MySQL website. In this particular case, it is about the upgrade of mocenter 0.2 to 0.3.
In the Workbench on the right side, you have the button “Database Migration”. Once you click on it, the introduction to the migration wizard will show up.
Setting up source and targetOn the bottom of the screen there is a button called “Start Migration”. Click it to get to the source selection.
On the first drop down menu choose “SQLite”. The menu will now change and give you the possibility to load a file. After you did this you can test the connection with the button on the bottom left, or go to the bottom right and click next.
You get a new menu, where you can choose your target of the database. You can use a stored connection or a new one. If you have no idea which connection type to use, TCP/IP usually works fine. Here I recommend you to test the connection, so you see if you can reach target. Then click next.
If the schema fetch does not return any error, you get to the schema selection.
There you have to select a target schema, before you can click next and go to fetch the source. Once you have done that too, click next once more.
Object selectionNow the database is ready to copy.
If you want to remove some of the tables from the migration, you can do this under “Show Selection”.
In case you see some warnings, you can ignore them.
Now click next and make sure there are no errors or failures, until you can select how to create the target database.
If you want to, you can create the SQL file to import the database structure somewhere else. I am just going to put it onto my server.
Clicking next after this step will create the database on the server. If you chose to create the SQL file, it would be created now as well.
Click next, when the creation is done, and you should get something, that looks like this picture. If you select a line, you will be shown the command that was executed.
This was the structure. Now to the data:
To transfer your data from the SQLite directly into the MySQL-Database, you can make an online copy. If you want to, or have to, do it from the command line you could also make a bash file.
If you click the option “Truncate target tables”, all the tables that already exist will be cleared of any data, so be careful using that checkbox.
Once complete, you should get report that looks somewhat like this.
------------------------------------------------------------------------------------ MySQL Workbench Migration Wizard Report Date: Sun Oct 18 17:36:53 2015 Source: SQLite 1.0.0 Target: MySQL 5.6.24 ------------------------------------------------------------------------------------ I. Migration 1. Summary Number of migrated schemas: 1 1. mocenter Source Schema: mocenter - Tables: 10 - Triggers: 0 - Views: 0 - Stored Procedures: 0 - Functions: 0 2. Migration Issues - versions warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - PRIMARY warning Truncated key column length for column from 0 to 255 - nodes warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - name_ui warning Truncated key column length for column from 0 to 255 - clusters warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - cluster_name warning Truncated key column length for column from 0 to 255 - vips warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - ip_address warning Truncated key column length for column from 0 to 255 - servers warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - server_name warning Truncated key column length for column from 0 to 255 - users warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - login_name warning Truncated key column length for column from 0 to 255 - moc_identifier warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - PRIMARY warning Truncated key column length for column from 0 to 255 - checks warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL.warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - PRIMARY warning Truncated key column length for column from 0 to 255 - name_dc warning Truncated key column length for column from 0 to 255 - jobs warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. 3. Object Creation Issues 4. Migration Details 4.1. Table mocenter.versions (versions) Columns: - version LONGTEXT - mr_version INT NULL - timestamp INT NULL Foreign Keys: Indices: - PRIMARY (version(255)) 4.2. Table mocenter.licenses (licenses) Columns: - license LONGTEXT NULL Foreign Keys: Indices: 4.3. Table mocenter.nodes (nodes) Columns: - node_id INT - name LONGTEXT NULL - last_change_ts INT NULL - status LONGTEXT NULL - node_type LONGTEXT NULL - hostname LONGTEXT NULL - basedir LONGTEXT NULL - datadir LONGTEXT NULL - my_cnf LONGTEXT NULL - port INT NULL - database_user LONGTEXT NULL - database_user_password LONGTEXT NULL - error_log LONGTEXT NULL - pid_file LONGTEXT NULL - read_only INT NULL - server_id INT NULL - role_id INT NULL - cluster_id INT NULL - master_id INT NULL Foreign Keys: Indices: - PRIMARY (node_id) - name_ui (name(255)) 4.4. Table mocenter.clusters (clusters) Columns: - cluster_id INT - name LONGTEXT NULL - last_change_ts INT NULL - type INT NULL Foreign Keys: Indices: - PRIMARY (cluster_id) - cluster_name (name(255)) 4.5. Table mocenter.vips (vips) Columns: - vip_id INT - ip_address LONGTEXT NULL - name LONGTEXT NULL - ipaddr_type INT NULL - interface LONGTEXT NULL - alias INT NULL - primary_id INT NULL - failover_id INT NULL - location_id INT NULL - cluster_id INT NULL - last_change_ts INT NULL - fo_sync_only INT NULL - fo_wait_sync INT NULL Foreign Keys: Indices: - PRIMARY (vip_id) - ip_address (ip_address(255)) 4.6. Table mocenter.servers (servers) Columns: - server_id INT - name LONGTEXT NULL - default_ip LONGTEXT NULL - os_user LONGTEXT NULL - cluster_id INT NULL - last_change_ts INT NULL - myenv_basedir LONGTEXT NULL Foreign Keys: Indices: - PRIMARY (server_id) - server_name (name(255)) 4.7. Table mocenter.users (users) Columns: - user_id INT - login_name LONGTEXT NULL - password_hash LONGTEXT NULL - email_address LONGTEXT NULL - first_name LONGTEXT NULL - last_name LONGTEXT NULL - mobile LONGTEXT NULL - role_id INT NULL Foreign Keys: Indices: - PRIMARY (user_id) - login_name (login_name(255)) 4.8. Table mocenter.moc_identifier (moc_identifier) Columns: - moc_identifier LONGTEXT Foreign Keys: Indices: - PRIMARY (moc_identifier(255)) 4.9. Table mocenter.checks (checks) Columns: - unit_id INT - type LONGTEXT NULL - name LONGTEXT - last_check_ts INT NULL - last_check_status LONGTEXT NULL - last_successful_check_ts INT NULL - last_successful_check_status LONGTEXT NULL Foreign Keys: Indices: - PRIMARY (unit_id, name(255)) - name_dc (name(255)) 4.10. Table mocenter.jobs (jobs) Columns: - job_id INT - name LONGTEXT NULL - server LONGTEXT NULL - pid INT NULL - start_ts INT NULL - status LONGTEXT NULL - check_interval INT NULL - last_check_ts INT NULL - end_ts INT NULL - error_code INT NULL - error_message LONGTEXT NULL - command LONGTEXT NULL Foreign Keys: Indices: - PRIMARY (job_id) II. Data Copy - `mocenter`.`moc_identifier` Succeeded : copied 1 of 1 rows from "mocenter"."moc_identifier" - `mocenter`.`clusters` Succeeded : copied 1 of 1 rows from "mocenter"."clusters" - `mocenter`.`jobs` Succeeded : copied 0 of 0 rows from "mocenter"."jobs" - `mocenter`.`users` Succeeded : copied 1 of 1 rows from "mocenter"."users" - `mocenter`.`vips` Succeeded : copied 1 of 1 rows from "mocenter"."vips" - `mocenter`.`versions` Succeeded : copied 2 of 2 rows from "mocenter"."versions" - `mocenter`.`servers` Succeeded : copied 2 of 2 rows from "mocenter"."servers" - `mocenter`.`checks` Succeeded : copied 46 of 46 rows from "mocenter"."checks" - `mocenter`.`licenses` Succeeded : copied 1 of 1 rows from "mocenter"."licenses" ConclusionMigrating from SQLite to MySQL is very easy when using MySQL Workbench.
Taxonomy upgrade extras: focmmmigrationMigration of SQLite to MySQL
In my first Blog ever, I am going to cover the migration of a SQLite-Database to MySQL. The Tool used is MySQL-Workbench, which you can Download from the MySQL website. In this particular case, it is about the upgrade of mocenter 0.2 to 0.3.
In the Workbench on the right side, you have the button “Database Migration”. Once you click on it, the introduction to the migration wizard will show up.
Setting up source and targetOn the bottom of the screen there is a button called “Start Migration”. Click it to get to the source selection.
On the first drop down menu choose “SQLite”. The menu will now change and give you the possibility to load a file. After you did this you can test the connection with the button on the bottom left, or go to the bottom right and click next.
You get a new menu, where you can choose your target of the database. You can use a stored connection or a new one. If you have no idea which connection type to use, TCP/IP usually works fine. Here I recommend you to test the connection, so you see if you can reach target. Then click next.
If the schema fetch does not return any error, you get to the schema selection.
There you have to select a target schema, before you can click next and go to fetch the source. Once you have done that too, click next once more.
Object selectionNow the database is ready to copy.
If you want to remove some of the tables from the migration, you can do this under “Show Selection”.
In case you see some warnings, you can ignore them.
Now click next and make sure there are no errors or failures, until you can select how to create the target database.
If you want to, you can create the SQL file to import the database structure somewhere else. I am just going to put it onto my server.
Clicking next after this step will create the database on the server. If you chose to create the SQL file, it would be created now as well.
Click next, when the creation is done, and you should get something, that looks like this picture. If you select a line, you will be shown the command that was executed.
This was the structure. Now to the data:
To transfer your data from the SQLite directly into the MySQL-Database, you can make an online copy. If you want to, or have to, do it from the command line you could also make a bash file.
If you click the option “Truncate target tables”, all the tables that already exist will be cleared of any data, so be careful using that checkbox.
Once complete, you should get report that looks somewhat like this.
------------------------------------------------------------------------------------ MySQL Workbench Migration Wizard Report Date: Sun Oct 18 17:36:53 2015 Source: SQLite 1.0.0 Target: MySQL 5.6.24 ------------------------------------------------------------------------------------ I. Migration 1. Summary Number of migrated schemas: 1 1. mocenter Source Schema: mocenter - Tables: 10 - Triggers: 0 - Views: 0 - Stored Procedures: 0 - Functions: 0 2. Migration Issues - versions warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - PRIMARY warning Truncated key column length for column from 0 to 255 - nodes warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - name_ui warning Truncated key column length for column from 0 to 255 - clusters warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - cluster_name warning Truncated key column length for column from 0 to 255 - vips warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - ip_address warning Truncated key column length for column from 0 to 255 - servers warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - server_name warning Truncated key column length for column from 0 to 255 - users warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - login_name warning Truncated key column length for column from 0 to 255 - moc_identifier warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - PRIMARY warning Truncated key column length for column from 0 to 255 - checks warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL.warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. - PRIMARY warning Truncated key column length for column from 0 to 255 - name_dc warning Truncated key column length for column from 0 to 255 - jobs warning Source table has a PRIMARY KEY allowing NULL values, which is not supported by MySQL. Column was changed to NOT NULL. 3. Object Creation Issues 4. Migration Details 4.1. Table mocenter.versions (versions) Columns: - version LONGTEXT - mr_version INT NULL - timestamp INT NULL Foreign Keys: Indices: - PRIMARY (version(255)) 4.2. Table mocenter.licenses (licenses) Columns: - license LONGTEXT NULL Foreign Keys: Indices: 4.3. Table mocenter.nodes (nodes) Columns: - node_id INT - name LONGTEXT NULL - last_change_ts INT NULL - status LONGTEXT NULL - node_type LONGTEXT NULL - hostname LONGTEXT NULL - basedir LONGTEXT NULL - datadir LONGTEXT NULL - my_cnf LONGTEXT NULL - port INT NULL - database_user LONGTEXT NULL - database_user_password LONGTEXT NULL - error_log LONGTEXT NULL - pid_file LONGTEXT NULL - read_only INT NULL - server_id INT NULL - role_id INT NULL - cluster_id INT NULL - master_id INT NULL Foreign Keys: Indices: - PRIMARY (node_id) - name_ui (name(255)) 4.4. Table mocenter.clusters (clusters) Columns: - cluster_id INT - name LONGTEXT NULL - last_change_ts INT NULL - type INT NULL Foreign Keys: Indices: - PRIMARY (cluster_id) - cluster_name (name(255)) 4.5. Table mocenter.vips (vips) Columns: - vip_id INT - ip_address LONGTEXT NULL - name LONGTEXT NULL - ipaddr_type INT NULL - interface LONGTEXT NULL - alias INT NULL - primary_id INT NULL - failover_id INT NULL - location_id INT NULL - cluster_id INT NULL - last_change_ts INT NULL - fo_sync_only INT NULL - fo_wait_sync INT NULL Foreign Keys: Indices: - PRIMARY (vip_id) - ip_address (ip_address(255)) 4.6. Table mocenter.servers (servers) Columns: - server_id INT - name LONGTEXT NULL - default_ip LONGTEXT NULL - os_user LONGTEXT NULL - cluster_id INT NULL - last_change_ts INT NULL - myenv_basedir LONGTEXT NULL Foreign Keys: Indices: - PRIMARY (server_id) - server_name (name(255)) 4.7. Table mocenter.users (users) Columns: - user_id INT - login_name LONGTEXT NULL - password_hash LONGTEXT NULL - email_address LONGTEXT NULL - first_name LONGTEXT NULL - last_name LONGTEXT NULL - mobile LONGTEXT NULL - role_id INT NULL Foreign Keys: Indices: - PRIMARY (user_id) - login_name (login_name(255)) 4.8. Table mocenter.moc_identifier (moc_identifier) Columns: - moc_identifier LONGTEXT Foreign Keys: Indices: - PRIMARY (moc_identifier(255)) 4.9. Table mocenter.checks (checks) Columns: - unit_id INT - type LONGTEXT NULL - name LONGTEXT - last_check_ts INT NULL - last_check_status LONGTEXT NULL - last_successful_check_ts INT NULL - last_successful_check_status LONGTEXT NULL Foreign Keys: Indices: - PRIMARY (unit_id, name(255)) - name_dc (name(255)) 4.10. Table mocenter.jobs (jobs) Columns: - job_id INT - name LONGTEXT NULL - server LONGTEXT NULL - pid INT NULL - start_ts INT NULL - status LONGTEXT NULL - check_interval INT NULL - last_check_ts INT NULL - end_ts INT NULL - error_code INT NULL - error_message LONGTEXT NULL - command LONGTEXT NULL Foreign Keys: Indices: - PRIMARY (job_id) II. Data Copy - `mocenter`.`moc_identifier` Succeeded : copied 1 of 1 rows from "mocenter"."moc_identifier" - `mocenter`.`clusters` Succeeded : copied 1 of 1 rows from "mocenter"."clusters" - `mocenter`.`jobs` Succeeded : copied 0 of 0 rows from "mocenter"."jobs" - `mocenter`.`users` Succeeded : copied 1 of 1 rows from "mocenter"."users" - `mocenter`.`vips` Succeeded : copied 1 of 1 rows from "mocenter"."vips" - `mocenter`.`versions` Succeeded : copied 2 of 2 rows from "mocenter"."versions" - `mocenter`.`servers` Succeeded : copied 2 of 2 rows from "mocenter"."servers" - `mocenter`.`checks` Succeeded : copied 46 of 46 rows from "mocenter"."checks" - `mocenter`.`licenses` Succeeded : copied 1 of 1 rows from "mocenter"."licenses" ConclusionMigrating from SQLite to MySQL is very easy when using MySQL Workbench.
Max_used_connections per user/account
How many connections can be opened concurrently against my MySQL or MariaDB database can be configured and checked with the following command:
SHOW GLOBAL VARIABLES LIKE 'max_connections'; +-----------------+-------+ | Variable_name | Value | +-----------------+-------+ | max_connections | 505 | +-----------------+-------+If this limit was ever reached in the past can be checked with:
SHOW GLOBAL STATUS LIKE 'max_use%'; +----------------------+-------+ | Variable_name | Value | +----------------------+-------+ | Max_used_connections | 23 | +----------------------+-------+But on MySQL instances with many different applications (= databases/schemas) and thus many different users it is a bit more complicated to find out which of these users have connected how many times concurrently. We can configure how many connections one specific user can have at maximum at the same time with:
SHOW GLOBAL VARIABLES LIKE 'max_user_connections'; +----------------------+-------+ | Variable_name | Value | +----------------------+-------+ | max_user_connections | 500 | +----------------------+-------+Further we can limit one specific user with:
GRANT USAGE ON *.* TO 'repl'@'%' WITH MAX_CONNECTIONS_PER_HOUR 100 MAX_USER_CONNECTIONS 10;and check with:
SELECT User, Host, max_connections, max_user_connections FROM mysql.user; +------+---------------+-----------------+----------------------+ | User | Host | max_connections | max_user_connections | +------+---------------+-----------------+----------------------+ | root | localhost | 0 | 0 | | repl | % | 100 | 10 | | repl | 192.168.1.139 | 0 | 0 | +------+---------------+-----------------+----------------------+But we have currently no chance to check if this limit was reached or nearly reached in the past...
A feature request for this was opened at MySQL wit bug #77888
SolutionIf you cannot wait for the implementation here we have a little workaround:
DROP TABLE IF EXISTS mysql.`max_used_connections`; CREATE TABLE mysql.`max_used_connections` ( `USER` char(16) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL, `HOST` char(60) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL, `MAX_USED_CONNECTIONS` bigint(20) NOT NULL, PRIMARY KEY (`USER`, `HOST`) USING HASH ) ENGINE=MEMORY DEFAULT CHARSET=utf8 ; DROP EVENT IF EXISTS mysql.gather_max_used_connections; -- event_scheduler = on CREATE DEFINER=root@localhost EVENT mysql.gather_max_used_connections ON SCHEDULE EVERY 10 SECOND DO INSERT INTO mysql.max_used_connections SELECT user, host, current_connections FROM performance_schema.accounts WHERE user IS NOT NULL AND host IS NOT NULL ON DUPLICATE KEY UPDATE max_used_connections = IF(current_connections > max_used_connections, current_connections, max_used_connections) ; SELECT * FROM mysql.max_used_connections; +--------+-----------+----------------------+ | USER | HOST | MAX_USED_CONNECTIONS | +--------+-----------+----------------------+ | root | localhost | 4 | | zabbix | localhost | 21 | +--------+-----------+----------------------+Caution: Because we used a MEMORY table those values are reset at every MySQL restart (as it happens with the PERFORMANCE_SCHEMA or the INFORMATION_SCHEMA).
The Upcoming Leap Second
The press, be it the general daily newspaper or the computer magazines, is currently informing the public about an upcoming leap second, which will be taken in the night from June 30 to July 1 at 00:00:00 UTC. While we Europeans will enjoy our well-deserved sleep then, this will be at 5 PM (17:00) local time on June 30 for Califormia people, and during the morning of July 1 for people in China, Japan, Korea, or Australia. (Other countries not mentioned for the sake of brevity.) This is different from last time, when the leap second was taken in the night from Saturday to Sunday (2012-July-1 00:00:00 UTC), so it was a weekend everywhere on the globe.
We have got several requests from our customers about this upcoming leap second, whether they need to take any special precautions or whether they "are safe". Well, obviously nobody is "safe" from the leap second in the sense that it would circumvent them, everybody will encounter it on their systems. The concern is whether they have to expect any trouble.
For the people operating MySQL (or any other DBMS), this issue is threefold:
Let us look at the operating system first, dealing with Linux only.
(All other operating systems don't show up with significant numbers in our customer base.)
Linux measures the time in seconds (since Jan 1, 1970, 00:00:00 UTC), and it does not include the leap seconds in this counting.
When the leap second is taken, the timestamp value (those seconds) will simply not be advanced at the end of the regular second, but it will re-use its current value in the leap second.
As a result, the conversion of timestamps into common time reckoning will still produce values from 0 to 59 for the minute as well as for the second, there will not be a 60.
Another consequence is that there is no way to tell the leap second from the preceding regular one.
MySQL always takes the time information from the Linux kernel, so it will also use the same timestamp value (or "now()" result) for both the regular and the leap second. The MySQL manual describes this on its own leap second page, which has become inaccurate by some recent code changes: We could not reproduce the results given there (and will probably file a documentation bug about this). However, that difference does not affect the principle of the page's first paragraph.
About the application, it is hard to claim anything - there are too many of them. However, it is obvious that an application might run into trouble if it managed to store a new timestamp value every second and assumed they are distinct: they will not be (unless the application manages to skip the leap second). Let's hope any application programmer creating such a high-resolution application was aware of the problem.
So does it all look fine?
Not completely:
Following the last leap second (just three years ago), several administrators noticed that their machines became extraordinarily busy, which even let the power consumption rise significantly.
This was caused by a bug in the Linux kernel, it let user processes resume immediately if they were trying to wait on a high-resolution timer.
The exact details are beyond the scope of this article, your favourite search engine will provide you with more than enough references if you ask it.
For our purposes, the important fact is that this also happened to MySQL server processes ("mysqld") because InnoDB was doing such operations.
Back then, Sheeri K. Cabral (well-known in the MySQL blogosphere) published a cure which her team had discovered:
This loop can be broken by simply setting the Linux kernel's clock to the current time.
date -s "`date`"
While this looks like a no-op at first sight, I assume that it will reset the sub-second time information and so will have a small effect, which obviously is sufficient to terminate that loop.
(In practice, you should stop your NTP daemon before changing the date, and restart it afterwards. The exact command will depend on your Linux distribution.)
Well, this was three years ago. The issue was reported to kernel developers, a fix was developed (which gives credit to Sheeri's report) and applied to newer kernels, and it w as also backported to older kernels. From my search, it seems that all reasonably maintained installations should have received that fix. For example, I have seen notices that Ubuntu 12.04 (since kernel 3.2.0-29.46) and 14.04 do have it, as does RedHat 6.4 (since kernel 2.6.32-298); for other distributions, I did not search.
In addition to a current kernel, you obviously need current packages including the leap second information. Typically, this would be "ntp", "ntpdate", "tzdata", and related ones. Don't forget to restart your NTP daemon after installing these updates!
So I wish all our readers that the leap second may not get them into trouble, or that (if worst comes to worst) the cure published by Sheeri may get them out.
However, I have to remind you of the old truth which is attributed to Mark Twain (sometimes also to Niels Bohr):
"It is difficult to make predictions, especially about the future."
The Upcoming Leap Second
The press, be it the general daily newspaper or the computer magazines, is currently informing the public about an upcoming leap second, which will be taken in the night from June 30 to July 1 at 00:00:00 UTC. While we Europeans will enjoy our well-deserved sleep then, this will be at 5 PM (17:00) local time on June 30 for Califormia people, and during the morning of July 1 for people in China, Japan, Korea, or Australia. (Other countries not mentioned for the sake of brevity.) This is different from last time, when the leap second was taken in the night from Saturday to Sunday (2012-July-1 00:00:00 UTC), so it was a weekend everywhere on the globe.
We have got several requests from our customers about this upcoming leap second, whether they need to take any special precautions or whether they "are safe". Well, obviously nobody is "safe" from the leap second in the sense that it would circumvent them, everybody will encounter it on their systems. The concern is whether they have to expect any trouble.
For the people operating MySQL (or any other DBMS), this issue is threefold:
Let us look at the operating system first, dealing with Linux only.
(All other operating systems don't show up with significant numbers in our customer base.)
Linux measures the time in seconds (since Jan 1, 1970, 00:00:00 UTC), and it does not include the leap seconds in this counting.
When the leap second is taken, the timestamp value (those seconds) will simply not be advanced at the end of the regular second, but it will re-use its current value in the leap second.
As a result, the conversion of timestamps into common time reckoning will still produce values from 0 to 59 for the minute as well as for the second, there will not be a 60.
Another consequence is that there is no way to tell the leap second from the preceding regular one.
MySQL always takes the time information from the Linux kernel, so it will also use the same timestamp value (or "now()" result) for both the regular and the leap second. The MySQL manual describes this on its own leap second page, which has become inaccurate by some recent code changes: We could not reproduce the results given there (and will probably file a documentation bug about this). However, that difference does not affect the principle of the page's first paragraph.
About the application, it is hard to claim anything - there are too many of them. However, it is obvious that an application might run into trouble if it managed to store a new timestamp value every second and assumed they are distinct: they will not be (unless the application manages to skip the leap second). Let's hope any application programmer creating such a high-resolution application was aware of the problem.
So does it all look fine?
Not completely:
Following the last leap second (just three years ago), several administrators noticed that their machines became extraordinarily busy, which even let the power consumption rise significantly.
This was caused by a bug in the Linux kernel, it let user processes resume immediately if they were trying to wait on a high-resolution timer.
The exact details are beyond the scope of this article, your favourite search engine will provide you with more than enough references if you ask it.
For our purposes, the important fact is that this also happened to MySQL server processes ("mysqld") because InnoDB was doing such operations.
Back then, Sheeri K. Cabral (well-known in the MySQL blogosphere) published a cure which her team had discovered:
This loop can be broken by simply setting the Linux kernel's clock to the current time.
date -s "`date`"
While this looks like a no-op at first sight, I assume that it will reset the sub-second time information and so will have a small effect, which obviously is sufficient to terminate that loop.
(In practice, you should stop your NTP daemon before changing the date, and restart it afterwards. The exact command will depend on your Linux distribution.)
Well, this was three years ago. The issue was reported to kernel developers, a fix was developed (which gives credit to Sheeri's report) and applied to newer kernels, and it w as also backported to older kernels. From my search, it seems that all reasonably maintained installations should have received that fix. For example, I have seen notices that Ubuntu 12.04 (since kernel 3.2.0-29.46) and 14.04 do have it, as does RedHat 6.4 (since kernel 2.6.32-298); for other distributions, I did not search.
In addition to a current kernel, you obviously need current packages including the leap second information. Typically, this would be "ntp", "ntpdate", "tzdata", and related ones. Don't forget to restart your NTP daemon after installing these updates!
So I wish all our readers that the leap second may not get them into trouble, or that (if worst comes to worst) the cure published by Sheeri may get them out.
However, I have to remind you of the old truth which is attributed to Mark Twain (sometimes also to Niels Bohr):
"It is difficult to make predictions, especially about the future."
The Upcoming Leap Second
The press, be it the general daily newspaper or the computer magazines, is currently informing the public about an upcoming leap second, which will be taken in the night from June 30 to July 1 at 00:00:00 UTC. While we Europeans will enjoy our well-deserved sleep then, this will be at 5 PM (17:00) local time on June 30 for Califormia people, and during the morning of July 1 for people in China, Japan, Korea, or Australia. (Other countries not mentioned for the sake of brevity.) This is different from last time, when the leap second was taken in the night from Saturday to Sunday (2012-July-1 00:00:00 UTC), so it was a weekend everywhere on the globe.
We have got several requests from our customers about this upcoming leap second, whether they need to take any special precautions or whether they "are safe". Well, obviously nobody is "safe" from the leap second in the sense that it would circumvent them, everybody will encounter it on their systems. The concern is whether they have to expect any trouble.
For the people operating MySQL (or any other DBMS), this issue is threefold:
Let us look at the operating system first, dealing with Linux only.
(All other operating systems don't show up with significant numbers in our customer base.)
Linux measures the time in seconds (since Jan 1, 1970, 00:00:00 UTC), and it does not include the leap seconds in this counting.
When the leap second is taken, the timestamp value (those seconds) will simply not be advanced at the end of the regular second, but it will re-use its current value in the leap second.
As a result, the conversion of timestamps into common time reckoning will still produce values from 0 to 59 for the minute as well as for the second, there will not be a 60.
Another consequence is that there is no way to tell the leap second from the preceding regular one.
MySQL always takes the time information from the Linux kernel, so it will also use the same timestamp value (or "now()" result) for both the regular and the leap second. The MySQL manual describes this on its own leap second page, which has become inaccurate by some recent code changes: We could not reproduce the results given there (and will probably file a documentation bug about this). However, that difference does not affect the principle of the page's first paragraph.
About the application, it is hard to claim anything - there are too many of them. However, it is obvious that an application might run into trouble if it managed to store a new timestamp value every second and assumed they are distinct: they will not be (unless the application manages to skip the leap second). Let's hope any application programmer creating such a high-resolution application was aware of the problem.
So does it all look fine?
Not completely:
Following the last leap second (just three years ago), several administrators noticed that their machines became extraordinarily busy, which even let the power consumption rise significantly.
This was caused by a bug in the Linux kernel, it let user processes resume immediately if they were trying to wait on a high-resolution timer.
The exact details are beyond the scope of this article, your favourite search engine will provide you with more than enough references if you ask it.
For our purposes, the important fact is that this also happened to MySQL server processes ("mysqld") because InnoDB was doing such operations.
Back then, Sheeri K. Cabral (well-known in the MySQL blogosphere) published a cure which her team had discovered:
This loop can be broken by simply setting the Linux kernel's clock to the current time.
date -s "`date`"
While this looks like a no-op at first sight, I assume that it will reset the sub-second time information and so will have a small effect, which obviously is sufficient to terminate that loop.
(In practice, you should stop your NTP daemon before changing the date, and restart it afterwards. The exact command will depend on your Linux distribution.)
Well, this was three years ago. The issue was reported to kernel developers, a fix was developed (which gives credit to Sheeri's report) and applied to newer kernels, and it w as also backported to older kernels. From my search, it seems that all reasonably maintained installations should have received that fix. For example, I have seen notices that Ubuntu 12.04 (since kernel 3.2.0-29.46) and 14.04 do have it, as does RedHat 6.4 (since kernel 2.6.32-298); for other distributions, I did not search.
In addition to a current kernel, you obviously need current packages including the leap second information. Typically, this would be "ntp", "ntpdate", "tzdata", and related ones. Don't forget to restart your NTP daemon after installing these updates!
So I wish all our readers that the leap second may not get them into trouble, or that (if worst comes to worst) the cure published by Sheeri may get them out.
However, I have to remind you of the old truth which is attributed to Mark Twain (sometimes also to Niels Bohr):
"It is difficult to make predictions, especially about the future."
FromDual Backup Manager for MySQL 1.2.2 has been released
FromDual has the pleasure to announce the release of the new version 1.2.2 of the popular Backup Manager for MySQL and MariaDB (fromdual_bman).
You can download the FromDual Backup Manager from here.
In the inconceivable case that you find a bug in the Backup Manager please report it to our Bugtracker.
Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.
Upgrade from 1.2.x to 1.2.2 # cd ${HOME}/product # tar xf /download/fromdual_brman-1.2.2.tar.gz # rm -f fromdual_brman # ln -s fromdual_brman-1.2.2 fromdual_brmanChanges in FromDual Backup Manager 1.2.2 FromDual Backup Manager
It contains mainly fixes with brman catalog and physical backups.
You can verify your current FromDual Backup Manager version with the following command:
fromdual_bman --version- Archiving with physical backup bug fixed.
- Connect replaced by OO style and error exit fixed.
- Create catalog fixed.
- Archivedir without archive option does not make sense.
Wir suchen Dich: MySQL/MariaDB DBA für FromDual Support
FromDual ist das führende unabhängige Beratungs- und Dienstleistungsunternehmen für MySQL, Galera Cluster, MariaDB und Percona Server in Europa mit Hauptsitz in der Schweiz.
Unsere Kunden stammen hauptsächlich aus Europa und reichen vom kleinen Start-Up bis zur europäischen Top-500 Firma. Sie erhalten von uns Support bei Datenbank-Problemen, direkte Eingriffe als remote-DBA, Schulung für ihre DBAs und Entwickler sowie Beratung bei Architektur- und Design-Entscheidungen. Ausserdem entwickeln wir Tools rund um MySQL, schreiben Blog-Artikel und halten Vorträge bei Konferenzen.
Da unsere qualitativ guten Dienstleistungen immer mehr Kunden anziehen, brauchen wir Kollegen (m/w), welche selbst und mit uns wachsen wollen.
StellenbeschreibungWir suchen deutschsprachige Mitarbeiter (Sie oder Ihn) auf Junior- oder Senior-Level für Dienstleistungen rund um MySQL (hauptsächlich Support und remote-DBA Arbeiten) in Vollzeit. Primär solltest Du sicherstellen, dass die geschäftskritischen MySQL-Datenbanken unserer Kunden wie am Schnürchen laufen - und falls nicht, diese schnell wieder ans Laufen kriegen...
Unser/e "Wunschkandidat/in"
- hat Erfahrung im Betrieb kritischer und hoch verfügbarer produktiver Datenbanken hauptsächlich auf Linux,
- kennt Replikation in allen Variationen aus der täglichen Arbeit,
- weiß, wie die meist verbreiteten MySQL-HA-Setups funktionieren und wie man sie wieder effizient repariert, wenn ein Problem auftritt,
- ist sattelfest in SQL,
- bringt Erfahrung mit Galera Cluster mit,
- kann Bash skripten und einfache Programme in mindestens einer verbreiteten Programmier-/Skripting-Sprache (PHP, Bash, ...) erstellen.
Wir suchen Verstärkung, die von soliden Grundlagen aus auf dem Weg zu diesem Ideal ist.
Was wir von Dir erwarten:
- Kenntnisse in MySQL, Percona Server oder MariaDB oder Bereitschaft, sich diese anzueignen
- wissen, wie man kritische Datenbank-Systeme betreibt
- Verständnis, was beim Betrieb von Datenbanken falsch laufen kann
- selbständige Arbeitsweise (remote) mit Kommunikation über IRC, Skype, Mail und Telefon
- Kenntnisse des Linux Systems
DBA- oder DevOps-Erfahrungen wären z.B. eine gute fachliche Basis.
Du schätzt den direkten Kontakt mit Kunden, hast ein gutes Gespür für deren Probleme, kannst zuhören und findest schnell die eigentlichen Probleme. Du bist gewohnt, proaktiv zu handeln bevor etwas passiert, und führst den Kunden wieder auf den richtigen Pfad zurück.
Um Deine Arbeit erledigen zu können, arbeitest Du in einer europäischen Zeitzone. Deine Arbeitszeit kannst Du, der betrieblichen Situation entsprechend, flexibel gestalten. Wir erwarten, dass Du Deinen Beitrag zum Bereitschaftsdienst leistest. FromDual hat voraussichtlich keine Büroräumlichkeiten in Deinem Wohnort. Ein Umzug ist jedoch nicht notwendig: Wir ermöglichen Dir das Arbeiten von zu Hause aus oder unterstützen Dich bei der Suche einer geeigneten Arbeitsräumlichkeit in Deiner Nähe. Gute schriftliche und mündliche Englischkenntnisse sind erforderlich.
Was wir Dir bieten:- Deinen Leistungen angemessenes Gehalt.
- Möglichkeit Dich zum Top MySQL-Datenbankspezialisten zu entwickeln.
- Selbständiges Arbeiten.
- Verantwortung für Deine Projekte und Kunden zu übernehmen.
- Gute Kameradschaft im Team, sowie lockerer und angenehmer Umgang.
- Stellenbezogene Weiterbildungsmöglichkeiten.
- Teilnahme an Open Source Anlässen.
- Arbeit von Deinem bevorzugten Wohnort aus.
Du solltest in der Lage sein, die meiste Zeit selbständig zu arbeiten, zu denken und zu handeln und Dir neues Wissen selbständig anzueignen (durch Web-Suche, die MySQL-Dokumentation, Ausprobieren, etc.). Solltest Du dennoch einmal nicht weiterkommen, werden Dir Deine Kollegen von FromDual gerne helfen.
Wenn Du jemanden brauchst, der Dir die ganze Zeit Dein Händchen hält, ist FromDual nicht die richtige Wahl.
Wie geht es weiter
Wenn Du an dieser Chance interessiert bist und Du denkst, dass Du die passende Kandidatin oder der passende Kandidat bist, würden wir uns freuen, von Dir zu hören. Wir wissen, dass niemand 100% auf diese Stellenbeschreibung passt!
Bitte schicke Deinen ungeschönten Lebenslauf mit Deinen Gehaltsvorstellungen an jobs@fromdual.com. Wenn Du mehr über diese Stelle erfahren oder wenn Du mit mir persönlich sprechen möchtest, ruf mich bitte an unter +41 79 830 09 33 (Oli Sennhauser, CTO). Bitte nur Bewerber, KEINE Headhunter!
Nachdem Du uns Deinen Lebenslauf zugeschickt hast, darfst Du Deine Fähigkeiten in einem kleinen MySQL-Test unter Beweis zu stellen. Nach bestandenem Test laden wir Dich für die finalen Interviews ein.
Controlling worldwide manufacturing plants with MySQL
A MySQL customer of FromDual has different manufacturing plants spread across the globe. They are operated by local companies. FromDuals customer wants to maintain the manufacturing receipts centralized in a MySQL database in the Head Quarter in Europe. Each manufacturing plant should only see their specific data.
Manufacturing log information should be reported backup to European Head Quarter MySQL database.
The process was designed as follows:
Preparation of Proof of Concept (PoC)To simulate all cases we need different schemas. Some which should be replicated, some which should NOT be replicated:
CREATE DATABASE finance; CREATE TABLE finance.accounting ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `data` varchar(255) DEFAULT NULL, `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`id`), KEY `data_rename` (`data`) ); CREATE DATABASE crm; CREATE TABLE crm.customer ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `data` varchar(255) DEFAULT NULL, `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`id`), KEY `data_rename` (`data`) ); CREATE DATABASE erp; -- Avoid specifying Storage Engine here!!! CREATE TABLE erp.manufacturing_data ( id INT UNSIGNED NOT NULL AUTO_INCREMENT , manufacture_plant VARCHAR(32) , manufacture_info VARCHAR(255) , PRIMARY KEY (id) , KEY (manufacture_plant) ); CREATE TABLE erp.manufacturing_log ( id INT UNSIGNED NOT NULL AUTO_INCREMENT , manufacture_plant VARCHAR(32) , log_data VARCHAR(255) , PRIMARY KEY (id) , KEY (manufacture_plant) );MySQL replication architecture
Before you start with such complicated MySQL set-ups it is recommended to make a little sketch of what you want to build:
Preparing the Production Master database (Prod M1)To make use of all the new and cool features of MySQL we used the new GTID replication. First we set up a Master (Prod M1) and its fail-over System (Prod M2) in the customers Head Quarter:
# /etc/my.cnf [mysqld] binlog_format = row # optional log_bin = binary-log # mandatory, also on Slave! log_slave_updates = on # mandatory gtid_mode = on # mandatory enforce_gtid_consistency = on # mandatory server-id = 39 # mandatoryThis step requires a system restart (one minute downtime).
Preparing the Production Master standby database (Prod M2)On Master (Prod M1):
GRANT REPLICATION SLAVE ON *.* TO 'replication'@'192.168.1.%' IDENTIFIED BY 'secret'; mysqldump -u root --set-gtid-purged=on --master-data=2 --all-databases --triggers --routines --events > /tmp/full_dump.sqlOn Slave (Prod M2):
CHANGE MASTER TO MASTER_HOST='192.168.1.39', MASTER_PORT=3306 , MASTER_USER='replication', MASTER_PASSWORD='secret' , MASTER_AUTO_POSITION=1; RESET MASTER; -- On SLAVE! system mysql -u root < /tmp/full_dump.sql START SLAVE;To make it easier for a Slave to connect to its master we set a VIP in front of those 2 database servers (VIP Prod). This VIP should be used by all applications in the head quarter and also the filter engines.
Set-up filter engines (Filter BR and Filter CN)To make sure every manufacturing plant sees only the data it is allowed to see we need a filtering engine between the production site and the manufacturing plant (Filter BR and Filter CN).
To keep this filter engine lean we use a MySQL instance with all tables converted to the Blackhole Storage Engine:
# /etc/my.cnf [mysqld] binlog_format = row # optional log_bin = binary-log # mandatory, also on Slave! log_slave_updates = on # mandatory gtid_mode = on # mandatory enforce_gtid_consistency = on # mandatory server-id = 36 # mandatory default_storage_engine = blackholeOn the production master (Prod M1) we get the data as follows:
mysqldump -u root --set-gtid-purged=on --master-data=2 --triggers --routines --events --no-data --databases erp > /tmp/erp_dump_nd.sqlThe Filter Engines (Filter BR and CN) are set-up as follows::
-- Here we can use the VIP! CHANGE MASTER TO master_host='192.168.1.33', master_port=3306 , master_user='replication', master_password='secret' , master_auto_position=1; RESET MASTER; -- On SLAVE! system cat /tmp/erp_dump_nd.sql | sed 's/ ENGINE=[a-zA-Z]*/ ENGINE=blackhole/' | mysql -u root START SLAVE;Do not forget to also create the replication user on the filter engines.
GRANT REPLICATION SLAVE ON *.* TO 'replication'@'192.168.1.%' IDENTIFIED BY 'secret';Filtering out all non ERP schemata
We only want the erp schema to be replicated to the manufacturing plants, not the crm or the finance application. This we achieve with the following option on the filter engines:
# /etc/my.cnf [mysqld] replicate_do_db = erp replicate_ignore_table = erp.manufacturing_logMySQL row filtering
To achieve row filtering we use TRIGGERS. Make sure they are not replicated further down the hierarchy:
SET SESSION SQL_LOG_BIN = 0; use erp DROP TRIGGER IF EXISTS filter_row; delimiter // CREATE TRIGGER filter_row BEFORE INSERT ON manufacturing_data FOR EACH ROW BEGIN IF ( NEW.manufacture_plant != 'China' ) THEN SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = 'Row was filtered out.' , CLASS_ORIGIN = 'FromDual filter trigger' , SUBCLASS_ORIGIN = 'filter_row' , CONSTRAINT_SCHEMA = 'erp' , CONSTRAINT_NAME = 'filer_row' , SCHEMA_NAME = 'erp' , TABLE_NAME = 'manufacturing_data' , COLUMN_NAME = '' , MYSQL_ERRNO = 1644 ; END IF; END; // delimiter ; SET SESSION SQL_LOG_BIN = 0;Up to now this would cause to stop replication for every filtered row. To avoid this we tell the Filtering Slaves to skip this error number:
# /etc/my.cnf [mysqld] slave_skip_errors = 1644Attaching production manufacturing Slaves (Man BR M1 and Man CN M1)
When we have finished everything on our head quarter site. We can start with the manufacturing sites (BR and CN):
On Master (Prod M1):
mysqldump -u root --set-gtid-purged=on --master-data=2 --triggers --routines --events --where='manufacture_plant="Brazil"' --databases erp > /tmp/erp_dump_br.sql mysqldump -u root --set-gtid-purged=on --master-data=2 --triggers --routines --events --where='manufacture_plant="China"' --databases erp > /tmp/erp_dump_cn.sqlOn the Manufacturing Masters (Man BR M1 and Man BR M2). Here we do NOT use a VIP because we think a blackhole storage engine is robust enough as master:
CHANGE MASTER TO master_host='192.168.1.43', master_port=3306 , master_user='replication', master_password='secret' , master_auto_position=1; RESET MASTER; -- On SLAVE! system cat /tmp/erp_dump_br.sql | mysql -u root START SLAVE;The standby manufacturing (Man BR M2 and Man CN M2) database is created in the same way as the production manufacturing database on the master.
Testing replication from HQ to manufacturing plantsFirst we make sure, crm and finance is not replicated out and replication also does not stop (on Prod M1):
INSERT INTO finance.accounting VALUES (NULL, 'test data over VIP', NULL); INSERT INTO finance.accounting VALUES (NULL, 'test data over VIP', NULL); INSERT INTO crm.customer VALUES (NULL, 'test data over VIP', NULL); INSERT INTO crm.customer VALUES (NULL, 'test data over VIP', NULL); UPDATE finance.accounting SET data = 'Changed data'; UPDATE crm.customer SET data = 'Changed data'; DELETE FROM finance.accounting WHERE id = 1; DELETE FROM crm.customer WHERE id = 1; SELECT * FROM finance.accounting; SELECT * FROM crm.customer; SHOW SLAVE STATUS\GThe schema filter seems to work correctly. Then we check if also the row filter works correctly. For this we have to run the queries in statement based replication (SBR)! Otherwise the trigger would not fire:
use mysql INSERT INTO erp.manufacturing_data VALUES (NULL, 'China', 'Highly secret manufacturing info as RBR.'); INSERT INTO erp.manufacturing_data VALUES (NULL, 'Brazil', 'Highly secret manufacturing info as RBR.'); -- This needs SUPER privilege... :-( SET SESSION binlog_format = STATEMENT; -- Caution those rows will NOT be replicated!!! -- See filter rules for SBR INSERT INTO erp.manufacturing_data VALUES (NULL, 'China', 'Highly secret manufacturing info as SBR lost.'); INSERT INTO erp.manufacturing_data VALUES (NULL, 'Brazil', 'Highly secret manufacturing info as SBR lost.'); use erp INSERT INTO manufacturing_data VALUES (NULL, 'China', 'Highly secret manufacturing info as SBR.'); INSERT INTO manufacturing_data VALUES (NULL, 'Brazil', 'Highly secret manufacturing info as SBR.'); INSERT INTO manufacturing_data VALUES (NULL, 'Germany', 'Highly secret manufacturing info as SBR.'); INSERT INTO manufacturing_data VALUES (NULL, 'Switzerland', 'Highly secret manufacturing info as SBR.'); SET SESSION binlog_format = ROW; SELECT * FROM erp.manufacturing_data;Production data back to head quarter
Now we have to take care about the production data on their way back to the HQ. To achieve this we use the new MySQL 5.7 feature called multi source replication. For multi source replication the replication repositories must be kept in tables instead of files:
# /etc/my.cnf [mysqld] master_info_repository = TABLE # mandatory relay_log_info_repository = TABLE # mandatoryThen we have to configure 2 replication channels from Prod M1 to their specific manufacturing masters over the VIP (VIP BR and VIP CN):
CHANGE MASTER TO MASTER_HOST='192.168.1.98', MASTER_PORT=3306 , MASTER_USER='replication', MASTER_PASSWORD='secret' , MASTER_AUTO_POSITION=1 FOR CHANNEL "manu_br"; CHANGE MASTER TO MASTER_HOST='192.168.1.99', MASTER_PORT=3306 , MASTER_USER='replication', MASTER_PASSWORD='secret' , MASTER_AUTO_POSITION=1 FOR CHANNEL "manu_cn"; START SLAVE FOR CHANNEL 'manu_br'; START SLAVE FOR CHANNEL 'manu_cn'; SHOW SLAVE STATUS FOR CHANNEL 'manu_br'\G SHOW SLAVE STATUS FOR CHANNEL 'manu_cn'\GAvoid to configure and activate the channels on Prod M2 as well.
Testing back replication from manufacturing plantsBrazil on Man BR M1:
INSERT INTO manufacturing_log VALUES (1, 'Production data from Brazil', 'data');China on Man CN M1:
INSERT INTO manufacturing_log VALUES (2, 'Production data from China', 'data');For testing:
SELECT * FROM manufacturing_log;Make sure you do not run into conflicts (Primary Key, AUTO_INCREMENTS). Make sure filtering is defined correctly!
To check the different channel states you can use the following command:
SHOW SLAVE STATUS\G or SELECT ras.channel_name, ras.service_state AS 'SQL_thread', ras.remaining_delay , CONCAT(user, '@', host, ':', port) AS user , rcs.service_state AS IO_thread, REPLACE(received_transaction_set, '\n', '') AS received_transaction_set FROM performance_schema.replication_applier_status AS ras JOIN performance_schema.replication_connection_configuration AS rcc ON rcc.channel_name = ras.channel_name JOIN performance_schema.replication_connection_status AS rcs ON ras.channel_name = rcs.channel_name ;Troubleshooting Inject empty transaction
If you try to skip a transaction as you did earlier (SQL_SLAVE_SKIP_COUNTER) you will face some problems:
STOP SLAVE; ERROR 1858 (HY000): sql_slave_skip_counter can not be set when the server is running with @@GLOBAL.GTID_MODE = ON. Instead, for each transaction that you want to skip, generate an empty transaction with the same GTID as the transactionTo skip the next transaction you have find the ones applied so far:
SHOW SLAVE STATUS\G ... Executed_Gtid_Set: c3611091-f80e-11e4-99bc-28d2445cb2e9:1-20then tell MySQL to skip this by injecting a new empty transaction:
SET SESSION GTID_NEXT='c3611091-f80e-11e4-99bc-28d2445cb2e9:21'; BEGIN; COMMIT; SET SESSION GTID_NEXT='AUTOMATIC'; SHOW SLAVE STATUS\G ... Executed_Gtid_Set: c3611091-f80e-11e4-99bc-28d2445cb2e9:1-21 START SLAVE;Revert from GTID-based replication to file/position-based replication
If you want to fall-back from MySQL GTID-based replication to file/position-based replication this is quite simple:
CHANGE MASTER TO MASTER_AUTO_POSITION = 0;MySQL Support and Engineering
If you need some help or support our MySQL support and engineering team is happy to help you.
Logging Galera Cluster conflicts
We typically suggest our customers to use our MySQL/Galera Cluster my.cnf configuration template to avoid MySQL configuration and performance problems.
And we are paranoid as well. Thus we enable all useful logging:
wsrep_log_conflicts = 1But this has also some consequences of more visibility...
If you monitor carefully your Galera Cluster for example with the FromDual Performance Monitor for MySQL and MariaDB, you might probably see some strange values increasing from time to time:
mysql< SHOW GLOBAL STATUS LIKE 'wsrep_local_%r_s'; +---------------------------+-------+ | Variable_name | Value | +---------------------------+-------+ | wsrep_local_cert_failures | 42 | | wsrep_local_bf_aborts | 13 | +---------------------------+-------+Those values are indicators that some transactions (Galera write sets) did to not succeed and were aborted by Galera. In this case the paranoid logging helps to find, what exactly was aborted and possibly helps to find out, if this can or should be fixed:
150410 1:44:18 [Note] WSREP: cluster conflict due to certification failure for threads: 150410 1:44:18 [Note] WSREP: Victim thread: THD: 151856, mode: local, state: executing, conflict: cert failure, seqno: 30399304 SQL: UPDATE login SET lTsexpire = UNIX_TIMESTAMP(NOW()) + lTimeout WHERE lSessionId = 'va3ta7besku82k56ncv3bnhlj5' *** Priority TRANSACTION: TRANSACTION 464359568, ACTIVE 0 sec starting index read mysql tables in use 1, locked 1 1 lock struct(s), heap size 360, 0 row lock(s) MySQL thread id 4, OS thread handle 0x7f1c0916c700, query id 8190690 Update_rows_log_event::find_row(30399302) *** Victim TRANSACTION: TRANSACTION 464359562, ACTIVE 0 sec mysql tables in use 1, locked 1 2 lock struct(s), heap size 360, 1 row lock(s), undo log entries 1 MySQL thread id 151856, OS thread handle 0x7f1c09091700, query id 8190614 172.20.100.11 sam_angiz query end UPDATE login SET lTsexpire = UNIX_TIMESTAMP(now()) + lTimeout WHERE lSessionId = 'va3ta7besku82k56ncv3bnhlj5' *** WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 835205 page no 3 n bits 72 index `PRIMARY` of table `fromdual`.`login` trx table locks 1 total table locks 2 trx id 464359562 lock_mode X locks rec but not gap lock hold time 0 wait time before grant 0 150410 1:44:18 [Note] WSREP: cluster conflict due to high priority abort for threads: 150410 1:44:18 [Note] WSREP: Winning thread: THD: 4, mode: applier, state: executing, conflict: no conflict, seqno: 30399302 SQL: (null) 150410 1:44:18 [Note] WSREP: Victim thread: THD: 151856, mode: local, state: committing, conflict: no conflict, seqno: -1 SQL: UPDATE login SET lTsexpire = UNIX_TIMESTAMP(now()) + lTimeout WHERE lSessionId = 'va3ta7besku82k56ncv3bnhlj5'In the above Galera conflict 2 login transactions where running at the same time. They both come with the same Session ID and want to update the expiry timestamp. Now how to solve or fix this:
- First check, if this table has a Primary Key (tables without a PK causes full table scans which can last for long time, increasing the chance for conflicts).
- Second check, if there is a (UNIQUE?) index on lSessionId. A missing index leads to full table scans which increases the chance for conflicts.
- Third check WHY 2 logins from the same Session ID can arrive at the same time (within 1 second) on 2 different Galera nodes (Ajax requests, etc...). Try to avoid such situations.
Galera Cluster last inactive check and VMware snapshots
From time to time we see at Galera Cluster customer engagements the following, for me scary, warning in the MySQL error log:
[Warning] WSREP: last inactive check more than PT1.5S ago (PT7.06159S), skipping checkWe mostly see this in VMware set-ups. Some further enquiry with the Galera developers did not give a satisfying answer:
This can be seen on bare metal as well - with poorly configured mysqld, O/S, or simply being overloaded. All it means is that this thread could not get CPU time for 7.1 seconds. You can imagine that access to resources in virtual machines is even harder (especially I/O) than on bare metal, so you will see this in virtual machines more often.
This is not a Galera specific issue (it just reports being stuck, other mysqld threads are equally stuck) so there is no configuration options for that. You simply must make sure that your system and mysqld are properly configured, that there is enough RAM (buffer pool not over provisioned), that there is swap, that there are proper I/O drivers installed on guest and so on.
Basically, Galera runs in virtual machines as well as well virtual machines approximates bare metal.
We were still suspecting that this is somehow VMware related. This week we had the chance to investigate... At 01:36 am node Galera2 lost connection to the Cluster and became NON-PRIMARY. This is basically a bad sign:
150401 1:36:15 [Warning] WSREP: last inactive check more than PT1.5S ago (PT5.08325S), skipping check 150401 1:36:15 [Note] WSREP: (09c6b2f2, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://192.168.42.2:4567 150401 1:36:16 [Note] WSREP: view(view_id(NON_PRIM,09c6b2f2,30) memb { 09c6b2f2,0 } joined { } left { } partitioned { ce6bf2e1,0 d1f9bee0,0 }) 150401 1:36:16 [Note] WSREP: view(view_id(NON_PRIM,09c6b2f2,31) memb { 09c6b2f2,0 } joined { } left { } partitioned { ce6bf2e1,0 d1f9bee0,0 }) 150401 1:36:16 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1 150401 1:36:16 [Note] WSREP: Flow-control interval: [16, 16] 150401 1:36:16 [Note] WSREP: Received NON-PRIMARY. 150401 1:36:16 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 26304132) 150401 1:36:16 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1 150401 1:36:16 [Note] WSREP: Flow-control interval: [16, 16] 150401 1:36:16 [Note] WSREP: Received NON-PRIMARY. 150401 1:36:16 [Warning] WSREP: Send action {(nil), 328, TORDERED} returned -107 (Transport endpoint is not connected) 150401 1:36:16 [Note] WSREP: New cluster view: global state: dcca768c-b5ad-11e3-bbc0-fb576fb3c451:26304132, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 3 150401 1:36:17 [Note] WSREP: (09c6b2f2, 'tcp://0.0.0.0:4567') reconnecting to d1f9bee0 (tcp://192.168.42.1:4567), attempt 0I suspected, after some investigation with the FromDual Performance Monitor for MySQL and MariaDB, that the database backup (mysqldump) could be the reason. It was not. But the customer explained, that after the database backup they do a VMware snapshot.
And when we compared our problem with the backup log file:
2015/04/01 01:35:08 [3] backup.fromdual.com: Creating a snapshot of galera3 2015/04/01 01:35:16 [3] backup.fromdual.com: Created a snapshot of galera3 2015/04/01 01:35:23 [3] backup.fromdual.com: galera3: backup the changed blocks of disk 'Festplatte 1' using NBD transport 2015/04/01 01:36:10 [3] backup.fromdual.com: galera3: saving the Change Block Tracking's reference for disk 'Festplatte 1' 2015/04/01 01:36:10 [3] backup.fromdual.com: Removing Arkeia's snapshot of galera3we can see that our problem pretty much started with the end of the WMware snapshot (01:36:10 + 5.08 = 1:36:15). By the way: For such kind of investigations it is always good to have a ntp daemon for time synchronization running. Otherwise problem investigation becomes much harder...
Some more and deeper investigation shows that we loose from time to time nodes during VMware snapshots (galera3). But they recover quickly because they can do an IST. In worst case we can loose 2 nodes and then the whole Galera Cluster has gone.
192.168.42.3 / node Galera3 2015-04-10 01:44:00 [3] backup.fromdual.com: Creating a snapshot of galera3 2015-04-10 01:44:08 [3] backup.fromdual.com: Created a snapshot of galera3 2015-04-10 01:44:15 [3] backup.fromdual.com: galera3: backup the changed blocks of disk 'Festplatte 1' using NBD transport 2015-04-10 01:45:39 [3] backup.fromdual.com: galera3: saving the Change Block Tracking's reference for disk 'Festplatte 1' 2015-04-10 01:45:39 [3] backup.fromdual.com: Removing Arkeia's snapshot of galera3150410 1:44:07 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera1:4567 tcp://galera2:4567 150410 1:44:07 [Warning] WSREP: last inactive check more than PT1.5S ago (PT7.06159S), skipping check 150410 1:44:08 [Note] WSREP: Received NON-PRIMARY. 150410 1:44:10 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 30399299) 150410 1:44:11 [Warning] WSREP: Gap in state sequence. Need state transfer. 150410 1:44:11 [Note] WSREP: Prepared IST receiver, listening at: tcp://galera3:4568 150410 1:44:11 [Note] WSREP: Member 0.0 (galera3) requested state transfer from '*any*'. Selected 2.0 (galera2)(SYNCED) as donor. 150410 1:44:11 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 30399309) 150410 1:44:11 [Note] WSREP: Requesting state transfer: success, donor: 2 150410 1:44:11 [Note] WSREP: 2.0 (galera2): State transfer to 0.0 (galera3) complete. 150410 1:44:11 [Note] WSREP: Member 2.0 (galera2) synced with group. 150410 1:44:11 [Note] WSREP: Receiving IST: 8 writesets, seqnos 30399291-30399299 150410 1:44:11 [Note] WSREP: IST received: dcca768c-b5ad-11e3-bbc0-fb576fb3c451:30399299 150410 1:44:11 [Note] WSREP: 0.0 (galera3): State transfer from 2.0 (galera2) complete. 150410 1:44:11 [Note] WSREP: Shifting JOINER -> JOINED (TO: 30399309) 150410 1:44:11 [Note] WSREP: Member 0.0 (galera3) synced with group. 150410 1:44:11 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 30399309) 150410 1:44:11 [Note] WSREP: Synchronized with group, ready for connections 150410 1:44:13 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') turning message relay requesting off 150410 1:45:42 [Warning] WSREP: last inactive check more than PT1.5S ago (PT2.47388S), skipping check 150410 1:45:43 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera1:4567 tcp://galera2:4567 150410 1:45:44 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') reconnecting to 54de92f8 (tcp://galera1:4567), attempt 0 150410 1:45:44 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') reconnecting to c9d964d3 (tcp://galera2:4567), attempt 0 150410 1:45:48 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') turning message relay requesting off 150410 1:47:26 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera1:4567 150410 1:47:27 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') reconnecting to 54de92f8 (tcp://galera1:4567), attempt 0 150410 1:47:31 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') turning message relay requesting off
192.168.42.1 / node Galera1 2015-04-10 01:47:24 [3] backup.fromdual.com: Creating a snapshot of galera1 2015-04-10 01:47:29 [3] backup.fromdual.com: Created a snapshot of galera1 2015-04-10 01:47:40 [3] backup.fromdual.com: galera1: backup the changed blocks of disk 'Festplatte 1' using NBD transport 2015-04-10 01:48:43 [3] backup.fromdual.com: galera1: saving the Change Block Tracking's reference for disk 'Festplatte 1' 2015-04-10 01:48:44 [3] backup.fromdual.com: Removing Arkeia's snapshot of galera1 150410 1:44:02 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera3:4567 150410 1:44:04 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') reconnecting to 158f71de (tcp://galera3:4567), attempt 0 150410 1:44:12 [Note] WSREP: Member 0.0 (galera3) requested state transfer from '*any*'. Selected 2.0 (galera2)(SYNCED) as donor. 150410 1:45:43 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera3:4567 150410 1:45:44 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') reconnecting to 158f71de (tcp://galera3:4567), attempt 0 150410 1:45:48 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') turning message relay requesting off 150410 1:47:27 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.66452S), skipping check 150410 1:47:27 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera3:4567 150410 1:47:30 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') turning message relay requesting off
192.168.42.2 / node Galera2 2015-04-10 02:09:55 [3] backup.fromdual.com: Creating a snapshot of galera2 2015-04-10 02:09:58 [3] backup.fromdual.com: Created a snapshot of galera2 2015-04-10 02:10:05 [3] backup.fromdual.com: galera2: backup the changed blocks of disk 'Festplatte 1' using NBD transport 2015-04-10 02:10:53 [3] backup.fromdual.com: galera2: saving the Change Block Tracking's reference for disk 'Festplatte 1' 2015-04-10 02:10:54 [3] backup.fromdual.com: Removing Arkeia's snapshot of galera2
150410 1:44:02 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera3:4567 150410 1:44:03 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') reconnecting to 158f71de (tcp://galera3:4567), attempt 0 150410 1:44:08 [Warning] WSREP: discarding established (time wait) 158f71de (tcp://192.168.42.3:4567) 150410 1:44:11 [Note] WSREP: Member 0.0 (galera3) requested state transfer from '*any*'. Selected 2.0 (galera2)(SYNCED) as donor. 150410 1:44:13 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') turning message relay requesting off 150410 1:45:43 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera3:4567 150410 1:45:44 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') reconnecting to 158f71de (tcp://galera3:4567), attempt 0 150410 1:45:48 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') turning message relay requesting off 150410 1:47:26 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera1:4567 150410 1:47:27 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') reconnecting to 54de92f8 (tcp://galera1:4567), attempt 0 150410 1:47:30 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') turning message relay requesting off 150410 2:09:57 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.83618S), skipping check
The backups are done with the 2 options:
enabled.
Possibly this is the reason and one should disable those features in combination with Galera. Further investigation is going on. In worst case VMware snapshotting with Galera should be avoided.
Rename MySQL Partition
Before I forget it and have to search again here a short note about how to rename a MySQL Partition:
My dream:
ALTER TABLE history RENAME PARTITION p2015_kw10 INTO p2015_kw09;In reality: ALTER TABLE history REORGANIZE PARTITION p2015_kw10 INTO ( PARTITION p2015_kw09 VALUES LESS THAN (UNIX_TIMESTAMP('2015-03-02 00:00:00')) );
Caution: REORGANIZE PARTITION causes a full copy of the whole partition!
Hint: I assume it would be very easy for MySQL or MariaDB to make this DDL command an in-place operation...
MySQL Partitioning was introduced in MySQL 5.1.
MySQL Enterprise Incremental Backup simplified
MySQL Enterprise Backup (MEB) has the capability to make real incremental (differential and cumulative?) backups. The actual releases are quite cool and you should really look at it...
Unfortunately the original MySQL documentation is much too complicated for my simple mind. So I did some testing and simplified it a bit for our customers...
If you want to dive into the original documentation please look here: Making an Incremental Backup .
If you want to use MySQL Enterprise Backup please let us know and we send you a quote...
Prepare MySQL Backup infrastructure mkdir /backup/full /backup/incremental1 /backup/incremental2Full MySQL Backup mysqlbackup --defaults-file=/etc/my.cnf --user=root --backup-dir=/backup/full backup mysqlbackup --defaults-file=/etc/my.cnf --user=root --backup-dir=/backup/full apply-log
First MySQL Incremental Backup mysqlbackup --defaults-file=/etc/my.cnf --user=root --incremental --incremental-base=dir:/backup/full --incremental-backup-dir=/backup/incremental1 backup mysqlbackup --defaults-file=/etc/my.cnf --user=root --backup-dir=/backup/full --incremental-backup-dir=/backup/incremental1 apply-incremental-backup
Second MySQL Incremental Backup mysqlbackup --defaults-file=/etc/my.cnf --user=root --incremental --incremental-base=dir:/backup/full --incremental-backup-dir=/backup/incremental2 backup mysqlbackup --defaults-file=/etc/my.cnf --user=root --backup-dir=/backup/full --incremental-backup-dir=/backup/incremental2 apply-incremental-backup
and so on...
MySQL Restore mysqlbackup --defaults-file=/etc/my.cnf --user=root --backup-dir=/backup/full copy-backHave fun with MySQL Enterprise Backup. If you need any help with your MySQL Backup concept, please let us know.