You are here
MariaDB MaxScale Load Balancer with Master/Master Replication
For this how-to we were working with a MariaDB 10.6.5 database server and the newest MariaDB MaxScale 6.3.0 (aka 2.6) load balancer/proxy.
As a starting point we had a working MariaDB Master/Master replication with one active Master (Ma) and one passive Master (Mp). Whereas the passive Master was set to read_only
. As a replication user we used the user replication
.
Creating database accounts for the MaxScale Load Balancer
The MaxScale load balancer connects itself with the application users to the database. To do this it needs to retrieve the available users from the database first. To get these users the MaxScale account needs some privileges:
CREATE USER 'maxscale_admin'@'%' IDENTIFIED BY 'secret'; GRANT SELECT ON mysql.user TO 'maxscale_admin'@'%'; GRANT SELECT ON mysql.db TO 'maxscale_admin'@'%'; GRANT SELECT ON mysql.tables_priv TO 'maxscale_admin'@'%'; GRANT SELECT ON mysql.columns_priv TO 'maxscale_admin'@'%'; GRANT SELECT ON mysql.procs_priv TO 'maxscale_admin'@'%'; GRANT SELECT ON mysql.proxies_priv TO 'maxscale_admin'@'%'; GRANT SELECT ON mysql.roles_mapping TO 'maxscale_admin'@'%'; GRANT SHOW DATABASES ON *.* TO 'maxscale_admin'@'%';
For monitoring the replication and doing proper switchover and failover MaxScale further needs a monitoring account:
CREATE USER 'maxscale_monitor'@'%' IDENTIFIED BY 'secret'; GRANT REPLICATION CLIENT on *.* to 'maxscale_monitor'@'%'; GRANT REPLICATION SLAVE on *.* to 'maxscale_monitor'@'%'; GRANT SUPER, RELOAD on *.* to 'maxscale_monitor'@'%';
And last we need an application account for our test application:
CREATE USER 'app'@'%' IDENTIFIED BY 'secret'; GRANT ALL on test.* to 'app'@'%';
Starting MariaDB MaxScale Load Balancer
Because we do not use the provided DEB/RPM packages on our systems but generic binary tarballs, which are not available, we have to start the MaxScale Load Balancer a bit over-complicated:
# export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/home/mysql/product/maxscale/usr/lib/x86_64-linux-gnu/maxscale # ldd ./maxscale # ./maxscale --help # ./maxscale --config=/home/mysql/etc/maxscale_mm.cnf --datadir=/home/mysql/tmp --basedir=/home/mysql/product/maxscale --logdir=/home/mysql/log --config-check # ./maxscale --config=/home/mysql/etc/maxscale_mm.cnf --datadir=/home/mysql/tmp --basedir=/home/mysql/product/maxscale --logdir=/home/mysql/log
The MaxScale configuration file we used in these tests looks as follows:
# # maxscale_mm.cnf # [maxscale] threads = auto [master1] type = server address = 192.168.1.11 port = 3306 protocol = MariaDBBackend [master2] type = server address = 192.168.1.12 port = 3306 protocol = MariaDBBackend [MultiMasterMonitor] type = monitor module = mariadbmon servers = master1,master2 user = maxscale_monitor password = secret enforce_read_only_slaves = true auto_rejoin = true # auto_failover = true [WriteService] type = service router = readconnroute router_options = master servers = master1,master2 user = maxscale_admin password = secret [ReadService] type = service router = readconnroute router_options = slave servers = master1,master2 user = maxscale_admin password = secret [WriteListener] type = listener service = WriteService protocol = MariaDBClient port = 3306 [ReadListener] type = listener service = ReadService protocol = MariaDBClient port = 3307
If the start was successful can be seen for example with:
# ps -ef | grep maxscale
If you start the MariaDB MaxScale the first time no user/password is needed. So we can connect the the MaxScale Load Balancer with the maxctrl
client:
# ./maxctrl maxctrl> list listeners ┌──────────────────┬──────┬──────┬─────────┬──────────────┐ │ Name │ Port │ Host │ State │ Service │ ├──────────────────┼──────┼──────┼─────────┼──────────────┤ │ WriteListener │ 3306 │ :: │ Running │ WriteService │ ├──────────────────┼──────┼──────┼─────────┼──────────────┤ │ MaxAdminListener │ 3307 │ :: │ Running │ ReadService │ └──────────────────┴──────┴──────┴─────────┴──────────────┘ maxctrl> list services ┌──────────────┬───────────────┬─────────────┬───────────────────┬──────────────────┐ │ Service │ Router │ Connections │ Total Connections │ Servers │ ├──────────────┼───────────────┼─────────────┼───────────────────┼──────────────────┤ │ WriteService │ readconnroute │ 0 │ 0 │ master1, master2 │ ├──────────────┼───────────────┼─────────────┼───────────────────┼──────────────────┤ │ ReadService │ readconnroute │ 0 │ 0 │ master1, master2 │ └──────────────┴───────────────┴─────────────┴───────────────────┴──────────────────┘ maxctrl> list servers ┌─────────┬──────────────┬──────┬─────────────┬──────────────────────────────┬────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ ├─────────┼──────────────┼──────┼─────────────┼──────────────────────────────┼────────────┤ │ master1 │ 192.168.1.11 │ 3306 │ 0 │ Master, Running │ 0-3308-134 │ ├─────────┼──────────────┼──────┼─────────────┼──────────────────────────────┼────────────┤ │ master2 │ 192.168.1.12 │ 3306 │ 0 │ Relay Master, Slave, Running │ 0-3308-134 │ └─────────┴──────────────┴──────┴─────────────┴──────────────────────────────┴────────────┘ maxctrl> list sessions ┌────────┬──────┬─────────────────────┬──────────────────────────┬──────┬─────────────┐ │ Id │ User │ Host │ Connected │ Idle │ Service │ ├────────┼──────┼─────────────────────┼──────────────────────────┼──────┼─────────────┤ │ 746468 │ app │ ::ffff:192.168.1.99 │ Fri May 13 15:28:55 2022 │ 0.4 │ ReadService │ └────────┴──────┴─────────────────────┴──────────────────────────┴──────┴─────────────┘
To not allow everybody to access MariaDB MaxScale we set a password for the admin
user:
maxctrl> alter user admin secret OK
Then you can connect with username and password:
# ./maxctrl --user=admin --password=secret maxctrl> list users ┌───────┬──────┬────────────┐ │ Name │ Type │ Privileges │ ├───────┼──────┼────────────┤ │ admin │ inet │ admin │ └───────┴──────┴────────────┘
Testing connections over the MaxScale Load Balancer
To test if connections are ending up on the correct database instances we used the following commands:
# mariadb --user=app --host=192.168.1.1 --port=3306 --password=secret test --execute='SELECT @@hostname, @@port' # mariadb --user=app --host=192.168.1.1 --port=3307 --password=secret test --execute='SELECT @@hostname, @@port'
Monitoring of nodes in MariaDB MaxScale
To see what is going on inside the MariaDB MaxScale Load Balancer we used the following command to create a simple real time monitor:
# watch -d -n 1 ./maxctrl --user=admin --password=secret list servers
Switchover and Failover with MariaDB MaxScale
To use MariaDB MaxScales switchover and failover capabilities the Master/Master Replication must be configured to use Global Transaction IDs (GTID). To change to GTID based replication you can use the following commands on both masters:
SQL> STOP SLAVE; SQL> CHANGE MASTER TO MASTER_USE_GTID = SLAVE_POS; SQL> START SLAVE;
Switchover in MariaDB MaxScale
A graceful switchover is used in a controlled situation if we want to switch the roles between the active master and the passive master and vice versa... This can be used for example before a maintenance operation: We do the maintenance operation on the passive Master first, then we switch the roles and then we can do the maintenance operation on the now new passive Master:
maxctrl> call command mariadbmon switchover MultiMasterMonitor
After switching for and back a few times we found out, that the MariaDB MaxScale Monitor has replaced our replication user replication
by his own user maxscale_monitor
. This is not documented? and I do really not like it, especially if it is done silently...
Failover in MariaDB MaxScale
To provoke/simulate a failover situation we stopped the active Master. Then we see in our monitor that the active Master is down and the passive Master is still running:
maxctrl> list servers ┌─────────┬──────────────┬──────┬─────────────┬────────────────┬────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ ├─────────┼──────────────┼──────┼─────────────┼────────────────┼────────────┤ │ master1 │ 192.168.1.11 │ 3306 │ 0 │ Down │ 0-3309-142 │ ├─────────┼──────────────┼──────┼─────────────┼────────────────┼────────────┤ │ master2 │ 192.168.1.12 │ 3306 │ 1 │ Slave, Running │ 0-3309-142 │ └─────────┴──────────────┴──────┴─────────────┴────────────────┴────────────┘
We further observe, that our application (the famous insert_test.sh
) is not working any more but throwing errors... Then we trigger a manual failover:
maxctrl> call command mariadbmon failover MultiMasterMonitor
It looks like an automatic failover is possible (auto_failover = true
) but we do not recommend this set-up and thus we did not further investigate in this feature.
After the manual failover the former passive Master becomes active Master:
maxctrl> list servers ┌─────────┬──────────────┬──────┬─────────────┬─────────────────┬───────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ ├─────────┼──────────────┼──────┼─────────────┼─────────────────┼───────────────┤ │ master1 │ 192.168.1.11 │ 3306 │ 0 │ Down │ 0-3308-365829 │ ├─────────┼──────────────┼──────┼─────────────┼─────────────────┼───────────────┤ │ master2 │ 192.168.1.12 │ 3306 │ 0 │ Master, Running │ 0-3308-370235 │ └─────────┴──────────────┴──────┴─────────────┴─────────────────┴───────────────┘
Now we simulate the repair of the former failed active Master by just restarting it:
maxctrl> list servers ┌─────────┬──────────────┬──────┬─────────────┬─────────────────┬───────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ ├─────────┼──────────────┼──────┼─────────────┼─────────────────┼───────────────┤ │ master1 │ 192.168.1.11 │ 3306 │ 0 │ Slave, Running │ 0-3308-401309 │ ├─────────┼──────────────┼──────┼─────────────┼─────────────────┼───────────────┤ │ master2 │ 192.168.1.12 │ 3306 │ 1 │ Master, Running │ 0-3308-401309 │ └─────────┴──────────────┴──────┴─────────────┴─────────────────┴───────────────┘
It comes back into the MariaDB MaxScale but just as a simple Slave. So a failover seems to break our Master/Master replication for some reasons. This must be repaired afterwards. Whereas a simple switchover seems to work properly. If this is a bug or intended behaviour I do not know... With the following command on master2
the loop is closed again:
SQL> CHANGE MASTER TO master_host='192.168.1.11', master_port=3306, master_user='maxscale_monitor', master_password='secret', master_use_gtid=slave_pos; SQL> START SLAVE;
Now everything is fine and working as expected again:
maxctrl> list servers ┌─────────┬──────────────┬──────┬─────────────┬──────────────────────────────┬───────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ ├─────────┼──────────────┼──────┼─────────────┼──────────────────────────────┼───────────────┤ │ master1 │ 192.168.1.11 │ 3306 │ 1 │ Relay Master, Slave, Running │ 0-3308-440194 │ ├─────────┼──────────────┼──────┼─────────────┼──────────────────────────────┼───────────────┤ │ master2 │ 192.168.1.12 │ 3306 │ 0 │ Master, Running │ 0-3308-440194 │ └─────────┴──────────────┴──────┴─────────────┴──────────────────────────────┴───────────────┘
Stopping and starting the passive Master had not effect at all.
Switchover with a lagging passive master
To test this scenario we created an artificial lag of the passive Master by setting innodb_flush_log_at_trx_commit = 1
and sync_binlog = 1
. When the passive master was lagging far enough (about 30 seconds) we tried an switchover:
maxctrl> call command mariadbmon switchover MultiMasterMonitor Error: timeout of 10000ms exceeded
So it is not really clear what happens in this case and the error message is not really telling us about the problem. Further this operations somehow breaks Master/Master replication again.
Failover with a lagging passive master
If we try a failover instead of a switchover we get at least a bit a more meaningful error message:
Error: Server at http://127.0.0.1:8989 responded with 403 Forbidden to `POST maxscale/modules/mariadbmon/failover?MultiMasterMonitor` { "links": { "self": "http://127.0.0.1:8989/v1/maxscale/modules/mariadbmon/failover/" }, "meta": { "errors": [ { "detail": "Can not select 'master1' as a demotion target for failover because it is a running master." }, { "detail": "Failover cancelled." } ] } }
Draining a node with MariaDB MaxScalse
If we try to drain a passive Master we get some warnings but it seems like the result is the expected:
maxctrl> drain server master2 Warning: Saving runtime modifications to 'ReadService' in '/home/mysql/product/maxscale/var/lib/maxscale/maxscale.cnf.d/ReadService.cnf'. The modified values will override the values found in the static configuration files.;Saving runtime modifications to 'WriteService' in '/home/mysql/product/maxscale/var/lib/maxscale/maxscale.cnf.d/WriteService.cnf'. The modified values will override the values found in the static configuration files. To hide these warnings, run: export MAXCTRL_WARNINGS=0 Warning: Saving runtime modifications to 'ReadService' in '/home/mysql/product/maxscale/var/lib/maxscale/maxscale.cnf.d/ReadService.cnf'. The modified values will override the values found in the static configuration files.;Saving runtime modifications to 'WriteService' in '/home/mysql/product/maxscale/var/lib/maxscale/maxscale.cnf.d/WriteService.cnf'. The modified values will override the values found in the static configuration files. To hide these warnings, run: export MAXCTRL_WARNINGS=0 OK maxctrl> list servers ┌─────────┬──────────────┬──────┬─────────────┬──────────────────────┬───────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ ├─────────┼──────────────┼──────┼─────────────┼──────────────────────┼───────────────┤ │ master1 │ 192.168.1.11 │ 3306 │ 0 │ Master, Running │ 0-3308-631119 │ ├─────────┼──────────────┼──────┼─────────────┼──────────────────────┼───────────────┤ │ master2 │ 192.168.1.12 │ 3306 │ 0 │ Maintenance, Running │ 0-3308-631119 │ └─────────┴──────────────┴──────┴─────────────┴──────────────────────┴───────────────┘
Un-drain a node with MariaDB MaxScale
For un-draining the node we tried:
maxctrl> clear server master2 drain OK
On the first look everything seems to be OK. But the State of master2
was still in Maintenance
. The command:
maxctrl> clear server master2 maintenance
did the job... If this is a bug or should be considered as an operator error I do not know...
Draining the active Master caused a switchover.
Sources
- Multi-Master Monitor
- Common Monitor Parameters
- Automatic Failover With MariaDB Monitor
- MariaDB Monitor
- MariaDB MaxScale Configuration Guide
- MaxScale 6 MaxCtrl - Drain Server
- Shinguz's blog
- Log in or register to post comments