You are here
How does Galera Cluster behave with many nodes?
Recently I had the opportunity to have a lot of Linux systems (VMs with Rocky Linux 9) from one of our regular Galera Cluster trainings all to myself for a week. And MariaDB 11.4.4 with Galera Cluster was already installed on the machines.
Since I had long wanted to try out how a Galera Cluster behaves with an increasing number of nodes, now was the opportunity to try it out.
The following questions were to be answered:
- How does the throughput of a Galera cluster behave depending on the number of Galera nodes?
- Which configuration gives us the highest throughput?
A total of 5 different test parameters were experimented with:
- Number of Galera nodes.
- Number of client machines (= instances).
- Number of threads per client (
--threads=
). - Number of Galera threads (
wsrep_slave_threads
). - Runtime of the tests. This parameter was varied because some tests were cancelled during the run. It may be possible to eliminate this parameter with a lower rate (
--rate
) in the load test. As it turned out, it did have an influence on the result or the measured throughput (e.g. test 4b and 5 or 18 and 19).
A total of 35 different tests were run. See raw data.
Throughput as a function of the number of Galera nodes
Test | # gal nodes | # threads/client | runtime [s] | tps | runtime [s] |
---|---|---|---|---|---|
7 | 1 | 8 | 180 | 596.3 | 180 |
8 | 2 | 8 | 180 | 567.8 | 180 |
9 | 3 | 8 | 180 | 531.9 | 180 |
11 | 4 | 8 | 180 | 495.2 | 180 |
12 | 5 | 8 | 180 | 492.2 | 180 |
13 | 6 | 8 | 180 | 502.9 | 180 |
14 | 7 | 8 | 180 | 459.5 | 180 |
15 | 8 | 8 | 180 | 458.6 | 180 |
16 | 9 | 8 | 180 | 429.2 | 180 |
The throughput in the Galera cluster decreased slightly from 600 tps to 430 tps (28%) when the number of nodes was increased from 1 to 9.
Throughput as a function of the number of connections
The main variation here was with the number of clients and threads per client. The optimum seems to be 30 - 40 connections in this setup. Varying the number of Galera threads (wsrep_slave_threads
) does not seem to have had much effect in our case. The system does not seem to be able to deliver much more than 1200 tps. In particular, the machines of the described Galera nodes did not have too much CPU idle time.
Test | # client nodes | # threads/client | # con tot | # gal threads | runtime [s] | tps |
---|---|---|---|---|---|---|
16 | 1 | 8 | 8 | 1 | 180 | 429.2 |
17 | 2 | 8 | 16 | 1 | 180 | 684.5 |
18 | 3 | 8 | 24 | 1 | 180 | 603.8 |
19 | 3 | 8 | 24 | 1 | 120 | 925.2 |
20 | 3 | 8 | 24 | 1 | 120 | 919.8 |
21 | 4 | 8 | 32 | 1 | 120 | 1081.1 |
22 | 5 | 8 | 40 | 1 | 120 | 1196.0 |
23 | 5 | 8 | 40 | 4 | 120 | 1132.2 |
23b | 5 | 8 | 40 | 8 | 120 | 1106.0 |
24 | 5 | 16 | 80 | 4 | 120 | 1233.8 |
25 | 5 | 32 | 160 | 4 | 120 | 1095.7 |
Throughput as a function of all possible parameters
By further varying the parameters, in particular by reducing the number of Galera nodes from 9 to 3, the throughput could be further increased from just under 1200 to just over 1400 tps.
Test | # gal nodes | # client nodes | # threads/client | # con tot | tps |
---|---|---|---|---|---|
23 | 9 | 5 | 8 | 40 | 1132.2 |
23b | 9 | 5 | 8 | 40 | 1106.0 |
24 | 9 | 5 | 16 | 80 | 1233.8 |
25 | 9 | 5 | 32 | 160 | 1095.7 |
26 | 8 | 5 | 32 | 160 | 1132.4 |
27 | 7 | 5 | 32 | 160 | 1207.6 |
28 | 6 | 5 | 16 | 80 | 1333.3 |
29 | 5 | 5 | 8 | 40 | 1278.6 |
30 | 5 | 5 | 8 | 40 | 1281.5 |
31 | 4 | 5 | 8 | 40 | 1374.1 |
32 | 3 | 5 | 8 | 40 | 1304.3 |
33 | 3 | 6 | 8 | 48 | 1428.9 |
With the given hardware, there seems to be an optimum somewhere around 3 Galera nodes and approx. 40 connections. More detailed clarifications would be interesting here...
Statistical Design of Experiments (DoE)
Here it would be exciting to work with the method of statistical design of experiments to determine this optimum more precisely or to find it more quickly.
- Mettler Toledo: DoE - Statistical Design of Experiments - A statistical approach to reaction optimisation.
- Wikipedia.en: Statistical design of experiments..
- Design of Experiments - DoE.
- Novustat: Quality Engineering: Statistical Design of Experiments - Our ultimate overview!.
Hardware specification
VM's from Hetzner: CX22 (2 vCPU, 4 Gibyte RAM (effective: 3.5 Gibyte (why that?)), 40 Gibyte disc)
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 40 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 2 On-line CPU(s) list: 0,1 Vendor ID: GenuineIntel BIOS Vendor ID: QEMU Model name: Intel Xeon Processor (Skylake, IBRS, no TSX) BIOS Model name: NotSpecified CPU family: 6 Model: 85 Thread(s) per core: 1 Core(s) per socket: 2 Socket(s): 1 Stepping: 4 BogoMIPS: 4589.21 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2a pic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault pti ssbd ibrs ibpb fsgsbase bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap clwb avx512cd avx512bw avx512vl xs aveopt xsavec xgetbv1 xsaves arat pku ospke md_clear Virtualization features: Hypervisor vendor: KVM Virtualization type: full Caches (sum of all): L1d: 64 KiB (2 instances) L1i: 64 KiB (2 instances) L2: 8 MiB (2 instances) L3: 16 MiB (1 instance)
Benchmark tool / load generator
sysbench
was used as a load generator.
# dnf install epel-release # dnf install sysbench
Each client runs on its own scheme to avoid Galera cluster conflicts. In reality, this is not always the case, but it is the optimal case for Galera.
SQL> CREATE DATABASE sbtest<n>;
Each client connects to a different Galera node (1 - 6 clients distributed on 1 - 9 Galera nodes).
GALERA_IP=<galera_ip> DATABASE=sbtest<n> # sysbench oltp_common --mysql-host=${GALERA_IP} --mysql-user=app --mysql-password=secret --mysql-db=${DATABASE} --db-driver=mysql prepare # sysbench oltp_read_write --time=180 --db-driver=mysql --mysql-host=${GALERA_IP} --mysql-user=app --mysql-password=secret --mysql-db=${DATABASE} --threads=8 --rate=1000 --report-interval=1 run # sysbench oltp_common --mysql-host=${GALERA_IP} --mysql-user=app --mysql-password=secret --mysql-db=${DATABASE} --db-driver=mysql cleanup
MariaDB and Galera configuration
[server] binlog_format = row innodb_autoinc_lock_mode = 2 innodb_flush_log_at_trx_commit = 2 query_cache_size = 0 query_cache_type = 0 wsrep_on = on wsrep_provider = /usr/lib64/galera-4/libgalera_smm.so wsrep_cluster_address = "gcomm://10.0.0.2,10.0.0.3,10.0.0.4,10.0.0.5,10.0.0.6,10.0.0.7,10.0.0.8,10.0.0.9,10.0.0.10,10.0.0.11,10.0.0.12,10.0.0.13,10.0.0.14,10.0.0.15,10.0.0.16,10.0.0.17" wsrep_cluster_name = 'Galera Cluster' wsrep_node_address = 10.0.0.2 wsrep_sst_method = rsync wsrep_sst_auth = sst:secret