You are here
Do not underestimate performance impacts of swapping on NUMA database systems
If your MariaDB or MySQL database system is swapping it can have a significant impact on your database query performance! Further it can also slow down your database shutdown and thus influence the whole reboot of your machine. This is especially painful if you have only short maintenance windows or if you do not want to spend the whole night with operation tasks.
When we do reviews of our customer MariaDB or MySQL database systems one of the items to check is Swap Space and swapping. With the free
command you can find if your system has Swap Space enabled at all and how much of your Swap Space is used:
# free total used free shared buff/cache available Mem: 16106252 3300424 697284 264232 12108544 12011972 Swap: 31250428 1701792 29548636
With the command:
# swapon --show NAME TYPE SIZE USED PRIO /dev/sdb2 partition 29.8G 1.6G -1
you can show on which disk drive your Swap Space is physically located. And with the following 3 commands you can find if your system is currently swapping or not:
# vmstat 1 procs ------------memory------------ ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 1 0 1701784 692580 355716 11757864 2 12 685 601 237 146 9 3 86 2 0 0 0 1701784 692472 355720 11757840 0 0 0 196 679 2350 2 1 97 1 0 0 0 1701784 692720 355728 11757332 0 0 0 104 606 2170 0 1 98 1 0 # sar -W 1 15:44:30 pswpin/s pswpout/s 15:44:31 0.00 0.00 15:44:32 0.00 0.00 15:44:33 0.00 0.00 # sar -S 1 15:43:02 kbswpfree kbswpused %swpused kbswpcad %swpcad 15:43:03 29548648 1701780 5.45 41552 2.44 15:43:04 29548648 1701780 5.45 41552 2.44 15:43:05 29548648 1701780 5.45 41552 2.44
Side note: Recent Linux distributions tend to use Swap Files instead of Swap Partitions. The performance impact seems to be negligible compared to the operational advantages of Swap Files... [ 1 ] [ 2 ] [ 3 ] [ 4 ]
What is Swap Space on a Linux system
Modern Operating Systems like Linux manage Virtual Memory (VM) which consists of RAM (fast) and Disk (HDD very slow and SSD slow). If the Operating System is short in fast RAM it tries to write some "old" pages to slow disk to get more free fast RAM for "new" pages and/or for the file system cache. This technique enables the Operating System to keep more and/or bigger processes running than physical RAM is available (overcommitment of RAM).
If one of those "old" pages is needed again it has to be swapped in which technically is a physical random disk read (which is slow, this is also called a major page fault).
If this block is a MariaDB or MySQL database block this disk read to RAM will slow down your SELECT
queries but also INSERT
, UPDATE
and DELETE
when you do write queries. This can severely slow down for example your clean-up jobs which have to remove "old" data (located on disk possibly in Swap Space).
Sizing of Swap Space for database systems
A rule of thumb for Swap Space is: Have always Swap Space but never use it (disk is cheap nowadays)!
A reasonable Swap Space sizing for database systems is the following:
Amount of RAM | Swap Space |
---|---|
4 GiB of RAM or less | a minimum of 4 GiB of Swap Space, is this really a Database server? |
8 GiB to 16 GiB of RAM | a minimum of once the amount of RAM of Swap Space |
24 GiB to 64 GiB of RAM | a minimum of half the amount of RAM of Swap Space |
more than 64 GiB of RAM | a minimum of 32 GiB of Swap Space |
If you have a close look at your Swap usage and if you monitor your Swap Space precisely and if you know exactly what you are doing you can lower these values...
It is NOT recommended to disable Swap Space
Some people tend to disable Swap Space. We see this mainly in virtualized environments (virtual machines) and cloud servers. From the VM/Cloud administrator point of view I can even understand why they disable Swap. But from the MariaDB / MySQL DBA point of view this is a bad idea.
If you do proper MariaDB / MySQL configuration (innodb_buffer_pool_size = 75% of RAM) the server should not swap a lot. But if you exaggerate with memory configuration the system starts swapping heavily. Till to the end the OOM-Killer will be activated by your Linux killing the troublemaker (typically the database process). If you have sufficient Swap Space enabled you get some time to detect a bad database configuration and act accordingly. If you have Swap Space disabled completely you do not get this safety buffer and OOM killer will act immediately and kill your database process when you run out of RAM. This really cannot be in the interest of the DBA.
Some literature to read further about Swap: In defence of swap: common misconceptions
Influence swapping - Swappiness
The Linux kernel documentation tells us the following about swappiness:
swappiness
This control is used to define how aggressive the kernel will swap memory pages. Higher values will increase aggressiveness, lower values decrease the amount of swap. A value of 0 instructs the kernel not to initiate swap until the amount of free and file-backed pages is less than the high water mark in a zone.
The default value is 60.
Source: Documentation for /proc/sys/vm/*
A informative article on StackExchange: Why is swappiness set to 60 by default?
To change your swappiness the following commands will help:
# sysctl vm.swappiness vm.swappiness = 60 # sysctl vm.swappiness=1 # sysctl vm.swappiness vm.swappiness = 1
To make these changes persistent you have to write it to some kind of configuration file dependent on your Operating System:
# # /etc/sysctl.d/99-sysctl.conf # vm.swappiness=1
Who is using the Swap Space?
For further analysing your Swap Space and to find who is using your Swap Space please see our article MariaDB and MySQL swap analysis.
What if your system is still swapping? - NUMA!
If you did everything correctly until here and your system is still swapping you possibly missed one point: NUMA systems behave a bit tricky related to Databases and swapping. The first person who wrote extensively about this problem in the MySQL ecosystem was Jeremy Cole in 2010 in his two well written articles which you can find here:
- The MySQL “swap insanity” problem and the effects of the NUMA architecture and
- A brief update on NUMA and MySQL
What NUMA is you can find here: Non-uniform memory access.
If you have spent your money for an expensive NUMA system you can find with the following command:
# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 56 On-line CPU(s) list: 0-55 Thread(s) per core: 2 Core(s) per socket: 14 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 79 Model name: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz Stepping: 1 CPU MHz: 2600.000 CPU max MHz: 2600.0000 CPU min MHz: 1200.0000 BogoMIPS: 5201.37 Virtualization: VT-x Hypervisor vendor: vertical Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 35840K NUMA node0 CPU(s): 0-13,28-41 NUMA node1 CPU(s): 14-27,42-55
If you are now in the unfortunate situation of having such a huge box with several sockets you can do different things:
- Configuring your MariaDB / MySQL database to allocate memory evenly on both sockets with the parameter
innodb_numa_interleave
. This works since MySQL 5.6.27, MySQL 5.7.9 and MariaDB 10.2.4 but there were various bugs in this area in Debian and CentOS packages (e.g. #80288, #78953, #79354 and MDEV-18660). - Disable NUMA support in your BIOS (Node Interleaving = enabled). Then there is no NUMA presentation to the Operating System any more.
- Start your MariaDB / MySQL database with
numactl --interleave all
as described here: MySQL and NUMA. - Set
innodb_buffer_pool_size
to 75% of half of your RAM. Sad for having too much of RAM. - Playing around with the following Linux settings could help to decrease swapping:
vm.zone_reclaim_mode=0
andkernel.numa_balancing=0
.
Some NUMA map examples
Download: numa-maps-summary.pl
un-balanced NUMA system | balanced NUMA sytem |
---|---|
./numa-maps-summary.pl < /proc/$(pidof mysqld)/numa_maps N0 : 28978412 (110.54 GB) N1 : 10777305 ( 41.11 GB) active : 35345437 (134.83 GB) anon : 39751779 (151.64 GB) dirty : 39751779 (151.64 GB) kernelpagesize_kB: 23320 ( 0.09 GB) mapmax : 367 ( 0.00 GB) mapped : 4033 ( 0.02 GB) |
./numa-maps-summary.pl < /proc/$(pidof mysqld)/numa_maps N0 : 11819082 ( 45.09 GB) N1 : 8917524 ( 34.02 GB) active : 247 ( 0.00 GB) anon : 20732465 ( 79.09 GB) dirty : 20732465 ( 79.09 GB) kernelpagesize_kB: 24600 ( 0.09 GB) mapmax : 398 ( 0.00 GB) mapped : 4236 ( 0.02 GB) |
Sources
Some further information about Swap Space you can find here:
- Shinguz's blog
- Log in or register to post comments
Comments
A different opinion
Kernel version 3.5 and newer: disables swappiness
swappiness
can have a value between 0 and 100.swappiness=0
:swappiness=1
:swappiness=100
:Source: How do I configure swappiness?
InnoDB NUMA interleave with MariaDB