You are here
dbstat for MariaDB (and MySQL)
Table of contents
An idea that I have been thinking about for a long time and have now, thanks to a customer, finally tackled is dbstat
for MariaDB/MySQL. The idea is based on sar/sysstat
by Sebastien Godard:
sar - Collect, report, or save system activity information.
and Oracle Statspack:
Statspack is a performance tuning tool ... to quickly gather detailed analysis of the performance of that database instance.
Functionality of dbstat
Although we have had the performance schema for some time, it does not cover some points that we see as a problem in practice and that are requested by customers:
- The
table_size
module collects data on the growth of tables. This allows statements to be made about the growth of individual tables, databases, future MariaDB Catalogs or the entire instance. This is interesting for users who are using multi-tenant systems or are otherwise struggling with uncontrolled growth. - The
processlist
module takes a snapshot of the process list at regular intervals and saves it. This information is useful for post-mortem analyses if the user was too slow to save his process list or to understand how a problem has built up. - The problem is often caused by long-running transactions, row locks or metadata locks. These are recorded and saved by the
trx_and_lck
andmetadata_lock
modules. This means that we can see problems that we did not even notice before or we can see what led to the problem after the accident (analogous to a tachograph in a vehicle). - Another question that we sometimes encounter in practice is: When was which database variable changed and what did it look like before? This is covered by the
global_variables
module. Unfortunately, it is not possible to find out who changed the variable or why. Operational processes are required for this. - The last module,
global_status
, actually covers whatsar/sysstat does
. It collects the values fromSHOW GLOBAL STATUS;
and saves them for later analysis purposes or to simply create graphs.
How does dbstat
work
dbstat
uses the database Event Scheduler as a scheduler. This must first be switched on for MariaDB (event_scheduler = ON
). With MySQL it is already switched on by default. The Event Scheduler has the advantage that we can activate the jobs at a finer granularity, for example 10 s, which would not be possible with the crontab.
The Event Scheduler then executes SQL/PSM code to collect the data on the one hand and to delete the data on the other, so that the dbstat
database does not grow immeasurably.
The following jobs are currently planned:
Module | Collect | Delete | Quantity structure | Remarks |
---|---|---|---|---|
table_size | 1/d at 02:04 | 12/h, 1000 rows, > 31 d | 1000 tab × 31 d = 31k rows | Should work up to 288k tables. |
processlist | 1/min | 1/min, 1000 rows, > 7 d | 1000 con × 1440 min × 7 d = 10M rows | Should work up to 1000 concurrent connections. |
trx_and_lck | 1/min | 1/min, 1000 rows, > 7 d | 100 lck × 1440 min × 7 d = 1M rows | Depends very much on the application. |
metadata_lock | 1/min | 12/h, 1000 rows, > 30 d | 100 mdl × 1440 × 30 d = 4M rows | Depends very much on the application. |
global_variables | 1/min | never | 1000 rows | Normally this table should not grow. |
global_status | 1/min | 1/min, 1000 rows, > 30 d | 1000 rows × 1440 × 30 d = 40M | Rows can become large? |
How to install dbstat
dbstat
can be downloaded from Github and is licensed under GPLv2.
The installation is simple: First execute the SQL file create_user_and_db.sql
. Then execute the corresponding create_*.sql
files for the respective modules in the dbstat
database. There are currently no direct dependencies between the modules. If you want to use a different user or a different database than dbstat, you have to take care of this yourself.
Query dbstat
Some possible queries on the data have already been prepared. They can be found in the query_*.sql
files. Here are a few examples:
table_size
SELECT `table_schema`, `table_name`, `ts`, `table_rows`, `data_length`, `index_length`
FROM `table_size`
WHERE `table_catalog` = 'def'
AND `table_schema` = 'dbstat'
AND `table_name` = 'table_size'
ORDER BY `ts` ASC
;
+--------------+------------+---------------------+------------+-------------+--------------+
| table_schema | table_name | ts | table_rows | data_length | index_length |
+--------------+------------+---------------------+------------+-------------+--------------+
| dbstat | table_size | 2024-03-09 20:01:00 | 0 | 16384 | 16384 |
| dbstat | table_size | 2024-03-10 17:26:33 | 310 | 65536 | 16384 |
| dbstat | table_size | 2024-03-11 08:28:12 | 622 | 114688 | 49152 |
| dbstat | table_size | 2024-03-12 08:02:38 | 934 | 114688 | 49152 |
| dbstat | table_size | 2024-03-13 08:08:55 | 1247 | 278528 | 81920 |
+--------------+------------+---------------------+------------+-------------+--------------+
processlist
SELECT connection_id, ts, time, state, SUBSTR(REGEXP_REPLACE(REPLACE(query, "\n", ' '), '\ +', ' '), 1, 64) AS query
FROM processlist
WHERE command != 'Sleep'
AND connection_id = @connection_id
ORDER BY ts ASC
LIMIT 5
;
+---------------+---------------------+---------+---------------------------------+---------------------------------------------+
| connection_id | ts | time | state | query |
+---------------+---------------------+---------+---------------------------------+---------------------------------------------+
| 14956 | 2024-03-09 20:21:12 | 13.042 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 |
| 14956 | 2024-03-09 20:22:12 | 73.045 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 |
| 14956 | 2024-03-09 20:23:12 | 133.044 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 |
| 14956 | 2024-03-09 20:24:12 | 193.044 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 |
| 14956 | 2024-03-09 20:25:12 | 253.041 | Waiting for table metadata lock | update test set data = 'bla' where id = 100 |
+---------------+---------------------+---------+---------------------------------+---------------------------------------------+
trx_and_lck
SELECT * FROM trx_and_lck\G
*************************** 1. row ***************************
machine_name:
connection_id: 14815
trx_id: 269766
ts: 2024-03-09 20:05:57
user: root
host: localhost
db: test
command: Query
time: 41.000
running_since: 2024-03-09 20:05:16
state: Statistics
info: select * from test where id = 6 for update
trx_state: LOCK WAIT
trx_started: 2024-03-09 20:05:15
trx_requested_lock_id: 269766:821:5:7
trx_tables_in_use: 1
trx_tables_locked: 1
trx_lock_structs: 2
trx_rows_locked: 1
trx_rows_modified: 0
lock_mode: X
lock_type: RECORD
lock_table_schema: test
lock_table_name: test
lock_index: PRIMARY
lock_space: 821
lock_page: 5
lock_rec: 7
lock_data: 6
*************************** 2. row ***************************
machine_name:
connection_id: 14817
trx_id: 269760
ts: 2024-03-09 20:05:57
user: root
host: localhost
db: test
command: Sleep
time: 60.000
running_since: 2024-03-09 20:04:57
state:
info:
trx_state: RUNNING
trx_started: 2024-03-09 20:04:56
trx_requested_lock_id: NULL
trx_tables_in_use: 0
trx_tables_locked: 1
trx_lock_structs: 2
trx_rows_locked: 1
trx_rows_modified: 1
lock_mode: X
lock_type: RECORD
lock_table_schema: test
lock_table_name: test
lock_index: PRIMARY
lock_space: 821
lock_page: 5
lock_rec: 7
lock_data: 6
metadata_lock
SELECT lock_mode, ts, user, host, lock_type, table_schema, table_name, time, started, state, query
FROM metadata_lock
WHERE connection_id = 14347
ORDER BY started DESC
LIMIT 5
;
+-------------------------+---------------------+------+-----------+----------------------+--------------+------------+-------+---------------------+----------------+------------------------------------------------------+
| lock_mode | ts | user | host | lock_type | table_schema | table_name | time | started | state | query |
+-------------------------+---------------------+------+-----------+----------------------+--------------+------------+-------+---------------------+----------------+------------------------------------------------------+
| MDL_SHARED_WRITE | 2024-03-13 10:27:33 | root | localhost | Table metadata lock | test | test | 1.000 | 2024-03-13 10:27:32 | Updating | UPDATE test set data3 = MD5(id) |
| MDL_BACKUP_TRANS_DML | 2024-03-13 10:27:33 | root | localhost | Backup lock | | | 1.000 | 2024-03-13 10:27:32 | Updating | UPDATE test set data3 = MD5(id) |
| MDL_BACKUP_ALTER_COPY | 2024-03-13 10:22:33 | root | localhost | Backup lock | | | 0.000 | 2024-03-13 10:22:33 | altering table | ALTER TABLE test DROP INDEX ts, ADD INDEX (ts, data) |
| MDL_SHARED_UPGRADABLE | 2024-03-13 10:22:33 | root | localhost | Table metadata lock | test | test | 0.000 | 2024-03-13 10:22:33 | altering table | ALTER TABLE test DROP INDEX ts, ADD INDEX (ts, data) |
| MDL_INTENTION_EXCLUSIVE | 2024-03-13 10:22:33 | root | localhost | Schema metadata lock | test | | 0.000 | 2024-03-13 10:22:33 | altering table | ALTER TABLE test DROP INDEX ts, ADD INDEX (ts, data) |
+-------------------------+---------------------+------+-----------+----------------------+--------------+------------+-------+---------------------+----------------+------------------------------------------------------+
global_variables
SELECT variable_name, COUNT(*) AS cnt
FROM global_variables
GROUP BY variable_name
HAVING COUNT(*) > 1
;
+-------------------------+-----+
| variable_name | cnt |
+-------------------------+-----+
| innodb_buffer_pool_size | 7 |
+-------------------------+-----+
SELECT variable_name, ts, variable_value
FROM global_variables
WHERE variable_name = 'innodb_buffer_pool_size'
;
+-------------------------+---------------------+----------------+
| variable_name | ts | variable_value |
+-------------------------+---------------------+----------------+
| innodb_buffer_pool_size | 2024-03-09 21:36:28 | 134217728 |
| innodb_buffer_pool_size | 2024-03-09 21:40:25 | 268435456 |
| innodb_buffer_pool_size | 2024-03-09 21:48:14 | 134217728 |
+-------------------------+---------------------+----------------+
global_status
SELECT s1.ts
, s1.variable_value AS 'table_open_cache_misses'
, s2.variable_value AS 'table_open_cache_hits'
FROM global_status AS s1
JOIN global_status AS s2 ON s1.ts = s2.ts
WHERE s1.variable_name = 'table_open_cache_misses'
AND s2.variable_name = 'table_open_cache_hits'
AND s1.ts BETWEEN '2024-03-13 11:55:00' AND '2024-03-13 12:05:00'
ORDER BY ts ASC
;
+---------------------+-------------------------+-----------------------+
| ts | table_open_cache_misses | table_open_cache_hits |
+---------------------+-------------------------+-----------------------+
| 2024-03-13 11:55:47 | 1001 | 60711 |
| 2024-03-13 11:56:47 | 1008 | 61418 |
| 2024-03-13 11:57:47 | 1015 | 62125 |
| 2024-03-13 11:58:47 | 1022 | 62829 |
| 2024-03-13 11:59:47 | 1029 | 63533 |
| 2024-03-13 12:00:47 | 1036 | 64237 |
| 2024-03-13 12:01:47 | 1043 | 64944 |
| 2024-03-13 12:02:47 | 1050 | 65651 |
| 2024-03-13 12:03:47 | 1057 | 66355 |
| 2024-03-13 12:04:47 | 1064 | 67059 |
+---------------------+-------------------------+-----------------------+
Testing
We have currently rolled out dbstat
on our test and production systems to test it and see whether our assumptions regarding stability and calculations of the quantity structure are correct. In addition, using it ourselves is the best way to find out if something is missing or if the handling is impractical (Eat your own dog food).