You are here

Shinguz

Subscribe to Shinguz feed
FromDual RSS feed about MySQL, Galera Cluster, MariaDB and Percona Server
Updated: 36 min 14 sec ago

FromDual Ops Center for MariaDB and MySQL 0.9.2 has been released

Tue, 2019-08-13 16:50

FromDual has the pleasure to announce the release of the new version 0.9.2 of its popular FromDual Ops Center for MariaDB and MySQL focmm.

The FromDual Ops Center for MariaDB and MySQL (focmm) helps DBA's and System Administrators to manage MariaDB and MySQL database farms. Ops Center makes DBA and Admins life easier!

The main task of Ops Center is to support you in your daily MySQL and MariaDB operation tasks. More information about FromDual Ops Center you can find here.

Download

The new FromDual Ops Center for MariaDB and MySQL (focmm) can be downloaded from here. How to install and use focmm is documented in the Ops Center User Guide.

In the inconceivable case that you find a bug in the FromDual Ops Center for MariaDB and MySQL please report it to the FromDual bug tracker or just send us an email.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Installation of Ops Center 0.9.2

A complete guide on how to install FromDual Ops Center you can find in the Ops Center User Guide.

Upgrade from 0.9.x to 0.9.2

Upgrade from 0.9.x to 0.9.2 should happen automatically. Please do a backup of your Ops Center Instance before you upgrade! Please also check Upgrading.

Changes in Ops Center 0.9.2 Instance
  • Replication: Mariadb 10.4 problem with mysql.user table fixed.
  • Restore: Clean-up work done.
  • Restore: Point-in-Time-Recovery is implemented.
  • Operations: Binary Log state added.
Security
  • Potential XSS hole fixed.
  • Bootstrap CSS updated to 4.3.1
General
  • Newest MyEnv Library added.
  • Function array_pop replaced by array_shift.
  • Tests against MariaDB 10.4 were successful. MariaDB 10.4 is officially supported now.
Build and Packaging
  • Dependency bug in Debian control file fixed.
Taxonomy upgrade extras: OperationsreleaseBackupfailoverRestoreFromDual Ops Centerops center

FromDual Ops Center for MariaDB and MySQL 0.9.2 has been released

Tue, 2019-08-13 16:50

FromDual has the pleasure to announce the release of the new version 0.9.2 of its popular FromDual Ops Center for MariaDB and MySQL focmm.

The FromDual Ops Center for MariaDB and MySQL (focmm) helps DBA's and System Administrators to manage MariaDB and MySQL database farms. Ops Center makes DBA and Admins life easier!

The main task of Ops Center is to support you in your daily MySQL and MariaDB operation tasks. More information about FromDual Ops Center you can find here.

Download

The new FromDual Ops Center for MariaDB and MySQL (focmm) can be downloaded from here. How to install and use focmm is documented in the Ops Center User Guide.

In the inconceivable case that you find a bug in the FromDual Ops Center for MariaDB and MySQL please report it to the FromDual bug tracker or just send us an email.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Installation of Ops Center 0.9.2

A complete guide on how to install FromDual Ops Center you can find in the Ops Center User Guide.

Upgrade from 0.9.x to 0.9.2

Upgrade from 0.9.x to 0.9.2 should happen automatically. Please do a backup of your Ops Center Instance before you upgrade! Please also check Upgrading.

Changes in Ops Center 0.9.2 Instance
  • Replication: Mariadb 10.4 problem with mysql.user table fixed.
  • Restore: Clean-up work done.
  • Restore: Point-in-Time-Recovery is implemented.
  • Operations: Binary Log state added.
Security
  • Potential XSS hole fixed.
  • Bootstrap CSS updated to 4.3.1
General
  • Newest MyEnv Library added.
  • Function array_pop replaced by array_shift.
  • Tests against MariaDB 10.4 were successful. MariaDB 10.4 is officially supported now.
Build and Packaging
  • Dependency bug in Debian control file fixed.
Taxonomy upgrade extras: OperationsreleaseBackupfailoverRestoreFromDual Ops Centerops center

FromDual Backup and Recovery Manager for MariaDB and MySQL 2.2.1 has been released

Tue, 2019-08-06 09:12

FromDual has the pleasure to announce the release of the new version 2.2.1 of its popular Backup and Recovery Manager for MariaDB and MySQL (brman).

The new FromDual Backup and Recovery Manager can be downloaded from here. The FromDual Repositories were updated. How to install and use the Backup and Recovery Manager is describe in FromDual Backup and Recovery Manager (brman) installation guide.

In the inconceivable case that you find a bug in the FromDual Backup and Recovery Manager please report it to the FromDual Bugtracker or just send us an email.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Upgrade from 2.x to 2.2.1 shell> cd ${HOME}/product shell> tar xf /download/brman-2.2.1.tar.gz shell> rm -f brman shell> ln -s brman-2.2.1 brman
Changes in FromDual Backup and Recovery Manager 2.2.1

This release is a new minor release. It contains only bug fixes. We have tried to maintain backward-compatibility with the 1.2, 2.0 and 2.1 release series. But you should test the new release seriously!

You can verify your current FromDual Backup Manager version with the following command:

shell> fromdual_bman --version shell> bman --version shell> rman --version
General
  • No changes.

FromDual Backup Manager
  • Workaround for MariaDB binlog purge problem added do binlog backup.

FromDual Recovery Manager
  • PiTR did not find binary logs in case they were from a foreign instance.

FromDual brman Catalog
  • No changes.

Subscriptions for commercial use of FromDual Backup and Recovery Manager you can get from from us.

Taxonomy upgrade extras: BackupRestoreRecoverypitrbrmanreleasebmanrmanFromDual Backup and Recovery Manager

FromDual Backup and Recovery Manager for MariaDB and MySQL 2.2.1 has been released

Tue, 2019-08-06 09:12

FromDual has the pleasure to announce the release of the new version 2.2.1 of its popular Backup and Recovery Manager for MariaDB and MySQL (brman).

The new FromDual Backup and Recovery Manager can be downloaded from here. The FromDual Repositories were updated. How to install and use the Backup and Recovery Manager is describe in FromDual Backup and Recovery Manager (brman) installation guide.

In the inconceivable case that you find a bug in the FromDual Backup and Recovery Manager please report it to the FromDual Bugtracker or just send us an email.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Upgrade from 2.x to 2.2.1 shell> cd ${HOME}/product shell> tar xf /download/brman-2.2.1.tar.gz shell> rm -f brman shell> ln -s brman-2.2.1 brman
Changes in FromDual Backup and Recovery Manager 2.2.1

This release is a new minor release. It contains only bug fixes. We have tried to maintain backward-compatibility with the 1.2, 2.0 and 2.1 release series. But you should test the new release seriously!

You can verify your current FromDual Backup Manager version with the following command:

shell> fromdual_bman --version shell> bman --version shell> rman --version
General
  • No changes.

FromDual Backup Manager
  • Workaround for MariaDB binlog purge problem added do binlog backup.

FromDual Recovery Manager
  • PiTR did not find binary logs in case they were from a foreign instance.

FromDual brman Catalog
  • No changes.

Subscriptions for commercial use of FromDual Backup and Recovery Manager you can get from from us.

Taxonomy upgrade extras: BackupRestoreRecoverypitrbrmanreleasebmanrmanFromDual Backup and Recovery Manager

FromDual Backup and Recovery Manager for MariaDB and MySQL 2.2.1 has been released

Tue, 2019-08-06 09:12

FromDual has the pleasure to announce the release of the new version 2.2.1 of its popular Backup and Recovery Manager for MariaDB and MySQL (brman).

The new FromDual Backup and Recovery Manager can be downloaded from here. The FromDual Repositories were updated. How to install and use the Backup and Recovery Manager is describe in FromDual Backup and Recovery Manager (brman) installation guide.

In the inconceivable case that you find a bug in the FromDual Backup and Recovery Manager please report it to the FromDual Bugtracker or just send us an email.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Upgrade from 2.x to 2.2.1 shell> cd ${HOME}/product shell> tar xf /download/brman-2.2.1.tar.gz shell> rm -f brman shell> ln -s brman-2.2.1 brman
Changes in FromDual Backup and Recovery Manager 2.2.1

This release is a new minor release. It contains only bug fixes. We have tried to maintain backward-compatibility with the 1.2, 2.0 and 2.1 release series. But you should test the new release seriously!

You can verify your current FromDual Backup Manager version with the following command:

shell> fromdual_bman --version shell> bman --version shell> rman --version
General
  • No changes.

FromDual Backup Manager
  • Workaround for MariaDB binlog purge problem added do binlog backup.

FromDual Recovery Manager
  • PiTR did not find binary logs in case they were from a foreign instance.

FromDual brman Catalog
  • No changes.

Subscriptions for commercial use of FromDual Backup and Recovery Manager you can get from from us.

Taxonomy upgrade extras: BackupRestoreRecoverypitrbrmanreleasebmanrmanFromDual Backup and Recovery Manager

FromDual Backup and Recovery Manager for MariaDB and MySQL 2.2.0 has been released

Mon, 2019-07-29 16:51

FromDual has the pleasure to announce the release of the new version 2.2.0 of its popular Backup and Recovery Manager for MariaDB and MySQL (brman).

The new FromDual Backup and Recovery Manager can be downloaded from here. The FromDual Repositories were updated. How to install and use the Backup and Recovery Manager is describe in FromDual Backup and Recovery Manager (brman) installation guide.

In the inconceivable case that you find a bug in the FromDual Backup and Recovery Manager please report it to the FromDual Bugtracker or just send us an email.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Upgrade from 2.x to 2.2.0 shell> cd ${HOME}/product shell> tar xf /download/brman-2.2.0.tar.gz shell> rm -f brman shell> ln -s brman-2.2.0 brman
Changes in FromDual Backup and Recovery Manager 2.2.0

This release is a new major release series. It contains some new features. We have tried to maintain backward-compatibility with the 1.2, 2.0 and 2.1 release series. But you should test the new release seriously!

You can verify your current FromDual Backup Manager version with the following command:

shell> fromdual_bman --version shell> bman --version shell> rman --version
General
  • Automated testing improved.
  • Packaging optimized. Especially unnecessary dependencies removed.
  • Code redundancy removed, made code more readable and partially code reorganized.

FromDual Backup Manager
  • Requirement checks improved. Warnings do not spoil rc any more.
  • Bug with --pass-through and --per-schema fixed (MGB).
  • Backup size is now correctly reported to fpmmm. And message is written to the log.
  • Timestamp format was wrong in start backup tag.
  • Globals removed from function doPrivilegeBackup.
  • Columns of table mysql.user quoted for MariaDB 10.4 compatibility.
  • Test on fpmmm-cache-file fixed from dir to file.
  • Some bugs in function binlogBackup simulation fixed.
  • Too verbose debugging disabled.
  • Sftp archiving did not fail or complain if directory does not exist.
  • Better input validation for option --archivedestination.
  • Function checkArguments renamed to checkBmanOptions.
  • Wrong error message for servers without binary log removed.
  • Skip gathering of binlog file and pos in case of binary log was not enabled.

FromDual Recovery Manager
  • Missing --stop-instance option added to usage.
  • Omitted option --policy PHP error fixed for logical and physical restore, this allows us to restore from conventional backup infrastructure as well.
  • Tables slave_master_info and slave_relay_log_info in mysql schema are omitted in restore progress output.
  • Error message for missing --backup-name option improved.
  • Error handling and messages improved as preparation for PiTR.
  • Binlog restore type removed and check made accordingly that it is not possible any more.
  • PiTR pre Recover test implemented.
  • Various tests for test automation improved.
  • Binlog position discovery possible also for compressed files.
  • Execution time information added.
  • Option --debug added to rman.
  • Point-in-Time-Recovery for logical and physical backup implemented.
  • Instance starting for physical backup was implemented.
  • Option --simulate was implemented everywhere.
  • Option --disable-log-bin added and error handling improved.
  • Option --config activated in rman.
  • Fast PiTR is implemented.
  • Only full physical restore is supported atm. This was not caught correctly.
  • Schema PiTR also possible.

FromDual brman Catalog
  • No changes.

Subscriptions for commercial use of FromDual Backup and Recovery Manager you can get from from us.

Taxonomy upgrade extras: BackupRestoreRecoverypitrbrmanreleasebmanrman

FromDual Backup and Recovery Manager for MariaDB and MySQL 2.2.0 has been released

Mon, 2019-07-29 16:51

FromDual has the pleasure to announce the release of the new version 2.2.0 of its popular Backup and Recovery Manager for MariaDB and MySQL (brman).

The new FromDual Backup and Recovery Manager can be downloaded from here. The FromDual Repositories were updated. How to install and use the Backup and Recovery Manager is describe in FromDual Backup and Recovery Manager (brman) installation guide.

In the inconceivable case that you find a bug in the FromDual Backup and Recovery Manager please report it to the FromDual Bugtracker or just send us an email.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Upgrade from 2.x to 2.2.0 shell> cd ${HOME}/product shell> tar xf /download/brman-2.2.0.tar.gz shell> rm -f brman shell> ln -s brman-2.2.0 brman
Changes in FromDual Backup and Recovery Manager 2.2.0

This release is a new major release series. It contains some new features. We have tried to maintain backward-compatibility with the 1.2, 2.0 and 2.1 release series. But you should test the new release seriously!

You can verify your current FromDual Backup Manager version with the following command:

shell> fromdual_bman --version shell> bman --version shell> rman --version
General
  • Automated testing improved.
  • Packaging optimized. Especially unnecessary dependencies removed.
  • Code redundancy removed, made code more readable and partially code reorganized.

FromDual Backup Manager
  • Requirement checks improved. Warnings do not spoil rc any more.
  • Bug with --pass-through and --per-schema fixed (MGB).
  • Backup size is now correctly reported to fpmmm. And message is written to the log.
  • Timestamp format was wrong in start backup tag.
  • Globals removed from function doPrivilegeBackup.
  • Columns of table mysql.user quoted for MariaDB 10.4 compatibility.
  • Test on fpmmm-cache-file fixed from dir to file.
  • Some bugs in function binlogBackup simulation fixed.
  • Too verbose debugging disabled.
  • Sftp archiving did not fail or complain if directory does not exist.
  • Better input validation for option --archivedestination.
  • Function checkArguments renamed to checkBmanOptions.
  • Wrong error message for servers without binary log removed.
  • Skip gathering of binlog file and pos in case of binary log was not enabled.

FromDual Recovery Manager
  • Missing --stop-instance option added to usage.
  • Omitted option --policy PHP error fixed for logical and physical restore, this allows us to restore from conventional backup infrastructure as well.
  • Tables slave_master_info and slave_relay_log_info in mysql schema are omitted in restore progress output.
  • Error message for missing --backup-name option improved.
  • Error handling and messages improved as preparation for PiTR.
  • Binlog restore type removed and check made accordingly that it is not possible any more.
  • PiTR pre Recover test implemented.
  • Various tests for test automation improved.
  • Binlog position discovery possible also for compressed files.
  • Execution time information added.
  • Option --debug added to rman.
  • Point-in-Time-Recovery for logical and physical backup implemented.
  • Instance starting for physical backup was implemented.
  • Option --simulate was implemented everywhere.
  • Option --disable-log-bin added and error handling improved.
  • Option --config activated in rman.
  • Fast PiTR is implemented.
  • Only full physical restore is supported atm. This was not caught correctly.
  • Schema PiTR also possible.

FromDual brman Catalog
  • No changes.

Subscriptions for commercial use of FromDual Backup and Recovery Manager you can get from from us.

Taxonomy upgrade extras: BackupRestoreRecoverypitrbrmanreleasebmanrman

FromDual Backup and Recovery Manager for MariaDB and MySQL 2.2.0 has been released

Mon, 2019-07-29 16:51

FromDual has the pleasure to announce the release of the new version 2.2.0 of its popular Backup and Recovery Manager for MariaDB and MySQL (brman).

The new FromDual Backup and Recovery Manager can be downloaded from here. The FromDual Repositories were updated. How to install and use the Backup and Recovery Manager is describe in FromDual Backup and Recovery Manager (brman) installation guide.

In the inconceivable case that you find a bug in the FromDual Backup and Recovery Manager please report it to the FromDual Bugtracker or just send us an email.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Upgrade from 2.x to 2.2.0 shell> cd ${HOME}/product shell> tar xf /download/brman-2.2.0.tar.gz shell> rm -f brman shell> ln -s brman-2.2.0 brman
Changes in FromDual Backup and Recovery Manager 2.2.0

This release is a new major release series. It contains some new features. We have tried to maintain backward-compatibility with the 1.2, 2.0 and 2.1 release series. But you should test the new release seriously!

You can verify your current FromDual Backup Manager version with the following command:

shell> fromdual_bman --version shell> bman --version shell> rman --version
General
  • Automated testing improved.
  • Packaging optimized. Especially unnecessary dependencies removed.
  • Code redundancy removed, made code more readable and partially code reorganized.

FromDual Backup Manager
  • Requirement checks improved. Warnings do not spoil rc any more.
  • Bug with --pass-through and --per-schema fixed (MGB).
  • Backup size is now correctly reported to fpmmm. And message is written to the log.
  • Timestamp format was wrong in start backup tag.
  • Globals removed from function doPrivilegeBackup.
  • Columns of table mysql.user quoted for MariaDB 10.4 compatibility.
  • Test on fpmmm-cache-file fixed from dir to file.
  • Some bugs in function binlogBackup simulation fixed.
  • Too verbose debugging disabled.
  • Sftp archiving did not fail or complain if directory does not exist.
  • Better input validation for option --archivedestination.
  • Function checkArguments renamed to checkBmanOptions.
  • Wrong error message for servers without binary log removed.
  • Skip gathering of binlog file and pos in case of binary log was not enabled.

FromDual Recovery Manager
  • Missing --stop-instance option added to usage.
  • Omitted option --policy PHP error fixed for logical and physical restore, this allows us to restore from conventional backup infrastructure as well.
  • Tables slave_master_info and slave_relay_log_info in mysql schema are omitted in restore progress output.
  • Error message for missing --backup-name option improved.
  • Error handling and messages improved as preparation for PiTR.
  • Binlog restore type removed and check made accordingly that it is not possible any more.
  • PiTR pre Recover test implemented.
  • Various tests for test automation improved.
  • Binlog position discovery possible also for compressed files.
  • Execution time information added.
  • Option --debug added to rman.
  • Point-in-Time-Recovery for logical and physical backup implemented.
  • Instance starting for physical backup was implemented.
  • Option --simulate was implemented everywhere.
  • Option --disable-log-bin added and error handling improved.
  • Option --config activated in rman.
  • Fast PiTR is implemented.
  • Only full physical restore is supported atm. This was not caught correctly.
  • Schema PiTR also possible.

FromDual brman Catalog
  • No changes.

Subscriptions for commercial use of FromDual Backup and Recovery Manager you can get from from us.

Taxonomy upgrade extras: BackupRestoreRecoverypitrbrmanreleasebmanrman

Schulung MariaDB/MySQL für Fortgeschrittene im August 2019 in Köln

Mon, 2019-07-29 14:29

Sommerferien-Zeit – für all die Daheimgebliebenen dürfte es jetzt hoffentlich etwas ruhiger zu und her gehen. Zeit für eine Weiterbildung? Es bleibt nicht mehr viel Zeit, das Jahres-Schulungs-Budget aufzubrauchen!

Vom 19. bis 23. August führt FromDual wieder einmal die Schulung MariaDB und MySQL für Fortgeschrittene in Köln durch. Siehe auch unsere Schulungstermine.

Es hat noch Plätze frei! Anmelden kann man sich direkt bei unserem Schulungs-Partner, der GFU Cyrus AG.

Diese MariaDB/MySQL Weiterbildung richtet sich an alle DBAs, DevOps und System Administratoren, welche MariaDB und MySQL Datenbanken zu betreuen habe und gerne besser verstehen wollen, wie man das noch besser macht.

In dieser Schulung behandeln wir Backup, Restore und Point-in-Time-Recovery sowohl einer kleinen wie auch einer grossen Datenbank. Aufsetzen von hochverfügbaren MariaDB und MySQL Datenbanken mittels der Master/Slave Replikation sowie dem Galera Cluster inklusive Switch-Over-Techniken. Schliesslich und endlich beschäftigen wir uns auch noch zwei Tage mit Datenbank Performance Tuning (Hardware, O/S, DB Konfiguration, Schema Tuning und SQL Query Tuning, etc.).

Das Ganze ist mir zahlreichen Übungen versehen, damit man das gelernte auch gleich praktisch anwenden kann!

Die Schulung findet in deutscher Sprache statt.

Den detaillierten Inhalt dieser fünftägigen MySQL/MariaDB Schulung finden Sie hier.

Bei weiteren Fragen nehmen Sie bitte mit uns Kontakt auf.

Taxonomy upgrade extras: schulungtraininggaleraBackupRestorepoint-in-time-recoveryreplikationPerformance Tuning

Schulung MariaDB/MySQL für Fortgeschrittene im August 2019 in Köln

Mon, 2019-07-29 14:29

Sommerferien-Zeit – für all die Daheimgebliebenen dürfte es jetzt hoffentlich etwas ruhiger zu und her gehen. Zeit für eine Weiterbildung? Es bleibt nicht mehr viel Zeit, das Jahres-Schulungs-Budget aufzubrauchen!

Vom 19. bis 23. August führt FromDual wieder einmal die Schulung MariaDB und MySQL für Fortgeschrittene in Köln durch. Siehe auch unsere Schulungstermine.

Es hat noch Plätze frei! Anmelden kann man sich direkt bei unserem Schulungs-Partner, der GFU Cyrus AG.

Diese MariaDB/MySQL Weiterbildung richtet sich an alle DBAs, DevOps und System Administratoren, welche MariaDB und MySQL Datenbanken zu betreuen habe und gerne besser verstehen wollen, wie man das noch besser macht.

In dieser Schulung behandeln wir Backup, Restore und Point-in-Time-Recovery sowohl einer kleinen wie auch einer grossen Datenbank. Aufsetzen von hochverfügbaren MariaDB und MySQL Datenbanken mittels der Master/Slave Replikation sowie dem Galera Cluster inklusive Switch-Over-Techniken. Schliesslich und endlich beschäftigen wir uns auch noch zwei Tage mit Datenbank Performance Tuning (Hardware, O/S, DB Konfiguration, Schema Tuning und SQL Query Tuning, etc.).

Das Ganze ist mir zahlreichen Übungen versehen, damit man das gelernte auch gleich praktisch anwenden kann!

Die Schulung findet in deutscher Sprache statt.

Den detaillierten Inhalt dieser fünftägigen MySQL/MariaDB Schulung finden Sie hier.

Bei weiteren Fragen nehmen Sie bitte mit uns Kontakt auf.

Taxonomy upgrade extras: schulungtraininggaleraBackupRestorepoint-in-time-recoveryreplikationPerformance Tuning

Schulung MariaDB/MySQL für Fortgeschrittene im August 2019 in Köln

Mon, 2019-07-29 14:29

Sommerferien-Zeit – für all die Daheimgebliebenen dürfte es jetzt hoffentlich etwas ruhiger zu und her gehen. Zeit für eine Weiterbildung? Es bleibt nicht mehr viel Zeit, das Jahres-Schulungs-Budget aufzubrauchen!

Vom 19. bis 23. August führt FromDual wieder einmal die Schulung MariaDB und MySQL für Fortgeschrittene in Köln durch. Siehe auch unsere Schulungstermine.

Es hat noch Plätze frei! Anmelden kann man sich direkt bei unserem Schulungs-Partner, der GFU Cyrus AG.

Diese MariaDB/MySQL Weiterbildung richtet sich an alle DBAs, DevOps und System Administratoren, welche MariaDB und MySQL Datenbanken zu betreuen habe und gerne besser verstehen wollen, wie man das noch besser macht.

In dieser Schulung behandeln wir Backup, Restore und Point-in-Time-Recovery sowohl einer kleinen wie auch einer grossen Datenbank. Aufsetzen von hochverfügbaren MariaDB und MySQL Datenbanken mittels der Master/Slave Replikation sowie dem Galera Cluster inklusive Switch-Over-Techniken. Schliesslich und endlich beschäftigen wir uns auch noch zwei Tage mit Datenbank Performance Tuning (Hardware, O/S, DB Konfiguration, Schema Tuning und SQL Query Tuning, etc.).

Das Ganze ist mir zahlreichen Übungen versehen, damit man das gelernte auch gleich praktisch anwenden kann!

Die Schulung findet in deutscher Sprache statt.

Den detaillierten Inhalt dieser fünftägigen MySQL/MariaDB Schulung finden Sie hier.

Bei weiteren Fragen nehmen Sie bitte mit uns Kontakt auf.

Taxonomy upgrade extras: schulungtraininggaleraBackupRestorepoint-in-time-recoveryreplikationPerformance Tuning

Who else is using my memory - File System Cache analysis

Fri, 2019-07-19 15:52

See also our former articles:

When we do analysis of MariaDB Database servers we also check the memory (RAM and Swap) available:

# free --kilo --wide total used free shared buffers cache available Mem: 16106252 4329952 703356 199008 307872 10765072 11042748 Swap: 31250428 528684 30721744

The values for buffers and especially for cache can be sometimes quite big. In this case they use about 10 GiB. So let us have a look what these things called buffers and cache are, using our valuable RAM... When we check the man pages of free we will find:

# man free ... buffers Memory used by kernel buffers (Buffers in /proc/meminfo) cache Memory used by the page cache and slabs (Cached and Slab in /proc/meminfo) buff/cache Sum of buffers and cache

So let us check a more fine grained information in /proc/meminfo which is an interface to the kernel data structures:

# cat /proc/meminfo | grep -e ^Cached -e Slab -e Buffers Buffers: 307872 kB Cached: 10155156 kB Slab: 609916 kB

Same values! Then let us have a look at the man pages of proc what we can find about these values:

# man proc ... Buffers Relatively temporary storage for raw disk blocks that shouldn't get tremendously large (20MB or so). Cached In-memory cache for files read from the disk (the page cache). Doesn't include SwapCached. Slab In-kernel data structures cache.

So it looks like we have a raw I/O Cache (called Buffer Cache) and a File System I/O Cache (called Page Cache). So how does this work? What is a raw I/O? And is a Files System I/O cached once (Cached) or twice (Cached and Buffers)?

When we dig a bit deeper we can find that prior to Linux Kernels 2.4 the two Caches were distinct. So that was a waste of memory (RAM). It seems like today this is not the case any more [1], [2], [3]. And man pages are a bit out of date or at least not very precise?

Analysing the Linux Page Cache

A very good source when it comes to Linux Performance Tuning and Measuring is Brendan Gregg's Website. To measure Linux Page Cache Hit Ratio he provides a tool called cachestat which is part of the perf-tools collection on GitHub.

With cachestat we get a per second statistics of the Buffer Cache and the Page Cache (without Slabs), Cache Hits, Cache Misses, Dirty Buffer Entries in the Cache and a Cache Hit Ratio:

# sudo cachestat Counting cache functions... Output every 1 seconds. HITS MISSES DIRTIES RATIO BUFFERS_MB CACHE_MB 1419 8 0 99.4% 338 9406 1368 0 0 100.0% 338 9406 1391 0 0 100.0% 338 9406 8558 0 29 100.0% 338 9406 31870 0 163 100.0% 338 9406 1374 0 24 100.0% 338 9406 1388 0 0 100.0% 338 9406 1370 0 0 100.0% 338 9406 1388 0 0 100.0% 338 9406

Brendan Gregg also mentions a tool called pcstat (on GitHub) by Al Tobey which gets Page Cache Statistics for Files. Unfortunately I had some problems building it on my Ubuntu 16.04 with Go version 1.6. So I built it on an Ubuntu 18.04 (Go 1.10) and copied it over to to Ubuntu 16.04):

# export GOPATH=/tmp/ # cd $GOPATH # go get golang.org/x/sys/unix # go get github.com/tobert/pcstat/pcstat # bin/pcstat $GOPATH/bin/pcstat

Then I tried pcstat out against a MariaDB 10.4 instance. In the output we can see how big the files are in bytes, how many pages of 4 kib this corresponds to, how many of these 4 kib pages are cached and the percentage of pages cached:

# pcstat /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* +------------------------------------------------------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |------------------------------------------------------+----------------+------------+-----------+---------| | /home/mysql/database/mariadb-104/data/ib_buffer_pool | 14642 | 4 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibdata1 | 79691776 | 19456 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ib_logfile0 | 268435456 | 65536 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ib_logfile1 | 268435456 | 65536 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibtmp1 | 12582912 | 3072 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.frm | 1097 | 1 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.ibd | 13631488 | 3328 | 0 | 000.000 | +------------------------------------------------------+----------------+------------+-----------+---------+

When we run pcstat over time with the famous watch command we can even see how the Page Cache is heating up:

# watch -d -n 1 'pcstat /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* ; free -w' +------------------------------------------------------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |------------------------------------------------------+----------------+------------+-----------+---------| | /home/mysql/database/mariadb-104/data/ib_buffer_pool | 14642 | 4 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibdata1 | 79691776 | 19456 | 2416 | 012.418 | | /home/mysql/database/mariadb-104/data/ib_logfile0 | 268435456 | 65536 | 3165 | 004.829 | | /home/mysql/database/mariadb-104/data/ib_logfile1 | 268435456 | 65536 | 5890 | 008.987 | | /home/mysql/database/mariadb-104/data/ibtmp1 | 12582912 | 3072 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.frm | 1097 | 1 | 1 | 100.000 | | /home/mysql/database/mariadb-104/data/test/test.ibd | 13631488 | 3328 | 1164 | 034.976 | +------------------------------------------------------+----------------+------------+-----------+---------+ total used free shared buffers cache available Mem: 16106252 4329952 703356 199008 307872 10765072 11042748 Swap: 31250428 528684 30721744

An other tool which was discussed on Brendans Website was vmtouch - the Virtual Memory Toucher (on GitHub, Documentation). With vmtouch we can see for example how much of the directory /home/mysql/database/mariadb-104/data (datadir) is currently in cache:

# vmtouch -f /home/mysql/database/mariadb-104/data Files: 503 Directories: 9 Resident Pages: 29356/231060 114M/902M 12.7% Elapsed: 0.009668 seconds

Or more fine grained how much of InnoDB System Files are currently in memory:

# vmtouch -f -v /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/ib_buffer_pool [ ] 0/4 /home/mysql/database/mariadb-104/data/ibdata1 [oOooooo ooooo ] 2416/19456 /home/mysql/database/mariadb-104/data/ib_logfile0 [o oOO] 3165/65536 /home/mysql/database/mariadb-104/data/ib_logfile1 [OOOOOOOOOOOOOOOOOOOOOo ] 23192/65536 /home/mysql/database/mariadb-104/data/ibtmp1 [ ] 0/3072 Files: 5 Directories: 0 Resident Pages: 28773/153604 112M/600M 18.7% Elapsed: 0.005499 seconds

A further question to answer is: Can I see all files cached in the Page Cache? So it seem like this is not easily possible:

There is no efficient search mechanism for doing the reverse - getting a file name belonging to a data block would require reading all inodes and indirect blocks on the file system. If you need to know about every single file's blocks stored in the page cache, you would need to supply a list of all files on your file system(s) to fincore. But that again is likely to spoil the measurement as a large amount of data would be read traversing the directories and getting all inodes and indirect blocks - putting them into the page cache and evicting the very page cache data you were trying to examine. [5]

Also in this article we can read about the Linux File Tools (linux-ftools) by Google. It seems to be a bit more complicated to make them work. So I let it be.

How is the Page Cache related to MariaDB

After all this technical O/S discussion, how is Linux Page Cache related to your MariaDB Database? Your MariaDB Database caches Data and Indexes as well. For the InnoDB Storage Engine this is the InnoDB Buffer Pool and for the Aria Storage Engine this is the Aria Page Cache Buffer. So if your MariaDB Database caches pages and if your Linux O/S caches pages the probability is high they cache the same data twice and thus waste valuable RAM! Fortunately InnoDB is configurable in a way it does NOT cache InnoDB files in the Page Cache. This is controlled with the InnoDB Server Variable innodb_flush_method.

When we look at InnoDB Files which were opened in a "normal" way (default: innodb_flush_method = fsync) we get the following information about how the files were opened (man 2 open and [6]):

# lsof +fg ./ib* COMMAND PID USER FD TYPE FILE-FLAG DEVICE SIZE/OFF NODE NAME mysqld 2098 mysql 7uW REG RW,LG,0x80000 8,1 79691776 9175185 ./ibdata1 mysqld 2098 mysql 11uW REG RW,LG,0x80000 8,1 268435456 9175186 ./ib_logfile0 mysqld 2098 mysql 12uW REG RW,LG,0x80000 8,1 268435456 9175187 ./ib_logfile1 mysqld 2098 mysql 13uW REG RW,LG,0x80000 8,1 12582912 9175280 ./ibtmp1

The interesting column here is the FILE-FLAG column which indicates (man lsof):

# man lsof FILE-FLAG when g or G has been specified to +f, this field contains the contents of the f_flag[s] member of the kernel file structure and the kernel's per-process open file flags (if available); `G' causes them to be displayed in hexadecimal; `g', as short-hand names; two lists may be displayed with entries separated by commas, the lists separated by a semicolon (`;'); the first list may contain short-hand names for f_flag[s] values from the following table: DIR direct LG large file RW read and write access

The output is not so clear or completely understandable yet thus we want to have the open file flags in hexadecimal notation:

# lsof +fG ./ib* COMMAND PID USER FD TYPE FILE-FLAG DEVICE SIZE/OFF NODE NAME mysqld 2098 mysql 7uW REG 0x88002;0x0 8,1 79691776 9175185 ./ibdata1 mysqld 2098 mysql 11uW REG 0x88002;0x0 8,1 268435456 9175186 ./ib_logfile0 mysqld 2098 mysql 12uW REG 0x88002;0x0 8,1 268435456 9175187 ./ib_logfile1 mysqld 2098 mysql 13uW REG 0x88002;0x0 8,1 12582912 9175280 ./ibtmp1

The Linux Kernel open file flags can be found here: fcntl.h. I have extracted the most relevant open file flags for our examination:

#define O_RDWR 00000002 (oct, 0x00002) #define O_DIRECT 00040000 (oct, 0x04000) /* direct disk access hint */ #define O_LARGEFILE 00100000 (oct, 0x08000) #define O_CLOEXEC 02000000 (oct, 0x80000) /* set close_on_exec */

So we can see that these 4 InnoDB files where opened with O_RDWR (RW), O_LARGE_FILE (LG) and O_CLOEXEC (not available (yet?) in lsof translation output).

Now let us start the MariaDB Database with the server variable set to: innodb_flush_method = O_DIRECT and check how the files where opened:

# lsof +fg ./ib* COMMAND PID USER FD TYPE FILE-FLAG DEVICE SIZE/OFF NODE NAME mysqld 2098 mysql 7uW REG RW,DIR,LG,0x80000 8,1 79691776 9175185 ./ibdata1 mysqld 2098 mysql 11uW REG RW,LG,0x80000 8,1 268435456 9175186 ./ib_logfile0 mysqld 2098 mysql 12uW REG RW,LG,0x80000 8,1 268435456 9175187 ./ib_logfile1 mysqld 2098 mysql 13uW REG RW,DIR,LG,0x80000 8,1 12582912 9175280 ./ibtmp1 # lsof +fG ./ib* COMMAND PID USER FD TYPE FILE-FLAG DEVICE SIZE/OFF NODE NAME mysqld 2098 mysql 7uW REG 0x8c002;0x0 8,1 79691776 9175185 ./ibdata1 mysqld 2098 mysql 11uW REG 0x88002;0x0 8,1 268435456 9175186 ./ib_logfile0 mysqld 2098 mysql 12uW REG 0x88002;0x0 8,1 268435456 9175187 ./ib_logfile1 mysqld 2098 mysql 13uW REG 0x8c002;0x0 8,1 12582912 9175280 ./ibtmp1

We can see a new flag DIR or 0x04000 which means the files where opened with O_DIRECT. But only the InnoDB Temporary Table Tablespace and the InnoDB System Tablespace but not the two InnoDB Transaction Logs.

Translation of hex to oct: 0x8c002 = 02140002.

But what does O_DIRECT mean? Looking at the open(2) man pages we can find:

# man 2 open O_DIRECT (since Linux 2.4.10) Try to minimize cache effects of the I/O to and from this file. In general this will degrade performance, but it is useful in special situations, such as when applications do their own caching. File I/O is done directly to/from user-space buffers. The O_DIRECT flag on its own makes an effort to transfer data synchronously, but does not give the guarantees of the O_SYNC flag that data and necessary meta‐ data are transferred. To guarantee synchronous I/O, O_SYNC must be used in addition to O_DIRECT.

So O_DIRECT is exactly what we want in this case: Bypassing the File System Page Cache to not cache the Database blocks twice!

To verify the impact we run pcstat again:

# pcstat /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* +------------------------------------------------------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |------------------------------------------------------+----------------+------------+-----------+---------| | /home/mysql/database/mariadb-104/data/ib_buffer_pool | 16020 | 4 | 4 | 100.000 | | /home/mysql/database/mariadb-104/data/ibdata1 | 79691776 | 19456 | 140 | 000.720 | | /home/mysql/database/mariadb-104/data/ib_logfile0 | 268435456 | 65536 | 36844 | 056.219 | | /home/mysql/database/mariadb-104/data/ib_logfile1 | 268435456 | 65536 | 65536 | 100.000 | | /home/mysql/database/mariadb-104/data/ibtmp1 | 12582912 | 3072 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.frm | 1097 | 1 | 1 | 100.000 | | /home/mysql/database/mariadb-104/data/test/test.ibd | 67108864 | 16384 | 13400 | 081.787 | +------------------------------------------------------+----------------+------------+-----------+---------+

But... part of the InnoDB Tablespace files is still cached! Also checking the total amount of Buffers and Cache shows the same:

# free total used free shared buff/cache available Mem: 16106252 4401788 368200 456716 11336264 10691792 Swap: 31250428 1348440 29901988

So restarting the MariaDB database does not purge the Page Cache! Note: This is important to notice because bypassing the Page Cache helps to not wasting valuable RAM but it makes Database restart much more costly because Page Cache does not help/support InnoDB Buffer Pool heating any more!

Then let us clear the Linux Page Cache and check the result:

# echo 1 > /proc/sys/vm/drop_caches # free -w total used free shared buffers cache available Mem: 16106252 4395892 10539864 441708 696 1169800 10882984 Swap: 31250428 1348428 29902000

Checking with pcstat shows now that all InnoDB pages are wiped out of the Page Cache:

# pcstat /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* +------------------------------------------------------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |------------------------------------------------------+----------------+------------+-----------+---------| | /home/mysql/database/mariadb-104/data/ib_buffer_pool | 16020 | 4 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibdata1 | 79691776 | 19456 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ib_logfile0 | 268435456 | 65536 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ib_logfile1 | 268435456 | 65536 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibtmp1 | 12582912 | 3072 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.frm | 1097 | 1 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.ibd | 67108864 | 16384 | 0 | 000.000 | +------------------------------------------------------+----------------+------------+-----------+---------+

And after a while running traffic on the test table we can see that InnoDB Transaction Log Files are cached again in the Page Cache but NOT the InnoDB Tablespace files:

# pcstat /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* +------------------------------------------------------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |------------------------------------------------------+----------------+------------+-----------+---------| | /home/mysql/database/mariadb-104/data/ib_buffer_pool | 16020 | 4 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibdata1 | 79691776 | 19456 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ib_logfile0 | 268435456 | 65536 | 3012 | 004.596 | | /home/mysql/database/mariadb-104/data/ib_logfile1 | 268435456 | 65536 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibtmp1 | 12582912 | 3072 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.frm | 1097 | 1 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.ibd | 71303168 | 17408 | 0 | 000.000 | +------------------------------------------------------+----------------+------------+-----------+---------+

Also with vmtouch we can see the difference:

./vmtouch -f -v /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* /home/mysql/database/mariadb-104/data/ib_buffer_pool [ ] 0/4 /home/mysql/database/mariadb-104/data/ibdata1 [ ] 0/19456 /home/mysql/database/mariadb-104/data/ib_logfile0 [o oOOOo ] 4252/65536 /home/mysql/database/mariadb-104/data/ib_logfile1 [ ] 0/65536 /home/mysql/database/mariadb-104/data/ibtmp1 [ ] 0/3072 /home/mysql/database/mariadb-104/data/test/test.frm [ ] 0/1 /home/mysql/database/mariadb-104/data/test/test.ibd [ ] 0/17408 Files: 7 Directories: 0 Resident Pages: 4252/171013 16M/668M 2.49% Elapsed: 0.003264 seconds

And also cachestat shows the effect of a flushed Buffer Cache and Page Cache:

# ./cachestat Counting cache functions... Output every 1 seconds. HITS MISSES DIRTIES RATIO BUFFERS_MB CACHE_MB 677882 19 740 100.0% 67 1087 679213 10 700 100.0% 67 1087 677236 0 732 100.0% 67 1087 685673 11 932 100.0% 67 1088 677933 5 703 100.0% 67 1088

Caution: Depending on your underlying I/O system it makes nevertheless sense to run your MariaDB Database with innodb_flush_method = fsync in certain cases! See also PostgreSQL behaviour.

Note: This information could also be interesting for PostgreSQL DBAs because they do redundant buffering with their shared_buffers (why plural? It is just one!?!) and the O/S Page Cache as well!

What is Slab

Beside Buffer Cache and the Page Cache itself we have a third thing in the /proc/meminfo statistics listed as Slabs. So what are Slabs? Slab seems to be a specific memory management (allocation) mechanism. It is used for frequently used objects in the Linux Kernel (buffer heads, inodes, dentries, etc.) [7-15]. So it contains something like other Linux Kernel Buffers and Kernel Caches.

What kind of other Linux Kernel Buffer and Kernel Caches exists can be found with the following command:

# sudo cat /proc/slabinfo slabinfo - version: 2.1 # name : tunables : slabdata nf_conntrack_1 14183 15275 320 25 2 : tunables 0 0 0 : slabdata 611 611 0 ext4_groupinfo_4k 8575 8596 144 28 1 : tunables 0 0 0 : slabdata 307 307 0 i915_gem_vma 523 950 320 25 2 : tunables 0 0 0 : slabdata 38 38 0 UDPv6 120 120 1088 30 8 : tunables 0 0 0 : slabdata 4 4 0 tw_sock_TCPv6 2668 2668 280 29 2 : tunables 0 0 0 : slabdata 92 92 0 request_sock_TCPv6 24 72 328 24 2 : tunables 0 0 0 : slabdata 3 3 0 TCPv6 68 105 2112 15 8 : tunables 0 0 0 : slabdata 7 7 0 cfq_queue 391 442 232 17 1 : tunables 0 0 0 : slabdata 26 26 0 mqueue_inode_cache 72 72 896 18 4 : tunables 0 0 0 : slabdata 4 4 0 fuse_request 20 40 400 20 2 : tunables 0 0 0 : slabdata 2 2 0 fuse_inode 1 21 768 21 4 : tunables 0 0 0 : slabdata 1 1 0 fat_cache 102 408 40 102 1 : tunables 0 0 0 : slabdata 4 4 0 hugetlbfs_inode_cache 28 84 584 28 4 : tunables 0 0 0 : slabdata 3 3 0 squashfs_inode_cache 25 50 640 25 4 : tunables 0 0 0 : slabdata 2 2 0 jbd2_journal_handle 340 340 48 85 1 : tunables 0 0 0 : slabdata 4 4 0 jbd2_journal_head 2040 2040 120 34 1 : tunables 0 0 0 : slabdata 60 60 0 jbd2_revoke_table_s 260 512 16 256 1 : tunables 0 0 0 : slabdata 2 2 0 jbd2_revoke_record_s1152 1408 32 128 1 : tunables 0 0 0 : slabdata 11 11 0 ext4_inode_cache 208751 210840 1072 30 8 : tunables 0 0 0 : slabdata 7028 7028 0 ext4_free_data 320 448 64 64 1 : tunables 0 0 0 : slabdata 7 7 0 ext4_allocation_cont 128 128 128 32 1 : tunables 0 0 0 : slabdata 4 4 0 ext4_io_end 392 560 72 56 1 : tunables 0 0 0 : slabdata 10 10 0 ext4_extent_status 64412 77928 40 102 1 : tunables 0 0 0 : slabdata 764 764 0 dquot 144 160 256 16 1 : tunables 0 0 0 : slabdata 10 10 0 mbcache 226 292 56 73 1 : tunables 0 0 0 : slabdata 4 4 0 dio 273 350 640 25 4 : tunables 0 0 0 : slabdata 14 14 0 pid_namespace 42 42 2224 14 8 : tunables 0 0 0 : slabdata 3 3 0 ip4-frags 32 64 248 16 1 : tunables 0 0 0 : slabdata 4 4 0 RAW 396 396 896 18 4 : tunables 0 0 0 : slabdata 22 22 0 UDP 68 68 960 17 4 : tunables 0 0 0 : slabdata 4 4 0 tw_sock_TCP 10750 11136 280 29 2 : tunables 0 0 0 : slabdata 384 384 0 request_sock_TCP 96 96 328 24 2 : tunables 0 0 0 : slabdata 4 4 0 TCP 119 136 1920 17 8 : tunables 0 0 0 : slabdata 8 8 0 blkdev_queue 27 48 1336 24 8 : tunables 0 0 0 : slabdata 2 2 0 blkdev_requests 394 506 368 22 2 : tunables 0 0 0 : slabdata 23 23 0 blkdev_ioc 516 546 104 39 1 : tunables 0 0 0 : slabdata 14 14 0 user_namespace 104 104 304 26 2 : tunables 0 0 0 : slabdata 4 4 0 dmaengine-unmap-256 15 15 2112 15 8 : tunables 0 0 0 : slabdata 1 1 0 sock_inode_cache 1707 1950 640 25 4 : tunables 0 0 0 : slabdata 78 78 0 file_lock_cache 665 665 208 19 1 : tunables 0 0 0 : slabdata 35 35 0 net_namespace 40 40 7296 4 8 : tunables 0 0 0 : slabdata 10 10 0 shmem_inode_cache 3315 3432 656 24 4 : tunables 0 0 0 : slabdata 143 143 0 taskstats 96 96 328 24 2 : tunables 0 0 0 : slabdata 4 4 0 proc_inode_cache 6895 7072 624 26 4 : tunables 0 0 0 : slabdata 272 272 0 sigqueue 100 100 160 25 1 : tunables 0 0 0 : slabdata 4 4 0 bdev_cache 29 76 832 19 4 : tunables 0 0 0 : slabdata 4 4 0 kernfs_node_cache 43625 44982 120 34 1 : tunables 0 0 0 : slabdata 1323 1323 0 mnt_cache 518 546 384 21 2 : tunables 0 0 0 : slabdata 26 26 0 inode_cache 17519 17668 568 28 4 : tunables 0 0 0 : slabdata 631 631 0 dentry 424185 439992 192 21 1 : tunables 0 0 0 : slabdata 20952 20952 0 buffer_head 1112865 1112865 104 39 1 : tunables 0 0 0 : slabdata 28535 28535 0 vm_area_struct 53945 55300 200 20 1 : tunables 0 0 0 : slabdata 2765 2765 0 files_cache 260 299 704 23 4 : tunables 0 0 0 : slabdata 13 13 0 signal_cache 509 630 1088 30 8 : tunables 0 0 0 : slabdata 21 21 0 sighand_cache 346 405 2112 15 8 : tunables 0 0 0 : slabdata 27 27 0 task_struct 1189 1269 3584 9 8 : tunables 0 0 0 : slabdata 141 141 0 Acpi-Operand 5703 5824 72 56 1 : tunables 0 0 0 : slabdata 104 104 0 Acpi-Parse 1314 1314 56 73 1 : tunables 0 0 0 : slabdata 18 18 0 Acpi-State 204 204 80 51 1 : tunables 0 0 0 : slabdata 4 4 0 Acpi-Namespace 4077 4182 40 102 1 : tunables 0 0 0 : slabdata 41 41 0 anon_vma 19831 21522 80 51 1 : tunables 0 0 0 : slabdata 422 422 0 numa_policy 170 170 24 170 1 : tunables 0 0 0 : slabdata 1 1 0 radix_tree_node 321937 327740 584 28 4 : tunables 0 0 0 : slabdata 11705 11705 0 trace_event_file 3985 4002 88 46 1 : tunables 0 0 0 : slabdata 87 87 0 ftrace_event_field 86541 88570 48 85 1 : tunables 0 0 0 : slabdata 1042 1042 0 idr_layer_cache 533 555 2096 15 8 : tunables 0 0 0 : slabdata 37 37 0 kmalloc-8192 1246 1246 8192 4 8 : tunables 0 0 0 : slabdata 502 502 0 kmalloc-4096 658 720 4096 8 8 : tunables 0 0 0 : slabdata 90 90 0 kmalloc-2048 1955 2144 2048 16 8 : tunables 0 0 0 : slabdata 134 134 0 kmalloc-1024 44217 44384 1024 16 4 : tunables 0 0 0 : slabdata 2774 2774 0 kmalloc-512 3037 3808 512 16 2 : tunables 0 0 0 : slabdata 238 238 0 kmalloc-256 17465 20384 256 16 1 : tunables 0 0 0 : slabdata 1274 1274 0 kmalloc-192 27708 28665 192 21 1 : tunables 0 0 0 : slabdata 1365 1365 0 kmalloc-128 140581 143744 128 32 1 : tunables 0 0 0 : slabdata 4492 4492 0 kmalloc-96 168044 168378 96 42 1 : tunables 0 0 0 : slabdata 4009 4009 0 kmalloc-64 117533 123264 64 64 1 : tunables 0 0 0 : slabdata 1926 1926 0 kmalloc-32 80425 90368 32 128 1 : tunables 0 0 0 : slabdata 706 706 0 kmalloc-16 9513 11264 16 256 1 : tunables 0 0 0 : slabdata 44 44 0 kmalloc-8 6616 7168 8 512 1 : tunables 0 0 0 : slabdata 14 14 0 kmem_cache_node 320 320 64 64 1 : tunables 0 0 0 : slabdata 5 5 0 kmem_cache 208 208 256 16 1 : tunables 0 0 0 : slabdata 13 13 0

If you want to see the most frequently used (hottest) Slabs you can see them top-like with slabtop. If you press c you can sort the Slabs by CACHE_SIZE:

# sudo slabtop Active / Total Objects (% used) : 2249113 / 2280136 (98.6%) Active / Total Slabs (% used) : 70256 / 70256 (100.0%) Active / Total Caches (% used) : 86 / 121 (71.1%) Active / Total Size (% used) : 597547.86K / 605445.30K (98.7%) Minimum / Average / Maximum Object : 0.01K / 0.26K / 18.56K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 294308 289889 98% 0.57K 10511 28 168176K radix_tree_node 105030 104435 99% 1.05K 3501 30 112032K ext4_inode_cache 745446 745446 100% 0.10K 19114 39 76456K buffer_head 59984 59909 99% 1.00K 3749 16 59984K ecryptfs_inode_cache 47520 47157 99% 1.00K 2970 16 47520K kmalloc-1024 215166 214987 99% 0.19K 10246 21 40984K dentry 139744 138452 99% 0.12K 4367 32 17468K kmalloc-128 179508 179011 99% 0.09K 4274 42 17096K kmalloc-96 47140 45768 97% 0.20K 2357 20 9428K vm_area_struct 14700 14700 100% 0.55K 525 28 8400K inode_cache ...
Literature Taxonomy upgrade extras: memoryRAMcachefile systemo_directinnodb_flush_methodpostgresqlinnodbtablespace

Who else is using my memory - File System Cache analysis

Fri, 2019-07-19 15:52

See also our former articles:

When we do analysis of MariaDB Database servers we also check the memory (RAM and Swap) available:

# free --kilo --wide total used free shared buffers cache available Mem: 16106252 4329952 703356 199008 307872 10765072 11042748 Swap: 31250428 528684 30721744

The values for buffers and especially for cache can be sometimes quite big. In this case they use about 10 GiB. So let us have a look what these things called buffers and cache are, using our valuable RAM... When we check the man pages of free we will find:

# man free ... buffers Memory used by kernel buffers (Buffers in /proc/meminfo) cache Memory used by the page cache and slabs (Cached and Slab in /proc/meminfo) buff/cache Sum of buffers and cache

So let us check a more fine grained information in /proc/meminfo which is an interface to the kernel data structures:

# cat /proc/meminfo | grep -e ^Cached -e Slab -e Buffers Buffers: 307872 kB Cached: 10155156 kB Slab: 609916 kB

Same values! Then let us have a look at the man pages of proc what we can find about these values:

# man proc ... Buffers Relatively temporary storage for raw disk blocks that shouldn't get tremendously large (20MB or so). Cached In-memory cache for files read from the disk (the page cache). Doesn't include SwapCached. Slab In-kernel data structures cache.

So it looks like we have a raw I/O Cache (called Buffer Cache) and a File System I/O Cache (called Page Cache). So how does this work? What is a raw I/O? And is a Files System I/O cached once (Cached) or twice (Cached and Buffers)?

When we dig a bit deeper we can find that prior to Linux Kernels 2.4 the two Caches were distinct. So that was a waste of memory (RAM). It seems like today this is not the case any more [1], [2], [3]. And man pages are a bit out of date or at least not very precise?

Analysing the Linux Page Cache

A very good source when it comes to Linux Performance Tuning and Measuring is Brendan Gregg's Website. To measure Linux Page Cache Hit Ratio he provides a tool called cachestat which is part of the perf-tools collection on GitHub.

With cachestat we get a per second statistics of the Buffer Cache and the Page Cache (without Slabs), Cache Hits, Cache Misses, Dirty Buffer Entries in the Cache and a Cache Hit Ratio:

# sudo cachestat Counting cache functions... Output every 1 seconds. HITS MISSES DIRTIES RATIO BUFFERS_MB CACHE_MB 1419 8 0 99.4% 338 9406 1368 0 0 100.0% 338 9406 1391 0 0 100.0% 338 9406 8558 0 29 100.0% 338 9406 31870 0 163 100.0% 338 9406 1374 0 24 100.0% 338 9406 1388 0 0 100.0% 338 9406 1370 0 0 100.0% 338 9406 1388 0 0 100.0% 338 9406

Brendan Gregg also mentions a tool called pcstat (on GitHub) by Al Tobey which gets Page Cache Statistics for Files. Unfortunately I had some problems building it on my Ubuntu 16.04 with Go version 1.6. So I built it on an Ubuntu 18.04 (Go 1.10) and copied it over to to Ubuntu 16.04):

# export GOPATH=/tmp/ # cd $GOPATH # go get golang.org/x/sys/unix # go get github.com/tobert/pcstat/pcstat # bin/pcstat $GOPATH/bin/pcstat

Then I tried pcstat out against a MariaDB 10.4 instance. In the output we can see how big the files are in bytes, how many pages of 4 kib this corresponds to, how many of these 4 kib pages are cached and the percentage of pages cached:

# pcstat /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* +------------------------------------------------------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |------------------------------------------------------+----------------+------------+-----------+---------| | /home/mysql/database/mariadb-104/data/ib_buffer_pool | 14642 | 4 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibdata1 | 79691776 | 19456 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ib_logfile0 | 268435456 | 65536 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ib_logfile1 | 268435456 | 65536 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibtmp1 | 12582912 | 3072 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.frm | 1097 | 1 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.ibd | 13631488 | 3328 | 0 | 000.000 | +------------------------------------------------------+----------------+------------+-----------+---------+

When we run pcstat over time with the famous watch command we can even see how the Page Cache is heating up:

# watch -d -n 1 'pcstat /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* ; free -w' +------------------------------------------------------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |------------------------------------------------------+----------------+------------+-----------+---------| | /home/mysql/database/mariadb-104/data/ib_buffer_pool | 14642 | 4 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibdata1 | 79691776 | 19456 | 2416 | 012.418 | | /home/mysql/database/mariadb-104/data/ib_logfile0 | 268435456 | 65536 | 3165 | 004.829 | | /home/mysql/database/mariadb-104/data/ib_logfile1 | 268435456 | 65536 | 5890 | 008.987 | | /home/mysql/database/mariadb-104/data/ibtmp1 | 12582912 | 3072 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.frm | 1097 | 1 | 1 | 100.000 | | /home/mysql/database/mariadb-104/data/test/test.ibd | 13631488 | 3328 | 1164 | 034.976 | +------------------------------------------------------+----------------+------------+-----------+---------+ total used free shared buffers cache available Mem: 16106252 4329952 703356 199008 307872 10765072 11042748 Swap: 31250428 528684 30721744

An other tool which was discussed on Brendans Website was vmtouch - the Virtual Memory Toucher (on GitHub, Documentation). With vmtouch we can see for example how much of the directory /home/mysql/database/mariadb-104/data (datadir) is currently in cache:

# vmtouch -f /home/mysql/database/mariadb-104/data Files: 503 Directories: 9 Resident Pages: 29356/231060 114M/902M 12.7% Elapsed: 0.009668 seconds

Or more fine grained how much of InnoDB System Files are currently in memory:

# vmtouch -f -v /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/ib_buffer_pool [ ] 0/4 /home/mysql/database/mariadb-104/data/ibdata1 [oOooooo ooooo ] 2416/19456 /home/mysql/database/mariadb-104/data/ib_logfile0 [o oOO] 3165/65536 /home/mysql/database/mariadb-104/data/ib_logfile1 [OOOOOOOOOOOOOOOOOOOOOo ] 23192/65536 /home/mysql/database/mariadb-104/data/ibtmp1 [ ] 0/3072 Files: 5 Directories: 0 Resident Pages: 28773/153604 112M/600M 18.7% Elapsed: 0.005499 seconds

A further question to answer is: Can I see all files cached in the Page Cache? So it seem like this is not easily possible:

There is no efficient search mechanism for doing the reverse - getting a file name belonging to a data block would require reading all inodes and indirect blocks on the file system. If you need to know about every single file's blocks stored in the page cache, you would need to supply a list of all files on your file system(s) to fincore. But that again is likely to spoil the measurement as a large amount of data would be read traversing the directories and getting all inodes and indirect blocks - putting them into the page cache and evicting the very page cache data you were trying to examine. [5]

Also in this article we can read about the Linux File Tools (linux-ftools) by Google. It seems to be a bit more complicated to make them work. So I let it be.

How is the Page Cache related to MariaDB

After all this technical O/S discussion, how is Linux Page Cache related to your MariaDB Database? Your MariaDB Database caches Data and Indexes as well. For the InnoDB Storage Engine this is the InnoDB Buffer Pool and for the Aria Storage Engine this is the Aria Page Cache Buffer. So if your MariaDB Database caches pages and if your Linux O/S caches pages the probability is high they cache the same data twice and thus waste valuable RAM! Fortunately InnoDB is configurable in a way it does NOT cache InnoDB files in the Page Cache. This is controlled with the InnoDB Server Variable innodb_flush_method.

When we look at InnoDB Files which were opened in a "normal" way (default: innodb_flush_method = fsync) we get the following information about how the files were opened (man 2 open and [6]):

# lsof +fg ./ib* COMMAND PID USER FD TYPE FILE-FLAG DEVICE SIZE/OFF NODE NAME mysqld 2098 mysql 7uW REG RW,LG,0x80000 8,1 79691776 9175185 ./ibdata1 mysqld 2098 mysql 11uW REG RW,LG,0x80000 8,1 268435456 9175186 ./ib_logfile0 mysqld 2098 mysql 12uW REG RW,LG,0x80000 8,1 268435456 9175187 ./ib_logfile1 mysqld 2098 mysql 13uW REG RW,LG,0x80000 8,1 12582912 9175280 ./ibtmp1

The interesting column here is the FILE-FLAG column which indicates (man lsof):

# man lsof FILE-FLAG when g or G has been specified to +f, this field contains the contents of the f_flag[s] member of the kernel file structure and the kernel's per-process open file flags (if available); `G' causes them to be displayed in hexadecimal; `g', as short-hand names; two lists may be displayed with entries separated by commas, the lists separated by a semicolon (`;'); the first list may contain short-hand names for f_flag[s] values from the following table: DIR direct LG large file RW read and write access

The output is not so clear or completely understandable yet thus we want to have the open file flags in hexadecimal notation:

# lsof +fG ./ib* COMMAND PID USER FD TYPE FILE-FLAG DEVICE SIZE/OFF NODE NAME mysqld 2098 mysql 7uW REG 0x88002;0x0 8,1 79691776 9175185 ./ibdata1 mysqld 2098 mysql 11uW REG 0x88002;0x0 8,1 268435456 9175186 ./ib_logfile0 mysqld 2098 mysql 12uW REG 0x88002;0x0 8,1 268435456 9175187 ./ib_logfile1 mysqld 2098 mysql 13uW REG 0x88002;0x0 8,1 12582912 9175280 ./ibtmp1

The Linux Kernel open file flags can be found here: fcntl.h. I have extracted the most relevant open file flags for our examination:

#define O_RDWR 00000002 (oct, 0x00002) #define O_DIRECT 00040000 (oct, 0x04000) /* direct disk access hint */ #define O_LARGEFILE 00100000 (oct, 0x08000) #define O_CLOEXEC 02000000 (oct, 0x80000) /* set close_on_exec */

So we can see that these 4 InnoDB files where opened with O_RDWR (RW), O_LARGE_FILE (LG) and O_CLOEXEC (not available (yet?) in lsof translation output).

Now let us start the MariaDB Database with the server variable set to: innodb_flush_method = O_DIRECT and check how the files where opened:

# lsof +fg ./ib* COMMAND PID USER FD TYPE FILE-FLAG DEVICE SIZE/OFF NODE NAME mysqld 2098 mysql 7uW REG RW,DIR,LG,0x80000 8,1 79691776 9175185 ./ibdata1 mysqld 2098 mysql 11uW REG RW,LG,0x80000 8,1 268435456 9175186 ./ib_logfile0 mysqld 2098 mysql 12uW REG RW,LG,0x80000 8,1 268435456 9175187 ./ib_logfile1 mysqld 2098 mysql 13uW REG RW,DIR,LG,0x80000 8,1 12582912 9175280 ./ibtmp1 # lsof +fG ./ib* COMMAND PID USER FD TYPE FILE-FLAG DEVICE SIZE/OFF NODE NAME mysqld 2098 mysql 7uW REG 0x8c002;0x0 8,1 79691776 9175185 ./ibdata1 mysqld 2098 mysql 11uW REG 0x88002;0x0 8,1 268435456 9175186 ./ib_logfile0 mysqld 2098 mysql 12uW REG 0x88002;0x0 8,1 268435456 9175187 ./ib_logfile1 mysqld 2098 mysql 13uW REG 0x8c002;0x0 8,1 12582912 9175280 ./ibtmp1

We can see a new flag DIR or 0x04000 which means the files where opened with O_DIRECT. But only the InnoDB Temporary Table Tablespace and the InnoDB System Tablespace but not the two InnoDB Transaction Logs.

Translation of hex to oct: 0x8c002 = 02140002.

But what does O_DIRECT mean? Looking at the open(2) man pages we can find:

# man 2 open O_DIRECT (since Linux 2.4.10) Try to minimize cache effects of the I/O to and from this file. In general this will degrade performance, but it is useful in special situations, such as when applications do their own caching. File I/O is done directly to/from user-space buffers. The O_DIRECT flag on its own makes an effort to transfer data synchronously, but does not give the guarantees of the O_SYNC flag that data and necessary meta‐ data are transferred. To guarantee synchronous I/O, O_SYNC must be used in addition to O_DIRECT.

So O_DIRECT is exactly what we want in this case: Bypassing the File System Page Cache to not cache the Database blocks twice!

To verify the impact we run pcstat again:

# pcstat /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* +------------------------------------------------------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |------------------------------------------------------+----------------+------------+-----------+---------| | /home/mysql/database/mariadb-104/data/ib_buffer_pool | 16020 | 4 | 4 | 100.000 | | /home/mysql/database/mariadb-104/data/ibdata1 | 79691776 | 19456 | 140 | 000.720 | | /home/mysql/database/mariadb-104/data/ib_logfile0 | 268435456 | 65536 | 36844 | 056.219 | | /home/mysql/database/mariadb-104/data/ib_logfile1 | 268435456 | 65536 | 65536 | 100.000 | | /home/mysql/database/mariadb-104/data/ibtmp1 | 12582912 | 3072 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.frm | 1097 | 1 | 1 | 100.000 | | /home/mysql/database/mariadb-104/data/test/test.ibd | 67108864 | 16384 | 13400 | 081.787 | +------------------------------------------------------+----------------+------------+-----------+---------+

But... part of the InnoDB Tablespace files is still cached! Also checking the total amount of Buffers and Cache shows the same:

# free total used free shared buff/cache available Mem: 16106252 4401788 368200 456716 11336264 10691792 Swap: 31250428 1348440 29901988

So restarting the MariaDB database does not purge the Page Cache! Note: This is important to notice because bypassing the Page Cache helps to not wasting valuable RAM but it makes Database restart much more costly because Page Cache does not help/support InnoDB Buffer Pool heating any more!

Then let us clear the Linux Page Cache and check the result:

# echo 1 > /proc/sys/vm/drop_caches # free -w total used free shared buffers cache available Mem: 16106252 4395892 10539864 441708 696 1169800 10882984 Swap: 31250428 1348428 29902000

Checking with pcstat shows now that all InnoDB pages are wiped out of the Page Cache:

# pcstat /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* +------------------------------------------------------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |------------------------------------------------------+----------------+------------+-----------+---------| | /home/mysql/database/mariadb-104/data/ib_buffer_pool | 16020 | 4 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibdata1 | 79691776 | 19456 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ib_logfile0 | 268435456 | 65536 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ib_logfile1 | 268435456 | 65536 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibtmp1 | 12582912 | 3072 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.frm | 1097 | 1 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.ibd | 67108864 | 16384 | 0 | 000.000 | +------------------------------------------------------+----------------+------------+-----------+---------+

And after a while running traffic on the test table we can see that InnoDB Transaction Log Files are cached again in the Page Cache but NOT the InnoDB Tablespace files:

# pcstat /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* +------------------------------------------------------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |------------------------------------------------------+----------------+------------+-----------+---------| | /home/mysql/database/mariadb-104/data/ib_buffer_pool | 16020 | 4 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibdata1 | 79691776 | 19456 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ib_logfile0 | 268435456 | 65536 | 3012 | 004.596 | | /home/mysql/database/mariadb-104/data/ib_logfile1 | 268435456 | 65536 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibtmp1 | 12582912 | 3072 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.frm | 1097 | 1 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.ibd | 71303168 | 17408 | 0 | 000.000 | +------------------------------------------------------+----------------+------------+-----------+---------+

Also with vmtouch we can see the difference:

./vmtouch -f -v /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* /home/mysql/database/mariadb-104/data/ib_buffer_pool [ ] 0/4 /home/mysql/database/mariadb-104/data/ibdata1 [ ] 0/19456 /home/mysql/database/mariadb-104/data/ib_logfile0 [o oOOOo ] 4252/65536 /home/mysql/database/mariadb-104/data/ib_logfile1 [ ] 0/65536 /home/mysql/database/mariadb-104/data/ibtmp1 [ ] 0/3072 /home/mysql/database/mariadb-104/data/test/test.frm [ ] 0/1 /home/mysql/database/mariadb-104/data/test/test.ibd [ ] 0/17408 Files: 7 Directories: 0 Resident Pages: 4252/171013 16M/668M 2.49% Elapsed: 0.003264 seconds

And also cachestat shows the effect of a flushed Buffer Cache and Page Cache:

# ./cachestat Counting cache functions... Output every 1 seconds. HITS MISSES DIRTIES RATIO BUFFERS_MB CACHE_MB 677882 19 740 100.0% 67 1087 679213 10 700 100.0% 67 1087 677236 0 732 100.0% 67 1087 685673 11 932 100.0% 67 1088 677933 5 703 100.0% 67 1088

Caution: Depending on your underlying I/O system it makes nevertheless sense to run your MariaDB Database with innodb_flush_method = fsync in certain cases! See also PostgreSQL behaviour.

Note: This information could also be interesting for PostgreSQL DBAs because they do redundant buffering with their shared_buffers (why plural? It is just one!?!) and the O/S Page Cache as well!

What is Slab

Beside Buffer Cache and the Page Cache itself we have a third thing in the /proc/meminfo statistics listed as Slabs. So what are Slabs? Slab seems to be a specific memory management (allocation) mechanism. It is used for frequently used objects in the Linux Kernel (buffer heads, inodes, dentries, etc.) [7-15]. So it contains something like other Linux Kernel Buffers and Kernel Caches.

What kind of other Linux Kernel Buffer and Kernel Caches exists can be found with the following command:

# sudo cat /proc/slabinfo slabinfo - version: 2.1 # name : tunables : slabdata nf_conntrack_1 14183 15275 320 25 2 : tunables 0 0 0 : slabdata 611 611 0 ext4_groupinfo_4k 8575 8596 144 28 1 : tunables 0 0 0 : slabdata 307 307 0 i915_gem_vma 523 950 320 25 2 : tunables 0 0 0 : slabdata 38 38 0 UDPv6 120 120 1088 30 8 : tunables 0 0 0 : slabdata 4 4 0 tw_sock_TCPv6 2668 2668 280 29 2 : tunables 0 0 0 : slabdata 92 92 0 request_sock_TCPv6 24 72 328 24 2 : tunables 0 0 0 : slabdata 3 3 0 TCPv6 68 105 2112 15 8 : tunables 0 0 0 : slabdata 7 7 0 cfq_queue 391 442 232 17 1 : tunables 0 0 0 : slabdata 26 26 0 mqueue_inode_cache 72 72 896 18 4 : tunables 0 0 0 : slabdata 4 4 0 fuse_request 20 40 400 20 2 : tunables 0 0 0 : slabdata 2 2 0 fuse_inode 1 21 768 21 4 : tunables 0 0 0 : slabdata 1 1 0 fat_cache 102 408 40 102 1 : tunables 0 0 0 : slabdata 4 4 0 hugetlbfs_inode_cache 28 84 584 28 4 : tunables 0 0 0 : slabdata 3 3 0 squashfs_inode_cache 25 50 640 25 4 : tunables 0 0 0 : slabdata 2 2 0 jbd2_journal_handle 340 340 48 85 1 : tunables 0 0 0 : slabdata 4 4 0 jbd2_journal_head 2040 2040 120 34 1 : tunables 0 0 0 : slabdata 60 60 0 jbd2_revoke_table_s 260 512 16 256 1 : tunables 0 0 0 : slabdata 2 2 0 jbd2_revoke_record_s1152 1408 32 128 1 : tunables 0 0 0 : slabdata 11 11 0 ext4_inode_cache 208751 210840 1072 30 8 : tunables 0 0 0 : slabdata 7028 7028 0 ext4_free_data 320 448 64 64 1 : tunables 0 0 0 : slabdata 7 7 0 ext4_allocation_cont 128 128 128 32 1 : tunables 0 0 0 : slabdata 4 4 0 ext4_io_end 392 560 72 56 1 : tunables 0 0 0 : slabdata 10 10 0 ext4_extent_status 64412 77928 40 102 1 : tunables 0 0 0 : slabdata 764 764 0 dquot 144 160 256 16 1 : tunables 0 0 0 : slabdata 10 10 0 mbcache 226 292 56 73 1 : tunables 0 0 0 : slabdata 4 4 0 dio 273 350 640 25 4 : tunables 0 0 0 : slabdata 14 14 0 pid_namespace 42 42 2224 14 8 : tunables 0 0 0 : slabdata 3 3 0 ip4-frags 32 64 248 16 1 : tunables 0 0 0 : slabdata 4 4 0 RAW 396 396 896 18 4 : tunables 0 0 0 : slabdata 22 22 0 UDP 68 68 960 17 4 : tunables 0 0 0 : slabdata 4 4 0 tw_sock_TCP 10750 11136 280 29 2 : tunables 0 0 0 : slabdata 384 384 0 request_sock_TCP 96 96 328 24 2 : tunables 0 0 0 : slabdata 4 4 0 TCP 119 136 1920 17 8 : tunables 0 0 0 : slabdata 8 8 0 blkdev_queue 27 48 1336 24 8 : tunables 0 0 0 : slabdata 2 2 0 blkdev_requests 394 506 368 22 2 : tunables 0 0 0 : slabdata 23 23 0 blkdev_ioc 516 546 104 39 1 : tunables 0 0 0 : slabdata 14 14 0 user_namespace 104 104 304 26 2 : tunables 0 0 0 : slabdata 4 4 0 dmaengine-unmap-256 15 15 2112 15 8 : tunables 0 0 0 : slabdata 1 1 0 sock_inode_cache 1707 1950 640 25 4 : tunables 0 0 0 : slabdata 78 78 0 file_lock_cache 665 665 208 19 1 : tunables 0 0 0 : slabdata 35 35 0 net_namespace 40 40 7296 4 8 : tunables 0 0 0 : slabdata 10 10 0 shmem_inode_cache 3315 3432 656 24 4 : tunables 0 0 0 : slabdata 143 143 0 taskstats 96 96 328 24 2 : tunables 0 0 0 : slabdata 4 4 0 proc_inode_cache 6895 7072 624 26 4 : tunables 0 0 0 : slabdata 272 272 0 sigqueue 100 100 160 25 1 : tunables 0 0 0 : slabdata 4 4 0 bdev_cache 29 76 832 19 4 : tunables 0 0 0 : slabdata 4 4 0 kernfs_node_cache 43625 44982 120 34 1 : tunables 0 0 0 : slabdata 1323 1323 0 mnt_cache 518 546 384 21 2 : tunables 0 0 0 : slabdata 26 26 0 inode_cache 17519 17668 568 28 4 : tunables 0 0 0 : slabdata 631 631 0 dentry 424185 439992 192 21 1 : tunables 0 0 0 : slabdata 20952 20952 0 buffer_head 1112865 1112865 104 39 1 : tunables 0 0 0 : slabdata 28535 28535 0 vm_area_struct 53945 55300 200 20 1 : tunables 0 0 0 : slabdata 2765 2765 0 files_cache 260 299 704 23 4 : tunables 0 0 0 : slabdata 13 13 0 signal_cache 509 630 1088 30 8 : tunables 0 0 0 : slabdata 21 21 0 sighand_cache 346 405 2112 15 8 : tunables 0 0 0 : slabdata 27 27 0 task_struct 1189 1269 3584 9 8 : tunables 0 0 0 : slabdata 141 141 0 Acpi-Operand 5703 5824 72 56 1 : tunables 0 0 0 : slabdata 104 104 0 Acpi-Parse 1314 1314 56 73 1 : tunables 0 0 0 : slabdata 18 18 0 Acpi-State 204 204 80 51 1 : tunables 0 0 0 : slabdata 4 4 0 Acpi-Namespace 4077 4182 40 102 1 : tunables 0 0 0 : slabdata 41 41 0 anon_vma 19831 21522 80 51 1 : tunables 0 0 0 : slabdata 422 422 0 numa_policy 170 170 24 170 1 : tunables 0 0 0 : slabdata 1 1 0 radix_tree_node 321937 327740 584 28 4 : tunables 0 0 0 : slabdata 11705 11705 0 trace_event_file 3985 4002 88 46 1 : tunables 0 0 0 : slabdata 87 87 0 ftrace_event_field 86541 88570 48 85 1 : tunables 0 0 0 : slabdata 1042 1042 0 idr_layer_cache 533 555 2096 15 8 : tunables 0 0 0 : slabdata 37 37 0 kmalloc-8192 1246 1246 8192 4 8 : tunables 0 0 0 : slabdata 502 502 0 kmalloc-4096 658 720 4096 8 8 : tunables 0 0 0 : slabdata 90 90 0 kmalloc-2048 1955 2144 2048 16 8 : tunables 0 0 0 : slabdata 134 134 0 kmalloc-1024 44217 44384 1024 16 4 : tunables 0 0 0 : slabdata 2774 2774 0 kmalloc-512 3037 3808 512 16 2 : tunables 0 0 0 : slabdata 238 238 0 kmalloc-256 17465 20384 256 16 1 : tunables 0 0 0 : slabdata 1274 1274 0 kmalloc-192 27708 28665 192 21 1 : tunables 0 0 0 : slabdata 1365 1365 0 kmalloc-128 140581 143744 128 32 1 : tunables 0 0 0 : slabdata 4492 4492 0 kmalloc-96 168044 168378 96 42 1 : tunables 0 0 0 : slabdata 4009 4009 0 kmalloc-64 117533 123264 64 64 1 : tunables 0 0 0 : slabdata 1926 1926 0 kmalloc-32 80425 90368 32 128 1 : tunables 0 0 0 : slabdata 706 706 0 kmalloc-16 9513 11264 16 256 1 : tunables 0 0 0 : slabdata 44 44 0 kmalloc-8 6616 7168 8 512 1 : tunables 0 0 0 : slabdata 14 14 0 kmem_cache_node 320 320 64 64 1 : tunables 0 0 0 : slabdata 5 5 0 kmem_cache 208 208 256 16 1 : tunables 0 0 0 : slabdata 13 13 0

If you want to see the most frequently used (hottest) Slabs you can see them top-like with slabtop. If you press c you can sort the Slabs by CACHE_SIZE:

# sudo slabtop Active / Total Objects (% used) : 2249113 / 2280136 (98.6%) Active / Total Slabs (% used) : 70256 / 70256 (100.0%) Active / Total Caches (% used) : 86 / 121 (71.1%) Active / Total Size (% used) : 597547.86K / 605445.30K (98.7%) Minimum / Average / Maximum Object : 0.01K / 0.26K / 18.56K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 294308 289889 98% 0.57K 10511 28 168176K radix_tree_node 105030 104435 99% 1.05K 3501 30 112032K ext4_inode_cache 745446 745446 100% 0.10K 19114 39 76456K buffer_head 59984 59909 99% 1.00K 3749 16 59984K ecryptfs_inode_cache 47520 47157 99% 1.00K 2970 16 47520K kmalloc-1024 215166 214987 99% 0.19K 10246 21 40984K dentry 139744 138452 99% 0.12K 4367 32 17468K kmalloc-128 179508 179011 99% 0.09K 4274 42 17096K kmalloc-96 47140 45768 97% 0.20K 2357 20 9428K vm_area_struct 14700 14700 100% 0.55K 525 28 8400K inode_cache ...
Literature Taxonomy upgrade extras: memoryRAMcachefile systemo_directinnodb_flush_methodpostgresqlinnodbtablespace

Who else is using my memory - File System Cache analysis

Fri, 2019-07-19 15:52

See also our former articles:

When we do analysis of MariaDB Database servers we also check the memory (RAM and Swap) available:

# free --kilo --wide total used free shared buffers cache available Mem: 16106252 4329952 703356 199008 307872 10765072 11042748 Swap: 31250428 528684 30721744

The values for buffers and especially for cache can be sometimes quite big. In this case they use about 10 GiB. So let us have a look what these things called buffers and cache are, using our valuable RAM... When we check the man pages of free we will find:

# man free ... buffers Memory used by kernel buffers (Buffers in /proc/meminfo) cache Memory used by the page cache and slabs (Cached and Slab in /proc/meminfo) buff/cache Sum of buffers and cache

So let us check a more fine grained information in /proc/meminfo which is an interface to the kernel data structures:

# cat /proc/meminfo | grep -e ^Cached -e Slab -e Buffers Buffers: 307872 kB Cached: 10155156 kB Slab: 609916 kB

Same values! Then let us have a look at the man pages of proc what we can find about these values:

# man proc ... Buffers Relatively temporary storage for raw disk blocks that shouldn't get tremendously large (20MB or so). Cached In-memory cache for files read from the disk (the page cache). Doesn't include SwapCached. Slab In-kernel data structures cache.

So it looks like we have a raw I/O Cache (called Buffer Cache) and a File System I/O Cache (called Page Cache). So how does this work? What is a raw I/O? And is a Files System I/O cached once (Cached) or twice (Cached and Buffers)?

When we dig a bit deeper we can find that prior to Linux Kernels 2.4 the two Caches were distinct. So that was a waste of memory (RAM). It seems like today this is not the case any more [1], [2], [3]. And man pages are a bit out of date or at least not very precise?

Analysing the Linux Page Cache

A very good source when it comes to Linux Performance Tuning and Measuring is Brendan Gregg's Website. To measure Linux Page Cache Hit Ratio he provides a tool called cachestat which is part of the perf-tools collection on GitHub.

With cachestat we get a per second statistics of the Buffer Cache and the Page Cache (without Slabs), Cache Hits, Cache Misses, Dirty Buffer Entries in the Cache and a Cache Hit Ratio:

# sudo cachestat Counting cache functions... Output every 1 seconds. HITS MISSES DIRTIES RATIO BUFFERS_MB CACHE_MB 1419 8 0 99.4% 338 9406 1368 0 0 100.0% 338 9406 1391 0 0 100.0% 338 9406 8558 0 29 100.0% 338 9406 31870 0 163 100.0% 338 9406 1374 0 24 100.0% 338 9406 1388 0 0 100.0% 338 9406 1370 0 0 100.0% 338 9406 1388 0 0 100.0% 338 9406

Brendan Gregg also mentions a tool called pcstat (on GitHub) by Al Tobey which gets Page Cache Statistics for Files. Unfortunately I had some problems building it on my Ubuntu 16.04 with Go version 1.6. So I built it on an Ubuntu 18.04 (Go 1.10) and copied it over to to Ubuntu 16.04):

# export GOPATH=/tmp/ # cd $GOPATH # go get golang.org/x/sys/unix # go get github.com/tobert/pcstat/pcstat # bin/pcstat $GOPATH/bin/pcstat

Then I tried pcstat out against a MariaDB 10.4 instance. In the output we can see how big the files are in bytes, how many pages of 4 kib this corresponds to, how many of these 4 kib pages are cached and the percentage of pages cached:

# pcstat /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* +------------------------------------------------------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |------------------------------------------------------+----------------+------------+-----------+---------| | /home/mysql/database/mariadb-104/data/ib_buffer_pool | 14642 | 4 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibdata1 | 79691776 | 19456 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ib_logfile0 | 268435456 | 65536 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ib_logfile1 | 268435456 | 65536 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibtmp1 | 12582912 | 3072 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.frm | 1097 | 1 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.ibd | 13631488 | 3328 | 0 | 000.000 | +------------------------------------------------------+----------------+------------+-----------+---------+

When we run pcstat over time with the famous watch command we can even see how the Page Cache is heating up:

# watch -d -n 1 'pcstat /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* ; free -w' +------------------------------------------------------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |------------------------------------------------------+----------------+------------+-----------+---------| | /home/mysql/database/mariadb-104/data/ib_buffer_pool | 14642 | 4 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibdata1 | 79691776 | 19456 | 2416 | 012.418 | | /home/mysql/database/mariadb-104/data/ib_logfile0 | 268435456 | 65536 | 3165 | 004.829 | | /home/mysql/database/mariadb-104/data/ib_logfile1 | 268435456 | 65536 | 5890 | 008.987 | | /home/mysql/database/mariadb-104/data/ibtmp1 | 12582912 | 3072 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.frm | 1097 | 1 | 1 | 100.000 | | /home/mysql/database/mariadb-104/data/test/test.ibd | 13631488 | 3328 | 1164 | 034.976 | +------------------------------------------------------+----------------+------------+-----------+---------+ total used free shared buffers cache available Mem: 16106252 4329952 703356 199008 307872 10765072 11042748 Swap: 31250428 528684 30721744

An other tool which was discussed on Brendans Website was vmtouch - the Virtual Memory Toucher (on GitHub, Documentation). With vmtouch we can see for example how much of the directory /home/mysql/database/mariadb-104/data (datadir) is currently in cache:

# vmtouch -f /home/mysql/database/mariadb-104/data Files: 503 Directories: 9 Resident Pages: 29356/231060 114M/902M 12.7% Elapsed: 0.009668 seconds

Or more fine grained how much of InnoDB System Files are currently in memory:

# vmtouch -f -v /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/ib_buffer_pool [ ] 0/4 /home/mysql/database/mariadb-104/data/ibdata1 [oOooooo ooooo ] 2416/19456 /home/mysql/database/mariadb-104/data/ib_logfile0 [o oOO] 3165/65536 /home/mysql/database/mariadb-104/data/ib_logfile1 [OOOOOOOOOOOOOOOOOOOOOo ] 23192/65536 /home/mysql/database/mariadb-104/data/ibtmp1 [ ] 0/3072 Files: 5 Directories: 0 Resident Pages: 28773/153604 112M/600M 18.7% Elapsed: 0.005499 seconds

A further question to answer is: Can I see all files cached in the Page Cache? So it seem like this is not easily possible:

There is no efficient search mechanism for doing the reverse - getting a file name belonging to a data block would require reading all inodes and indirect blocks on the file system. If you need to know about every single file's blocks stored in the page cache, you would need to supply a list of all files on your file system(s) to fincore. But that again is likely to spoil the measurement as a large amount of data would be read traversing the directories and getting all inodes and indirect blocks - putting them into the page cache and evicting the very page cache data you were trying to examine. [5]

Also in this article we can read about the Linux File Tools (linux-ftools) by Google. It seems to be a bit more complicated to make them work. So I let it be.

How is the Page Cache related to MariaDB

After all this technical O/S discussion, how is Linux Page Cache related to your MariaDB Database? Your MariaDB Database caches Data and Indexes as well. For the InnoDB Storage Engine this is the InnoDB Buffer Pool and for the Aria Storage Engine this is the Aria Page Cache Buffer. So if your MariaDB Database caches pages and if your Linux O/S caches pages the probability is high they cache the same data twice and thus waste valuable RAM! Fortunately InnoDB is configurable in a way it does NOT cache InnoDB files in the Page Cache. This is controlled with the InnoDB Server Variable innodb_flush_method.

When we look at InnoDB Files which were opened in a "normal" way (default: innodb_flush_method = fsync) we get the following information about how the files were opened (man 2 open and [6]):

# lsof +fg ./ib* COMMAND PID USER FD TYPE FILE-FLAG DEVICE SIZE/OFF NODE NAME mysqld 2098 mysql 7uW REG RW,LG,0x80000 8,1 79691776 9175185 ./ibdata1 mysqld 2098 mysql 11uW REG RW,LG,0x80000 8,1 268435456 9175186 ./ib_logfile0 mysqld 2098 mysql 12uW REG RW,LG,0x80000 8,1 268435456 9175187 ./ib_logfile1 mysqld 2098 mysql 13uW REG RW,LG,0x80000 8,1 12582912 9175280 ./ibtmp1

The interesting column here is the FILE-FLAG column which indicates (man lsof):

# man lsof FILE-FLAG when g or G has been specified to +f, this field contains the contents of the f_flag[s] member of the kernel file structure and the kernel's per-process open file flags (if available); `G' causes them to be displayed in hexadecimal; `g', as short-hand names; two lists may be displayed with entries separated by commas, the lists separated by a semicolon (`;'); the first list may contain short-hand names for f_flag[s] values from the following table: DIR direct LG large file RW read and write access

The output is not so clear or completely understandable yet thus we want to have the open file flags in hexadecimal notation:

# lsof +fG ./ib* COMMAND PID USER FD TYPE FILE-FLAG DEVICE SIZE/OFF NODE NAME mysqld 2098 mysql 7uW REG 0x88002;0x0 8,1 79691776 9175185 ./ibdata1 mysqld 2098 mysql 11uW REG 0x88002;0x0 8,1 268435456 9175186 ./ib_logfile0 mysqld 2098 mysql 12uW REG 0x88002;0x0 8,1 268435456 9175187 ./ib_logfile1 mysqld 2098 mysql 13uW REG 0x88002;0x0 8,1 12582912 9175280 ./ibtmp1

The Linux Kernel open file flags can be found here: fcntl.h. I have extracted the most relevant open file flags for our examination:

#define O_RDWR 00000002 (oct, 0x00002) #define O_DIRECT 00040000 (oct, 0x04000) /* direct disk access hint */ #define O_LARGEFILE 00100000 (oct, 0x08000) #define O_CLOEXEC 02000000 (oct, 0x80000) /* set close_on_exec */

So we can see that these 4 InnoDB files where opened with O_RDWR (RW), O_LARGE_FILE (LG) and O_CLOEXEC (not available (yet?) in lsof translation output).

Now let us start the MariaDB Database with the server variable set to: innodb_flush_method = O_DIRECT and check how the files where opened:

# lsof +fg ./ib* COMMAND PID USER FD TYPE FILE-FLAG DEVICE SIZE/OFF NODE NAME mysqld 2098 mysql 7uW REG RW,DIR,LG,0x80000 8,1 79691776 9175185 ./ibdata1 mysqld 2098 mysql 11uW REG RW,LG,0x80000 8,1 268435456 9175186 ./ib_logfile0 mysqld 2098 mysql 12uW REG RW,LG,0x80000 8,1 268435456 9175187 ./ib_logfile1 mysqld 2098 mysql 13uW REG RW,DIR,LG,0x80000 8,1 12582912 9175280 ./ibtmp1 # lsof +fG ./ib* COMMAND PID USER FD TYPE FILE-FLAG DEVICE SIZE/OFF NODE NAME mysqld 2098 mysql 7uW REG 0x8c002;0x0 8,1 79691776 9175185 ./ibdata1 mysqld 2098 mysql 11uW REG 0x88002;0x0 8,1 268435456 9175186 ./ib_logfile0 mysqld 2098 mysql 12uW REG 0x88002;0x0 8,1 268435456 9175187 ./ib_logfile1 mysqld 2098 mysql 13uW REG 0x8c002;0x0 8,1 12582912 9175280 ./ibtmp1

We can see a new flag DIR or 0x04000 which means the files where opened with O_DIRECT. But only the InnoDB Temporary Table Tablespace and the InnoDB System Tablespace but not the two InnoDB Transaction Logs.

Translation of hex to oct: 0x8c002 = 02140002.

But what does O_DIRECT mean? Looking at the open(2) man pages we can find:

# man 2 open O_DIRECT (since Linux 2.4.10) Try to minimize cache effects of the I/O to and from this file. In general this will degrade performance, but it is useful in special situations, such as when applications do their own caching. File I/O is done directly to/from user-space buffers. The O_DIRECT flag on its own makes an effort to transfer data synchronously, but does not give the guarantees of the O_SYNC flag that data and necessary meta‐ data are transferred. To guarantee synchronous I/O, O_SYNC must be used in addition to O_DIRECT.

So O_DIRECT is exactly what we want in this case: Bypassing the File System Page Cache to not cache the Database blocks twice!

To verify the impact we run pcstat again:

# pcstat /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* +------------------------------------------------------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |------------------------------------------------------+----------------+------------+-----------+---------| | /home/mysql/database/mariadb-104/data/ib_buffer_pool | 16020 | 4 | 4 | 100.000 | | /home/mysql/database/mariadb-104/data/ibdata1 | 79691776 | 19456 | 140 | 000.720 | | /home/mysql/database/mariadb-104/data/ib_logfile0 | 268435456 | 65536 | 36844 | 056.219 | | /home/mysql/database/mariadb-104/data/ib_logfile1 | 268435456 | 65536 | 65536 | 100.000 | | /home/mysql/database/mariadb-104/data/ibtmp1 | 12582912 | 3072 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.frm | 1097 | 1 | 1 | 100.000 | | /home/mysql/database/mariadb-104/data/test/test.ibd | 67108864 | 16384 | 13400 | 081.787 | +------------------------------------------------------+----------------+------------+-----------+---------+

But... part of the InnoDB Tablespace files is still cached! Also checking the total amount of Buffers and Cache shows the same:

# free total used free shared buff/cache available Mem: 16106252 4401788 368200 456716 11336264 10691792 Swap: 31250428 1348440 29901988

So restarting the MariaDB database does not purge the Page Cache! Note: This is important to notice because bypassing the Page Cache helps to not wasting valuable RAM but it makes Database restart much more costly because Page Cache does not help/support InnoDB Buffer Pool heating any more!

Then let us clear the Linux Page Cache and check the result:

# echo 1 > /proc/sys/vm/drop_caches # free -w total used free shared buffers cache available Mem: 16106252 4395892 10539864 441708 696 1169800 10882984 Swap: 31250428 1348428 29902000

Checking with pcstat shows now that all InnoDB pages are wiped out of the Page Cache:

# pcstat /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* +------------------------------------------------------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |------------------------------------------------------+----------------+------------+-----------+---------| | /home/mysql/database/mariadb-104/data/ib_buffer_pool | 16020 | 4 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibdata1 | 79691776 | 19456 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ib_logfile0 | 268435456 | 65536 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ib_logfile1 | 268435456 | 65536 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibtmp1 | 12582912 | 3072 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.frm | 1097 | 1 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.ibd | 67108864 | 16384 | 0 | 000.000 | +------------------------------------------------------+----------------+------------+-----------+---------+

And after a while running traffic on the test table we can see that InnoDB Transaction Log Files are cached again in the Page Cache but NOT the InnoDB Tablespace files:

# pcstat /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* +------------------------------------------------------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |------------------------------------------------------+----------------+------------+-----------+---------| | /home/mysql/database/mariadb-104/data/ib_buffer_pool | 16020 | 4 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibdata1 | 79691776 | 19456 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ib_logfile0 | 268435456 | 65536 | 3012 | 004.596 | | /home/mysql/database/mariadb-104/data/ib_logfile1 | 268435456 | 65536 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibtmp1 | 12582912 | 3072 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.frm | 1097 | 1 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.ibd | 71303168 | 17408 | 0 | 000.000 | +------------------------------------------------------+----------------+------------+-----------+---------+

Also with vmtouch we can see the difference:

./vmtouch -f -v /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* /home/mysql/database/mariadb-104/data/ib_buffer_pool [ ] 0/4 /home/mysql/database/mariadb-104/data/ibdata1 [ ] 0/19456 /home/mysql/database/mariadb-104/data/ib_logfile0 [o oOOOo ] 4252/65536 /home/mysql/database/mariadb-104/data/ib_logfile1 [ ] 0/65536 /home/mysql/database/mariadb-104/data/ibtmp1 [ ] 0/3072 /home/mysql/database/mariadb-104/data/test/test.frm [ ] 0/1 /home/mysql/database/mariadb-104/data/test/test.ibd [ ] 0/17408 Files: 7 Directories: 0 Resident Pages: 4252/171013 16M/668M 2.49% Elapsed: 0.003264 seconds

And also cachestat shows the effect of a flushed Buffer Cache and Page Cache:

# ./cachestat Counting cache functions... Output every 1 seconds. HITS MISSES DIRTIES RATIO BUFFERS_MB CACHE_MB 677882 19 740 100.0% 67 1087 679213 10 700 100.0% 67 1087 677236 0 732 100.0% 67 1087 685673 11 932 100.0% 67 1088 677933 5 703 100.0% 67 1088

Caution: Depending on your underlying I/O system it makes nevertheless sense to run your MariaDB Database with innodb_flush_method = fsync in certain cases! See also PostgreSQL behaviour.

Note: This information could also be interesting for PostgreSQL DBAs because they do redundant buffering with their shared_buffers (why plural? It is just one!?!) and the O/S Page Cache as well!

What is Slab

Beside Buffer Cache and the Page Cache itself we have a third thing in the /proc/meminfo statistics listed as Slabs. So what are Slabs? Slab seems to be a specific memory management (allocation) mechanism. It is used for frequently used objects in the Linux Kernel (buffer heads, inodes, dentries, etc.) [7-15]. So it contains something like other Linux Kernel Buffers and Kernel Caches.

What kind of other Linux Kernel Buffer and Kernel Caches exists can be found with the following command:

# sudo cat /proc/slabinfo slabinfo - version: 2.1 # name : tunables : slabdata nf_conntrack_1 14183 15275 320 25 2 : tunables 0 0 0 : slabdata 611 611 0 ext4_groupinfo_4k 8575 8596 144 28 1 : tunables 0 0 0 : slabdata 307 307 0 i915_gem_vma 523 950 320 25 2 : tunables 0 0 0 : slabdata 38 38 0 UDPv6 120 120 1088 30 8 : tunables 0 0 0 : slabdata 4 4 0 tw_sock_TCPv6 2668 2668 280 29 2 : tunables 0 0 0 : slabdata 92 92 0 request_sock_TCPv6 24 72 328 24 2 : tunables 0 0 0 : slabdata 3 3 0 TCPv6 68 105 2112 15 8 : tunables 0 0 0 : slabdata 7 7 0 cfq_queue 391 442 232 17 1 : tunables 0 0 0 : slabdata 26 26 0 mqueue_inode_cache 72 72 896 18 4 : tunables 0 0 0 : slabdata 4 4 0 fuse_request 20 40 400 20 2 : tunables 0 0 0 : slabdata 2 2 0 fuse_inode 1 21 768 21 4 : tunables 0 0 0 : slabdata 1 1 0 fat_cache 102 408 40 102 1 : tunables 0 0 0 : slabdata 4 4 0 hugetlbfs_inode_cache 28 84 584 28 4 : tunables 0 0 0 : slabdata 3 3 0 squashfs_inode_cache 25 50 640 25 4 : tunables 0 0 0 : slabdata 2 2 0 jbd2_journal_handle 340 340 48 85 1 : tunables 0 0 0 : slabdata 4 4 0 jbd2_journal_head 2040 2040 120 34 1 : tunables 0 0 0 : slabdata 60 60 0 jbd2_revoke_table_s 260 512 16 256 1 : tunables 0 0 0 : slabdata 2 2 0 jbd2_revoke_record_s1152 1408 32 128 1 : tunables 0 0 0 : slabdata 11 11 0 ext4_inode_cache 208751 210840 1072 30 8 : tunables 0 0 0 : slabdata 7028 7028 0 ext4_free_data 320 448 64 64 1 : tunables 0 0 0 : slabdata 7 7 0 ext4_allocation_cont 128 128 128 32 1 : tunables 0 0 0 : slabdata 4 4 0 ext4_io_end 392 560 72 56 1 : tunables 0 0 0 : slabdata 10 10 0 ext4_extent_status 64412 77928 40 102 1 : tunables 0 0 0 : slabdata 764 764 0 dquot 144 160 256 16 1 : tunables 0 0 0 : slabdata 10 10 0 mbcache 226 292 56 73 1 : tunables 0 0 0 : slabdata 4 4 0 dio 273 350 640 25 4 : tunables 0 0 0 : slabdata 14 14 0 pid_namespace 42 42 2224 14 8 : tunables 0 0 0 : slabdata 3 3 0 ip4-frags 32 64 248 16 1 : tunables 0 0 0 : slabdata 4 4 0 RAW 396 396 896 18 4 : tunables 0 0 0 : slabdata 22 22 0 UDP 68 68 960 17 4 : tunables 0 0 0 : slabdata 4 4 0 tw_sock_TCP 10750 11136 280 29 2 : tunables 0 0 0 : slabdata 384 384 0 request_sock_TCP 96 96 328 24 2 : tunables 0 0 0 : slabdata 4 4 0 TCP 119 136 1920 17 8 : tunables 0 0 0 : slabdata 8 8 0 blkdev_queue 27 48 1336 24 8 : tunables 0 0 0 : slabdata 2 2 0 blkdev_requests 394 506 368 22 2 : tunables 0 0 0 : slabdata 23 23 0 blkdev_ioc 516 546 104 39 1 : tunables 0 0 0 : slabdata 14 14 0 user_namespace 104 104 304 26 2 : tunables 0 0 0 : slabdata 4 4 0 dmaengine-unmap-256 15 15 2112 15 8 : tunables 0 0 0 : slabdata 1 1 0 sock_inode_cache 1707 1950 640 25 4 : tunables 0 0 0 : slabdata 78 78 0 file_lock_cache 665 665 208 19 1 : tunables 0 0 0 : slabdata 35 35 0 net_namespace 40 40 7296 4 8 : tunables 0 0 0 : slabdata 10 10 0 shmem_inode_cache 3315 3432 656 24 4 : tunables 0 0 0 : slabdata 143 143 0 taskstats 96 96 328 24 2 : tunables 0 0 0 : slabdata 4 4 0 proc_inode_cache 6895 7072 624 26 4 : tunables 0 0 0 : slabdata 272 272 0 sigqueue 100 100 160 25 1 : tunables 0 0 0 : slabdata 4 4 0 bdev_cache 29 76 832 19 4 : tunables 0 0 0 : slabdata 4 4 0 kernfs_node_cache 43625 44982 120 34 1 : tunables 0 0 0 : slabdata 1323 1323 0 mnt_cache 518 546 384 21 2 : tunables 0 0 0 : slabdata 26 26 0 inode_cache 17519 17668 568 28 4 : tunables 0 0 0 : slabdata 631 631 0 dentry 424185 439992 192 21 1 : tunables 0 0 0 : slabdata 20952 20952 0 buffer_head 1112865 1112865 104 39 1 : tunables 0 0 0 : slabdata 28535 28535 0 vm_area_struct 53945 55300 200 20 1 : tunables 0 0 0 : slabdata 2765 2765 0 files_cache 260 299 704 23 4 : tunables 0 0 0 : slabdata 13 13 0 signal_cache 509 630 1088 30 8 : tunables 0 0 0 : slabdata 21 21 0 sighand_cache 346 405 2112 15 8 : tunables 0 0 0 : slabdata 27 27 0 task_struct 1189 1269 3584 9 8 : tunables 0 0 0 : slabdata 141 141 0 Acpi-Operand 5703 5824 72 56 1 : tunables 0 0 0 : slabdata 104 104 0 Acpi-Parse 1314 1314 56 73 1 : tunables 0 0 0 : slabdata 18 18 0 Acpi-State 204 204 80 51 1 : tunables 0 0 0 : slabdata 4 4 0 Acpi-Namespace 4077 4182 40 102 1 : tunables 0 0 0 : slabdata 41 41 0 anon_vma 19831 21522 80 51 1 : tunables 0 0 0 : slabdata 422 422 0 numa_policy 170 170 24 170 1 : tunables 0 0 0 : slabdata 1 1 0 radix_tree_node 321937 327740 584 28 4 : tunables 0 0 0 : slabdata 11705 11705 0 trace_event_file 3985 4002 88 46 1 : tunables 0 0 0 : slabdata 87 87 0 ftrace_event_field 86541 88570 48 85 1 : tunables 0 0 0 : slabdata 1042 1042 0 idr_layer_cache 533 555 2096 15 8 : tunables 0 0 0 : slabdata 37 37 0 kmalloc-8192 1246 1246 8192 4 8 : tunables 0 0 0 : slabdata 502 502 0 kmalloc-4096 658 720 4096 8 8 : tunables 0 0 0 : slabdata 90 90 0 kmalloc-2048 1955 2144 2048 16 8 : tunables 0 0 0 : slabdata 134 134 0 kmalloc-1024 44217 44384 1024 16 4 : tunables 0 0 0 : slabdata 2774 2774 0 kmalloc-512 3037 3808 512 16 2 : tunables 0 0 0 : slabdata 238 238 0 kmalloc-256 17465 20384 256 16 1 : tunables 0 0 0 : slabdata 1274 1274 0 kmalloc-192 27708 28665 192 21 1 : tunables 0 0 0 : slabdata 1365 1365 0 kmalloc-128 140581 143744 128 32 1 : tunables 0 0 0 : slabdata 4492 4492 0 kmalloc-96 168044 168378 96 42 1 : tunables 0 0 0 : slabdata 4009 4009 0 kmalloc-64 117533 123264 64 64 1 : tunables 0 0 0 : slabdata 1926 1926 0 kmalloc-32 80425 90368 32 128 1 : tunables 0 0 0 : slabdata 706 706 0 kmalloc-16 9513 11264 16 256 1 : tunables 0 0 0 : slabdata 44 44 0 kmalloc-8 6616 7168 8 512 1 : tunables 0 0 0 : slabdata 14 14 0 kmem_cache_node 320 320 64 64 1 : tunables 0 0 0 : slabdata 5 5 0 kmem_cache 208 208 256 16 1 : tunables 0 0 0 : slabdata 13 13 0

If you want to see the most frequently used (hottest) Slabs you can see them top-like with slabtop. If you press c you can sort the Slabs by CACHE_SIZE:

# sudo slabtop Active / Total Objects (% used) : 2249113 / 2280136 (98.6%) Active / Total Slabs (% used) : 70256 / 70256 (100.0%) Active / Total Caches (% used) : 86 / 121 (71.1%) Active / Total Size (% used) : 597547.86K / 605445.30K (98.7%) Minimum / Average / Maximum Object : 0.01K / 0.26K / 18.56K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 294308 289889 98% 0.57K 10511 28 168176K radix_tree_node 105030 104435 99% 1.05K 3501 30 112032K ext4_inode_cache 745446 745446 100% 0.10K 19114 39 76456K buffer_head 59984 59909 99% 1.00K 3749 16 59984K ecryptfs_inode_cache 47520 47157 99% 1.00K 2970 16 47520K kmalloc-1024 215166 214987 99% 0.19K 10246 21 40984K dentry 139744 138452 99% 0.12K 4367 32 17468K kmalloc-128 179508 179011 99% 0.09K 4274 42 17096K kmalloc-96 47140 45768 97% 0.20K 2357 20 9428K vm_area_struct 14700 14700 100% 0.55K 525 28 8400K inode_cache ...
Literature Taxonomy upgrade extras: memoryRAMcachefile systemo_directinnodb_flush_methodpostgresqlinnodbtablespace

Enable General Query Log per Connection in MariaDB

Wed, 2019-07-10 20:27

The General Query Log in MariaDB is a Log (file or table) that logs every statement sent from the Application to the MariaDB Database Server from the Connect Command to the Quit Command. A simple example you can find here:

bin/mysqld, Version: 10.4.6-MariaDB-log (MariaDB Server). started with: Tcp port: 3306 Unix socket: /var/run/mysqld/mysql.sock Time Id Command Argument 190709 21:27:30 3872 Connect root@localhost as anonymous on test 3872 Query INSERT INTO test VALUES (NULL, CONCAT('Test data insert from laptop5 on ', @@hostname), CURRENT_TIMESTAMP()) 3872 Quit

The General Query Log can be enabled dynamically with the following command:

SQL> SET GLOBAL general_log = 1;

and is written either to a file (defined with general_log_file) or a table (mysql.general_log, defined with log_output=TABLE).

But unfortunately the General Query Log logs all Commands from all Connections. So it can grow quickly and you loose overview over what happens...

With the MariaDB Server System Variables sql_log_off you can configure to NOT log a Connection to the General Query Log. This is the opposite of what we want to have. But with a clever combination of these variables we can manage how to log only specific Connections to the General Query Log.

Preparation for Connection Logging to the General Query Log

First of all we have to enable the General Query Log but disable it for all Connections so nothing is written to the General Query Log:

SQL> SET GLOBAL sql_log_off = on; -- 1 = no logging SQL> SET SESSION sql_log_off = on; -- also for my current session! SQL> SET GLOBAL general_log = 1; -- 1 = enable general log file

Then we need a place where to specify and control which user from which host should be logged. So we can control General Query Log logging dynamically. For this purpose we create a table called general_log_settings in the sys Schema of the MariaDB sys Schema:

SQL> CREATE SCHEMA IF NOT EXISTS `sys`; SQL> use `sys`; SQL> DROP TABLE IF EXISTS `general_log_settings`; SQL> CREATE TABLE `general_log_settings` ( `HOST` char(60) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT '%', `USER` char(32) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT '%', `ENABLED` enum('YES','NO') NOT NULL DEFAULT 'YES' ) ENGINE=Aria DEFAULT CHARSET=utf8; SQL> INSERT INTO `general_log_settings` (`HOST`, `USER`, `ENABLED`) VALUES ('%', '%', 'NO');

Now we need a mechanism to enable General Query Log logging for every user. For this we use a Stored Procedure:

SQL> use `sys`; SQL> DROP PROCEDURE IF EXISTS `enable_general_log`; SQL> DELIMITER // SQL> CREATE OR REPLACE DEFINER=`root`@`localhost` PROCEDURE `enable_general_log`() DETERMINISTIC READS SQL DATA SQL SECURITY DEFINER BEGIN DECLARE user VARCHAR(32); DECLARE host VARCHAR(60); DECLARE cnt SMALLINT; SELECT SUBSTRING_INDEX(SESSION_USER(), '@', 1) INTO user; SELECT SUBSTRING_INDEX(SESSION_USER(), '@', -1) INTO host; SELECT COUNT(*) INTO cnt FROM `general_log_settings` WHERE `ENABLED` = 'YES' AND (`HOST` = host OR `HOST` = '%') AND (`USER`= user OR `USER`= '%') ; IF cnt > 0 THEN SET SESSION sql_log_off = off; DO 'Connect'; END IF; END; // SQL> DELIMITER ;

Then we have to grant to ALL! users without the SUPER privilege the privilege to execute this procedure. These users can be found with the following query:

SQL> SELECT user, host, super_priv FROM mysql.user WHERE super_priv != 'Y'; +------+-----------+------------+ | User | Host | Super_priv | +------+-----------+------------+ | oli | localhost | N | | app | % | N | +------+-----------+------------+

To grant the privilege to execute the procedure you must run the queries as follows:

SQL> GRANT EXECUTE ON PROCEDURE `sys`.`enable_general_log` TO 'oli'@'localhost'; SQL> GRANT EXECUTE ON PROCEDURE `sys`.`enable_general_log` TO 'app'@'%';

Caution: If you do NOT grant the execute privilege to all users their new connections will fail as soon as you have enabled the logon trigger!

Then you have to add the users you want to log to the General Query Log to the general_log_settings table:

SQL> INSERT INTO `sys`.`general_log_settings` VALUES ('%', 'app', 'YES'); SQL> SELECT * FROM `sys`.`general_log_settings`; +------+------+---------+ | HOST | USER | ENABLED | +------+------+---------+ | % | % | NO | | % | app | YES | +------+------+---------+

You can test the logging to the General Query Log in your Connection with the following command now:

SQL> CALL `sys`.`enable_general_log`();

If everything is done correctly you should see some entries in your General Query Log as follows:

190709 23:51:21 6695 Query DO 'Connect' 6695 Query select count(*) from test.test 6695 Quit

And you should notice that the Connect Command is missing (see above). This is because enabling the General Query Log happens after the Connect. To give an equivalent entry tag in the General Query Log we have added the DO 'Connect' tag in the Stored Procedure.

Up to this point here all actions should not have any impact (neigther operational nor performance wise) to your MariaDB database system yet.

Activating General Query Log per Connection

Now we want to activate General Query Log logging for every new Connection to your MariaDB database server. This step can have an impact on your MariaDB database system and you should know how an be prepared to revert this step:

SQL> SET GLOBAL init_connect = 'CALL `sys`.`enable_general_log`();';

The init_connect Server System Variable specifies a command or a list of commands that will be executed for each new client connection. We also call this a Logon Trigger. These statements are only executed for users without the SUPER privilege. So you cannot track admin accounts with this method.

Further readings

Please let us know if you think we should add this tool to the MariaDB sys Schema or not.

Taxonomy upgrade extras: general query logconnectionlogon triggerSQL/PSM

Enable General Query Log per Connection in MariaDB

Wed, 2019-07-10 20:27

The General Query Log in MariaDB is a Log (file or table) that logs every statement sent from the Application to the MariaDB Database Server from the Connect Command to the Quit Command. A simple example you can find here:

bin/mysqld, Version: 10.4.6-MariaDB-log (MariaDB Server). started with: Tcp port: 3306 Unix socket: /var/run/mysqld/mysql.sock Time Id Command Argument 190709 21:27:30 3872 Connect root@localhost as anonymous on test 3872 Query INSERT INTO test VALUES (NULL, CONCAT('Test data insert from laptop5 on ', @@hostname), CURRENT_TIMESTAMP()) 3872 Quit

The General Query Log can be enabled dynamically with the following command:

SQL> SET GLOBAL general_log = 1;

and is written either to a file (defined with general_log_file) or a table (mysql.general_log, defined with log_output=TABLE).

But unfortunately the General Query Log logs all Commands from all Connections. So it can grow quickly and you loose overview over what happens...

With the MariaDB Server System Variables sql_log_off you can configure to NOT log a Connection to the General Query Log. This is the opposite of what we want to have. But with a clever combination of these variables we can manage how to log only specific Connections to the General Query Log.

Preparation for Connection Logging to the General Query Log

First of all we have to enable the General Query Log but disable it for all Connections so nothing is written to the General Query Log:

SQL> SET GLOBAL sql_log_off = on; -- 1 = no logging SQL> SET SESSION sql_log_off = on; -- also for my current session! SQL> SET GLOBAL general_log = 1; -- 1 = enable general log file

Then we need a place where to specify and control which user from which host should be logged. So we can control General Query Log logging dynamically. For this purpose we create a table called general_log_settings in the sys Schema of the MariaDB sys Schema:

SQL> CREATE SCHEMA IF NOT EXISTS `sys`; SQL> use `sys`; SQL> DROP TABLE IF EXISTS `general_log_settings`; SQL> CREATE TABLE `general_log_settings` ( `HOST` char(60) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT '%', `USER` char(32) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT '%', `ENABLED` enum('YES','NO') NOT NULL DEFAULT 'YES' ) ENGINE=Aria DEFAULT CHARSET=utf8; SQL> INSERT INTO `general_log_settings` (`HOST`, `USER`, `ENABLED`) VALUES ('%', '%', 'NO');

Now we need a mechanism to enable General Query Log logging for every user. For this we use a Stored Procedure:

SQL> use `sys`; SQL> DROP PROCEDURE IF EXISTS `enable_general_log`; SQL> DELIMITER // SQL> CREATE OR REPLACE DEFINER=`root`@`localhost` PROCEDURE `enable_general_log`() DETERMINISTIC READS SQL DATA SQL SECURITY DEFINER BEGIN DECLARE user VARCHAR(32); DECLARE host VARCHAR(60); DECLARE cnt SMALLINT; SELECT SUBSTRING_INDEX(SESSION_USER(), '@', 1) INTO user; SELECT SUBSTRING_INDEX(SESSION_USER(), '@', -1) INTO host; SELECT COUNT(*) INTO cnt FROM `general_log_settings` WHERE `ENABLED` = 'YES' AND (`HOST` = host OR `HOST` = '%') AND (`USER`= user OR `USER`= '%') ; IF cnt > 0 THEN SET SESSION sql_log_off = off; DO 'Connect'; END IF; END; // SQL> DELIMITER ;

Then we have to grant to ALL! users without the SUPER privilege the privilege to execute this procedure. These users can be found with the following query:

SQL> SELECT user, host, super_priv FROM mysql.user WHERE super_priv != 'Y'; +------+-----------+------------+ | User | Host | Super_priv | +------+-----------+------------+ | oli | localhost | N | | app | % | N | +------+-----------+------------+

To grant the privilege to execute the procedure you must run the queries as follows:

SQL> GRANT EXECUTE ON PROCEDURE `sys`.`enable_general_log` TO 'oli'@'localhost'; SQL> GRANT EXECUTE ON PROCEDURE `sys`.`enable_general_log` TO 'app'@'%';

Caution: If you do NOT grant the execute privilege to all users their new connections will fail as soon as you have enabled the logon trigger!

Then you have to add the users you want to log to the General Query Log to the general_log_settings table:

SQL> INSERT INTO `sys`.`general_log_settings` VALUES ('%', 'app', 'YES'); SQL> SELECT * FROM `sys`.`general_log_settings`; +------+------+---------+ | HOST | USER | ENABLED | +------+------+---------+ | % | % | NO | | % | app | YES | +------+------+---------+

You can test the logging to the General Query Log in your Connection with the following command now:

SQL> CALL `sys`.`enable_general_log`();

If everything is done correctly you should see some entries in your General Query Log as follows:

190709 23:51:21 6695 Query DO 'Connect' 6695 Query select count(*) from test.test 6695 Quit

And you should notice that the Connect Command is missing (see above). This is because enabling the General Query Log happens after the Connect. To give an equivalent entry tag in the General Query Log we have added the DO 'Connect' tag in the Stored Procedure.

Up to this point here all actions should not have any impact (neigther operational nor performance wise) to your MariaDB database system yet.

Activating General Query Log per Connection

Now we want to activate General Query Log logging for every new Connection to your MariaDB database server. This step can have an impact on your MariaDB database system and you should know how an be prepared to revert this step:

SQL> SET GLOBAL init_connect = 'CALL `sys`.`enable_general_log`();';

The init_connect Server System Variable specifies a command or a list of commands that will be executed for each new client connection. We also call this a Logon Trigger. These statements are only executed for users without the SUPER privilege. So you cannot track admin accounts with this method.

Further readings

Please let us know if you think we should add this tool to the MariaDB sys Schema or not.

Taxonomy upgrade extras: general query logconnectionlogon triggerSQL/PSM

Enable General Query Log per Connection in MariaDB

Wed, 2019-07-10 20:27

The General Query Log in MariaDB is a Log (file or table) that logs every statement sent from the Application to the MariaDB Database Server from the Connect Command to the Quit Command. A simple example you can find here:

bin/mysqld, Version: 10.4.6-MariaDB-log (MariaDB Server). started with: Tcp port: 3306 Unix socket: /var/run/mysqld/mysql.sock Time Id Command Argument 190709 21:27:30 3872 Connect root@localhost as anonymous on test 3872 Query INSERT INTO test VALUES (NULL, CONCAT('Test data insert from laptop5 on ', @@hostname), CURRENT_TIMESTAMP()) 3872 Quit

The General Query Log can be enabled dynamically with the following command:

SQL> SET GLOBAL general_log = 1;

and is written either to a file (defined with general_log_file) or a table (mysql.general_log, defined with log_output=TABLE).

But unfortunately the General Query Log logs all Commands from all Connections. So it can grow quickly and you loose overview over what happens...

With the MariaDB Server System Variables sql_log_off you can configure to NOT log a Connection to the General Query Log. This is the opposite of what we want to have. But with a clever combination of these variables we can manage how to log only specific Connections to the General Query Log.

Preparation for Connection Logging to the General Query Log

First of all we have to enable the General Query Log but disable it for all Connections so nothing is written to the General Query Log:

SQL> SET GLOBAL sql_log_off = on; -- 1 = no logging SQL> SET SESSION sql_log_off = on; -- also for my current session! SQL> SET GLOBAL general_log = 1; -- 1 = enable general log file

Then we need a place where to specify and control which user from which host should be logged. So we can control General Query Log logging dynamically. For this purpose we create a table called general_log_settings in the sys Schema of the MariaDB sys Schema:

SQL> CREATE SCHEMA IF NOT EXISTS `sys`; SQL> use `sys`; SQL> DROP TABLE IF EXISTS `general_log_settings`; SQL> CREATE TABLE `general_log_settings` ( `HOST` char(60) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT '%', `USER` char(32) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT '%', `ENABLED` enum('YES','NO') NOT NULL DEFAULT 'YES' ) ENGINE=Aria DEFAULT CHARSET=utf8; SQL> INSERT INTO `general_log_settings` (`HOST`, `USER`, `ENABLED`) VALUES ('%', '%', 'NO');

Now we need a mechanism to enable General Query Log logging for every user. For this we use a Stored Procedure:

SQL> use `sys`; SQL> DROP PROCEDURE IF EXISTS `enable_general_log`; SQL> DELIMITER // SQL> CREATE OR REPLACE DEFINER=`root`@`localhost` PROCEDURE `enable_general_log`() DETERMINISTIC READS SQL DATA SQL SECURITY DEFINER BEGIN DECLARE user VARCHAR(32); DECLARE host VARCHAR(60); DECLARE cnt SMALLINT; SELECT SUBSTRING_INDEX(SESSION_USER(), '@', 1) INTO user; SELECT SUBSTRING_INDEX(SESSION_USER(), '@', -1) INTO host; SELECT COUNT(*) INTO cnt FROM `general_log_settings` WHERE `ENABLED` = 'YES' AND (`HOST` = host OR `HOST` = '%') AND (`USER`= user OR `USER`= '%') ; IF cnt > 0 THEN SET SESSION sql_log_off = off; DO 'Connect'; END IF; END; // SQL> DELIMITER ;

Then we have to grant to ALL! users without the SUPER privilege the privilege to execute this procedure. These users can be found with the following query:

SQL> SELECT user, host, super_priv FROM mysql.user WHERE super_priv != 'Y'; +------+-----------+------------+ | User | Host | Super_priv | +------+-----------+------------+ | oli | localhost | N | | app | % | N | +------+-----------+------------+

To grant the privilege to execute the procedure you must run the queries as follows:

SQL> GRANT EXECUTE ON PROCEDURE `sys`.`enable_general_log` TO 'oli'@'localhost'; SQL> GRANT EXECUTE ON PROCEDURE `sys`.`enable_general_log` TO 'app'@'%';

Caution: If you do NOT grant the execute privilege to all users their new connections will fail as soon as you have enabled the logon trigger!

Then you have to add the users you want to log to the General Query Log to the general_log_settings table:

SQL> INSERT INTO `sys`.`general_log_settings` VALUES ('%', 'app', 'YES'); SQL> SELECT * FROM `sys`.`general_log_settings`; +------+------+---------+ | HOST | USER | ENABLED | +------+------+---------+ | % | % | NO | | % | app | YES | +------+------+---------+

You can test the logging to the General Query Log in your Connection with the following command now:

SQL> CALL `sys`.`enable_general_log`();

If everything is done correctly you should see some entries in your General Query Log as follows:

190709 23:51:21 6695 Query DO 'Connect' 6695 Query select count(*) from test.test 6695 Quit

And you should notice that the Connect Command is missing (see above). This is because enabling the General Query Log happens after the Connect. To give an equivalent entry tag in the General Query Log we have added the DO 'Connect' tag in the Stored Procedure.

Up to this point here all actions should not have any impact (neigther operational nor performance wise) to your MariaDB database system yet.

Activating General Query Log per Connection

Now we want to activate General Query Log logging for every new Connection to your MariaDB database server. This step can have an impact on your MariaDB database system and you should know how an be prepared to revert this step:

SQL> SET GLOBAL init_connect = 'CALL `sys`.`enable_general_log`();';

The init_connect Server System Variable specifies a command or a list of commands that will be executed for each new client connection. We also call this a Logon Trigger. These statements are only executed for users without the SUPER privilege. So you cannot track admin accounts with this method.

Further readings

Please let us know if you think we should add this tool to the MariaDB sys Schema or not.

Taxonomy upgrade extras: general query logconnectionlogon trigger

MariaDB sys Schema

Thu, 2019-06-27 12:25

The MySQL sys Schema is a cool thing and was mostly developed by Mark Leith and Jesper Wisborg Krogh from Oracle/MySQL. The sys Schema is based on the somehow cryptic MySQL PERFORMANCE_SCHEMA which was developed by Oracle/MySQL as well.

The sys Schema is: "A collection of views, functions and procedures to help MySQL administrators get insight in to MySQL Database usage."

Unfortunately the sys Schema was not ported from MySQL to MariaDB by the company MariaDB Corp. so far. So a good opportunity to act as a community...

In the beginning the sys Schema was hosted on GitHub: mysql/mysql-sys. So we could take the sys Schema from there for MariaDB. But since 3 years ago no more changes appeared on GitHub. It seems like the project was abandoned.

According to a well informed source the project on GitHub is not dead but the developers have just been working on other priorities. And the source announced another release soon (they are working on it at the moment). This statement was made in March 2018 and since then no progress happened.

We can further see that in the meanwhile the MySQL sys Schema has changed from version 1.6.0 to 2.0.0 without any changes on GitHub. So we think the project is really abandoned on GitHub and declared dead.

Because we had in the past many discussions with customers about the sys Schema for MariaDB, FromDual decided to fork the MySQL sys Schema project.

Yesterday we have forked the MySQL sys Schema to the MariaDB sys Schema. And over time it is our intention to port all possible changes to MariaDB sys Schema.

You can find our actual work on GitHub at: FromDual/mariadb-sys.

The current version contains a working fork of MySQL 5.6 sys Schema / v1.5.1. To make it work we had to fix a few little things but now the MariaDB sys Schema is working for MariaDB from version 10.0 up to MariaDB version 10.4.

Community: Please try out MariaDB sys Schema, find bugs, describe how to provoke the problem and report those bugs on GitHub and ideally also provide some fixes...

Our first experiments with MariaDB sys Schema in March 2018 is described here: MySQL sys Schema in MariaDB 10.2.

If you need any MariaDB related help please contact our MariaDB consulting team...

Taxonomy upgrade extras: mariadbsysperformance_schema

MariaDB sys Schema

Thu, 2019-06-27 12:25

The MySQL sys Schema is a cool thing and was mostly developed by Mark Leith and Jesper Wisborg Krogh from Oracle/MySQL. The sys Schema is based on the somehow cryptic MySQL PERFORMANCE_SCHEMA which was developed by Oracle/MySQL as well.

The sys Schema is: "A collection of views, functions and procedures to help MySQL administrators get insight in to MySQL Database usage."

Unfortunately the sys Schema was not ported from MySQL to MariaDB by the company MariaDB Corp. so far. So a good opportunity to act as a community...

In the beginning the sys Schema was hosted on GitHub: mysql/mysql-sys. So we could take the sys Schema from there for MariaDB. But since 3 years ago no more changes appeared on GitHub. It seems like the project was abandoned.

According to a well informed source the project on GitHub is not dead but the developers have just been working on other priorities. And the source announced another release soon (they are working on it at the moment). This statement was made in March 2018 and since then no progress happened.

We can further see that in the meanwhile the MySQL sys Schema has changed from version 1.6.0 to 2.0.0 without any changes on GitHub. So we think the project is really abandoned on GitHub and declared dead.

Because we had in the past many discussions with customers about the sys Schema for MariaDB, FromDual decided to fork the MySQL sys Schema project.

Yesterday we have forked the MySQL sys Schema to the MariaDB sys Schema. And over time it is our intention to port all possible changes to MariaDB sys Schema.

You can find our actual work on GitHub at: FromDual/mariadb-sys.

The current version contains a working fork of MySQL 5.6 sys Schema / v1.5.1. To make it work we had to fix a few little things but now the MariaDB sys Schema is working for MariaDB from version 10.0 up to MariaDB version 10.4.

Community: Please try out MariaDB sys Schema, find bugs, describe how to provoke the problem and report those bugs on GitHub and ideally also provide some fixes...

Our first experiments with MariaDB sys Schema in March 2018 is described here: MySQL sys Schema in MariaDB 10.2.

If you need any MariaDB related help please contact our MariaDB consulting team...

Taxonomy upgrade extras: mariadbsysperformance_schema

MariaDB sys Schema

Thu, 2019-06-27 12:25

The MySQL sys Schema is a cool thing and was mostly developed by Mark Leith and Jesper Wisborg Krogh from Oracle/MySQL. The sys Schema is based on the somehow cryptic MySQL PERFORMANCE_SCHEMA which was developed by Oracle/MySQL as well.

The sys Schema is: "A collection of views, functions and procedures to help MySQL administrators get insight in to MySQL Database usage."

Unfortunately the sys Schema was not ported from MySQL to MariaDB by the company MariaDB Corp. so far. So a good opportunity to act as a community...

In the beginning the sys Schema was hosted on GitHub: mysql/mysql-sys. So we could take the sys Schema from there for MariaDB. But since 3 years ago no more changes appeared on GitHub. It seems like the project was abandoned.

According to a well informed source the project on GitHub is not dead but the developers have just been working on other priorities. And the source announced another release soon (they are working on it at the moment). This statement was made in March 2018 and since then no progress happened.

We can further see that in the meanwhile the MySQL sys Schema has changed from version 1.6.0 to 2.0.0 without any changes on GitHub. So we think the project is really abandoned on GitHub and declared dead.

Because we had in the past many discussions with customers about the sys Schema for MariaDB, FromDual decided to fork the MySQL sys Schema project.

Yesterday we have forked the MySQL sys Schema to the MariaDB sys Schema. And over time it is our intention to port all possible changes to MariaDB sys Schema.

You can find our actual work on GitHub at: FromDual/mariadb-sys.

The current version contains a working fork of MySQL 5.6 sys Schema / v1.5.1. To make it work we had to fix a few little things but now the MariaDB sys Schema is working for MariaDB from version 10.0 up to MariaDB version 10.4.

Community: Please try out MariaDB sys Schema, find bugs, describe how to provoke the problem and report those bugs on GitHub and ideally also provide some fixes...

Our first experiments with MariaDB sys Schema in March 2018 is described here: MySQL sys Schema in MariaDB 10.2.

If you need any MariaDB related help please contact our MariaDB consulting team...

Taxonomy upgrade extras: mariadbsysperformance_schema

Pages