You are here

Feed aggregator

Increase file limit of a running process

Shinguz - Fri, 2020-06-19 18:53

Asking stupid questions and googling for them is fun some times...

Today I was asking myself if one could rise the file limit for a running MariaDB mysqld process online without restarting the database instance?

And I found an answer on serverfault: Set max file limit on a running process:

PID=$(pidof mysqld) grep -e 'Max open files' -e Limit /proc/${PID}/limits Limit Soft Limit Hard Limit Units Max open files 1024 4096 files prlimit --pid $PID | grep -e NOFILE -e DESC RESOURCE DESCRIPTION SOFT HARD UNITS NOFILE max number of open files 1024 4096 files prlimit --nofile --output RESOURCE,SOFT,HARD --pid ${PID} RESOURCE SOFT HARD NOFILE 1024 4096 sudo prlimit --nofile=2048:8192 --pid ${PID} prlimit --nofile --output RESOURCE,SOFT,HARD --pid ${PID} RESOURCE SOFT HARD NOFILE 2048 8192
Literature

prlimit(1)

See also:


Taxonomy upgrade extras: open_files_limitLimitNOFILEfile handles

FromDual Performance Monitor for MariaDB 1.2.0 has been released

Shinguz - Fri, 2020-06-12 16:47

FromDual has the pleasure to announce the release of the new version 1.2.0 of its popular Database Performance Monitor for MariaDB and Galera Cluster fpmmm.

The FromDual Performance Monitor for MariaDB (fpmmm) enables DBAs and System Administrators to monitor and understand what is going on inside their MariaDB database instances and on the machines where the databases reside.

More detailed information you can find in the fpmmm Installation Guide.

Download

The new FromDual Performance Monitor for MariaDB (fpmmm) can be downloaded from here or you can use our FromDual repositories. How to install and use fpmmm is documented in the fpmmm Installation Guide.

In the inconceivable case that you find a bug in the FromDual Performance Monitor for MariaDB please report it to the FromDual Bug-tracker or just send us an email.

Any feedback, statements and testimonials are welcome as well! Please send them to us.

Monitoring as a Service (MaaS)

You do not want to set-up your Database monitoring yourself? No problem: Choose our MariaDB Monitoring as a Service (Maas) program to safe time and costs!

Installation of Performance Monitor 1.2.0

A complete guide on how to install FromDual Performance Monitor you can find in the fpmmm Installation Guide.

Upgrade of fpmmm tarball from 1.0.x to 1.2.0

Upgrade with DEB/RPM packages should happen automatically. For tarballs follow this:

shell> cd /opt shell> tar xf /download/fpmmm-1.2.0.tar.gz shell> rm -f fpmmm shell> ln -s fpmmm-1.2.0 fpmmm
Changes in FromDual Performance Monitor for MariaDB 1.2.0

This release contains new features and various bug fixes.

You can verify your current FromDual Performance Monitor for MariaDB version with the following command:

shell> fpmmm --version
General
  • MariaDB 10.5 problems fixed. Fpmmm supports MariaDB 10.5 now!
  • Naming convention for Type changed from host to machine and mysqld to instance, including downwards compatibility.
  • New host screens added.
  • All Screens removed because they are customer specific, we have host screens now.
  • Zabbix templates adapted to the more flexible trigger URL.
  • Renamed all files to make it more agnostic.

Server
  • Code made more robust for cloud databases.
  • Free file descriptors removed because it is always zero, trigger added for 80% file descriptors used.
  • Cache file base bug in getIostat fixed.
  • Server graph for file descriptors improved
  • I/O queue ymin set to 0.
  • Server template optimized.
  • Iostat graphs added to server template.
  • Integrated iostat data into fpmmm.
  • All registered devices and bug on svctm fixed.
  • More info added when server module is called with --debug option.
  • Disk status items cleaned-up and filesystem names added for creating new items.
  • Interface eth1 removed but list of all interfaces added.
  • NUMA trigger added.
  • Macros for network interfaces added.

Data
  • Data module added to measure schema and instance size.
  • Code made ready for cloud databases.

Galera
  • Galera cluster size graph y axis set to 0.
  • 3 Galera graphs added, URL added to some triggers, title of one trigger changed.
  • gmcast segment added.
  • Locale state name of item fixed, local status removed, cluster and local state added, gmcast segment added to template.
  • wsrep version fixed to new format.
  • Item name change for cluster_conf_id.
  • Dirty code fixed, found on cloud databases.

User
  • Module for per user data added.
  • Dirty code fixed, found on cloud databases.
  • User info for transactions added.
  • Tmp disk tables and sort merge passes per user information added.

Agent
  • Output format zabbix, icinga, nagios and centreon should be supported now.
  • Error messages for connect improved.
  • Option --debug added, one message was not handled correctly in verbosity level.
  • Parameters in function goThroughAllSections cleaned-up.
  • Option -h added, info more clear when wrong options were used.
  • URLs added to fpmmm template.
  • fpmmm check and trigger improved.
  • Made error handling better after test of 1.1.0 on CentOS 7.
  • fpmmm trigger error message improved.

InnoDB
  • NUMA information and warning trigger added to InnoDB module.
  • Trigger for innodb_force_recovery made repeatable.
  • Alert level for innodb_force_recovery increased, InnoDB non default page size alert added.
  • InnoDB deadlock detection is alarmed, when disabled.
  • innodb_metrics only works with SUPER privilege, fixed.
  • InnoDB Log Buffer much to small trigger added on Innodb_log_waits item.
  • Items and graphs for InnoDB temporary tables added.

MyISAM
  • Items and graphs for MyISAM temporary tables added.

Aria
  • Items and graphs for Aria temporary tables added.

Security
  • Expired user added for MariaDB including alert.

Slave
  • URL added and two triggers made repeatable.

Backup
  • Backup will report EVERY failure and URL is now useful!

For subscriptions of commercial use of fpmmm please get in contact with us.

Taxonomy upgrade extras: performancemonitormonitoringfpmmmmaasreleasegraph

FromDual Performance Monitor for MySQL 1.2.0 has been released

Shinguz - Fri, 2020-06-12 16:42

FromDual has the pleasure to announce the release of the new version 1.2.0 of its popular Database Performance Monitor for MySQL fpmmm.

The FromDual Performance Monitor for MySQL (fpmmm) enables DBAs and System Administrators to monitor and understand what is going on inside their MySQL database instances and on the machines where the databases reside.

More detailed information you can find in the fpmmm Installation Guide.

Download

The new FromDual Performance Monitor for MySQL (fpmmm) can be downloaded from here or you can use our FromDual repositories. How to install and use fpmmm is documented in the fpmmm Installation Guide.

In the inconceivable case that you find a bug in the FromDual Performance Monitor for MySQL please report it to the FromDual Bug-tracker or just send us an email.

Any feedback, statements and testimonials are welcome as well! Please send them to us.

Monitoring as a Service (MaaS)

You do not want to set-up your Database monitoring yourself? No problem: Choose our MySQL Monitoring as a Service (Maas) program to safe time and costs!

Installation of Performance Monitor 1.2.0

A complete guide on how to install FromDual Performance Monitor you can find in the fpmmm Installation Guide.

Upgrade of fpmmm tarball from 1.0.x to 1.2.0

Upgrade with DEB/RPM packages should happen automatically. For tarballs follow this:

shell> cd /opt shell> tar xf /download/fpmmm-1.2.0.tar.gz shell> rm -f fpmmm shell> ln -s fpmmm-1.2.0 fpmmm
Changes in FromDual Performance Monitor for MySQL 1.2.0

This release contains new features and various bug fixes.

You can verify your current FromDual Performance Monitor for MySQL version with the following command:

shell> fpmmm --version
General
  • MySQL 8.0 problems fixed. Fpmmm supports MySQL 8.0 now!
  • Naming convention for Type changed from host to machine and mysqld to instance, including downwards compatibility.
  • New host screens added.
  • All Screens removed because they are customer specific, we have host screens now.
  • Zabbix templates adapted to the more flexible trigger URL.
  • Renamed all files to make it more agnostic.

Server
  • Code made more robust for cloud databases.
  • Free file descriptors removed because it is always zero, trigger added for 80% file descriptors used.
  • Cache file base bug in getIostat fixed.
  • Server graph for file descriptors improved
  • I/O queue ymin set to 0.
  • Server template optimized.
  • Iostat graphs added to server template.
  • Integrated iostat data into fpmmm.
  • All registered devices and bug on svctm fixed.
  • More info added when server module is called with --debug option.
  • Disk status items cleaned-up and filesystem names added for creating new items.
  • Interface eth1 removed but list of all interfaces added.
  • NUMA trigger added.
  • Macros for network interfaces added.

Data
  • Data module added to measure schema and instance size.
  • Code made ready for cloud databases.

User
  • Module for per user data added.
  • Dirty code fixed, found on cloud databases.
  • User info for transactions added.
  • Tmp disk tables and sort merge passes per user information added.

Agent
  • MySQL 8.0 compatibility issue fixed with user privileges.
  • Output format zabbix, icinga, nagios and centreon should be supported now.
  • Error messages for connect improved.
  • Option --debug added, one message was not handled correctly in verbosity level.
  • Parameters in function goThroughAllSections cleaned-up.
  • Option -h added, info more clear when wrong options were used.
  • URLs added to fpmmm template.
  • fpmmm check and trigger improved.
  • Made error handling better after test of 1.1.0 on CentOS 7.
  • fpmmm trigger error message improved.

InnoDB
  • NUMA information and warning trigger added to InnoDB module.
  • Trigger for innodb_force_recovery made repeatable.
  • Alert level for innodb_force_recovery increased, InnoDB non default page size alert added.
  • InnoDB deadlock detection is alarmed, when disabled.
  • innodb_metrics only works with SUPER privilege, fixed.
  • InnoDB Log Buffer much to small trigger added on Innodb_log_waits item.
  • Items and graphs for InnoDB temporary tables added.

MySQL
  • Metadata Lock (MDL) error message improved.
  • Metadata Lock (MDL) naming improved.
  • Metadata Lock (MDL) counters, checks, graphs, triggers and Metadata Lock itself added.
  • Connection graphs yellow made a bit darker.

MyISAM
  • Items and graphs for MyISAM temporary tables added.

Aria
  • Items and graphs for Aria temporary tables added.

Security
  • Expired user added for MySQL including alert.

Slave
  • URL added and two triggers made repeatable.

Backup
  • Backup will report EVERY failure and URL is now useful!

For subscriptions of commercial use of fpmmm please get in contact with us.

Taxonomy upgrade extras: performancemonitormonitoringfpmmmmaasreleasegraph

Stupid Error Messages

Shinguz - Fri, 2020-05-22 11:02

Very often I see some stupid error messages as a (power-)user. I do not know if this is because of lazy developers or managers not having enough focus on more useful error messages.

If the error messages would be more clear it would help me as a power-user to fix my problems faster and fix it possibly myself instead of asking questions or even open support cases.

That would also safe costs on the support service side if end-users would be enabled to fix their problems themself. If this is what software vendors really want...

Sometimes strace helps to understand the problem. But why do I need external tools to do the job?

Some examples

Bad: Could not add A-record.

Better: Could not add A-record lamp-database.org because it already exists.

Bad: Error 2.

OK I can help myself with:

perror 2 OS error code 2: No such file or directory

but still bad.

Better: No such file or directory. File I was looking for: /tmp/doesnotexist.txt

Unique Error Code Policy

FromDual has introduced a unique error code policy: Whenever you get an error code of any FromDual product we can tell you within seconds in which product at what place in the code (function) you hit the error (at least in theory and in 98% in practice).

grep -r '3170' * focmm/lib/Node.inc: $rc = 3170;

Our system is not perfect and could be done better. But this concept has evolved over time and was not there from the beginning. Possibly today we would make it a bit more sophisticated. For example: Error Codes can be reused (in the same product) if not needed any more. This causes the problem that we also need to know the version of the product to match the error number exactly to the code. Or in some projects we are running soon out of error numbers because we have chosen a too small range. 1000 error numbers is NOT enough for medium size products.

I think Microsoft has a similar concept?

Some developers tend to just rise stack traces. Java developers are specialists in this. This is completely useless for the (power-)user because most of them do not know what to do with a stack trace and they do not really see the real problem because of all the stack trace noise in the error log file. So the real information is overlooked. And information filtering is not everybody's strength...

Possibly I will add here some more examples in the future...

Taxonomy upgrade extras: errordevelopersoftware

Stupid Error Messages

Shinguz - Fri, 2020-05-22 11:02

Very often I see some stupid error messages as a (power-)user. I do not know if this is because of lazy developers or managers not having enough focus on more useful error messages.

If the error messages would be more clear it would help me as a power-user to fix my problems faster and fix it possibly myself instead of asking questions or even open support cases.

That would also safe costs on the support service side if end-users would be enabled to fix their problems themself. If this is what software vendors really want...

Sometimes strace helps to understand the problem. But why do I need external tools to do the job?

Some examples

Bad: Could not add A-record.

Better: Could not add A-record lamp-database.org because it already exists.

Bad: Error 2.

OK I can help myself with:

perror 2 OS error code 2: No such file or directory

but still bad.

Better: No such file or directory. File I was looking for: /tmp/doesnotexist.txt

Unique Error Code Policy

FromDual has introduced a unique error code policy: Whenever you get an error code of any FromDual product we can tell you within seconds in which product at what place in the code (function) you hit the error (at least in theory and in 98% in practice).

grep -r '3170' * focmm/lib/Node.inc: $rc = 3170;

Our system is not perfect and could be done better. But this concept has evolved over time and was not there from the beginning. Possibly today we would make it a bit more sophisticated. For example: Error Codes can be reused (in the same product) if not needed any more. This causes the problem that we also need to know the version of the product to match the error number exactly to the code. Or in some projects we are running soon out of error numbers because we have chosen a too small range. 1000 error numbers is NOT enough for medium size products.

I think Microsoft has a similar concept?

Some developers tend to just rise stack traces. Java developers are specialists in this. This is completely useless for the (power-)user because most of them do not know what to do with a stack trace and they do not really see the real problem because of all the stack trace noise in the error log file. So the real information is overlooked. And information filtering is not everybody's strength...

Possibly I will add here some more examples in the future...

Taxonomy upgrade extras: errordevelopersoftware

FromDual Ops Center for MySQL and compatible databases 1.0.0 has been released

Shinguz - Mon, 2020-05-11 15:58

FromDual has the pleasure to announce the release of the new version 1.0.0 of its popular FromDual Ops Center focmm, a Graphical User Interface (GUI) for MySQL and compatible databases.

The FromDual Ops Center for MySQL and compatible databases (focmm) helps DBA's and System Administrators to better manage their MySQL and compatible databases farms. Ops Center makes DBA and Admins life easier!

The main task of Ops Center is to support you in your daily MySQL and compatible databases operation tasks. More information about FromDual Ops Center you can find here.

Download

The new FromDual Ops Center for MySQL and compatible databases (focmm) can be downloaded from here. How to install and use focmm is documented in the Ops Center User Guide.

In the inconceivable case that you find a bug in the FromDual Ops Center for MySQL and compatible databases please report it to the FromDual bug tracker or just send us an email.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Installation of Ops Center 1.0.0

A complete guide on how to install FromDual Ops Center you can find in the Ops Center User Guide.

Upgrade from 0.9.x to 1.0.0

Upgrade from 0.9.x to 1.0.0 should happen automatically. Please do a backup of your Ops Center Instance before you upgrade! Please also check Upgrading.

Changes in Ops Center 1.0.0 Machine
  • Expression Server replaced by Machine.
  • Better error handling in getOperatingSystem function.
  • Machine does not belong to a Cluster any more: Column cluster_id from table machine removed.
  • Machine rename prepared.
  • 2 bugs on refresh and edit Machine fixed.
  • Instances added to Machine overview.

Instance
  • Expression Node replaced by Instance.
  • Instance added to Machine and Cluster overview.
  • Database Account management implemented.
  • Bug in create instance (server_id) fixed.
  • Bugs fixed in Instance backup, Instance restore, Instance performance and create Account.
  • Variable skip_name_resolve added to create Instance. This makes error probability smaller for accounts with same username but different host.
  • Menu title Configuration renamed to Settings.
  • Configuration file management implemented.
  • Deleting Instance belonging to a Cluster is not possible any more. Cluster must first release instance before instance can be deleted.
  • Column instance_type removed.
  • Menu Title Variables renamed to Configuration.
  • Show Processlist refurbished, filter fields smaller and daemon and system users filtered out.
  • Schema management implemented.
  • MySQL compatible database version 10.4 defaults for Variables added.
  • Missing backup-target is only ERR and not EMERG any more.
  • Instance performance advises added for sar.
  • Function copyFileFromRemoteServer should preserve timestamp now.
  • Connection changed from connect to real_connect to allow SSL connections.
  • SSL connection from Ops Center to Target Database and Repository Database is possible now.

Cluster
  • CHANGE MASTER TO and Cluster change fixed for systems with more than one Cluster.
  • CHANGE MASTER TO password is hidden now as well, all password can be viewed and copied.
  • Cluster bugs fixed.
  • Cluster functions cleaned-up.
  • Galera safe to bootstrap improved.
  • Machine removed from Cluster dashboard.
  • Instance added to Cluster overview.

Load-Balancer
  • Better error handling for Load-Balancer.
  • MaxScale Load-Balancer added in overview.
  • ProxySQL Load-Balancer basics implemented.

Virtual IP (VIP)/Floating IP
  • Put Edit VIP buttons in right order.

Tools
  • Tool for File Transfer added.
  • Tool config-diff compare button added also to top of table.

Configuration
  • No changes.
Database-as-a-Service (DBaaS)
  • Customer group replaced by Resource group.
  • FK added to Resource costs.
  • User Group added.
  • Logout User put into own function.
  • Tabindex for User edit/add added.
  • User cannot delete itself any more.
  • User comment and User group responsibility added.
  • User is logged out from focmm when deleted.

Building and Packaging
  • No changes.

Themes / UI
  • Keybord usage and Tab Index implemented in some forms.
  • jquery upgraded from 3.4.1 to 3.5.0

General
  • All HTML code removed in code.
  • Constant clean-up.
  • Machine, Instance and Cluster overview are sorted now.
  • Menu Instance, Machine and Cluster are sorted now, getXxxx function extended to make this possible.
  • Order of object and function changed in url.
  • Spell check error fixed.
  • Title moved a bit more to the right.
  • All tests passed for MySQL compatible database 10.5 and MySQL 8.0. Both can be considered as supported now.
  • Repository version 124 added, create separated for next major step.

Taxonomy upgrade extras: OperationsreleaseFromDual Ops Centerops centerdbaas

FromDual Ops Center for MySQL and compatible databases 1.0.0 has been released

Shinguz - Mon, 2020-05-11 15:58

FromDual has the pleasure to announce the release of the new version 1.0.0 of its popular FromDual Ops Center focmm, a Graphical User Interface (GUI) for MySQL and compatible databases.

The FromDual Ops Center for MySQL and compatible databases (focmm) helps DBA's and System Administrators to better manage their MySQL and compatible databases farms. Ops Center makes DBA and Admins life easier!

The main task of Ops Center is to support you in your daily MySQL and compatible databases operation tasks. More information about FromDual Ops Center you can find here.

Download

The new FromDual Ops Center for MySQL and compatible databases (focmm) can be downloaded from here. How to install and use focmm is documented in the Ops Center User Guide.

In the inconceivable case that you find a bug in the FromDual Ops Center for MySQL and compatible databases please report it to the FromDual bug tracker or just send us an email.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Installation of Ops Center 1.0.0

A complete guide on how to install FromDual Ops Center you can find in the Ops Center User Guide.

Upgrade from 0.9.x to 1.0.0

Upgrade from 0.9.x to 1.0.0 should happen automatically. Please do a backup of your Ops Center Instance before you upgrade! Please also check Upgrading.

Changes in Ops Center 1.0.0 Machine
  • Expression Server replaced by Machine.
  • Better error handling in getOperatingSystem function.
  • Machine does not belong to a Cluster any more: Column cluster_id from table machine removed.
  • Machine rename prepared.
  • 2 bugs on refresh and edit Machine fixed.
  • Instances added to Machine overview.

Instance
  • Expression Node replaced by Instance.
  • Instance added to Machine and Cluster overview.
  • Database Account management implemented.
  • Bug in create instance (server_id) fixed.
  • Bugs fixed in Instance backup, Instance restore, Instance performance and create Account.
  • Variable skip_name_resolve added to create Instance. This makes error probability smaller for accounts with same username but different host.
  • Menu title Configuration renamed to Settings.
  • Configuration file management implemented.
  • Deleting Instance belonging to a Cluster is not possible any more. Cluster must first release instance before instance can be deleted.
  • Column instance_type removed.
  • Menu Title Variables renamed to Configuration.
  • Show Processlist refurbished, filter fields smaller and daemon and system users filtered out.
  • Schema management implemented.
  • MySQL compatible database version 10.4 defaults for Variables added.
  • Missing backup-target is only ERR and not EMERG any more.
  • Instance performance advises added for sar.
  • Function copyFileFromRemoteServer should preserve timestamp now.
  • Connection changed from connect to real_connect to allow SSL connections.
  • SSL connection from Ops Center to Target Database and Repository Database is possible now.

Cluster
  • CHANGE MASTER TO and Cluster change fixed for systems with more than one Cluster.
  • CHANGE MASTER TO password is hidden now as well, all password can be viewed and copied.
  • Cluster bugs fixed.
  • Cluster functions cleaned-up.
  • Galera safe to bootstrap improved.
  • Machine removed from Cluster dashboard.
  • Instance added to Cluster overview.

Load-Balancer
  • Better error handling for Load-Balancer.
  • MaxScale Load-Balancer added in overview.
  • ProxySQL Load-Balancer basics implemented.

Virtual IP (VIP)/Floating IP
  • Put Edit VIP buttons in right order.

Tools
  • Tool for File Transfer added.
  • Tool config-diff compare button added also to top of table.

Configuration
  • No changes.
Database-as-a-Service (DBaaS)
  • Customer group replaced by Resource group.
  • FK added to Resource costs.
  • User Group added.
  • Logout User put into own function.
  • Tabindex for User edit/add added.
  • User cannot delete itself any more.
  • User comment and User group responsibility added.
  • User is logged out from focmm when deleted.

Building and Packaging
  • No changes.

Themes / UI
  • Keybord usage and Tab Index implemented in some forms.
  • jquery upgraded from 3.4.1 to 3.5.0

General
  • All HTML code removed in code.
  • Constant clean-up.
  • Machine, Instance and Cluster overview are sorted now.
  • Menu Instance, Machine and Cluster are sorted now, getXxxx function extended to make this possible.
  • Order of object and function changed in url.
  • Spell check error fixed.
  • Title moved a bit more to the right.
  • All tests passed for MySQL compatible database 10.5 and MySQL 8.0. Both can be considered as supported now.
  • Repository version 124 added, create separated for next major step.

Taxonomy upgrade extras: OperationsreleaseFromDual Ops Centerops centerdbaas

Shutdown with MySQL 8

Shinguz - Wed, 2020-04-01 16:52

On StackExchange for Database Administrators I recently have seen a question which attracted my interest.

The question puzzled me a bit because the answer seems too easy. Further the question was not so clear. An all theses factors smell dangerous...

About time - was, is and will be

How can I find out if the database "was" shutdown slowly? This is quite easy: Look into your MySQL Error Log and there you will find a log sequence similar to the following:

2020-03-30T08:03:36.928017Z 0 [System] [MY-010910] [Server] /home/mysql/product/mysql-8.0.19-linux-glibc2.12-x86_64/bin/mysqld: Shutdown complete (mysqld 8.0.19) MySQL Community Server - GPL.

Ups! There are no more "shutting down ..." messages like in MySQL 5.7:

2020-03-30T08:04:49.898254Z 0 [Note] Giving 1 client threads a chance to die gracefully 2020-03-30T08:04:49.898266Z 0 [Note] Shutting down slave threads 2020-03-30T08:04:51.898389Z 0 [Note] Forcefully disconnecting 1 remaining clients 2020-03-30T08:04:51.898433Z 0 [Warning] bin/mysqld: Forcing close of thread 115 user: 'enswitch' 2020-03-30T08:04:51.898512Z 0 [Note] Event Scheduler: Purging the queue. 0 events 2020-03-30T08:04:51.924644Z 0 [Note] Binlog end 2020-03-30T08:04:51.938518Z 0 [Note] Shutting down plugin 'ngram' ... 2020-03-30T08:04:53.296239Z 0 [Note] Shutting down plugin 'binlog' 2020-03-30T08:04:53.296805Z 0 [Note] bin/mysqld: Shutdown complete

So you cannot find out, when shutdown started any more and thus you cannot say how long it took to shutdown MySQL. So MySQL messed it up somehow in 8.0. Too much clean-up!

If you want to get the old behaviour back you can stop MySQL 8 as follows:

SQL> SET GLOBAL log_error_verbosity = 3; SQL> SHUTDOWN;

or just add the variable to your MySQL configuration file (my.cnf).

Then you will find the old shutdown sequence in your error log as before:

2020-03-30T08:13:55.071627Z 9 [System] [MY-013172] [Server] Received SHUTDOWN from user root. Shutting down mysqld (Version: 8.0.19). 2020-03-30T08:13:55.178119Z 0 [Note] [MY-010067] [Server] Giving 1 client threads a chance to die gracefully 2020-03-30T08:13:55.178210Z 0 [Note] [MY-010117] [Server] Shutting down slave threads ... 2020-03-30T08:13:56.588574Z 0 [System] [MY-010910] [Server] /home/mysql/product/mysql-8.0.19-linux-glibc2.12-x86_64/bin/mysqld: Shutdown complete (mysqld 8.0.19) MySQL Community Server - GPL.

If you want to know where your MySQL Error Log File is located you can find it like this:

SQL> SHOW GLOBAL VARIABLES LIKE 'log_error'; +---------------+---------------------------------------------+ | Variable_name | Value | +---------------+---------------------------------------------+ | log_error | /home/mysql/database/mysql-80/log/error.log | +---------------+---------------------------------------------+

Typical locations are: /var/lib/mysql/<hostname>.log or /var/log/mysqld.log or /var/log/mysql/mysqld.log or /var/log/messages or similar.

Now about the "is"

When you are currently shutting down MySQL it is already to late to find it out when it started. Because you cannot connect to the database any more to change the settings and you do not see anything in the MySQL Error log. Possibly you can look at the error log with stat and you can see when the last message was written to it to find the start of the shutdown.

shell> stat error.log File: error.log Size: 29929 Blocks: 64 IO Block: 4096 regular file Device: 801h/2049d Inode: 5373953 Links: 1 Access: (0640/-rw-r-----) Uid: ( 1001/ mysql) Gid: ( 1001/ mysql) Access: 2020-03-30 10:13:59.491446560 +0200 Modify: 2020-03-30 10:13:56.587467485 +0200 Change: 2020-03-30 10:13:56.587467485 +0200 Birth: -

Symptoms for a working shutdown is either heavy writing to disk (iostat -xk 1) or heavy swapping in (vmstat). You have to wait until finished. Some brave people use a kill -9 in such a case, if they have InnoDB only and if they know exactly what they are doing and how much time a following crash recovery will take.

And finally about the question "how long will the shutdown take" (will be)

This is not so easy to predict. It depends on several things:

  • How many memory blocks of your database are swapped out.
  • How many pages are dirty and must be written to disk.
  • How fast is your I/O system (IOPS).

You can gather these information as follows:

  • How much Swap must be swapped in you can find here.
  • The number of dirty pages (pages modified but not written to disk yet) you can find with: SQL> SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_pages_dirty';
  • And about the IOPS you have to ask your hardware spec sheet (150 to 100'000 IOPS).

I hope with this answer I have covered all the possibly questions about shutting down MySQL 8.0?

Taxonomy upgrade extras: mysqlshutdownslow

Shutdown with MySQL 8

Shinguz - Wed, 2020-04-01 16:52

On StackExchange for Database Administrators I recently have seen a question which attracted my interest.

The question puzzled me a bit because the answer seems too easy. Further the question was not so clear. An all theses factors smell dangerous...

About time - was, is and will be

How can I find out if the database "was" shutdown slowly? This is quite easy: Look into your MySQL Error Log and there you will find a log sequence similar to the following:

2020-03-30T08:03:36.928017Z 0 [System] [MY-010910] [Server] /home/mysql/product/mysql-8.0.19-linux-glibc2.12-x86_64/bin/mysqld: Shutdown complete (mysqld 8.0.19) MySQL Community Server - GPL.

Ups! There are no more "shutting down ..." messages like in MySQL 5.7:

2020-03-30T08:04:49.898254Z 0 [Note] Giving 1 client threads a chance to die gracefully 2020-03-30T08:04:49.898266Z 0 [Note] Shutting down slave threads 2020-03-30T08:04:51.898389Z 0 [Note] Forcefully disconnecting 1 remaining clients 2020-03-30T08:04:51.898433Z 0 [Warning] bin/mysqld: Forcing close of thread 115 user: 'enswitch' 2020-03-30T08:04:51.898512Z 0 [Note] Event Scheduler: Purging the queue. 0 events 2020-03-30T08:04:51.924644Z 0 [Note] Binlog end 2020-03-30T08:04:51.938518Z 0 [Note] Shutting down plugin 'ngram' ... 2020-03-30T08:04:53.296239Z 0 [Note] Shutting down plugin 'binlog' 2020-03-30T08:04:53.296805Z 0 [Note] bin/mysqld: Shutdown complete

So you cannot find out, when shutdown started any more and thus you cannot say how long it took to shutdown MySQL. So MySQL messed it up somehow in 8.0. Too much clean-up!

If you want to get the old behaviour back you can stop MySQL 8 as follows:

SQL> SET GLOBAL log_error_verbosity = 3; SQL> SHUTDOWN;

or just add the variable to your MySQL configuration file (my.cnf).

Then you will find the old shutdown sequence in your error log as before:

2020-03-30T08:13:55.071627Z 9 [System] [MY-013172] [Server] Received SHUTDOWN from user root. Shutting down mysqld (Version: 8.0.19). 2020-03-30T08:13:55.178119Z 0 [Note] [MY-010067] [Server] Giving 1 client threads a chance to die gracefully 2020-03-30T08:13:55.178210Z 0 [Note] [MY-010117] [Server] Shutting down slave threads ... 2020-03-30T08:13:56.588574Z 0 [System] [MY-010910] [Server] /home/mysql/product/mysql-8.0.19-linux-glibc2.12-x86_64/bin/mysqld: Shutdown complete (mysqld 8.0.19) MySQL Community Server - GPL.

If you want to know where your MySQL Error Log File is located you can find it like this:

SQL> SHOW GLOBAL VARIABLES LIKE 'log_error'; +---------------+---------------------------------------------+ | Variable_name | Value | +---------------+---------------------------------------------+ | log_error | /home/mysql/database/mysql-80/log/error.log | +---------------+---------------------------------------------+

Typical locations are: /var/lib/mysql/<hostname>.log or /var/log/mysqld.log or /var/log/mysql/mysqld.log or /var/log/messages or similar.

Now about the "is"

When you are currently shutting down MySQL it is already to late to find it out when it started. Because you cannot connect to the database any more to change the settings and you do not see anything in the MySQL Error log. Possibly you can look at the error log with stat and you can see when the last message was written to it to find the start of the shutdown.

shell> stat error.log File: error.log Size: 29929 Blocks: 64 IO Block: 4096 regular file Device: 801h/2049d Inode: 5373953 Links: 1 Access: (0640/-rw-r-----) Uid: ( 1001/ mysql) Gid: ( 1001/ mysql) Access: 2020-03-30 10:13:59.491446560 +0200 Modify: 2020-03-30 10:13:56.587467485 +0200 Change: 2020-03-30 10:13:56.587467485 +0200 Birth: -

Symptoms for a working shutdown is either heavy writing to disk (iostat -xk 1) or heavy swapping in (vmstat). You have to wait until finished. Some brave people use a kill -9 in such a case, if they have InnoDB only and if they know exactly what they are doing and how much time a following crash recovery will take.

And finally about the question "how long will the shutdown take" (will be)

This is not so easy to predict. It depends on several things:

  • How many memory blocks of your database are swapped out.
  • How many pages are dirty and must be written to disk.
  • How fast is your I/O system (IOPS).

You can gather these information as follows:

  • How much Swap must be swapped in you can find here.
  • The number of dirty pages (pages modified but not written to disk yet) you can find with: SQL> SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_pages_dirty';
  • And about the IOPS you have to ask your hardware spec sheet (150 to 100'000 IOPS).

I hope with this answer I have covered all the possibly questions about shutting down MySQL 8.0?

Taxonomy upgrade extras: mysqlshutdownslow

FromDual Schulungen trotz Corona - einfach online und remote

Oli Sennhauser - Mon, 2020-03-23 15:43

Aufgrund der aktuellen Corona-Pandemie sind sämtliche FromDual vor Ort Schulungen, Schulungen bei unseren Schulungspartnern sowie FromDual Beratungseinsätze bis auf weiteres sistiert.

Damit Sie diesen Frühling aber nicht ganz ohne Weiterbildung auskommen müssen, sind wir bereits seit letzter Woche daran, alternative Möglichkeiten mittels online Schulungen zu testen. Erste Testschulungen wurden bereits Ende letzter Woche und anfangs dieser Woche durchgeführt.

Möglicherweise bietet sich Ihnen in den nächsten Wochen die Möglichkeit, aus dem Homeoffice die eine oder andere unserer online-Schulungen zu besuchen. Voraussichtlich betroffen sind die folgenden Schulungstermine:

  • 2. bis 4. April: Galera Cluster für MariaDB/MySQL im Linuxhotel in Essen
  • 23. bis 24. April: Galera Cluster für MariaDB/MySQL in der Heinlein Academy in Berlin
  • 4. bis 8. Mai: MariaDB/MySQL für Fortgeschrittene bei der GfU Cyrus in Köln
  • 11. bis 15. Mai: MariaDB/MySQL für Fortgeschrittene in der Heinlein Academy in Berlin

Ob es auch noch spätere Schulungstermine betrifft werden wir sehen...

Remote geht es besser - FromDual remote-DBA Dienstleistungen

Die Corona-Pandemie verursacht möglicherweise auch ein anderes oder neues Lastmuster auf Ihrer Datenbank.

  • Buchungen werden storniert (Reiseportale),
  • Rabatt-Aktionen gestartet (Outdoor-Aktivitäten),
  • es wird vermehrt online eingekauft (Webshops),
  • es wird vermehrt kommuniziert (VoIP, Video-Conferencing, Screen Sharing) oder auch
  • die Plattformen des Gesundheitswesen werden heftiger als sonst in Anspruch genommen.
  • etc.

Falls diese Situation bei Ihnen zu betrieblichen Problemen oder Performance-Engpässen führt, helfen wir Ihnen auch gerne mit einem remote-DBA Einsatz anstelle eines vor Ort Beratungseinsatzes weiter. Wir haben diese Technologien bereits seit Jahren im Einsatz und zeigen Ihnen auch gerne remote, wie Sie Ihre Probleme in den Griff kriegen.

Taxonomy upgrade extras: schulungremote-dbaremoteberatungconsultingtraining

FromDual Schulungen trotz Corona - einfach online und remote

Oli Sennhauser - Mon, 2020-03-23 15:43

Aufgrund der aktuellen Corona-Pandemie sind sämtliche FromDual vor Ort Schulungen, Schulungen bei unseren Schulungspartnern sowie FromDual Beratungseinsätze bis auf weiteres sistiert.

Damit Sie diesen Frühling aber nicht ganz ohne Weiterbildung auskommen müssen, sind wir bereits seit letzter Woche daran, alternative Möglichkeiten mittels online Schulungen zu testen. Erste Testschulungen wurden bereits Ende letzter Woche und anfangs dieser Woche durchgeführt.

Möglicherweise bietet sich Ihnen in den nächsten Wochen die Möglichkeit, aus dem Homeoffice die eine oder andere unserer online-Schulungen zu besuchen. Voraussichtlich betroffen sind die folgenden Schulungstermine:

  • 2. bis 4. April: Galera Cluster für MariaDB/MySQL im Linuxhotel in Essen
  • 23. bis 24. April: Galera Cluster für MariaDB/MySQL in der Heinlein Academy in Berlin
  • 4. bis 8. Mai: MariaDB/MySQL für Fortgeschrittene bei der GfU Cyrus in Köln
  • 11. bis 15. Mai: MariaDB/MySQL für Fortgeschrittene in der Heinlein Academy in Berlin

Ob es auch noch spätere Schulungstermine betrifft werden wir sehen...

Remote geht es besser - FromDual remote-DBA Dienstleistungen

Die Corona-Pandemie verursacht möglicherweise auch ein anderes oder neues Lastmuster auf Ihrer Datenbank.

  • Buchungen werden storniert (Reiseportale),
  • Rabatt-Aktionen gestartet (Outdoor-Aktivitäten),
  • es wird vermehrt online eingekauft (Webshops),
  • es wird vermehrt kommuniziert (VoIP, Video-Conferencing, Screen Sharing) oder auch
  • die Plattformen des Gesundheitswesen werden heftiger als sonst in Anspruch genommen.
  • etc.

Falls diese Situation bei Ihnen zu betrieblichen Problemen oder Performance-Engpässen führt, helfen wir Ihnen auch gerne mit einem remote-DBA Einsatz anstelle eines vor Ort Beratungseinsatzes weiter. Wir haben diese Technologien bereits seit Jahren im Einsatz und zeigen Ihnen auch gerne remote, wie Sie Ihre Probleme in den Griff kriegen.

Taxonomy upgrade extras: schulungremote-dbaremoteberatungconsultingtraining

innodb_deadlock_detect - Rather Hands off!

Shinguz - Mon, 2020-03-23 11:24

Recently we had a new customer who has had from time to time massive database problems which he did not understand. When we reviewed the MySQL configuration file (my.cnf) we found, that this customer had disabled the InnoDB Deadlock detection (innodb_deadlock_detect).

Because we have advised against doing this so far, but I never stumbled upon this problem in practice, I have investigated a bit more about the MySQL variable innodb_deadlock_detect.

The MySQL documentation tells us the following [1]:

Disabling Deadlock Detection
On high concurrency systems, deadlock detection can cause a slowdown when numerous threads wait for the same lock. At times, it may be more efficient to disable deadlock detection and rely on the innodb_lock_wait_timeout setting for transaction rollback when a deadlock occurs. Deadlock detection can be disabled using the innodb_deadlock_detect configuration option.

And about the parameter innodb_deadlock_detect itself [2] itself:

This option is used to disable deadlock detection. On high concurrency systems, deadlock detection can cause a slowdown when numerous threads wait for the same lock. At times, it may be more efficient to disable deadlock detection and rely on the innodb_lock_wait_timeout setting for transaction rollback when a deadlock occurs.

The problem is, that every time, when MySQL is doing a (Row) Lock or a Table Lock, it checks, if the Lock causes a Deadlock. This check is quite expensive. By the way: The feature disabling InnoDB Deadlock detection was developed by Facebook for WebScaleSQL [3].

The relevant functions can be found in [4]:

class DeadlockChecker, method check_and_resolve (DeadlockChecker::check_and_resolve) Every InnoDB (row) Lock (for mode LOCK_S or LOCK_X) and type ORed with LOCK_GAP or LOCK_REC_NOT_GAP, ORed with LOCK_INSERT_INTENTION Enqueue a waiting request for a lock which cannot be granted immediately. lock_rec_enqueue_waiting()

and

Every (InnoDB) Table Lock Enqueues a waiting request for a table lock which cannot be granted immediately. Checks for deadlocks. lock_table_enqueue_waiting()

This means if the variable innodb_deadlock_detect is enabled (= default) for every Lock (Row or Table) it is checked, if it causes a Deadlock. If the variable is disabled, the check is NOT done (which is faster) and the transaction hangs in (Dead-)Lock until the Lock is freed or the time innodb_lock_wait_timeout (default 50 seconds) is exceeded. Then the InnoDB Lock Wait Timeout (detector?) strikes and kills the transaction.

SQL> SHOW GLOBAL VARIABLES LIKE 'innodb_lock_wait%'; +--------------------------+-------+ | Variable_name | Value | +--------------------------+-------+ | innodb_lock_wait_timeout | 50 | +--------------------------+-------+

This means, deactivating InnoDB Deadlock detection is interesting, if you have many (like Facebook!?!) short and small transactions where you expect little to now conflicts.
Further it is recommended, to set the MySQL variable innodb_lock_wait_timeout to a very small value (some seconds).

Because most of our customers do not have the size of Facebook and because they have rather not so many concurrent short and small transactions but few but long transactions (with probably many Locks and therefore a high Deadlock probability), I can imagine, disabling this parameter was responsible for the hickup (Locks are piling up) of the customer system. Which leads to exceeding max_connections and finally the whole system sticks.

Therefore I strongly recommend, to let InnoDB Deadlock detection enabled. Except you know exactly what your are doing (after about 2 weeks of extensive testing and measuring).

Literature Taxonomy upgrade extras: innodbdeadlocklockperformancelockingblock

innodb_deadlock_detect - Rather Hands off!

Shinguz - Mon, 2020-03-23 11:24

Recently we had a new customer who has had from time to time massive database problems which he did not understand. When we reviewed the MySQL configuration file (my.cnf) we found, that this customer had disabled the InnoDB Deadlock detection (innodb_deadlock_detect).

Because we have advised against doing this so far, but I never stumbled upon this problem in practice, I have investigated a bit more about the MySQL variable innodb_deadlock_detect.

The MySQL documentation tells us the following [1]:

Disabling Deadlock Detection
On high concurrency systems, deadlock detection can cause a slowdown when numerous threads wait for the same lock. At times, it may be more efficient to disable deadlock detection and rely on the innodb_lock_wait_timeout setting for transaction rollback when a deadlock occurs. Deadlock detection can be disabled using the innodb_deadlock_detect configuration option.

And about the parameter innodb_deadlock_detect itself [2] itself:

This option is used to disable deadlock detection. On high concurrency systems, deadlock detection can cause a slowdown when numerous threads wait for the same lock. At times, it may be more efficient to disable deadlock detection and rely on the innodb_lock_wait_timeout setting for transaction rollback when a deadlock occurs.

The problem is, that every time, when MySQL is doing a (Row) Lock or a Table Lock, it checks, if the Lock causes a Deadlock. This check is quite expensive. By the way: The feature disabling InnoDB Deadlock detection was developed by Facebook for WebScaleSQL [3].

The relevant functions can be found in [4]:

class DeadlockChecker, method check_and_resolve (DeadlockChecker::check_and_resolve) Every InnoDB (row) Lock (for mode LOCK_S or LOCK_X) and type ORed with LOCK_GAP or LOCK_REC_NOT_GAP, ORed with LOCK_INSERT_INTENTION Enqueue a waiting request for a lock which cannot be granted immediately. lock_rec_enqueue_waiting()

and

Every (InnoDB) Table Lock Enqueues a waiting request for a table lock which cannot be granted immediately. Checks for deadlocks. lock_table_enqueue_waiting()

This means if the variable innodb_deadlock_detect is enabled (= default) for every Lock (Row or Table) it is checked, if it causes a Deadlock. If the variable is disabled, the check is NOT done (which is faster) and the transaction hangs in (Dead-)Lock until the Lock is freed or the time innodb_lock_wait_timeout (default 50 seconds) is exceeded. Then the InnoDB Lock Wait Timeout (detector?) strikes and kills the transaction.

SQL> SHOW GLOBAL VARIABLES LIKE 'innodb_lock_wait%'; +--------------------------+-------+ | Variable_name | Value | +--------------------------+-------+ | innodb_lock_wait_timeout | 50 | +--------------------------+-------+

This means, deactivating InnoDB Deadlock detection is interesting, if you have many (like Facebook!?!) short and small transactions where you expect little to now conflicts.
Further it is recommended, to set the MySQL variable innodb_lock_wait_timeout to a very small value (some seconds).

Because most of our customers do not have the size of Facebook and because they have rather not so many concurrent short and small transactions but few but long transactions (with probably many Locks and therefore a high Deadlock probability), I can imagine, disabling this parameter was responsible for the hickup (Locks are piling up) of the customer system. Which leads to exceeding max_connections and finally the whole system sticks.

Therefore I strongly recommend, to let InnoDB Deadlock detection enabled. Except you know exactly what your are doing (after about 2 weeks of extensive testing and measuring).

Literature Taxonomy upgrade extras: innodbdeadlocklockperformancelockingblock

Eher Finger weg: innodb_deadlock_detect

Oli Sennhauser - Fri, 2020-03-06 15:27

Kürzlich haben wir bei einem unserer Kunden, der gelegentlich massive Datenbankprobleme hat, bei der Durchsicht der MySQL Konfigurationsdatei (my.cnf) festgestellt, dass er die InnoDB Deadlock-Erkennung (innodb_deadlock_detect) deaktiviert hatte.

Da wir davon bisher immer abgeraten haben, ich aber noch nie konkret über dieses Problem gestolpert bin, bin ich der Sache noch etwas nachgegangen und habe zur Variable innodb_deadlock_detect recherchiert.

Die MySQL Dokumentation sagt dazu folgendes [1]:

Disabling Deadlock Detection
On high concurrency systems, deadlock detection can cause a slowdown when numerous threads wait for the same lock. At times, it may be more efficient to disable deadlock detection and rely on the innodb_lock_wait_timeout setting for transaction rollback when a deadlock occurs. Deadlock detection can be disabled using the innodb_deadlock_detect configuration option.

Und zum Parameter innodb_deadlock_detect selbst [2]:

This option is used to disable deadlock detection. On high concurrency systems, deadlock detection can cause a slowdown when numerous threads wait for the same lock. At times, it may be more efficient to disable deadlock detection and rely on the innodb_lock_wait_timeout setting for transaction rollback when a deadlock occurs.

Das Problem ist, dass jedes mal, wenn MySQL einen (Row) Lock oder Table Lock macht, überprüft wird, ob dieser Lock einen Deadlock verursacht. Was entsprechend teuer ist. Diese Feature wurde übrigens von Facebook entwickelt [3].

Die entsprechenden Funktionen sind in [4] zu finden:

class DeadlockChecker, method check_and_resolve (DeadlockChecker::check_and_resolve) Every InnoDB (row) Lock (for mode LOCK_S or LOCK_X) and type ORed with LOCK_GAP or LOCK_REC_NOT_GAP, ORed with LOCK_INSERT_INTENTION Enqueue a waiting request for a lock which cannot be granted immediately. lock_rec_enqueue_waiting()

und

Every (InnoDB) Table Lock Enqueues a waiting request for a table lock which cannot be granted immediately. Checks for deadlocks. lock_table_enqueue_waiting()

Das heisst jetzt, wenn die Variable innodb_deadlock_detect eingeschaltet ist (= default) wird bei jedem Lock (Row oder Table) überprüft, ob dadurch ein Deadlock entsteht. Wenn die Variable ausgeschaltet ist, wird diese Überprüfung NICHT ausgeführt (was schneller ist) und die Transaktion bleibt im (Dead-)Lock hängen bis der Lock frei gegeben wird oder die Zeit innodb_lock_wait_timeout (default 50 Sekunden) überschritten ist. Dann schlägt der InnoDB Lock Wait Timeout (Detektor?) zu.

SQL> SHOW GLOBAL VARIABLES LIKE 'innodb_lock_wait%'; +--------------------------+-------+ | Variable_name | Value | +--------------------------+-------+ | innodb_lock_wait_timeout | 50 | +--------------------------+-------+

Das bedeutet, das Deaktivieren der InnoDB Deadlock Detektion ist interessant, wenn Sie sehr viele (wie Facebook!?!) kurze/kleine Transaktionen haben, bei welchen wenig bis keine Konflikte auftreten.
Im Weiteren wird empfohlen, gleichzeitig die innodb_lock_wait_timeout auf einen sehr geringen Wert (wenige Sekunden) einzustellen.

Da die meisten unserer Kunden aber nicht in die Kragenweite Facebook passen und tendenziell eher nicht viele gleichzeitige kurze/kleine Transaktionen haben sondern wenig lange Transaktionen (mit wahrscheinlich vielen Locks und daher einer grossen Deadlock-Wahrscheinlichkeit), kann ich mir durchaus vorstellen, dass das deaktivieren diese Parameters für das Verschlucken (Locks stauen auf) des Kundensystems verantwortlich ist, was anschliessen dazu führt, dass max_connections erreicht wird und schliesslich gar nichts mehr geht.

Daher würde ich dringend empfehlen, die InnoDB Deadlock Detektierung aktiviert zu lassen. Ausser man weiss genau, was man tut (nach ca. 2 Wochen testen und messen).

Literatur Taxonomy upgrade extras: innodbdeadlocklockperformancelockingblock

Eher Finger weg: innodb_deadlock_detect

Oli Sennhauser - Fri, 2020-03-06 15:27

Kürzlich haben wir bei einem unserer Kunden, der gelegentlich massive Datenbankprobleme hat, bei der Durchsicht der MySQL Konfigurationsdatei (my.cnf) festgestellt, dass er die InnoDB Deadlock-Erkennung (innodb_deadlock_detect) deaktiviert hatte.

Da wir davon bisher immer abgeraten haben, ich aber noch nie konkret über dieses Problem gestolpert bin, bin ich der Sache noch etwas nachgegangen und habe zur Variable innodb_deadlock_detect recherchiert.

Die MySQL Dokumentation sagt dazu folgendes [1]:

Disabling Deadlock Detection
On high concurrency systems, deadlock detection can cause a slowdown when numerous threads wait for the same lock. At times, it may be more efficient to disable deadlock detection and rely on the innodb_lock_wait_timeout setting for transaction rollback when a deadlock occurs. Deadlock detection can be disabled using the innodb_deadlock_detect configuration option.

Und zum Parameter innodb_deadlock_detect selbst [2]:

This option is used to disable deadlock detection. On high concurrency systems, deadlock detection can cause a slowdown when numerous threads wait for the same lock. At times, it may be more efficient to disable deadlock detection and rely on the innodb_lock_wait_timeout setting for transaction rollback when a deadlock occurs.

Das Problem ist, dass jedes mal, wenn MySQL einen (Row) Lock oder Table Lock macht, überprüft wird, ob dieser Lock einen Deadlock verursacht. Was entsprechend teuer ist. Diese Feature wurde übrigens von Facebook entwickelt [3].

Die entsprechenden Funktionen sind in [4] zu finden:

class DeadlockChecker, method check_and_resolve (DeadlockChecker::check_and_resolve) Every InnoDB (row) Lock (for mode LOCK_S or LOCK_X) and type ORed with LOCK_GAP or LOCK_REC_NOT_GAP, ORed with LOCK_INSERT_INTENTION Enqueue a waiting request for a lock which cannot be granted immediately. lock_rec_enqueue_waiting()

und

Every (InnoDB) Table Lock Enqueues a waiting request for a table lock which cannot be granted immediately. Checks for deadlocks. lock_table_enqueue_waiting()

Das heisst jetzt, wenn die Variable innodb_deadlock_detect eingeschaltet ist (= default) wird bei jedem Lock (Row oder Table) überprüft, ob dadurch ein Deadlock entsteht. Wenn die Variable ausgeschaltet ist, wird diese Überprüfung NICHT ausgeführt (was schneller ist) und die Transaktion bleibt im (Dead-)Lock hängen bis der Lock frei gegeben wird oder die Zeit innodb_lock_wait_timeout (default 50 Sekunden) überschritten ist. Dann schlägt der InnoDB Lock Wait Timeout (Detektor?) zu.

SQL> SHOW GLOBAL VARIABLES LIKE 'innodb_lock_wait%'; +--------------------------+-------+ | Variable_name | Value | +--------------------------+-------+ | innodb_lock_wait_timeout | 50 | +--------------------------+-------+

Das bedeutet, das Deaktivieren der InnoDB Deadlock Detektion ist interessant, wenn Sie sehr viele (wie Facebook!?!) kurze/kleine Transaktionen haben, bei welchen wenig bis keine Konflikte auftreten.
Im Weiteren wird empfohlen, gleichzeitig die innodb_lock_wait_timeout auf einen sehr geringen Wert (wenige Sekunden) einzustellen.

Da die meisten unserer Kunden aber nicht in die Kragenweite Facebook passen und tendenziell eher nicht viele gleichzeitige kurze/kleine Transaktionen haben sondern wenig lange Transaktionen (mit wahrscheinlich vielen Locks und daher einer grossen Deadlock-Wahrscheinlichkeit), kann ich mir durchaus vorstellen, dass das deaktivieren diese Parameters für das Verschlucken (Locks stauen auf) des Kundensystems verantwortlich ist, was anschliessen dazu führt, dass max_connections erreicht wird und schliesslich gar nichts mehr geht.

Daher würde ich dringend empfehlen, die InnoDB Deadlock Detektierung aktiviert zu lassen. Ausser man weiss genau, was man tut (nach ca. 2 Wochen testen und messen).

Literatur Taxonomy upgrade extras: innodbdeadlocklockperformancelockingblock

FromDual is 10 years old

Shinguz - Mon, 2020-03-02 10:03

On 1 March 2020 FromDual became 10 years old! Sincere thanks are given to all our customers, partners and interested person for their support and good cooperation in the last 10 years. And we would be pleased to advise and support you again competently in the coming 10 years.

Your FromDual Team

Picture by kalhh on Pixabay

Taxonomy upgrade extras: fromdual

FromDual is 10 years old

Shinguz - Mon, 2020-03-02 10:03

On 1 March 2020 FromDual became 10 years old! Sincere thanks are given to all our customers, partners and interested person for their support and good cooperation in the last 10 years. And we would be pleased to advise and support you again competently in the coming 10 years.

Your FromDual Team

Picture by kalhh on Pixabay

Taxonomy upgrade extras: fromdual

FromDual ist 10 Jahre alt

Oli Sennhauser - Mon, 2020-03-02 10:01

Am 1. März 2020 wurde die FromDual GmbH 10 Jahre alt! Wir möchten allen Kunden, Partnern und Interessenten herzlich für die Unterstützung und gute Zusammenarbeit in den vergangenen 10 Jahren danken. Es würde uns freuen, Euch auch in den kommenden 10 Jahren kompetent beraten und betreuen zu dürfen.

Euer FromDual Team

Bild von kalhh auf Pixabay

Taxonomy upgrade extras: fromdual

FromDual ist 10 Jahre alt

Oli Sennhauser - Mon, 2020-03-02 10:01

Am 1. März 2020 wurde die FromDual GmbH 10 Jahre alt! Wir möchten allen Kunden, Partnern und Interessenten herzlich für die Unterstützung und gute Zusammenarbeit in den vergangenen 10 Jahren danken. Es würde uns freuen, Euch auch in den kommenden 10 Jahren kompetent beraten und betreuen zu dürfen.

Euer FromDual Team

Bild von kalhh auf Pixabay

Taxonomy upgrade extras: fromdual

InnoDB Page Cleaner intended loop takes too long

Shinguz - Tue, 2020-02-18 17:50

Recently we migrated a database system from MySQL 5.7 to MariaDB 10.3. Everything went fine so far just the following message started to pop-up in the MariaDB Error Log File with the severity Note:

InnoDB: page_cleaner: 1000ms intended loop took 4674ms. The settings might not be optimal. (flushed=102 and evicted=0, during the time.)

I remember that this message also appeared in earlier MySQL 5.7 releases but somehow disappeared in later releases. I assume MySQL has just disabled the Note?

You can find various advices in the Internet on to get rid of this Note:

innodb_lru_scan_depth = 1024, 256 innodb_buffer_pool_instances = 1, 8 innodb_io_capcity = 100, 200 or 1000 innodb_page_cleaners = 1, 4 or 8

But non of these changes made the Note go away in our case. I only found one voice claiming it could be an external reason which makes this message appear. Because we are actually running on a Cloud-Machine the appearance of this message could really be an effect of the Cloud and not caused by the Database or the Application.

We further know that our MariaDB Database has a more or less uniform workload over the day. Further it is a Master/Master (active/passive) set-up. So both nodes should see more or less the same write load at the same time.

But as our investigation clearly shows that the Note does not appear at the same time on both nodes. So I strongly assume it is a noisy-neighbour problem.

First we tried to find any trend or correlation between these 2 Master/Master Databases maas1 and maas2:

What we can see here is, that the message appeared on different days on maas1 and maas2. The database maas1 had a problem in the beginning of December and end of January. Database maas1 had much less problems in general but end of December there was a problem.

During night both instances seem to have less problems than during the day. And maas2 has more problems in the afternoon and evening.

If we look at the distribution per minute we can see that maas2 has some problems around xx:45 to xx:50 and maas1 more at xx:15.

Then we had a closer look at 28 January at about 12:00 to 15:00 on maas2:

We cannot see any anomalies which would explain a huge increase of dirty pages and and a page_cleaner stuck.

The only thing we could see at the specified time is that I/O latency significantly increased on server side. Because we did not cause more load and over-saturated the system it must be triggered externally:

This correlates quite well to the Notes we see in the MariaDB Error Log on maas2:

2020-01-28 12:45:27 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5760ms. The settings might not be optimal. (flushed=101 and evicted=0, during the time.) 2020-01-28 12:46:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 6908ms. The settings might not be optimal. (flushed=4 and evicted=0, during the time.) 2020-01-28 12:46:32 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5339ms. The settings might not be optimal. (flushed=17 and evicted=0, during the time.) 2020-01-28 12:47:36 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4379ms. The settings might not be optimal. (flushed=101 and evicted=0, during the time.) 2020-01-28 12:48:08 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5053ms. The settings might not be optimal. (flushed=7 and evicted=0, during the time.) 2020-01-28 12:48:42 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5760ms. The settings might not be optimal. (flushed=102 and evicted=0, during the time.) 2020-01-28 12:49:38 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4202ms. The settings might not be optimal. (flushed=100 and evicted=0, during the time.) 2020-01-28 12:57:28 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4615ms. The settings might not be optimal. (flushed=18 and evicted=0, during the time.) 2020-01-28 12:58:01 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5593ms. The settings might not be optimal. (flushed=4 and evicted=0, during the time.) 2020-01-28 12:58:34 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5442ms. The settings might not be optimal. (flushed=101 and evicted=0, during the time.) 2020-01-28 12:59:31 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4327ms. The settings might not be optimal. (flushed=101 and evicted=0, during the time.) 2020-01-28 13:00:05 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5154ms. The settings might not be optimal. (flushed=82 and evicted=0, during the time.) 2020-01-28 13:08:01 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4321ms. The settings might not be optimal. (flushed=4 and evicted=0, during the time.) 2020-01-28 13:10:46 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 21384ms. The settings might not be optimal. (flushed=100 and evicted=0, during the time.) 2020-01-28 13:14:16 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4180ms. The settings might not be optimal. (flushed=20 and evicted=0, during the time.) 2020-01-28 13:14:49 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4935ms. The settings might not be optimal. (flushed=4 and evicted=0, during the time.) 2020-01-28 13:15:20 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4472ms. The settings might not be optimal. (flushed=25 and evicted=0, during the time.) 2020-01-28 13:15:47 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4358ms. The settings might not be optimal. (flushed=9 and evicted=0, during the time.) 2020-01-28 13:48:31 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 6212ms. The settings might not be optimal. (flushed=9 and evicted=0, during the time.) 2020-01-28 13:55:44 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4280ms. The settings might not be optimal. (flushed=101 and evicted=0, during the time.) 2020-01-28 13:59:43 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5817ms. The settings might not be optimal. (flushed=4 and evicted=0, during the time.) 2020-01-28 14:00:16 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5384ms. The settings might not be optimal. (flushed=100 and evicted=0, during the time.) 2020-01-28 14:00:52 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 9460ms. The settings might not be optimal. (flushed=5 and evicted=0, during the time.) 2020-01-28 14:01:25 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 7727ms. The settings might not be optimal. (flushed=103 and evicted=0, during the time.) 2020-01-28 14:01:57 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 7154ms. The settings might not be optimal. (flushed=5 and evicted=0, during the time.) 2020-01-28 14:02:29 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 7501ms. The settings might not be optimal. (flushed=5 and evicted=0, during the time.) 2020-01-28 14:03:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4322ms. The settings might not be optimal. (flushed=78 and evicted=0, during the time.) 2020-01-28 14:32:02 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4927ms. The settings might not be optimal. (flushed=4 and evicted=0, during the time.) 2020-01-28 14:32:34 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4506ms. The settings might not be optimal. (flushed=101 and evicted=0, during the time.)
Taxonomy upgrade extras: innodbpage cleanerdirty pagesmigrationflushingnoisy neighbours

Pages

Subscribe to FromDual aggregator