You are here
News
Storing BLOBs in the database
We have sometimes discussions with our customers whether to store LOBs (Large Objects) in the database or not. To not rephrase the arguments again and again I have summarized them in the following lines.
The following items are more or less valid for all large data types (BLOB, TEXT and theoretically also for JSON and GIS columns) stored in a MySQL or MariaDB (or any other relational) database.
The idea of a relational table based data-store is to store structured data (numbers, data and short character strings) to have a quick write and read access to them.
And yes, you can also store other things like videos, huge texts (PDF, emails) or similar in a RDBMS but they are principally not designed for such a job and thus non optimal for the task. Software vendors implement such features not mainly because it makes sense but because users want it and the vendors want to attract users (or their managers) with such features (USP, Unique Selling Proposition). Here also one of my Mantras: Use the right tool for the right task:
The main topics to discuss related to LOBs are: Operations, performance, economical reasons and technical limitations.
Disadvantages of storing LOBs in the database- The database will grow fast. Operations will become more costly and complicated.
- Backup and restore will become more costly and complicated for the admin because of the increased size caused by LOBs.
- Backup and restore will take longer because of the same reason.
- Database and table management functions (OPTIMIZE, ALTER, etc.) will take longer on big LOB tables.
- Smaller databases need less RAM/disk space and are thus cheaper.
- Smaller databases fit better into your RAM and are thus potentially faster (RAM vs disk access).
- RDBMS are a relatively slow technology (compared to others). Reading LOBs from the database is significantly slower than reading LOBs from a filer for example.
- LOBs stored in the database will spoil your database cache (InnoDB Buffer Pool) and thus possibly slow down other queries (does not necessarily happen with more sophisticated RBDMS).
- LOB size limitation of 1 Gbyte in reality (max_allowed_packet, theoretically limit is at 4 Gbyte) for MySQL/MariaDB.
- Expensive, fast database store (RAID-10, SSD) is wasted for something which can be stored better on a cheap slow file store (RAID-5, HDD).
- It is programmatically often more complicated to get LOBs from a database than from a filer (depends on your libraries).
Advantages of storing LOBs in the database
- Atomicity between data and LOB is guaranteed by transactions (is it really in MySQL/MariaDB?).
- There are no dangling links (reference from data to LOB) between data and LOB.
- Data and LOB are from the same point in time and can be included in the same backup.
Conclusion
So basically you have to balance the advantages vs. the disadvantages of storing LOBs in the database and decided what arguments are more important in your case.
If you have some more good arguments pro or contra storing LOBs in the database please let me know.
LiteratureCheck also various articles on Google.
Taxonomy upgrade extras: blobtextlobdesignMySQL Enterprise Backup Incremental Cumulative and Differential Backup
Preparing the MySQL Enterprise Administrator Training I found that the MySQL Enterprise Backup Incremental Backup is not described very well. Thus I tried it out and wrote down this how-to:
Incremental Differential Backup Full Backup mysqlbackup --user=root --backup-dir=/tape/full backup-and-apply-log grep end_lsn /tape/full/meta/backup_variables.txt end_lsn=2583666Incremental Backups mysqlbackup --user=root --incremental-backup-dir=/tape/inc1 --start-lsn=2583666 --incremental backup grep end_lsn /tape/inc1/meta/backup_variables.txt end_lsn=2586138 mysqlbackup --user=root --incremental-backup-dir=/tape/inc2 --start-lsn=2586138 --incremental backup grep end_lsn /tape/inc2/meta/backup_variables.txt end_lsn=2589328 mysqlbackup --user=root --incremental-backup-dir=/tape/inc3 --start-lsn=2589328 --incremental backup grep end_lsn /tape/inc3/meta/backup_variables.txt end_lsn=2592519
Binary Log Backups cp /var/lib/binlog/binlog.* /tape/binlog/
Restore
This step will modify the original full backup!
mysqlbackup --incremental-backup-dir=/tape/inc1 --backup-dir=/tape/full apply-incremental-backup mysqlbackup --incremental-backup-dir=/tape/inc2 --backup-dir=/tape/full apply-incremental-backup mysqlbackup --incremental-backup-dir=/tape/inc3 --backup-dir=/tape/full apply-incremental-backup mysqlbackup --user=root --datadir=/var/lib/mysql --backup-dir=/tape/full copy-backPoint-in-Time-Recovery grep binlog_position /tape/inc3/meta/backup_variables.txt /tape/inc3/meta/backup_variables.txt:binlog_position=binlog.000001:7731 cd /tape/binlog mysqlbinlog --disable-log-bin --start-position=7731 binlog.000001 | mysql -uroot
Incremental Cumulative Backup Full Backup mysqlbackup --user=root --backup-dir=/tape/full backup-and-apply-log grep end_lsn /tape/full/meta/backup_variables.txt end_lsn=2602954 Incremental Backups mysqlbackup --user=root --incremental-backup-dir=/tape/inc1 --start-lsn=2602954 --incremental backup mysqlbackup --user=root --incremental-backup-dir=/tape/inc2 --start-lsn=2602954 --incremental backup mysqlbackup --user=root --incremental-backup-dir=/tape/inc3 --start-lsn=2602954 --incremental backup
Binary Log Backups cp /home/mysql/database/mysql-5.7/binlog/binlog.* /tape/binlog/
Restore
This step will modify the original full backup!
mysqlbackup --incremental-backup-dir=/tape/inc3 --backup-dir=/tape/full apply-incremental-backup mysqlbackup --user=root --datadir=/var/lib/mysql --backup-dir=/tape/full copy-backPoint-in-Time-Recovery grep binlog_position /tape/*/meta/backup_variables.txt /tape/inc3/meta/backup_variables.txt:binlog_position=binlog.000001:7731 cd /tape/binlog mysqlbinlog --disable-log-bin --start-position=7731 binlog.000001 | mysql -uroot
I very much dislike that during my restore the backup is modified. So if I do a mistake during restore my backup is gone and I am doomed.
Taxonomy upgrade extras: BackupRestoreMySQL Enterprise BackupenterpriseincrementalcumulativedifferentialMySQL Enterprise Backup Incremental Cumulative and Differential Backup
Preparing the MySQL Enterprise Administrator Training I found that the MySQL Enterprise Backup Incremental Backup is not described very well. Thus I tried it out and wrote down this how-to:
Incremental Differential Backup Full Backup mysqlbackup --user=root --backup-dir=/tape/full backup-and-apply-log grep end_lsn /tape/full/meta/backup_variables.txt end_lsn=2583666Incremental Backups mysqlbackup --user=root --incremental-backup-dir=/tape/inc1 --start-lsn=2583666 --incremental backup grep end_lsn /tape/inc1/meta/backup_variables.txt end_lsn=2586138 mysqlbackup --user=root --incremental-backup-dir=/tape/inc2 --start-lsn=2586138 --incremental backup grep end_lsn /tape/inc2/meta/backup_variables.txt end_lsn=2589328 mysqlbackup --user=root --incremental-backup-dir=/tape/inc3 --start-lsn=2589328 --incremental backup grep end_lsn /tape/inc3/meta/backup_variables.txt end_lsn=2592519
Binary Log Backups cp /var/lib/binlog/binlog.* /tape/binlog/
Restore
This step will modify the original full backup!
mysqlbackup --incremental-backup-dir=/tape/inc1 --backup-dir=/tape/full apply-incremental-backup mysqlbackup --incremental-backup-dir=/tape/inc2 --backup-dir=/tape/full apply-incremental-backup mysqlbackup --incremental-backup-dir=/tape/inc3 --backup-dir=/tape/full apply-incremental-backup mysqlbackup --user=root --datadir=/var/lib/mysql --backup-dir=/tape/full copy-backPoint-in-Time-Recovery grep binlog_position /tape/inc3/meta/backup_variables.txt /tape/inc3/meta/backup_variables.txt:binlog_position=binlog.000001:7731 cd /tape/binlog mysqlbinlog --disable-log-bin --start-position=7731 binlog.000001 | mysql -uroot
Incremental Cumulative Backup Full Backup mysqlbackup --user=root --backup-dir=/tape/full backup-and-apply-log grep end_lsn /tape/full/meta/backup_variables.txt end_lsn=2602954 Incremental Backups mysqlbackup --user=root --incremental-backup-dir=/tape/inc1 --start-lsn=2602954 --incremental backup mysqlbackup --user=root --incremental-backup-dir=/tape/inc2 --start-lsn=2602954 --incremental backup mysqlbackup --user=root --incremental-backup-dir=/tape/inc3 --start-lsn=2602954 --incremental backup
Binary Log Backups cp /home/mysql/database/mysql-5.7/binlog/binlog.* /tape/binlog/
Restore
This step will modify the original full backup!
mysqlbackup --incremental-backup-dir=/tape/inc3 --backup-dir=/tape/full apply-incremental-backup mysqlbackup --user=root --datadir=/var/lib/mysql --backup-dir=/tape/full copy-backPoint-in-Time-Recovery grep binlog_position /tape/*/meta/backup_variables.txt /tape/inc3/meta/backup_variables.txt:binlog_position=binlog.000001:7731 cd /tape/binlog mysqlbinlog --disable-log-bin --start-position=7731 binlog.000001 | mysql -uroot
I very much dislike that during my restore the backup is modified. So if I do a mistake during restore my backup is gone and I am doomed.
Taxonomy upgrade extras: BackupRestoreMySQL Enterprise BackupenterpriseincrementalcumulativedifferentialFromDual Backup and Recovery Manager for MySQL 1.2.5 has been released
FromDual has the pleasure to announce the release of the new version 1.2.5 of the popular Backup and Recovery Manager for MySQL and MariaDB (fromdual_brman).
You can download the FromDual Backup and Recovery Manager from here.
In the inconceivable case that you find a bug in the FromDual Backup and Recovery Manager please report it to our Bugtracker or just send us an email.
Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.
Upgrade from 1.2.x to 1.2.5 # cd ${HOME}/product # tar xf /download/fromdual_brman-1.2.5.tar.gz # rm -f fromdual_brman # ln -s fromdual_brman-1.2.5 fromdual_brmanChanges in FromDual Backup Manager 1.2.5
This release contains mainly fixes related to the backup catalog and the backup type cleanup.
You can verify your current FromDual Backup Manager version with the following command:
shell> fromdual_bman --versionFromDual Backup Manager
- Docker container is ready.
- Testing infrastructure improved.
- Better option checking to be usage compliant.
- Option handling and checking refactored.
- PHP variable variables_order is included into program. No more php.ini change required any more.
- Connection error made more verbose.
- Downstream myEnv.inc library from rev873.
- Incompatibility change: Default --backupdir location changed from ./bck to $HOME/bck.
- Catalog creation and upgrade messages put into right order.
- Backup catalog should be upgraded directly to newest release when created.
- Only do catalog version check if --upgrade is not specified.
- Incompatibility change: Catalog operation options (--create and --upgrade) made more consistent (--type=catalog). This could lead to different behaviour than earlier!
- Backup cleanup in catalog made nicer.
- Catalog version changed from 010 to 020.
- Archive cleanup output empty line removed.
- Backup cleanup in catalog made nicer.
- Cleanup should delete only instance backups but not all backups if specified (bug #166).
- Omit warning for non-transactional tables if option --blocking-backup is set.
- Function getDumpOverview now also includes empty schemata.
- Transactional storage engine selection made ready for MariaDB 10.1/10.2 and MySQL 8.0.
FromDual Backup and Recovery Manager for MySQL 1.2.5 has been released
FromDual has the pleasure to announce the release of the new version 1.2.5 of the popular Backup and Recovery Manager for MySQL and MariaDB (fromdual_brman).
You can download the FromDual Backup and Recovery Manager from here.
In the inconceivable case that you find a bug in the FromDual Backup and Recovery Manager please report it to our Bugtracker or just send us an email.
Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.
Upgrade from 1.2.x to 1.2.5 # cd ${HOME}/product # tar xf /download/fromdual_brman-1.2.5.tar.gz # rm -f fromdual_brman # ln -s fromdual_brman-1.2.5 fromdual_brmanChanges in FromDual Backup Manager 1.2.5
This release contains mainly fixes related to the backup catalog and the backup type cleanup.
You can verify your current FromDual Backup Manager version with the following command:
shell> fromdual_bman --versionFromDual Backup Manager
- Docker container is ready.
- Testing infrastructure improved.
- Better option checking to be usage compliant.
- Option handling and checking refactored.
- PHP variable variables_order is included into program. No more php.ini change required any more.
- Connection error made more verbose.
- Downstream myEnv.inc library from rev873.
- Incompatibility change: Default --backupdir location changed from ./bck to $HOME/bck.
- Catalog creation and upgrade messages put into right order.
- Backup catalog should be upgraded directly to newest release when created.
- Only do catalog version check if --upgrade is not specified.
- Incompatibility change: Catalog operation options (--create and --upgrade) made more consistent (--type=catalog). This could lead to different behaviour than earlier!
- Backup cleanup in catalog made nicer.
- Catalog version changed from 010 to 020.
- Archive cleanup output empty line removed.
- Backup cleanup in catalog made nicer.
- Cleanup should delete only instance backups but not all backups if specified (bug #166).
- Omit warning for non-transactional tables if option --blocking-backup is set.
- Function getDumpOverview now also includes empty schemata.
- Transactional storage engine selection made ready for MariaDB 10.1/10.2 and MySQL 8.0.
FromDual Backup and Recovery Manager for MySQL 1.2.5 has been released
FromDual has the pleasure to announce the release of the new version 1.2.5 of the popular Backup and Recovery Manager for MySQL and MariaDB (fromdual_brman).
You can download the FromDual Backup and Recovery Manager from here.
In the inconceivable case that you find a bug in the FromDual Backup and Recovery Manager please report it to our Bugtracker or just send us an email.
Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.
Upgrade from 1.2.x to 1.2.5 # cd ${HOME}/product # tar xf /download/fromdual_brman-1.2.5.tar.gz # rm -f fromdual_brman # ln -s fromdual_brman-1.2.5 fromdual_brmanChanges in FromDual Backup Manager 1.2.5
This release contains mainly fixes related to the backup catalog and the backup type cleanup.
You can verify your current FromDual Backup Manager version with the following command:
shell> fromdual_bman --versionFromDual Backup Manager
- Docker container is ready.
- Testing infrastructure improved.
- Better option checking to be usage compliant.
- Option handling and checking refactored.
- PHP variable variables_order is included into program. No more php.ini change required any more.
- Connection error made more verbose.
- Downstream myEnv.inc library from rev873.
- Incompatibility change: Default --backupdir location changed from ./bck to $HOME/bck.
- Catalog creation and upgrade messages put into right order.
- Backup catalog should be upgraded directly to newest release when created.
- Only do catalog version check if --upgrade is not specified.
- Incompatibility change: Catalog operation options (--create and --upgrade) made more consistent (--type=catalog). This could lead to different behaviour than earlier!
- Backup cleanup in catalog made nicer.
- Catalog version changed from 010 to 020.
- Archive cleanup output empty line removed.
- Backup cleanup in catalog made nicer.
- Cleanup should delete only instance backups but not all backups if specified (bug #166).
- Omit warning for non-transactional tables if option --blocking-backup is set.
- Function getDumpOverview now also includes empty schemata.
- Transactional storage engine selection made ready for MariaDB 10.1/10.2 and MySQL 8.0.
How the Lack of a Primary Key May Effectively Stop the Slave
Most (relational) DBAs and DB application developers know the concept of a primary key ("PK") and what it is good for. However, much too often one still encounters table definitions without a PK. True, the relational theory based on sets does not need a PK, and all operations (insert, select, update, delete) can also be done on tables for which no PK was defined. If performance doesn't matter (or the data volume is small, a typical situation in tests), the lack of a PK does not immediately cause negative consequences.
But recently, we had several customers who (independent of each other) had big tables without a PK in a replication setup which they wanted to delete, and they all suffered severely from that: Their replication did not progress, the slave lag grew larger and larger, and all this without reporting any error. When I say "larger and larger", I mean it: I'm not talking about minutes or even hours, I'm talking about days!
ObservationsIn all cases, the situation could be summarized as:
- They were running a traditional, asynchronous replication using the "row" format.
- They had a table without PK with many entries, say a million rows.
- This table wasn't needed any more, and someone issued a "delete from T" statement on the master.
- From this moment, the slave did not show any sign of progress: The output of "show slave status \G"
- listed both the IO and the SQL slave thread as running,
- reported the same log positions without any change,
- did not mention any slave error,
- but the slave lag grew in sync with the passing time.
Looking at the slave's machine load, say using "vmstat", you could see a certain amount of CPU load in user mode, possibly some "waiting for IO", and some read IO (group "io", column "bi" = "block input"). Checking with "top", you could see that the MySQL server process was the only significant CPU load, and closer inspection would reveal that it used roughly one CPU core.
(A side remark: The numbers of "vmstat" and "top" are not directly comparable: While "vmstat" reports the overall CPU usage as a percentage of the total available CPU power, "top" reports the CPU usage of the individual processes as the percentage of a single CPU core. So if you had a 4-core machine with only one active process running an infinite loop, "top" would show this process as using 100%, while "vmstat" would report the CPU as running 25% in user mode and being 75% idle. If you don't know it, look at "/proc/cpuinfo" to find the number of CPU cores.)
What was the Issue? From "top", we could tell that the MySQL server was using one CPU core at full speed. From "vmstat", we could tell that it might access some data from disk, but didn't write any significant amount. This sure looked like an infinite loop or a close relative of it, executed by only one thread.Some of you may have heard that (under some conditions) replication may do a full table scan: This is it!
When the slave has to apply a change (for this discussion: a delete or an update) which is provided in row format, and the table has no PK (or other suitable index), the SQL thread does a full table scan searching for the matching row. Note that it doesn't stop on the first match, but rather continues the scan throughout the rest of the table.
To be honest: I don't understand why it doesn't stop after the first match. I also ran a modified test where some master rows had two matches on the slave (I had duplicated them by "insert select"): The "delete" was still replicated as before, and these additional slave rows were still present after the delete.
Until now, you may consider that "full table scan" as a claim without proof, and I agree with all demands for a more thorough check.
The Experiment For tests, I have a setup of two VMs, each running a MySQL server process, and they are configured for a traditional master-slave replication. In fact, I have it for the versions MySQL 5.5.44, 5.6.24, and 5.7.17. (Yes, I might add newer versions.)So I designed an experiment:
- Start both master and slave, make sure replication is running.
- On the master, create a table without any key / index, neither primary nor secondary: CREATE TABLE for_delete ( inc_num int, # ascending numbers 1 .. 10,000 repeated for larger tables clock timestamp, # now() counted: 129 different values with 200 k rows bubble char(250) # repeat('Abcd', 60) ) DEFAULT CHARACTER SET latin1 ;
- Insert into that table the desired number of rows (I had runs with 10 k, 20 k, 100 k, and 200 k)
- on both master and slave, run "SELECT ... FROM information_schema.global_status WHERE ..." to get status counters,
- Delete all rows of that table
- sleep for a time depending on the row count (you will later see why)
- get new status counters
- compute the status counter differences
This experiment reliably reproduced the symptoms which the customers had reported: When the "delete" had been done on the master, the slave became busy for a long time, depending on the row count, and in this time it did not show any progress indicators. This holds for all three versions I checked.
So what did the slave do? It worked as assumed: For each row, it did a full table scan, continuing even after it had found a matching row. This can be easily seen in the status counters, which you will find in the result tables below.
In this article, I will show the results of tests with 200 k rows only, as they are the most impressive. As described above, the table was created anew for each run. Then, 10,000 rows were inserted, and this was repeated 20 times. Each such group had the value of "inc_num" go up from 1 to 10,000.
In all runs, the status counters "HANDLER_DELETE", "INNODB_ROWS_DELETED", and "INNODB_ROWS_READ" showed the value of 200,000, the table size. So the handler calls did not show any scan, it must be below that interface (= within the handler).
If the server does a table scan, it must access all data pages of the table, so there must be read requests for them to guarantee they are in the buffer pool. And yes, the status counters showed these read requests: Table 1: No Key or Index
200,000 rows, status counters on Master SlaveVersion 5.5.44
INNODB_BUFFER_POOL_READ_REQUESTS 2,174,121 380,132,972 Ratio "read_requests" Slave/Master
174.8 "read_requests" per "rows_deleted" 10.9 1,900.7
Version 5.6.24
INNODB_BUFFER_POOL_READ_REQUESTS 2,318,685 379,762,583 Ratio "read_requests" Slave/Master
163.8 "read_requests" per "rows_deleted" 11.6 1,898.8
Version 5.7.17
INNODB_BUFFER_POOL_READ_REQUESTS 1,598,745 379,538,695 Ratio "read_requests" Slave/Master
237.4 "read_requests" per "rows_deleted" 8.0 1,897.7
While master and slave (from a logical point of view) do the same work (delete all rows from a 200 k rows table), the slave accesses many more pages than the master - depending on the version, the factor ranges from 164 to 237. (Note that in version 5.7 the slave didn't do worse than in earlier versions, rather the master did better. This shows clearly in the ratio of "innodb buffer pool read requests" to "rows deleted" on the master: This is about 11 in 5.5 and 5.6, but only 8 in 5.7.)
I won't bother you with the detailed results for 100 k rows. It is enough to say that for half as many rows, the master had half as many read requests, while the slave had a quarter. This means that the effort on the master grows proportional to table size, but on the slave it grows with the square of the table size. Accepting the fact that the slave always scans the full table, this is easy to understand: If the table has twice as many rows, there are twice as many pages to scan, and the number of scans also doubles.
Quadratic growth means that an increase of the table size by a factor ten lets the effort grow by a factor hundred! Now imagine I had tested with a million of rows ...
All this isn't new: In the public MySQL bug database, this is listed as bug #53375. The bug was reported on May 3, 2010, for MySQL 5.1. The bug is closed since December 15, 2011. This does not mean the full table scans were stopped - obviously, they are considered unavoidable, so the bug fix is to write an explaining message into the error log. In my tests, I encountered it only once, here it is:
[Note] The slave is applying a ROW event on behalf of a DELETE statement on table for_delete and is currently taking a considerable amount of time (61 seconds). This is due to the fact that it is scanning an index while looking up records to be processed. Consider adding a primary key (or unique key) to the table to improve performance. Avoiding the IssueOf course, this problem is the consequence of sloppy table design: If there were a primary key, it would not occur. But this may not help those who currently have tables without a PK and dare not add one, because it would require prior testing and a maintenance window to alter the table on the master.
Luckily, there are remedies, and even more than the log message (quoted above) mentions: In my tests, I found that the slave will use any index on that table if one is available. (It even uses an index with low selectivity, which may increase the page request count.) I did 5 tests:
- Create an index on "clock" (129 different values).
- Create an index on "inc_num" (10,000 different values).
- Create an index on both "clock" and "inc_num" (all combinations are distinct).
- As before, and declare it as "unique".
- Define the combination of "clock" and "inc_num" to be the primary key.
The "alter table" statement for this was always issued on the slave only, after the inserts, but before the delete.
In all three versions, the slave used any index available. The index on "clock" of course had many different rows per value, so it caused many even more page requests than the no-index case: between 2,800 and 3,165 per row deleted. Obviously, thios makes the problem even worse. But all other indexes were huge improvements, the primary key of course was the best choice. Here are the numbers: Table 2: With Added Key or Index
200,000 rows, status counters on Master Slave Slave Slave Slave Slave"Alter table ..." on slave before "Delete":
Index on "clock" 129 values Index on "inc_num" 10,000 v. Index
on both Unique I.
on both Primary key on both Version 5.5.44
INNODB_BUFFER_POOL_READ_REQUESTS 2,174,121 633,132,032 10,197,554 4,429,243 4,421,357 1,810,734 Ratio "read_requests" Slave/Master
291.2 4.7 2.0 2.0 0.8 "read_requests" per "rows_deleted" 10.9 3,165.7 51.0 22.1 22.1 9.1
Version 5.6.24
INNODB_BUFFER_POOL_READ_REQUESTS 2,318,685 610,482,315 10,327,132 4,375,106 4,073,543 1,794,524 Ratio "read_requests" Slave/Master
342.1 5.0 2.1 2.0 0.9 "read_requests" per "rows_deleted" 11.6 3,052.4 51.6 21.9 20.4 9.0
Version 5.7.17
INNODB_BUFFER_POOL_READ_REQUESTS 1,598,745 559,398,633 9,531,444 3,859,343 3,836,390 1,720,731 Ratio "read_requests" Slave/Master
354.8 6.0 2.3 2.5 1.1 "read_requests" per "rows_deleted" 8.0 2,797.0 47.7 19.3 19.2 8.6
But one very important fact should be mentioned in addition to the numbers: The fact that it worked! By adding an index, I made the slave's schema differ from the master's. The primary key even totally changed the B-tree in which the rows are stored, and it made the slave drop the internal row ID which InnoDB had added on the master. Still, replication using the row format could handle these differences without problems.
The Way OutSo the good message is:
- Even if you have a table without indexes or primary key, you can add these on the slave without breaking the replication.
- If you suffer from slow replication on such a table, adding a good index or (even better) the PK will solve this problem.
- In a replication setup, you can improve the schema on the slave and then do a failover, effectively improving it for all accesses without any maintenance window - just the short failover time.
This might help many DBAs who otherwise don't see a chance to improve a bad schema once it is used in production.
Take care!
Taxonomy upgrade extras: primary keyHow the Lack of a Primary Key May Effectively Stop the Slave
Most (relational) DBAs and DB application developers know the concept of a primary key ("PK") and what it is good for. However, much too often one still encounters table definitions without a PK. True, the relational theory based on sets does not need a PK, and all operations (insert, select, update, delete) can also be done on tables for which no PK was defined. If performance doesn't matter (or the data volume is small, a typical situation in tests), the lack of a PK does not immediately cause negative consequences.
But recently, we had several customers who (independent of each other) had big tables without a PK in a replication setup which they wanted to delete, and they all suffered severely from that: Their replication did not progress, the slave lag grew larger and larger, and all this without reporting any error. When I say "larger and larger", I mean it: I'm not talking about minutes or even hours, I'm talking about days!
ObservationsIn all cases, the situation could be summarized as:
- They were running a traditional, asynchronous replication using the "row" format.
- They had a table without PK with many entries, say a million rows.
- This table wasn't needed any more, and someone issued a "delete from T" statement on the master.
- From this moment, the slave did not show any sign of progress: The output of "show slave status \G"
- listed both the IO and the SQL slave thread as running,
- reported the same log positions without any change,
- did not mention any slave error,
- but the slave lag grew in sync with the passing time.
Looking at the slave's machine load, say using "vmstat", you could see a certain amount of CPU load in user mode, possibly some "waiting for IO", and some read IO (group "io", column "bi" = "block input"). Checking with "top", you could see that the MySQL server process was the only significant CPU load, and closer inspection would reveal that it used roughly one CPU core.
(A side remark: The numbers of "vmstat" and "top" are not directly comparable: While "vmstat" reports the overall CPU usage as a percentage of the total available CPU power, "top" reports the CPU usage of the individual processes as the percentage of a single CPU core. So if you had a 4-core machine with only one active process running an infinite loop, "top" would show this process as using 100%, while "vmstat" would report the CPU as running 25% in user mode and being 75% idle. If you don't know it, look at "/proc/cpuinfo" to find the number of CPU cores.)
What was the Issue? From "top", we could tell that the MySQL server was using one CPU core at full speed. From "vmstat", we could tell that it might access some data from disk, but didn't write any significant amount. This sure looked like an infinite loop or a close relative of it, executed by only one thread.Some of you may have heard that (under some conditions) replication may do a full table scan: This is it!
When the slave has to apply a change (for this discussion: a delete or an update) which is provided in row format, and the table has no PK (or other suitable index), the SQL thread does a full table scan searching for the matching row. Note that it doesn't stop on the first match, but rather continues the scan throughout the rest of the table.
To be honest: I don't understand why it doesn't stop after the first match. I also ran a modified test where some master rows had two matches on the slave (I had duplicated them by "insert select"): The "delete" was still replicated as before, and these additional slave rows were still present after the delete.
Until now, you may consider that "full table scan" as a claim without proof, and I agree with all demands for a more thorough check.
The Experiment For tests, I have a setup of two VMs, each running a MySQL server process, and they are configured for a traditional master-slave replication. In fact, I have it for the versions MySQL 5.5.44, 5.6.24, and 5.7.17. (Yes, I might add newer versions.)So I designed an experiment:
- Start both master and slave, make sure replication is running.
- On the master, create a table without any key / index, neither primary nor secondary: CREATE TABLE for_delete ( inc_num int, # ascending numbers 1 .. 10,000 repeated for larger tables clock timestamp, # now() counted: 129 different values with 200 k rows bubble char(250) # repeat('Abcd', 60) ) DEFAULT CHARACTER SET latin1 ;
- Insert into that table the desired number of rows (I had runs with 10 k, 20 k, 100 k, and 200 k)
- on both master and slave, run "SELECT ... FROM information_schema.global_status WHERE ..." to get status counters,
- Delete all rows of that table
- sleep for a time depending on the row count (you will later see why)
- get new status counters
- compute the status counter differences
This experiment reliably reproduced the symptoms which the customers had reported: When the "delete" had been done on the master, the slave became busy for a long time, depending on the row count, and in this time it did not show any progress indicators. This holds for all three versions I checked.
So what did the slave do? It worked as assumed: For each row, it did a full table scan, continuing even after it had found a matching row. This can be easily seen in the status counters, which you will find in the result tables below.
In this article, I will show the results of tests with 200 k rows only, as they are the most impressive. As described above, the table was created anew for each run. Then, 10,000 rows were inserted, and this was repeated 20 times. Each such group had the value of "inc_num" go up from 1 to 10,000.
In all runs, the status counters "HANDLER_DELETE", "INNODB_ROWS_DELETED", and "INNODB_ROWS_READ" showed the value of 200,000, the table size. So the handler calls did not show any scan, it must be below that interface (= within the handler).
If the server does a table scan, it must access all data pages of the table, so there must be read requests for them to guarantee they are in the buffer pool. And yes, the status counters showed these read requests: Table 1: No Key or Index
200,000 rows, status counters on Master SlaveVersion 5.5.44
INNODB_BUFFER_POOL_READ_REQUESTS 2,174,121 380,132,972 Ratio "read_requests" Slave/Master
174.8 "read_requests" per "rows_deleted" 10.9 1,900.7
Version 5.6.24
INNODB_BUFFER_POOL_READ_REQUESTS 2,318,685 379,762,583 Ratio "read_requests" Slave/Master
163.8 "read_requests" per "rows_deleted" 11.6 1,898.8
Version 5.7.17
INNODB_BUFFER_POOL_READ_REQUESTS 1,598,745 379,538,695 Ratio "read_requests" Slave/Master
237.4 "read_requests" per "rows_deleted" 8.0 1,897.7
While master and slave (from a logical point of view) do the same work (delete all rows from a 200 k rows table), the slave accesses many more pages than the master - depending on the version, the factor ranges from 164 to 237. (Note that in version 5.7 the slave didn't do worse than in earlier versions, rather the master did better. This shows clearly in the ratio of "innodb buffer pool read requests" to "rows deleted" on the master: This is about 11 in 5.5 and 5.6, but only 8 in 5.7.)
I won't bother you with the detailed results for 100 k rows. It is enough to say that for half as many rows, the master had half as many read requests, while the slave had a quarter. This means that the effort on the master grows proportional to table size, but on the slave it grows with the square of the table size. Accepting the fact that the slave always scans the full table, this is easy to understand: If the table has twice as many rows, there are twice as many pages to scan, and the number of scans also doubles.
Quadratic growth means that an increase of the table size by a factor ten lets the effort grow by a factor hundred! Now imagine I had tested with a million of rows ...
All this isn't new: In the public MySQL bug database, this is listed as bug #53375. The bug was reported on May 3, 2010, for MySQL 5.1. The bug is closed since December 15, 2011. This does not mean the full table scans were stopped - obviously, they are considered unavoidable, so the bug fix is to write an explaining message into the error log. In my tests, I encountered it only once, here it is:
[Note] The slave is applying a ROW event on behalf of a DELETE statement on table for_delete and is currently taking a considerable amount of time (61 seconds). This is due to the fact that it is scanning an index while looking up records to be processed. Consider adding a primary key (or unique key) to the table to improve performance. Avoiding the IssueOf course, this problem is the consequence of sloppy table design: If there were a primary key, it would not occur. But this may not help those who currently have tables without a PK and dare not add one, because it would require prior testing and a maintenance window to alter the table on the master.
Luckily, there are remedies, and even more than the log message (quoted above) mentions: In my tests, I found that the slave will use any index on that table if one is available. (It even uses an index with low selectivity, which may increase the page request count.) I did 5 tests:
- Create an index on "clock" (129 different values).
- Create an index on "inc_num" (10,000 different values).
- Create an index on both "clock" and "inc_num" (all combinations are distinct).
- As before, and declare it as "unique".
- Define the combination of "clock" and "inc_num" to be the primary key.
The "alter table" statement for this was always issued on the slave only, after the inserts, but before the delete.
In all three versions, the slave used any index available. The index on "clock" of course had many different rows per value, so it caused many even more page requests than the no-index case: between 2,800 and 3,165 per row deleted. Obviously, thios makes the problem even worse. But all other indexes were huge improvements, the primary key of course was the best choice. Here are the numbers: Table 2: With Added Key or Index
200,000 rows, status counters on Master Slave Slave Slave Slave Slave"Alter table ..." on slave before "Delete":
Index on "clock" 129 values Index on "inc_num" 10,000 v. Index
on both Unique I.
on both Primary key on both Version 5.5.44
INNODB_BUFFER_POOL_READ_REQUESTS 2,174,121 633,132,032 10,197,554 4,429,243 4,421,357 1,810,734 Ratio "read_requests" Slave/Master
291.2 4.7 2.0 2.0 0.8 "read_requests" per "rows_deleted" 10.9 3,165.7 51.0 22.1 22.1 9.1
Version 5.6.24
INNODB_BUFFER_POOL_READ_REQUESTS 2,318,685 610,482,315 10,327,132 4,375,106 4,073,543 1,794,524 Ratio "read_requests" Slave/Master
342.1 5.0 2.1 2.0 0.9 "read_requests" per "rows_deleted" 11.6 3,052.4 51.6 21.9 20.4 9.0
Version 5.7.17
INNODB_BUFFER_POOL_READ_REQUESTS 1,598,745 559,398,633 9,531,444 3,859,343 3,836,390 1,720,731 Ratio "read_requests" Slave/Master
354.8 6.0 2.3 2.5 1.1 "read_requests" per "rows_deleted" 8.0 2,797.0 47.7 19.3 19.2 8.6
But one very important fact should be mentioned in addition to the numbers: The fact that it worked! By adding an index, I made the slave's schema differ from the master's. The primary key even totally changed the B-tree in which the rows are stored, and it made the slave drop the internal row ID which InnoDB had added on the master. Still, replication using the row format could handle these differences without problems.
The Way OutSo the good message is:
- Even if you have a table without indexes or primary key, you can add these on the slave without breaking the replication.
- If you suffer from slow replication on such a table, adding a good index or (even better) the PK will solve this problem.
- In a replication setup, you can improve the schema on the slave and then do a failover, effectively improving it for all accesses without any maintenance window - just the short failover time.
This might help many DBAs who otherwise don't see a chance to improve a bad schema once it is used in production.
Take care!
Taxonomy upgrade extras: primary keyHow the Lack of a Primary Key May Effectively Stop the Slave
Most (relational) DBAs and DB application developers know the concept of a primary key ("PK") and what it is good for. However, much too often one still encounters table definitions without a PK. True, the relational theory based on sets does not need a PK, and all operations (insert, select, update, delete) can also be done on tables for which no PK was defined. If performance doesn't matter (or the data volume is small, a typical situation in tests), the lack of a PK does not immediately cause negative consequences.
But recently, we had several customers who (independent of each other) had big tables without a PK in a replication setup which they wanted to delete, and they all suffered severely from that: Their replication did not progress, the slave lag grew larger and larger, and all this without reporting any error. When I say "larger and larger", I mean it: I'm not talking about minutes or even hours, I'm talking about days!
ObservationsIn all cases, the situation could be summarized as:
- They were running a traditional, asynchronous replication using the "row" format.
- They had a table without PK with many entries, say a million rows.
- This table wasn't needed any more, and someone issued a "delete from T" statement on the master.
- From this moment, the slave did not show any sign of progress: The output of "show slave status \G"
- listed both the IO and the SQL slave thread as running,
- reported the same log positions without any change,
- did not mention any slave error,
- but the slave lag grew in sync with the passing time.
Looking at the slave's machine load, say using "vmstat", you could see a certain amount of CPU load in user mode, possibly some "waiting for IO", and some read IO (group "io", column "bi" = "block input"). Checking with "top", you could see that the MySQL server process was the only significant CPU load, and closer inspection would reveal that it used roughly one CPU core.
(A side remark: The numbers of "vmstat" and "top" are not directly comparable: While "vmstat" reports the overall CPU usage as a percentage of the total available CPU power, "top" reports the CPU usage of the individual processes as the percentage of a single CPU core. So if you had a 4-core machine with only one active process running an infinite loop, "top" would show this process as using 100%, while "vmstat" would report the CPU as running 25% in user mode and being 75% idle. If you don't know it, look at "/proc/cpuinfo" to find the number of CPU cores.)
What was the Issue? From "top", we could tell that the MySQL server was using one CPU core at full speed. From "vmstat", we could tell that it might access some data from disk, but didn't write any significant amount. This sure looked like an infinite loop or a close relative of it, executed by only one thread.Some of you may have heard that (under some conditions) replication may do a full table scan: This is it!
When the slave has to apply a change (for this discussion: a delete or an update) which is provided in row format, and the table has no PK (or other suitable index), the SQL thread does a full table scan searching for the matching row. Note that it doesn't stop on the first match, but rather continues the scan throughout the rest of the table.
To be honest: I don't understand why it doesn't stop after the first match. I also ran a modified test where some master rows had two matches on the slave (I had duplicated them by "insert select"): The "delete" was still replicated as before, and these additional slave rows were still present after the delete.
Until now, you may consider that "full table scan" as a claim without proof, and I agree with all demands for a more thorough check.
The Experiment For tests, I have a setup of two VMs, each running a MySQL server process, and they are configured for a traditional master-slave replication. In fact, I have it for the versions MySQL 5.5.44, 5.6.24, and 5.7.17. (Yes, I might add newer versions.)So I designed an experiment:
- Start both master and slave, make sure replication is running.
- On the master, create a table without any key / index, neither primary nor secondary: CREATE TABLE for_delete ( inc_num int, # ascending numbers 1 .. 10,000 repeated for larger tables clock timestamp, # now() counted: 129 different values with 200 k rows bubble char(250) # repeat('Abcd', 60) ) DEFAULT CHARACTER SET latin1 ;
- Insert into that table the desired number of rows (I had runs with 10 k, 20 k, 100 k, and 200 k)
- on both master and slave, run "SELECT ... FROM information_schema.global_status WHERE ..." to get status counters,
- Delete all rows of that table
- sleep for a time depending on the row count (you will later see why)
- get new status counters
- compute the status counter differences
This experiment reliably reproduced the symptoms which the customers had reported: When the "delete" had been done on the master, the slave became busy for a long time, depending on the row count, and in this time it did not show any progress indicators. This holds for all three versions I checked.
So what did the slave do? It worked as assumed: For each row, it did a full table scan, continuing even after it had found a matching row. This can be easily seen in the status counters, which you will find in the result tables below.
In this article, I will show the results of tests with 200 k rows only, as they are the most impressive. As described above, the table was created anew for each run. Then, 10,000 rows were inserted, and this was repeated 20 times. Each such group had the value of "inc_num" go up from 1 to 10,000.
In all runs, the status counters "HANDLER_DELETE", "INNODB_ROWS_DELETED", and "INNODB_ROWS_READ" showed the value of 200,000, the table size. So the handler calls did not show any scan, it must be below that interface (= within the handler).
If the server does a table scan, it must access all data pages of the table, so there must be read requests for them to guarantee they are in the buffer pool. And yes, the status counters showed these read requests: Table 1: No Key or Index
200,000 rows, status counters on Master SlaveVersion 5.5.44
INNODB_BUFFER_POOL_READ_REQUESTS 2,174,121 380,132,972 Ratio "read_requests" Slave/Master
174.8 "read_requests" per "rows_deleted" 10.9 1,900.7
Version 5.6.24
INNODB_BUFFER_POOL_READ_REQUESTS 2,318,685 379,762,583 Ratio "read_requests" Slave/Master
163.8 "read_requests" per "rows_deleted" 11.6 1,898.8
Version 5.7.17
INNODB_BUFFER_POOL_READ_REQUESTS 1,598,745 379,538,695 Ratio "read_requests" Slave/Master
237.4 "read_requests" per "rows_deleted" 8.0 1,897.7
While master and slave (from a logical point of view) do the same work (delete all rows from a 200 k rows table), the slave accesses many more pages than the master - depending on the version, the factor ranges from 164 to 237. (Note that in version 5.7 the slave didn't do worse than in earlier versions, rather the master did better. This shows clearly in the ratio of "innodb buffer pool read requests" to "rows deleted" on the master: This is about 11 in 5.5 and 5.6, but only 8 in 5.7.)
I won't bother you with the detailed results for 100 k rows. It is enough to say that for half as many rows, the master had half as many read requests, while the slave had a quarter. This means that the effort on the master grows proportional to table size, but on the slave it grows with the square of the table size. Accepting the fact that the slave always scans the full table, this is easy to understand: If the table has twice as many rows, there are twice as many pages to scan, and the number of scans also doubles.
Quadratic growth means that an increase of the table size by a factor ten lets the effort grow by a factor hundred! Now imagine I had tested with a million of rows ...
All this isn't new: In the public MySQL bug database, this is listed as bug #53375. The bug was reported on May 3, 2010, for MySQL 5.1. The bug is closed since December 15, 2011. This does not mean the full table scans were stopped - obviously, they are considered unavoidable, so the bug fix is to write an explaining message into the error log. In my tests, I encountered it only once, here it is:
[Note] The slave is applying a ROW event on behalf of a DELETE statement on table for_delete and is currently taking a considerable amount of time (61 seconds). This is due to the fact that it is scanning an index while looking up records to be processed. Consider adding a primary key (or unique key) to the table to improve performance. Avoiding the IssueOf course, this problem is the consequence of sloppy table design: If there were a primary key, it would not occur. But this may not help those who currently have tables without a PK and dare not add one, because it would require prior testing and a maintenance window to alter the table on the master.
Luckily, there are remedies, and even more than the log message (quoted above) mentions: In my tests, I found that the slave will use any index on that table if one is available. (It even uses an index with low selectivity, which may increase the page request count.) I did 5 tests:
- Create an index on "clock" (129 different values).
- Create an index on "inc_num" (10,000 different values).
- Create an index on both "clock" and "inc_num" (all combinations are distinct).
- As before, and declare it as "unique".
- Define the combination of "clock" and "inc_num" to be the primary key.
The "alter table" statement for this was always issued on the slave only, after the inserts, but before the delete.
In all three versions, the slave used any index available. The index on "clock" of course had many different rows per value, so it caused many even more page requests than the no-index case: between 2,800 and 3,165 per row deleted. Obviously, thios makes the problem even worse. But all other indexes were huge improvements, the primary key of course was the best choice. Here are the numbers: Table 2: With Added Key or Index
200,000 rows, status counters on Master Slave Slave Slave Slave Slave"Alter table ..." on slave before "Delete":
Index on "clock" 129 values Index on "inc_num" 10,000 v. Index
on both Unique I.
on both Primary key on both Version 5.5.44
INNODB_BUFFER_POOL_READ_REQUESTS 2,174,121 633,132,032 10,197,554 4,429,243 4,421,357 1,810,734 Ratio "read_requests" Slave/Master
291.2 4.7 2.0 2.0 0.8 "read_requests" per "rows_deleted" 10.9 3,165.7 51.0 22.1 22.1 9.1
Version 5.6.24
INNODB_BUFFER_POOL_READ_REQUESTS 2,318,685 610,482,315 10,327,132 4,375,106 4,073,543 1,794,524 Ratio "read_requests" Slave/Master
342.1 5.0 2.1 2.0 0.9 "read_requests" per "rows_deleted" 11.6 3,052.4 51.6 21.9 20.4 9.0
Version 5.7.17
INNODB_BUFFER_POOL_READ_REQUESTS 1,598,745 559,398,633 9,531,444 3,859,343 3,836,390 1,720,731 Ratio "read_requests" Slave/Master
354.8 6.0 2.3 2.5 1.1 "read_requests" per "rows_deleted" 8.0 2,797.0 47.7 19.3 19.2 8.6
But one very important fact should be mentioned in addition to the numbers: The fact that it worked! By adding an index, I made the slave's schema differ from the master's. The primary key even totally changed the B-tree in which the rows are stored, and it made the slave drop the internal row ID which InnoDB had added on the master. Still, replication using the row format could handle these differences without problems.
The Way OutSo the good message is:
- Even if you have a table without indexes or primary key, you can add these on the slave without breaking the replication.
- If you suffer from slow replication on such a table, adding a good index or (even better) the PK will solve this problem.
- In a replication setup, you can improve the schema on the slave and then do a failover, effectively improving it for all accesses without any maintenance window - just the short failover time.
This might help many DBAs who otherwise don't see a chance to improve a bad schema once it is used in production.
Take care!
MySQL and MariaDB authentication against pam_unix
The PAM authentication plug-in is an extension included in MySQL Enterprise Edition (since 5.5) and in MariaDB (since 5.2).
MySQL authentication against pam_unixCheck if plug-in is available:
# ll lib/plugin/auth*so -rwxr-xr-x 1 mysql mysql 42937 Sep 18 2015 lib/plugin/authentication_pam.so -rwxr-xr-x 1 mysql mysql 25643 Sep 18 2015 lib/plugin/auth.so -rwxr-xr-x 1 mysql mysql 12388 Sep 18 2015 lib/plugin/auth_socket.so -rwxr-xr-x 1 mysql mysql 25112 Sep 18 2015 lib/plugin/auth_test_plugin.soInstall PAM plug-in:
mysql> INSTALL PLUGIN authentication_pam SONAME 'authentication_pam.so';Check plug-in information:
mysql> SELECT * FROM information_schema.plugins WHERE plugin_name = 'authentication_pam'\G *************************** 1. row *************************** PLUGIN_NAME: authentication_pam PLUGIN_VERSION: 1.1 PLUGIN_STATUS: ACTIVE PLUGIN_TYPE: AUTHENTICATION PLUGIN_TYPE_VERSION: 1.1 PLUGIN_LIBRARY: authentication_pam.so PLUGIN_LIBRARY_VERSION: 1.7 PLUGIN_AUTHOR: Georgi Kodinov PLUGIN_DESCRIPTION: PAM authentication plugin PLUGIN_LICENSE: PROPRIETARY LOAD_OPTION: ONIt seems like this set-up is persisted and survives a database restart because of the mysql schema table:
mysql> SELECT * FROM mysql.plugin; +--------------------+-----------------------+ | name | dl | +--------------------+-----------------------+ | authentication_pam | authentication_pam.so | +--------------------+-----------------------+Configuring PAM on Ubuntu/Debian:
#%PAM-1.0 # # /etc/pam.d/mysql # @include common-auth @include common-account @include common-session-noninteractiveCreate the database user matching to the O/S user:
mysql> CREATE USER 'oli'@'localhost' IDENTIFIED WITH authentication_pam AS 'mysql' ; mysql> GRANT ALL PRIVILEGES ON test.* TO 'oli'@'localhost';Verifying user in the database:
mysql> SELECT user, host, authentication_string FROM `mysql`.`user` WHERE user = 'oli'; +-----------+-----------+-------------------------------------------+ | user | host | authentication_string | +-----------+-----------+-------------------------------------------+ | oli | localhost | mysql | +-----------+-----------+-------------------------------------------+ mysql> SHOW CREATE USER 'oli'@'localhost'; +-----------------------------------------------------------------------------------------------------------------------------------+ | CREATE USER for oli@localhost | +-----------------------------------------------------------------------------------------------------------------------------------+ | CREATE USER 'oli'@'localhost' IDENTIFIED WITH 'authentication_pam' AS 'mysql' REQUIRE NONE PASSWORD EXPIRE DEFAULT ACCOUNT UNLOCK | +-----------------------------------------------------------------------------------------------------------------------------------+Connection tests:
# mysql --user=oli --host=localhost ERROR 2059 (HY000): Authentication plugin 'mysql_clear_password' cannot be loaded: plugin not enabled # mysql --user=oli --host=localhost --enable-cleartext-plugin --password=wrong ERROR 1045 (28000): Access denied for user 'oli'@'localhost' (using password: YES) # tail /var/log/auth.log Feb 13 15:15:14 chef unix_chkpwd[31600]: check pass; user unknown Feb 13 15:15:14 chef unix_chkpwd[31600]: password check failed for user (oli) # mysql --user=oli --host=localhost --enable-cleartext-plugin --password=rigth ERROR 1045 (28000): Access denied for user 'oli'@'localhost' (using password: YES) # tail /var/log/auth.log Feb 13 15:15:40 chef unix_chkpwd[31968]: check pass; user unknown Feb 13 15:15:40 chef unix_chkpwd[31968]: password check failed for user (oli)Some research led to the following result: The non privileged mysql user is not allowed to access the file /etc/shadow thus it should be added to the group shadow to make it work:
# ll /sbin/unix_chkpwd -rwxr-sr-x 1 root shadow 35536 Mar 16 2016 /sbin/unix_chkpwd # usermod -a -G shadow mysqlConnection tests:
# mysql --user=oli --host=localhost --enable-cleartext-plugin --password=rigth mysql> SELECT USER(), CURRENT_USER(), @@proxy_user; +---------------+----------------+--------------+ | USER() | CURRENT_USER() | @@proxy_user | +---------------+----------------+--------------+ | oli@localhost | oli@localhost | NULL | +---------------+----------------+--------------+MariaDB authentication against pam_unix
Check if plug-in is available:
# ll lib/plugin/auth*so -rwxr-xr-x 1 mysql mysql 12462 Nov 4 14:37 lib/plugin/auth_0x0100.so -rwxr-xr-x 1 mysql mysql 33039 Nov 4 14:37 lib/plugin/auth_gssapi_client.so -rwxr-xr-x 1 mysql mysql 80814 Nov 4 14:37 lib/plugin/auth_gssapi.so -rwxr-xr-x 1 mysql mysql 19015 Nov 4 14:37 lib/plugin/auth_pam.so -rwxr-xr-x 1 mysql mysql 13028 Nov 4 14:37 lib/plugin/auth_socket.so -rwxr-xr-x 1 mysql mysql 23521 Nov 4 14:37 lib/plugin/auth_test_plugin.soInstall PAM plug-in:
mysql> INSTALL SONAME 'auth_pam';Check plug-in information:
mysql> SELECT * FROM information_schema.plugins WHERE plugin_name = 'pam'\G *************************** 1. row *************************** PLUGIN_NAME: pam PLUGIN_VERSION: 1.0 PLUGIN_STATUS: ACTIVE PLUGIN_TYPE: AUTHENTICATION PLUGIN_TYPE_VERSION: 2.0 PLUGIN_LIBRARY: auth_pam.so PLUGIN_LIBRARY_VERSION: 1.11 PLUGIN_AUTHOR: Sergei Golubchik PLUGIN_DESCRIPTION: PAM based authentication PLUGIN_LICENSE: GPL LOAD_OPTION: ON PLUGIN_MATURITY: Stable PLUGIN_AUTH_VERSION: 1.0Configuring PAM on Ubuntu/Debian:
#%PAM-1.0 # # /etc/pam.d/mysql # @include common-auth @include common-account @include common-session-noninteractiveCreate the database user matching to the O/S user:
mysql> CREATE USER 'oli'@'localhost' IDENTIFIED VIA pam USING 'mariadb' ; mysql> GRANT ALL PRIVILEGES ON test.* TO 'oli'@'localhost';Verifying user in the database:
mysql> SELECT user, host, authentication_string FROM `mysql`.`user` WHERE user = 'oli'; +------+-----------+-----------------------+ | user | host | authentication_string | +------+-----------+-----------------------+ | oli | localhost | mariadb | +------+-----------+-----------------------+Connection tests:
# mysql --user=oli --host=localhost --password=wrong ERROR 2059 (HY000): Authentication plugin 'dialog' cannot be loaded: /usr/local/mysql/lib/plugin/dialog.so: cannot open shared object file: No such file or directory # tail /var/log/auth.log Feb 13 17:11:16 chef mysqld: pam_unix(mariadb:auth): unexpected response from failed conversation function Feb 13 17:11:16 chef mysqld: pam_unix(mariadb:auth): conversation failed Feb 13 17:11:16 chef mysqld: pam_unix(mariadb:auth): auth could not identify password for [oli] Feb 13 17:11:16 chef mysqld: pam_winbind(mariadb:auth): getting password (0x00000388) Feb 13 17:11:16 chef mysqld: pam_winbind(mariadb:auth): Could not retrieve user's password # mysql --user=oli --host=localhost --password=wrong --plugin-dir=$PWD/lib/plugin ERROR 1045 (28000): Access denied for user 'oli'@'localhost' (using password: NO) Feb 13 17:11:30 chef mysqld: pam_unix(mariadb:auth): authentication failure; logname= uid=1001 euid=1001 tty= ruser= rhost= user=oli Feb 13 17:11:30 chef mysqld: pam_winbind(mariadb:auth): getting password (0x00000388) Feb 13 17:11:30 chef mysqld: pam_winbind(mariadb:auth): pam_get_item returned a password Feb 13 17:11:30 chef mysqld: pam_winbind(mariadb:auth): request wbcLogonUser failed: WBC_ERR_AUTH_ERROR, PAM error: PAM_USER_UNKNOWN (10), NTSTATUS: NT_STATUS_NO_SUCH_USER, Error message was: No such userAdd mysql user to the shadow group:
# ll /sbin/unix_chkpwd -rwxr-sr-x 1 root shadow 35536 Mar 16 2016 /sbin/unix_chkpwd # usermod -a -G shadow mysqlConnection tests:
# mysql --user=oli --host=localhost --password=right --plugin-dir=$PWD/lib/plugin mysql> SELECT USER(), CURRENT_USER(), @@proxy_user; +---------------+----------------+--------------+ | USER() | CURRENT_USER() | @@proxy_user | +---------------+----------------+--------------+ | oli@localhost | oli@localhost | NULL | +---------------+----------------+--------------+Taxonomy upgrade extras: authenticationpamsecuritypluginplug-in
MySQL and MariaDB authentication against pam_unix
The PAM authentication plug-in is an extension included in MySQL Enterprise Edition (since 5.5) and in MariaDB (since 5.2).
MySQL authentication against pam_unixCheck if plug-in is available:
# ll lib/plugin/auth*so -rwxr-xr-x 1 mysql mysql 42937 Sep 18 2015 lib/plugin/authentication_pam.so -rwxr-xr-x 1 mysql mysql 25643 Sep 18 2015 lib/plugin/auth.so -rwxr-xr-x 1 mysql mysql 12388 Sep 18 2015 lib/plugin/auth_socket.so -rwxr-xr-x 1 mysql mysql 25112 Sep 18 2015 lib/plugin/auth_test_plugin.soInstall PAM plug-in:
mysql> INSTALL PLUGIN authentication_pam SONAME 'authentication_pam.so';Check plug-in information:
mysql> SELECT * FROM information_schema.plugins WHERE plugin_name = 'authentication_pam'\G *************************** 1. row *************************** PLUGIN_NAME: authentication_pam PLUGIN_VERSION: 1.1 PLUGIN_STATUS: ACTIVE PLUGIN_TYPE: AUTHENTICATION PLUGIN_TYPE_VERSION: 1.1 PLUGIN_LIBRARY: authentication_pam.so PLUGIN_LIBRARY_VERSION: 1.7 PLUGIN_AUTHOR: Georgi Kodinov PLUGIN_DESCRIPTION: PAM authentication plugin PLUGIN_LICENSE: PROPRIETARY LOAD_OPTION: ONIt seems like this set-up is persisted and survives a database restart because of the mysql schema table:
mysql> SELECT * FROM mysql.plugin; +--------------------+-----------------------+ | name | dl | +--------------------+-----------------------+ | authentication_pam | authentication_pam.so | +--------------------+-----------------------+Configuring PAM on Ubuntu/Debian:
#%PAM-1.0 # # /etc/pam.d/mysql # @include common-auth @include common-account @include common-session-noninteractiveCreate the database user matching to the O/S user:
mysql> CREATE USER 'oli'@'localhost' IDENTIFIED WITH authentication_pam AS 'mysql' ; mysql> GRANT ALL PRIVILEGES ON test.* TO 'oli'@'localhost';Verifying user in the database:
mysql> SELECT user, host, authentication_string FROM mysql.user WHERE user = 'oli'; +-----------+-----------+-------------------------------------------+ | user | host | authentication_string | +-----------+-----------+-------------------------------------------+ | oli | localhost | mysql | +-----------+-----------+-------------------------------------------+ mysql> SHOW CREATE USER 'oli'@'localhost'; +-----------------------------------------------------------------------------------------------------------------------------------+ | CREATE USER for oli@localhost | +-----------------------------------------------------------------------------------------------------------------------------------+ | CREATE USER 'oli'@'localhost' IDENTIFIED WITH 'authentication_pam' AS 'mysql' REQUIRE NONE PASSWORD EXPIRE DEFAULT ACCOUNT UNLOCK | +-----------------------------------------------------------------------------------------------------------------------------------+Connection tests:
# mysql --user=oli --host=localhost ERROR 2059 (HY000): Authentication plugin 'mysql_clear_password' cannot be loaded: plugin not enabled # mysql --user=oli --host=localhost --enable-cleartext-plugin --password=wrong ERROR 1045 (28000): Access denied for user 'oli'@'localhost' (using password: YES) # tail /var/log/auth.log Feb 13 15:15:14 chef unix_chkpwd[31600]: check pass; user unknown Feb 13 15:15:14 chef unix_chkpwd[31600]: password check failed for user (oli) # mysql --user=oli --host=localhost --enable-cleartext-plugin --password=rigth ERROR 1045 (28000): Access denied for user 'oli'@'localhost' (using password: YES) # tail /var/log/auth.log Feb 13 15:15:40 chef unix_chkpwd[31968]: check pass; user unknown Feb 13 15:15:40 chef unix_chkpwd[31968]: password check failed for user (oli)Some research led to the following result: The non privileged mysql user is not allowed to access the file /etc/shadow thus it should be added to the group shadow to make it work:
# ll /sbin/unix_chkpwd -rwxr-sr-x 1 root shadow 35536 Mar 16 2016 /sbin/unix_chkpwd # usermod -a -G shadow mysqlConnection tests:
# mysql --user=oli --host=localhost --enable-cleartext-plugin --password=rigth mysql> SELECT USER(), CURRENT_USER(), @@proxy_user; +---------------+----------------+--------------+ | USER() | CURRENT_USER() | @@proxy_user | +---------------+----------------+--------------+ | oli@localhost | oli@localhost | NULL | +---------------+----------------+--------------+MariaDB authentication against pam_unix
Check if plug-in is available:
# ll lib/plugin/auth*so -rwxr-xr-x 1 mysql mysql 12462 Nov 4 14:37 lib/plugin/auth_0x0100.so -rwxr-xr-x 1 mysql mysql 33039 Nov 4 14:37 lib/plugin/auth_gssapi_client.so -rwxr-xr-x 1 mysql mysql 80814 Nov 4 14:37 lib/plugin/auth_gssapi.so -rwxr-xr-x 1 mysql mysql 19015 Nov 4 14:37 lib/plugin/auth_pam.so -rwxr-xr-x 1 mysql mysql 13028 Nov 4 14:37 lib/plugin/auth_socket.so -rwxr-xr-x 1 mysql mysql 23521 Nov 4 14:37 lib/plugin/auth_test_plugin.soInstall PAM plug-in:
mysql> INSTALL SONAME 'auth_pam';Check plug-in information:
mysql> SELECT * FROM information_schema.plugins WHERE plugin_name = 'pam'\G *************************** 1. row *************************** PLUGIN_NAME: pam PLUGIN_VERSION: 1.0 PLUGIN_STATUS: ACTIVE PLUGIN_TYPE: AUTHENTICATION PLUGIN_TYPE_VERSION: 2.0 PLUGIN_LIBRARY: auth_pam.so PLUGIN_LIBRARY_VERSION: 1.11 PLUGIN_AUTHOR: Sergei Golubchik PLUGIN_DESCRIPTION: PAM based authentication PLUGIN_LICENSE: GPL LOAD_OPTION: ON PLUGIN_MATURITY: Stable PLUGIN_AUTH_VERSION: 1.0Configuring PAM on Ubuntu/Debian:
#%PAM-1.0 # # /etc/pam.d/mysql # @include common-auth @include common-account @include common-session-noninteractiveCreate the database user matching to the O/S user:
mysql> CREATE USER 'oli'@'localhost' IDENTIFIED VIA pam USING 'mariadb' ; mysql> GRANT ALL PRIVILEGES ON test.* TO 'oli'@'localhost';Verifying user in the database:
mysql> SELECT user, host, authentication_string FROM mysql.user WHERE user = 'oli'; +------+-----------+-----------------------+ | user | host | authentication_string | +------+-----------+-----------------------+ | oli | localhost | mariadb | +------+-----------+-----------------------+Connection tests:
# mysql --user=oli --host=localhost --password=wrong ERROR 2059 (HY000): Authentication plugin 'dialog' cannot be loaded: /usr/local/mysql/lib/plugin/dialog.so: cannot open shared object file: No such file or directory # tail /var/log/auth.log Feb 13 17:11:16 chef mysqld: pam_unix(mariadb:auth): unexpected response from failed conversation function Feb 13 17:11:16 chef mysqld: pam_unix(mariadb:auth): conversation failed Feb 13 17:11:16 chef mysqld: pam_unix(mariadb:auth): auth could not identify password for [oli] Feb 13 17:11:16 chef mysqld: pam_winbind(mariadb:auth): getting password (0x00000388) Feb 13 17:11:16 chef mysqld: pam_winbind(mariadb:auth): Could not retrieve user's password # mysql --user=oli --host=localhost --password=wrong --plugin-dir=$PWD/lib/plugin ERROR 1045 (28000): Access denied for user 'oli'@'localhost' (using password: NO) Feb 13 17:11:30 chef mysqld: pam_unix(mariadb:auth): authentication failure; logname= uid=1001 euid=1001 tty= ruser= rhost= user=oli Feb 13 17:11:30 chef mysqld: pam_winbind(mariadb:auth): getting password (0x00000388) Feb 13 17:11:30 chef mysqld: pam_winbind(mariadb:auth): pam_get_item returned a password Feb 13 17:11:30 chef mysqld: pam_winbind(mariadb:auth): request wbcLogonUser failed: WBC_ERR_AUTH_ERROR, PAM error: PAM_USER_UNKNOWN (10), NTSTATUS: NT_STATUS_NO_SUCH_USER, Error message was: No such userAdd mysql user to the shadow group:
# ll /sbin/unix_chkpwd -rwxr-sr-x 1 root shadow 35536 Mar 16 2016 /sbin/unix_chkpwd # usermod -a -G shadow mysqlConnection tests:
# mysql --user=oli --host=localhost --password=right --plugin-dir=$PWD/lib/plugin mysql> SELECT USER(), CURRENT_USER(), @@proxy_user; +---------------+----------------+--------------+ | USER() | CURRENT_USER() | @@proxy_user | +---------------+----------------+--------------+ | oli@localhost | oli@localhost | NULL | +---------------+----------------+--------------+Taxonomy upgrade extras: authenticationpamsecuritypluginplug-in
FromDual Performance Monitor for MySQL and MariaDB 1.0.1 has been released
FromDual has the pleasure to announce the release of the new version 1.0.1 of its popular Database Performance Monitor for MySQL, MariaDB, Galera Cluster and Percona Server fpmmm.
You can download fpmmm from here.
In the inconceivable case that you find a bug in the fpmmm please report it to our Bug-tracker.
Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.
This release contains various bug fixes.
Changes in fpmmm v1.0.1 fpmmm agent- Fpmmm suppresses server has gone away message to stdout.
- Fpmmm should behave correctly now when database was started after fpmmm agent.
- MyEnv library synced from MyEnv project.
- Added LaunchDaemon configuration for fpmmmm on Max OSX (darwin).
- Added some more support for Mac OSX (darwin).
For subscriptions of commercial use of fpmmm please get in contact with us.
Taxonomy upgrade extras: mysqlperformancemonitormonitoringfpmmmmaasperformance monitorreleaseFromDual Performance Monitor for MySQL and MariaDB 1.0.1 has been released
FromDual has the pleasure to announce the release of the new version 1.0.1 of its popular Database Performance Monitor for MySQL, MariaDB, Galera Cluster and Percona Server fpmmm.
You can download fpmmm from here.
In the inconceivable case that you find a bug in the fpmmm please report it to our Bug-tracker.
Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.
This release contains various bug fixes.
Changes in fpmmm v1.0.1 fpmmm agent- Fpmmm suppresses server has gone away message to stdout.
- Fpmmm should behave correctly now when database was started after fpmmm agent.
- MyEnv library synced from MyEnv project.
- Added LaunchDaemon configuration for fpmmmm on Max OSX (darwin).
- Added some more support for Mac OSX (darwin).
For subscriptions of commercial use of fpmmm please get in contact with us.
Taxonomy upgrade extras: mysqlperformancemonitormonitoringfpmmmmaasperformance monitormpmreleaseFromDual Performance Monitor for MySQL and MariaDB 1.0.1 has been released
FromDual has the pleasure to announce the release of the new version 1.0.1 of its popular Database Performance Monitor for MySQL, MariaDB, Galera Cluster and Percona Server fpmmm.
You can download fpmmm from here.
In the inconceivable case that you find a bug in the fpmmm please report it to our Bug-tracker.
Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.
This release contains various bug fixes.
Changes in fpmmm v1.0.1 fpmmm agent- Fpmmm suppresses server has gone away message to stdout.
- Fpmmm should behave correctly now when database was started after fpmmm agent.
- MyEnv library synced from MyEnv project.
- Added LaunchDaemon configuration for fpmmmm on Max OSX (darwin).
- Added some more support for Mac OSX (darwin).
For subscriptions of commercial use of fpmmm please get in contact with us.
Taxonomy upgrade extras: mysqlperformancemonitormonitoringfpmmmmaasperformance monitormpmreleaseIs your MySQL software Cluster ready?
When we do Galera Cluster consulting we always discuss with the customer if his software is Galera Cluster ready. This basically means: Can the software cope with the Galera Cluster specifics?
If it is a software product developed outside of the company we recommend to ask the software vendor if the software supports Galera Cluster or not.
We typically see 3 different answers:
- We do not know. Then they are at least honest.
- Yes we do support Galera Cluster. Then they hopefully know what they are talking about but you cannot be sure and should test carefully.
- No we do not. Then they most probably know what they are talking about.
If the software is developed in-house it becomes a bit more tricky because the responsibility for this statement has to be taken by you or some of your colleagues.
Thus it is good to know what are the characteristics and the limitations of a Cluster like Galera Cluster for MySQL.
Most of the Galera restrictions an limitation you can find here.
DDL statements cause TOI operationsDDL and DCL statements (like CREATE, ALTER, TRUNCATE, OPTIMIZE, DROP, GRANT, REVOKE, etc.) are executed by default in Total Order Isolation (TOI) by the Online Schema Upgrade (OSU) method. To achieve this schema upgrade consistently Galera does a global Cluster lock.
It is obvious that those DDL operations should be short and not very frequent to not always block your Galera Cluster. So changing your table structure must be planned and done carefully to not impact your daily business operation.
But there are also some not so obvious DDL statements causing TOI operations (and Cluster locks).
- TRUNCATE TABLE ... This operation is NOT a DML statement (like DELETE) but a DDL statement and thus does a TOI operation with a Cluster lock.
- CREATE TABLE IF NOT EXISTS ... This operation is clearly a DDL statement but one might think that it does NOT a TOI operation if the table already exists. This is wrong. This statement causes always a TOI operation if the table is there or not does not matter. If you run this statement very frequent this potentially causes troubles to your Galera Cluster.
- CREATE TABLE younameit_tmp ... The intention is clear: The developer wants to create a temporary table. But this is NOT a temporary table but just a normal table called _tmp. So it causes as TOI operation as well. What you should do in this case is to create a real temporary table like this: CREATE TEMPORARY TABLE yournameit_tmp ... This DDL statement is only executed locally and will not cause a TOI operation.
How to check?
You can check the impact of this problem with the following sequence of statements:
mysql> SHOW GLOBAL STATUS LIKE 'Com_create_table%'; +------------------+-------+ | Variable_name | Value | +------------------+-------+ | Com_create_table | 4 | +------------------+-------+ mysql> CREATE TABLE t1_tmp (id INT); mysql> SHOW GLOBAL STATUS LIKE 'Com_create_table%'; +------------------+-------+ | Variable_name | Value | +------------------+-------+ | Com_create_table | 5 | --> Also changes on the Slave nodes! +------------------+-------+ mysql> CREATE TEMPORARY TABLE t2_tmp (id INT); mysql> SHOW GLOBAL STATUS LIKE 'Com_create_table%'; +------------------+-------+ | Variable_name | Value | +------------------+-------+ | Com_create_table | 6 | --> Does NOT change on the Slave nodes! +------------------+-------+ mysql> CREATE TABLE IF NOT EXISTS t1_tmp (id INT); +------------------+-------+ | Variable_name | Value | +------------------+-------+ | Com_create_table | 7 | --> Also changes on the Slave nodes! +------------------+-------+Find out in advance
If you want to find out before migrating to Galera Cluster if you are hit by this problem or not you can either run:
mysql> SHOW GLOBAL STATUS WHERE variable_name LIKE 'Com_create%' OR variable_name LIKE 'Com_alter%' OR variable_name LIKE 'Com_drop%' OR variable_name LIKE 'Com_truncate%' OR variable_name LIKE 'Com_grant%' OR variable_name LIKE 'Com_revoke%' OR variable_name LIKE 'Com_optimize%' OR variable_name LIKE 'Com_rename%' OR variable_name LIKE 'Uptime' ; +----------------------+-------+ | Variable_name | Value | +----------------------+-------+ | Com_create_db | 2 | | Com_create_table | 6 | | Com_optimize | 1 | | Uptime | 6060 | +----------------------+-------+Or if you want to know exactly who was running the query from the PERFORMANCE_SCHEMA:
SELECT user, host, SUBSTR(event_name, 15) AS event_name, count_star FROM performance_schema.events_statements_summary_by_account_by_event_name WHERE count_star > 0 AND ( event_name LIKE 'statement/sql/create%' OR event_name LIKE 'statement/sql/alter%' OR event_name LIKE 'statement/sql/drop%' OR event_name LIKE 'statement/sql/rename%' OR event_name LIKE 'statement/sql/grant%' OR event_name LIKE 'statement/sql/revoke%' OR event_name LIKE 'statement/sql/optimize%' OR event_name LIKE 'statement/sql/truncate%' OR event_name LIKE 'statement/sql/repair%' OR event_name LIKE 'statement/sql/check%' ) ; +------+-----------+--------------+------------+ | user | host | event_name | count_star | +------+-----------+--------------+------------+ | root | localhost | create_table | 4 | | root | localhost | create_db | 2 | | root | localhost | optimize | 1 | +------+-----------+--------------+------------+If you need help to make your application Galera Cluster ready we will be glad to assist you.
Taxonomy upgrade extras: Galera ClusterTOIDDLcreatetemporary tableDCLdropaltertruncateIs your MySQL software Cluster ready?
When we do Galera Cluster consulting we always discuss with the customer if his software is Galera Cluster ready. This basically means: Can the software cope with the Galera Cluster specifics?
If it is a software product developed outside of the company we recommend to ask the software vendor if the software supports Galera Cluster or not.
We typically see 3 different answers:
- We do not know. Then they are at least honest.
- Yes we do support Galera Cluster. Then they hopefully know what they are talking about but you cannot be sure and should test carefully.
- No we do not. Then they most probably know what they are talking about.
If the software is developed in-house it becomes a bit more tricky because the responsibility for this statement has to be taken by you or some of your colleagues.
Thus it is good to know what are the characteristics and the limitations of a Cluster like Galera Cluster for MySQL.
Most of the Galera restrictions an limitation you can find here.
DDL statements cause TOI operationsDDL and DCL statements (like CREATE, ALTER, TRUNCATE, OPTIMIZE, DROP, GRANT, REVOKE, etc.) are executed by default in Total Order Isolation (TOI) by the Online Schema Upgrade (OSU) method. To achieve this schema upgrade consistently Galera does a global Cluster lock.
It is obvious that those DDL operations should be short and not very frequent to not always block your Galera Cluster. So changing your table structure must be planned and done carefully to not impact your daily business operation.
But there are also some not so obvious DDL statements causing TOI operations (and Cluster locks).
- TRUNCATE TABLE ... This operation is NOT a DML statement (like DELETE) but a DDL statement and thus does a TOI operation with a Cluster lock.
- CREATE TABLE IF NOT EXISTS ... This operation is clearly a DDL statement but one might think that it does NOT a TOI operation if the table already exists. This is wrong. This statement causes always a TOI operation if the table is there or not does not matter. If you run this statement very frequent this potentially causes troubles to your Galera Cluster.
- CREATE TABLE younameit_tmp ... The intention is clear: The developer wants to create a temporary table. But this is NOT a temporary table but just a normal table called _tmp. So it causes as TOI operation as well. What you should do in this case is to create a real temporary table like this: CREATE TEMPORARY TABLE yournameit_tmp ... This DDL statement is only executed locally and will not cause a TOI operation.
How to check?
You can check the impact of this problem with the following sequence of statements:
mysql> SHOW GLOBAL STATUS LIKE 'Com_create_table%'; +------------------+-------+ | Variable_name | Value | +------------------+-------+ | Com_create_table | 4 | +------------------+-------+ mysql> CREATE TABLE t1_tmp (id INT); mysql> SHOW GLOBAL STATUS LIKE 'Com_create_table%'; +------------------+-------+ | Variable_name | Value | +------------------+-------+ | Com_create_table | 5 | --> Also changes on the Slave nodes! +------------------+-------+ mysql> CREATE TEMPORARY TABLE t2_tmp (id INT); mysql> SHOW GLOBAL STATUS LIKE 'Com_create_table%'; +------------------+-------+ | Variable_name | Value | +------------------+-------+ | Com_create_table | 6 | --> Does NOT change on the Slave nodes! +------------------+-------+ mysql> CREATE TABLE IF NOT EXISTS t1_tmp (id INT); +------------------+-------+ | Variable_name | Value | +------------------+-------+ | Com_create_table | 7 | --> Also changes on the Slave nodes! +------------------+-------+Find out in advance
If you want to find out before migrating to Galera Cluster if you are hit by this problem or not you can either run:
mysql> SHOW GLOBAL STATUS WHERE variable_name LIKE 'Com_create%' OR variable_name LIKE 'Com_alter%' OR variable_name LIKE 'Com_drop%' OR variable_name LIKE 'Com_truncate%' OR variable_name LIKE 'Com_grant%' OR variable_name LIKE 'Com_revoke%' OR variable_name LIKE 'Com_optimize%' OR variable_name LIKE 'Com_rename%' OR variable_name LIKE 'Uptime' ; +----------------------+-------+ | Variable_name | Value | +----------------------+-------+ | Com_create_db | 2 | | Com_create_table | 6 | | Com_optimize | 1 | | Uptime | 6060 | +----------------------+-------+Or if you want to know exactly who was running the query from the PERFORMANCE_SCHEMA:
SELECT user, host, SUBSTR(event_name, 15) AS event_name, count_star FROM performance_schema.events_statements_summary_by_account_by_event_name WHERE count_star > 0 AND ( event_name LIKE 'statement/sql/create%' OR event_name LIKE 'statement/sql/alter%' OR event_name LIKE 'statement/sql/drop%' OR event_name LIKE 'statement/sql/rename%' OR event_name LIKE 'statement/sql/grant%' OR event_name LIKE 'statement/sql/revoke%' OR event_name LIKE 'statement/sql/optimize%' OR event_name LIKE 'statement/sql/truncate%' OR event_name LIKE 'statement/sql/repair%' OR event_name LIKE 'statement/sql/check%' ) ; +------+-----------+--------------+------------+ | user | host | event_name | count_star | +------+-----------+--------------+------------+ | root | localhost | create_table | 4 | | root | localhost | create_db | 2 | | root | localhost | optimize | 1 | +------+-----------+--------------+------------+If you need help to make your application Galera Cluster ready we will be glad to assist you.
Taxonomy upgrade extras: Galera ClusterTOIDDLcreatetemporary tableDCLdropaltertruncateMySQL replication with filtering is dangerous
From time to time we see in customer engagements that MySQL Master/Slave replication is set-up doing schema or table level replication filtering. This can be done either on Master or on Slave. If filtering is done on the Master (by the binlog_{do|ignore}_db settings), the binary log becomes incomplete and cannot be used for a proper Point-in-Time-Recovery. Therefore FromDual recommends AGAINST this approach.
The replication filtering rules vary depending on the binary log format (ROW and STATEMENT) See also: How Servers Evaluate Replication Filtering Rules.
For reasons of data consistency between Master and Slave FromDual recommends to use only the binary log format ROW. This is also stated in the MySQL documentation: All changes can be replicated. This is the safest form of replication. Especially dangerous is binary log filtering with binary log format MIXED. This binary log format FromDual strongly discourages users to use.
The binary log format ROW affects only DML statements (UPDATE, INSERT, DELETE, etc.) but NOT DDL statements (CREATE, ALTER, DROP, etc.) and NOT DCL statements (CREATE, GRANT, REVOKE, DROP, etc.). So how are those statements replicated? They are replicated in STATEMENT binary log format even though binlog_format is set to ROW. This has the consequences that the binary log filtering rules of STATEMENT based replication and not the ones of ROW based replication apply when running one of those DDL or DCL statements.
This can easily cause problems. If you are lucky, they will cause the replication to break sooner or later, which you can detect and fix - but they may also cause inconsistencies between Master and Slave which may remain undetected for a long time.
Let us show what happens in 2 similar scenarios:
Scenario A: Filtering on mysql schemaOn Slave we set the binary log filter as follows:
replicate_ignore_db = mysqland verify it:
mysql> SHOW SLAVE STATUS\G ... Replicate_Ignore_DB: mysql ...The intention of this filter setting is to not replicate user creations or modifications from Master to the Slave.
We verify on the Master, that binlog_format is set to the wanted value:
mysql> SHOW GLOBAL VARIABLES LIKE 'binlog_format'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | binlog_format | ROW | +---------------+-------+Now we do the following on the Master:
mysql> use mysql mysql> CREATE USER 'inmysql'@'%'; mysql> use test mysql> CREATE USER 'intest'@'%';and verify the result on the Master:
mysql> SELECT user, host FROM mysql.user; +-------------+-----------+ | user | host | +-------------+-----------+ | inmysql | % | | intest | % | | mysql.sys | localhost | | root | localhost | +-------------+-----------+and on the Slave:
mysql> SELECT user, host FROM mysql.user; +-------------+-----------+ | user | host | +-------------+-----------+ | intest | % | | mysql.sys | localhost | | root | localhost | +-------------+-----------+We see, that the user intest was replicated and the user inmysql was not. And we have clearly an unwanted data inconsistency between Master and Slave.
If we want to drop the inmysql user some time later on the Master:
mysql> use myapp; mysql> DROP USER 'inmysql'@'%';we get the following error message on the Slave and are wondering, why this user or the query appears on the Slave:
mysql> SHOW SLAVE STATUS\G ... Last_SQL_Errno: 1396 Last_SQL_Error: Error 'Operation DROP USER failed for 'inmysql'@'%'' on query. Default database: 'test'. Query: 'DROP USER 'inmysql'@'%'' ...A similar problem happens when we connect to NO database on the Master as follows and change the users password:
shell> mysql -uroot mysql> SELECT DATABASE(); +------------+ | database() | +------------+ | NULL | +------------+ mysql> ALTER USER 'innone'@'%' IDENTIFIED BY 'secret';This works perfectly on the Master. But what happens on the Slave:
mysql> SHOW SLAVE STATUS\G ... Last_SQL_Errno: 1396 Last_SQL_Error: Error 'Operation ALTER USER failed for 'innone'@'%'' on query. Default database: ''. Query: 'ALTER USER 'innone'@'%' IDENTIFIED WITH 'mysql_native_password' AS '*14E65567ABDB5135D0CFD9A70B3032C179A49EE7'' ...The Slave wants to tell us in a complicated way, that the user innone does not exist on the Slave...
Scenario B: Filtering on tmp or similar schemaAn other scenario we have seen recently is that the customer is filtering out tables with temporary data located in the tmp schema. Similar scenarios are cache, session or log tables. He did it as follows on the Master:
mysql> use tmp; mysql> TRUNCATE TABLE tmp.test;As he has learned in FromDual trainings he emptied the table with the TRUNCATE TABLE command instead of a DELETE FROM tmp.test command which is much less efficient than the TRUNCATE TABLE command. What he did not consider is, that the TRUNCATE TABLE command is a DDL command and not a DML command and thus the STATEMENT based replication filtering rules apply. His filtering rules on the Slave were as follows:
mysql> SHOW SLAVE STATUS\G ... Replicate_Ignore_DB: tmp ...When we do the check on the Master we get an empty set as expected:
mysql> SELECT * FROM tmp.test; Empty set (0.00 sec)When we add new data on the Master:
mysql> INSERT INTO tmp.test VALUES (NULL, 'new data', CURRENT_TIMESTAMP()); mysql> SELECT * FROM tmp.test; +----+-----------+---------------------+ | id | data | ts | +----+-----------+---------------------+ | 1 | new data | 2017-01-11 18:00:11 | +----+-----------+---------------------+we get a different result set on the Slave:
mysql> SELECT * FROM tmp.test; +----+-----------+---------------------+ | id | data | ts | +----+-----------+---------------------+ | 1 | old data | 2017-01-11 17:58:55 | +----+-----------+---------------------+and in addition the replication stops working with the following error:
mysql> SHOW SLAVE STATUS\G ... Last_Errno: 1062 Last_Error: Could not execute Write_rows event on table tmp.test; Duplicate entry '1' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log laptop4_qa57master_binlog.000042, end_log_pos 1572 ...See also our earlier bug report of a similar topic: Option "replicate_do_db" does not cause "create table" to replicate ('row' log)
ConclusionBinary log filtering is extremely dangerous when you care about data consistency and thus FromDual recommends to avoid binary log filtering by all means. If you really have to do binary log filtering you should exactly know what you are doing, carefully test your set-up, check your application and your maintenance jobs and also review your future code changes regularly. Otherwise you risk data inconsistencies in your MySQL Master/Slave replication.
Taxonomy upgrade extras: replicationbinary logfilterfilteringrow filteringstatementbinlog_formatrowMySQL replication with filtering is dangerous
From time to time we see in customer engagements that MySQL Master/Slave replication is set-up doing schema or table level replication filtering. This can be done either on Master or on Slave. If filtering is done on the Master (by the binlog_{do|ignore}_db settings), the binary log becomes incomplete and cannot be used for a proper Point-in-Time-Recovery. Therefore FromDual recommends AGAINST this approach.
The replication filtering rules vary depending on the binary log format (ROW and STATEMENT) See also: How Servers Evaluate Replication Filtering Rules.
For reasons of data consistency between Master and Slave FromDual recommends to use only the binary log format ROW. This is also stated in the MySQL documentation: All changes can be replicated. This is the safest form of replication. Especially dangerous is binary log filtering with binary log format MIXED. This binary log format FromDual strongly discourages users to use.
The binary log format ROW affects only DML statements (UPDATE, INSERT, DELETE, etc.) but NOT DDL statements (CREATE, ALTER, DROP, etc.) and NOT DCL statements (CREATE, GRANT, REVOKE, DROP, etc.). So how are those statements replicated? They are replicated in STATEMENT binary log format even though binlog_format is set to ROW. This has the consequences that the binary log filtering rules of STATEMENT based replication and not the ones of ROW based replication apply when running one of those DDL or DCL statements.
This can easily cause problems. If you are lucky, they will cause the replication to break sooner or later, which you can detect and fix - but they may also cause inconsistencies between Master and Slave which may remain undetected for a long time.
Let us show what happens in 2 similar scenarios:
Scenario A: Filtering on mysql schemaOn Slave we set the binary log filter as follows:
replicate_ignore_db = mysqland verify it:
mysql> SHOW SLAVE STATUS\G ... Replicate_Ignore_DB: mysql ...The intention of this filter setting is to not replicate user creations or modifications from Master to the Slave.
We verify on the Master, that binlog_format is set to the wanted value:
mysql> SHOW GLOBAL VARIABLES LIKE 'binlog_format'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | binlog_format | ROW | +---------------+-------+Now we do the following on the Master:
mysql> use mysql mysql> CREATE USER 'inmysql'@'%'; mysql> use test mysql> CREATE USER 'intest'@'%';and verify the result on the Master:
mysql> SELECT user, host FROM mysql.user; +-------------+-----------+ | user | host | +-------------+-----------+ | inmysql | % | | intest | % | | mysql.sys | localhost | | root | localhost | +-------------+-----------+and on the Slave:
mysql> SELECT user, host FROM mysql.user; +-------------+-----------+ | user | host | +-------------+-----------+ | intest | % | | mysql.sys | localhost | | root | localhost | +-------------+-----------+We see, that the user intest was replicated and the user inmysql was not. And we have clearly an unwanted data inconsistency between Master and Slave.
If we want to drop the inmysql user some time later on the Master:
mysql> use myapp; mysql> DROP USER 'inmysql'@'%';we get the following error message on the Slave and are wondering, why this user or the query appears on the Slave:
mysql> SHOW SLAVE STATUS\G ... Last_SQL_Errno: 1396 Last_SQL_Error: Error 'Operation DROP USER failed for 'inmysql'@'%'' on query. Default database: 'test'. Query: 'DROP USER 'inmysql'@'%'' ...A similar problem happens when we connect to NO database on the Master as follows and change the users password:
shell> mysql -uroot mysql> SELECT DATABASE(); +------------+ | database() | +------------+ | NULL | +------------+ mysql> ALTER USER 'innone'@'%' IDENTIFIED BY 'secret';This works perfectly on the Master. But what happens on the Slave:
mysql> SHOW SLAVE STATUS\G ... Last_SQL_Errno: 1396 Last_SQL_Error: Error 'Operation ALTER USER failed for 'innone'@'%'' on query. Default database: ''. Query: 'ALTER USER 'innone'@'%' IDENTIFIED WITH 'mysql_native_password' AS '*14E65567ABDB5135D0CFD9A70B3032C179A49EE7'' ...The Slave wants to tell us in a complicated way, that the user innone does not exist on the Slave...
Scenario B: Filtering on tmp or similar schemaAn other scenario we have seen recently is that the customer is filtering out tables with temporary data located in the tmp schema. Similar scenarios are cache, session or log tables. He did it as follows on the Master:
mysql> use tmp; mysql> TRUNCATE TABLE tmp.test;As he has learned in FromDual trainings he emptied the table with the TRUNCATE TABLE command instead of a DELETE FROM tmp.test command which is much less efficient than the TRUNCATE TABLE command. What he did not consider is, that the TRUNCATE TABLE command is a DDL command and not a DML command and thus the STATEMENT based replication filtering rules apply. His filtering rules on the Slave were as follows:
mysql> SHOW SLAVE STATUS\G ... Replicate_Ignore_DB: tmp ...When we do the check on the Master we get an empty set as expected:
mysql> SELECT * FROM tmp.test; Empty set (0.00 sec)When we add new data on the Master:
mysql> INSERT INTO tmp.test VALUES (NULL, 'new data', CURRENT_TIMESTAMP()); mysql> SELECT * FROM tmp.test; +----+-----------+---------------------+ | id | data | ts | +----+-----------+---------------------+ | 1 | new data | 2017-01-11 18:00:11 | +----+-----------+---------------------+we get a different result set on the Slave:
mysql> SELECT * FROM tmp.test; +----+-----------+---------------------+ | id | data | ts | +----+-----------+---------------------+ | 1 | old data | 2017-01-11 17:58:55 | +----+-----------+---------------------+and in addition the replication stops working with the following error:
mysql> SHOW SLAVE STATUS\G ... Last_Errno: 1062 Last_Error: Could not execute Write_rows event on table tmp.test; Duplicate entry '1' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log laptop4_qa57master_binlog.000042, end_log_pos 1572 ...See also our earlier bug report of a similar topic: Option "replicate_do_db" does not cause "create table" to replicate ('row' log)
ConclusionBinary log filtering is extremely dangerous when you care about data consistency and thus FromDual recommends to avoid binary log filtering by all means. If you really have to do binary log filtering you should exactly know what you are doing, carefully test your set-up, check your application and your maintenance jobs and also review your future code changes regularly. Otherwise you risk data inconsistencies in your MySQL Master/Slave replication.
Taxonomy upgrade extras: replicationbinary logfilterfilteringrow filteringstatementbinlog_formatrowFromDual Backup and Recovery Manager for MySQL 1.2.4 has been released
FromDual has the pleasure to announce the release of the new version 1.2.4 of the popular Backup and Recovery Manager for MySQL and MariaDB (fromdual_brman).
You can download the FromDual Backup and Recovery Manager from here.
In the inconceivable case that you find a bug in the Backup and Recovery Manager please report it to our Bugtracker or just send us an email.
Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.
Upgrade from 1.2.x to 1.2.4 # cd ${HOME}/product # tar xf /download/fromdual_brman-1.2.4.tar.gz # rm -f fromdual_brman # ln -s fromdual_brman-1.2.4 fromdual_brmanChanges in FromDual Backup Manager 1.2.4
This release contains mainly fixes related to the backup type cleanup.
You can verify your current FromDual Backup Manager version with the following command:
shell> fromdual_bman --versionFromDual Backup Manager
- Library myEnv.inc synced from myEnv project.
- Ignore warning for non empty backup directories in cleanup.
- Check if directory exist before going into recursive lookup in cleanup.
- Function rmdir is also covered by simulate option now in cleanup.
- Backup type cleanup also considers physical backup and catalog now.
FromDual Backup and Recovery Manager for MySQL 1.2.4 has been released
FromDual has the pleasure to announce the release of the new version 1.2.4 of the popular Backup and Recovery Manager for MySQL and MariaDB (fromdual_brman).
You can download the FromDual Backup and Recovery Manager from here.
In the inconceivable case that you find a bug in the Backup and Recovery Manager please report it to our Bugtracker or just send us an email.
Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.
Upgrade from 1.2.x to 1.2.4 # cd ${HOME}/product # tar xf /download/fromdual_brman-1.2.4.tar.gz # rm -f fromdual_brman # ln -s fromdual_brman-1.2.4 fromdual_brmanChanges in FromDual Backup Manager 1.2.4
This release contains mainly fixes related to the backup type cleanup.
You can verify your current FromDual Backup Manager version with the following command:
shell> fromdual_bman --versionFromDual Backup Manager
- Library myEnv.inc synced from myEnv project.
- Ignore warning for non empty backup directories in cleanup.
- Check if directory exist before going into recursive lookup in cleanup.
- Function rmdir is also covered by simulate option now in cleanup.
- Backup type cleanup also considers physical backup and catalog now.