You are here
Disadvantages of explicitly NOT using InnoDB Primary Keys?
We recently had the case with one of our customers where we got externally generated random hash values (up to 70 bytes) and they were used as Primary Keys in InnoDB.
As we know, this is not a very good idea because the size of all secondary indexes becomes large and because a random hash value as a Primary Key gives us a bad locality of our rows in the table 1.
If we do not specify a Primary Key and have no Unique Key InnoDB generates a hidden Clustered Index based on the Row ID which is a 6 byte field.
This Row ID is increasing monotonically 1. That would fit our needs.
So we were thinking about getting completely rid of any Primary Key of that InnoDB table.
This has the following advantages:
- It is sorted by time (more or less).
- The Primary Key is shorter (6 bytes instead of 30-70 bytes + something).
- We do not run into the
AUTO_INCREMENT
bottleneck of Bug #16979 2 in MySQL 5.0 2, 3. Yes we are still running on 5.0. - We exclude possible other
AUTO_INCREMENT
contention situations.
To get more opinions about this idea I was writing to some of my colleagues from the vibrant MySQL Community and asked for their opinion.
And I got a lot of feedback. Please find below the summary:
- An implicit Primary Key (= no PK) causes a Full Table Scan on the Slave for every row that is being deleted or updated when Row Based Replication (RBR) is in use 4. The reason for this behaviour is that the implicit Primary Key is not sent to the Slave. And thus the Slave does not know what has to be changed.
- MySQL Bug #16979 2: This bug still exists in 5.1 if you use multi row insert. The 5.1 fix for single row inserts is very good. The problem completely goes away for all inserts with Row-based-replication.
- You lose the ability to use tools like on-line schema change 5.
- Checking for table drift is harder 6.
- You have no way to write a
WHERE
clause that actually identifies a row.
Conclusion
InnoDB tables WITHOUT a Primary Key on a Master/Slave replication and RBR (5.1) seems to be definitely a nogo. For all other use cases one has to try out and benchmark. And one should only use this mean when there are big benefits.
So: Do not do InnoDB tables without a Primary Key except you know exactly what you are doing...
Thanks to all who replied personally to my question!
With the following query you can find the tables not having a Primary Key:
SELECT t.TABLE_SCHEMA, t.TABLE_NAME, t.ENGINE FROM information_schema.tables AS t LEFT JOIN information_schema.TABLE_CONSTRAINTS AS tc ON tc.TABLE_SCHEMA = t.TABLE_SCHEMA AND tc.TABLE_NAME = t.TABLE_NAME AND tc.CONSTRAINT_TYPE = 'PRIMARY KEY' WHERE t.TABLE_TYPE = 'BASE TABLE' AND tc.CONSTRAINT_TYPE IS NULL AND t.TABLE_SCHEMA != 'mysql' ; +--------------+---------------------+--------+ | TABLE_SCHEMA | TABLE_NAME | ENGINE | +--------------+---------------------+--------+ | cms | antispam_spam_marks | MyISAM | | cms | search_dataset | MyISAM | | cms | search_index | MyISAM | | test | test | InnoDB | | zabbix | history | InnoDB | | zabbix | history_str | InnoDB | | zabbix | history_uint | InnoDB | | zabbix | node_cksum | InnoDB | +--------------+---------------------+--------+
By the way. The execution plan of the above query does not look too good! :-(
Sources
- 1 Clustered and Secondary Indexes
- 2 MySQL Bug #16979:
AUTO_INC
lock in InnoDB works a table level lock - 3
AUTO_INCREMENT
Handling in InnoDB - 4 MySQL Bug #53375: Too easy to decommission a slave with RBR and unindexed table on master
- 5 Online Schema Change for MySQL
- 6 Maatkit: Table Checksum, alternative: pt-table-checksum
- 7 Kristian Köhntopp, 23 August 2020: MySQL: GIPK (InnoDB and Primary Keys)
- 8 Jeremy Cole, 2 May 2013: How does InnoDB behave without a Primary Key?
- 9 Ovais Tariq, 18 October 2013: InnoDB scalability issues due to tables without primary keys
- 10 FromDual, 20 January 2022: Learning from the Bugs Database
- 11 Jörg Brühe, 2 May 2017: How the Lack of a Primary Key May Effectively Stop the Slave
- 12 FromDual: Tables without a Primary Key
- 13 FromDual, 16 August 2012: Deadlocks, indexing and Primary Key's
- Shinguz's blog
- Log in or register to post comments
Comments
Alternative query