Disadvantages of explicitly NOT using InnoDB Primary Keys?

Mon, 2010-09-20 15:36 — Shinguz

We recently had the case with one of our customers where we got externally generated random hash values (up to 70 bytes) and they were used as Primary Keys in InnoDB.

As we know, this is not a very good idea because the size of all secondary indexes becomes large and because a random hash value as a Primary Key gives us a bad locality of our rows in the table ¹.

If we do not specify a Primary Key and have no Unique Key InnoDB generates a hidden Clustered Index based on the Row ID which is a 6 byte field.

This Row ID is increasing monotonically ¹. That would fit our needs.

So we were thinking about getting completely rid of any Primary Key of that InnoDB table.

This has the following advantages:

It is sorted by time (more or less).
The Primary Key is shorter (6 bytes instead of 30-70 bytes + something).
We do not run into the AUTO_INCREMENT bottleneck of Bug #16979 ² in MySQL 5.0 ², ³. Yes we are still running on 5.0.
We exclude possible other AUTO_INCREMENT contention situations.

To get more opinions about this idea I was writing to some of my colleagues from the vibrant MySQL Community and asked for their opinion.

And I got a lot of feedback. Please find below the summary:

An implicit Primary Key (= no PK) causes a Full Table Scan on the Slave for every row that is being deleted or updated when Row Based Replication (RBR) is in use ⁴. The reason for this behaviour is that the implicit Primary Key is not sent to the Slave. And thus the Slave does not know what has to be changed.
MySQL Bug #16979 ²: This bug still exists in 5.1 if you use multi row insert. The 5.1 fix for single row inserts is very good. The problem completely goes away for all inserts with Row-based-replication.
You lose the ability to use tools like on-line schema change ⁵.
Checking for table drift is harder ⁶.
You have no way to write a WHERE clause that actually identifies a row.

Conclusion

InnoDB tables WITHOUT a Primary Key on a Master/Slave replication and RBR (5.1) seems to be definitely a nogo. For all other use cases one has to try out and benchmark. And one should only use this mean when there are big benefits.

So: Do not do InnoDB tables without a Primary Key except you know exactly what you are doing...

Thanks to all who replied personally to my question!

With the following query you can find the tables not having a Primary Key:

SELECT t.TABLE_SCHEMA, t.TABLE_NAME, t.ENGINE
  FROM information_schema.tables AS t
  LEFT JOIN information_schema.TABLE_CONSTRAINTS AS tc
         ON tc.TABLE_SCHEMA = t.TABLE_SCHEMA
        AND tc.TABLE_NAME = t.TABLE_NAME
        AND tc.CONSTRAINT_TYPE = 'PRIMARY KEY'
 WHERE t.TABLE_TYPE = 'BASE TABLE'
   AND tc.CONSTRAINT_TYPE IS NULL
   AND t.TABLE_SCHEMA != 'mysql'
;
+--------------+---------------------+--------+
| TABLE_SCHEMA | TABLE_NAME          | ENGINE |
+--------------+---------------------+--------+
| cms          | antispam_spam_marks | MyISAM |
| cms          | search_dataset      | MyISAM |
| cms          | search_index        | MyISAM |
| test         | test                | InnoDB |
| zabbix       | history             | InnoDB |
| zabbix       | history_str         | InnoDB |
| zabbix       | history_uint        | InnoDB |
| zabbix       | node_cksum          | InnoDB |
+--------------+---------------------+--------+

By the way. The execution plan of the above query does not look too good! :-(

Sources

¹ Clustered and Secondary Indexes
² MySQL Bug #16979: AUTO_INC lock in InnoDB works a table level lock
³ AUTO_INCREMENT Handling in InnoDB
⁴ MySQL Bug #53375: Too easy to decommission a slave with RBR and unindexed table on master
⁵ Online Schema Change for MySQL
⁶ Maatkit: Table Checksum, alternative: pt-table-checksum
⁷ Kristian Köhntopp, 23 August 2020: MySQL: GIPK (InnoDB and Primary Keys)
⁸ Jeremy Cole, 2 May 2013: How does InnoDB behave without a Primary Key?
⁹ Ovais Tariq, 18 October 2013: InnoDB scalability issues due to tables without primary keys
¹⁰ FromDual, 20 January 2022: Learning from the Bugs Database
¹¹ Jörg Brühe, 2 May 2017: How the Lack of a Primary Key May Effectively Stop the Slave
¹² FromDual: Tables without a Primary Key
¹³ FromDual, 16 August 2012: Deadlocks, indexing and Primary Key's

Taxonomy upgrade extras:

innodb

primary key

locality

Shinguz's blog
Log in or register to post comments

Comments

Alternative query

SELECT t.table_schema, t.table_name, t.engine
  FROM information_schema.tables AS t
  LEFT JOIN information_schema.key_column_usage AS i ON t.table_schema = i.table_schema AND t.table_name = i.table_name AND i.constraint_name = 'PRIMARY'
 WHERE i.constraint_name IS NULL
   AND t.table_schema NOT IN ('information_schema', 'mysql')
;

oli

Thu, 2012-08-02 09:40

Languages

Login

You are here

Disadvantages of explicitly NOT using InnoDB Primary Keys?

Conclusion

Sources

Comments

Alternative query

Navigation

Trademarks

Languages

Login

Search form

You are here

Disadvantages of explicitly NOT using InnoDB Primary Keys?

Conclusion

Sources

Comments

Alternative query

Navigation

Trademarks