You are here
Impact of column types on MySQL JOIN performance
In our MySQL trainings and consulting engagements we tell our customers always to use the smallest possible data type to get better query performance. Especially for the JOIN
columns. This advice is supported as well by the MySQL documentation in the chapter Optimizing Data Types:
Use the most efficient (smallest) data types possible. MySQL has many specialized types that save disk space and memory. For example, use the smaller integer types if possible to get smaller tables. MEDIUMINT
is often a better choice than INT
because a MEDIUMINT
column uses 25% less space.
I remember somewhere the JOIN
columns where explicitly mentioned but I cannot find it any more.
Test set-up
To get numbers we have created a little test set-up:
CREATE TABLE `a` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT , `data` varchar(64) DEFAULT NULL , `ts` timestamp NOT NULL , PRIMARY KEY (`id`) ) ENGINE=InnoDB CHARSET=latin1 |
CREATE TABLE `b` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT , `data` varchar(64) DEFAULT NULL , `ts` timestamp NOT NULL , `a_id` int(10) unsigned DEFAULT NULL , PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
1048576 rows | 16777216 rows |
EXPLAIN SELECT * FROM a JOIN b ON b.a_id = a.id WHERE a.id BETWEEN 10000 AND 15000; +----+-------------+-------+--------+---------------+---------+---------+-------------+----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+---------------+---------+---------+-------------+----------+-------------+ | 1 | SIMPLE | b | ALL | NULL | NULL | NULL | NULL | 16322446 | Using where | | 1 | SIMPLE | a | eq_ref | PRIMARY | PRIMARY | 4 | test.b.a_id | 1 | NULL | +----+-------------+-------+--------+---------------+---------+---------+-------------+----------+-------------+
And yes: I know this query could be more optimal by setting an index on b.a_id
.
Results
The whole workload was executed completely in memory and thus CPU bound (we did not want to measure the speed of our I/O system).
SE | JOIN column | bytes | query time | Gain | Space | Character set | |
---|---|---|---|---|---|---|---|
InnoDB | MEDIUMINT | 3 | 5.28 s | 96% | 4% faster | 75% | |
InnoDB | INT | 4 | 5.48 s | 100% | 100% | 100% | |
InnoDB | BIGINT | 8 | 5.65 s | 107% | 7% slower | 200% | |
InnoDB | NUMERIC(7, 2) | ~4 | 6.77 s | 124% | 24% slower | ~100% | |
InnoDB | VARCHAR(7) | 7-8 | 6.44 s | 118% | 18% slower | ~200% | latin1 |
InnoDB | VARCHAR(16) | 7-8 | 6.44 s | 118% | 18% slower | ~200% | latin1 |
InnoDB | VARCHAR(32) | 7-8 | 6.42 s | 118% | 18% slower | ~200% | latin1 |
InnoDB | VARCHAR(128) | 7-8 | 6.46 s | 118% | 18% slower | ~200% | latin1 |
InnoDB | VARCHAR(256) | 8-9 | 6.17 s | 114% | 14% slower | ~225% | latin1 |
InnoDB | VARCHAR(16) | 7-8 | 6.96 s | 127% | 27% slower | ~200% | utf8 |
InnoDB | VARCHAR(128) | 7-8 | 6.82 s | 124% | 24% slower | ~200% | utf8 |
InnoDB | CHAR(16) | 16 | 6.85 s | 125% | 25% slower | 400% | latin1 |
InnoDB | CHAR(128) | 128 | 9.68 s | 177% | 77% slower | 3200% | latin1 |
InnoDB | TEXT | 8-9 | 10.7 s | 195% | 95% slower | ~225% | latin1 |
MyISAM | INT | 4 | 3.16 s | 58% | 42% faster | ||
TokuDB | INT | 4 | 4.52 s | 82% | 18% faster |
Some comments to the tests:
- MySQL 5.6.13 was used for most of the tests.
- TokuDB v7.1.0 was tested with MySQL 5.5.30.
- As results the optimistic cases were taken. In reality the results can be slightly worse.
- We did not take into consideration that bigger data types will eventually cause more I/O which is very slow!
Commands
ALTER TABLE a CONVERT TO CHARACTER SET latin1; ALTER TABLE b CONVERT TO CHARACTER SET latin1; ALTER TABLE a MODIFY COLUMN id INT UNSIGNED NOT NULL; ALTER TABLE b MODIFY COLUMN a_id INT UNSIGNED NOT NULL;
- Shinguz's blog
- Log in or register to post comments
Comments
5.5 vs 5.6 single threaded performance
MySQL slower over time?
Hi Morgan,
Thanks!
I have planed such a test already a while ago but did not find the time to do it yet...
In my mind I have some numbers showing that since MySQL 4.0 single query performance has decreased significantly...
This leads to the question: Where are all the MySQL branches and forks heading to and is this in the interest of the majority of the MySQL users or just in the interest of a few ones like Facebook, LinkeId, Google, Booking.com etc...?
Regards,
Oli
MediumInt same size as INT ?
Hi Oli,
thanks for the nice comparison chart !
In InnoDB, isn't (Tiny|Small|Medium)-int internally aligned to 4 bytes, and thus stored as 4bytes int ?
How many iterations did you run the query ?
Thanks in advance!
/Joffrey
InnoDB 4 byte alignment
Hi Joffrey,
The easy answer first: 5 - 15 iterations until it got a stable response time and then I took the most optimistic value. Not very scientific, I know... But I think good enough for a reliable statement.
About InnoDB 4 byte alignment. No, it is not.
Some proves beside consulting various documentations from InnoDB table monitor:
And from any
hexdump
utility:NDB Alignment
Hi Joffrey,
I think NDB tables use 4-byte alignment not InnoDB:
NDB
tables use 4-byte alignment; allNDB
data storage is done in multiples of 4 bytes. Thus, a column value that would typically take 15 bytes requires 16 bytes in anNDB
table. For example, inNDB
tables, theTINYINT
,SMALLINT
,MEDIUMINT
, andINTEGER
(INT
) column types each require 4 bytes storage per record due to the alignment factor.Check this manual page for reference.
Thanks,
Abdel-Mawla