You are here
SQL Query Tuning - Performance
How could the following SQL queries be improved performance wise and otherwise and can you also explain why your change is more optimal?
Please consider, when testing, that your results are not confused by the Query Cache or by reading data from your I/O system which are an order of magnitude faster (Query Cache) or slower (I/O) than the in-memory behaviour.
Table of Contents
- Covering function
LIKE
search- Covering function twice
- Compare 2 indexed columns
- Function and 2 indexed columns
- Time series problem
- Wrong built index or query
- OR optimization
- Nested subquery
DELETE
- Wrong casting
- Non normalized join column
- Optimizer is not considering ICP
- OR optimization with Materialized View
SQL Query Tuning - Covering function
Difficulty: Very easy.
Query:
SELECT * FROM bills WHERE UNIX_TIMESTAMP(due_date) < 1601560170 ;
Table and data:
CREATE TABLE bills ( id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY , amount DECIMAL (10,2) NOT NULL , due_date DATETIME NOT NULL , customer_id SMALLINT UNSIGNED NOT NULL , KEY `due_date` (`due_date`) );
INSERT INTO bills SELECT NULL, ROUND(RAND()*10000, 2), FROM_UNIXTIME(UNIX_TIMESTAMP()+RAND()*86400*30), ROUND(RAND()*10000, 0); INSERT INTO bills SELECT NULL, ROUND(RAND()*10000, 2), FROM_UNIXTIME(UNIX_TIMESTAMP()+RAND()*86400*30), ROUND(RAND()*10000, 0) FROM bills; -- Repeat this command about 14 times
Query Execution Plan (EXPLAIN
) before improvement:
+------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | 1 | SIMPLE | bills | ALL | NULL | NULL | NULL | NULL | 16300 | Using where | +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+
Query runtime before improvement: 11 - 12 ms.
Query Execution Plan (EXPLAIN
) after improvement:
+------+-------------+-------+-------+---------------+----------+---------+------+------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+----------+---------+------+------+-----------------------+ | 1 | SIMPLE | bills | range | due_date | due_date | 5 | NULL | 1 | Using index condition | +------+-------------+-------+-------+---------------+----------+---------+------+------+-----------------------+
Query runtime after improvement: About 1 ms.
SQL Query Tuning - LIKE
search
Difficulty: Very easy to pretty costly.
Query:
SELECT * FROM customers WHERE email LIKE '%hans.meier%' ;
Table and data:
CREATE TABLE customers ( id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY , first_name VARCHAR(63) , last_name VARCHAR(127) , email VARCHAR(255) , zip_code VARCHAR(9) , city VARCHAR(63) , KEY `email` (`email`) , INDEX (city, last_name, first_name) );
INSERT INTO customers SELECT NULL, MD5(RAND()), MD5(RAND()), CONCAT(CRC32(RAND()), '@', MD5(RAND())), ROUND(RAND()*9999, 0), CRC32(RAND()) ; INSERT INTO customers SELECT NULL, MD5(RAND()), MD5(RAND()), CONCAT(CRC32(RAND()), '@', MD5(RAND())), ROUND(RAND()*9999, 0), CRC32(RAND()) FROM customers; -- Repeat this command about 14 times
Query Execution Plan (EXPLAIN
) before improvement:
+------+-------------+-----------+------+---------------+------+---------+------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-----------+------+---------------+------+---------+------+--------+-------------+ | 1 | SIMPLE | customers | ALL | NULL | NULL | NULL | NULL | 324988 | Using where | +------+-------------+-----------+------+---------------+------+---------+------+--------+-------------+
Query runtime before improvement: About 150 ms.
Query Execution Plan (EXPLAIN
) after improvement:
+------+-------------+-----------+-------+---------------+-------+---------+------+------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-----------+-------+---------------+-------+---------+------+------+-----------------------+ | 1 | SIMPLE | customers | range | email | email | 258 | NULL | 1 | Using index condition | +------+-------------+-----------+-------+---------------+-------+---------+------+------+-----------------------+
Query runtime after improvement: About 1 ms.
SQL Query Tuning - Covering function twice
Difficulty: Very easy to pretty costly.
Query:
SELECT * FROM customers WHERE LOWER(email) = LOWER('hans.meier@fromdual.com') ;
Table and data: See above.
Query Execution Plan (EXPLAIN
) before improvement:
+------+-------------+-----------+------+---------------+------+---------+------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-----------+------+---------------+------+---------+------+--------+-------------+ | 1 | SIMPLE | customers | ALL | NULL | NULL | NULL | NULL | 519800 | Using where | +------+-------------+-----------+------+---------------+------+---------+------+--------+-------------+
Query runtime before improvement: About 140 ms.
Query Execution Plan (EXPLAIN
) after improvement:
+------+-------------+-----------+------+---------------+-------+---------+-------+------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-----------+------+---------------+-------+---------+-------+------+-----------------------+ | 1 | SIMPLE | customers | ref | email | email | 258 | const | 1 | Using index condition | +------+-------------+-----------+------+---------------+-------+---------+-------+------+-----------------------+
Query runtime after improvement: About 1 ms.
SQL Query Tuning - Compare 2 indexed columns
Difficulty: Medium.
Query:
SELECT COUNT(*) FROM cdrs WHERE start >= end ;
Table and data:
CREATE TABLE `cdrs` ( `uniqueid` varchar(40) NOT NULL, `callid` varchar(40) NOT NULL, `asteriskid` varchar(20) NOT NULL DEFAULT '', `machine` int(11) NOT NULL DEFAULT 0, `status` varchar(15) NOT NULL DEFAULT '', `start` TIMESTAMP(6) NOT NULL, `end` TIMESTAMP(6) NOT NULL, `scustomer` int(11) NOT NULL DEFAULT 0, `stype` varchar(30) NOT NULL DEFAULT '', `snumber` varchar(255) NOT NULL DEFAULT '', `dcustomer` int(11) NOT NULL DEFAULT 0, `dtype` varchar(30) NOT NULL DEFAULT '', `dnumber` varchar(255) NOT NULL DEFAULT '', PRIMARY KEY (`uniqueid`), KEY `server` (`callid`), KEY `start` (`start`), KEY `scustomer` (`scustomer`,`start`), KEY `dcustomer` (`dcustomer`,`start`), KEY `asteriskid` (`asteriskid`) );
INSERT INTO cdrs SELECT UUID(), MD5(RAND()), '', 0, '', FROM_UNIXTIME(ROUND(RAND()*UNIX_TIMESTAMP(), 6)), 0, 0, '', '', 0, '', ''; INSERT INTO cdrs SELECT UUID(), MD5(RAND()), '', 0, '', FROM_UNIXTIME(ROUND(RAND()*UNIX_TIMESTAMP(), 6)), 0, 0, '', '', 0, '', '' FROM cdrs; -- Repeat this command about 20 times... UPDATE cdrs SET end = FROM_UNIXTIME(UNIX_TIMESTAMP(start)+RAND()*3600);
Query Execution Plan (EXPLAIN
) before improvement:
+------+-------------+-------+------+---------------+------+---------+------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+--------+-------------+ | 1 | SIMPLE | cdrs | ALL | NULL | NULL | NULL | NULL | 520304 | Using where | +------+-------------+-------+------+---------------+------+---------+------+--------+-------------+
Query runtime before improvement: About 110 ms.
Query Execution Plan (EXPLAIN
) after improvement:
+------+-------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+ | 1 | SIMPLE | cdrs | index | NULL | start_2 | 16 | NULL | 260250 | Using where; Using index | +------+-------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
Query runtime after improvement: About 105 ms. Not so much improvement achieved. But should be come much better when you have memory pressure.
SQL Query Tuning - Function and 2 indexed columns
Difficulty: Medium.
Query:
SELECT uniqueid FROM cdrs WHERE CURRENT_TIMESTAMP() BETWEEN start AND end ;
Table and data: See above.
Query Execution Plan (EXPLAIN
) before improvement:
+------+-------------+-------+------+---------------+------+---------+------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+--------+-------------+ | 1 | SIMPLE | cdrs | ALL | start | NULL | NULL | NULL | 520304 | Using where | +------+-------------+-------+------+---------------+------+---------+------+--------+-------------+
Query runtime before improvement: About 120 ms.
Query Execution Plan (EXPLAIN
) after improvement:
+------+-------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+ | 1 | SIMPLE | cdrs | range | start,start_2 | start_2 | 7 | NULL | 260152 | Using where; Using index | +------+-------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
Query runtime after improvement: About 140 ms. Worse than original. Depends heavily on data distribution!
SQL Query Tuning - Time series problem
Difficulty: High.
Query:
SELECT * FROM history_of_int_values WHERE item_id = 42 AND timestamp BETWEEN '2020-10-01 00:00:00' AND '2010-10-31 23:59:59' ;
Table and data:
CREATE TABLE `history_of_int_values` ( id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, `item_id` bigint(20) unsigned NOT NULL, `timestamp` timestamp(6) NOT NULL, `value` bigint(20) signed NOT NULL DEFAULT '0', UNIQUE KEY (`item_id`,`timestamp`), KEY (timestamp) ) ENGINE = InnoDB;
INSERT INTO history_of_int_values SELECT NULL, ROUND(RAND()*100, 0), FROM_UNIXTIME(UNIX_TIMESTAMP(CURRENT_TIMESTAMP(6))-(86400*365*rand())), ROUND(RAND()*1000000, 0); INSERT IGNORE INTO history_of_int_values SELECT NULL, ROUND(RAND()*100, 0), FROM_UNIXTIME(UNIX_TIMESTAMP(CURRENT_TIMESTAMP(6))-(86400*365*rand())), ROUND(RAND()*1000000, 0) FROM history_of_int_values;
Query Execution Plan (EXPLAIN
) before improvement:
+------+-------------+-----------------------+-------+-------------------+---------+---------+------+------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-----------------------+-------+-------------------+---------+---------+------+------+-----------------------+ | 1 | SIMPLE | history_of_int_values | range | item_id,timestamp | item_id | 15 | NULL | 859 | Using index condition | +------+-------------+-----------------------+-------+-------------------+---------+---------+------+------+-----------------------+
Query runtime before improvement: 11 ms.
Query Execution Plan (EXPLAIN
) after improvement:
+------+-------------+-----------------------+-------+-------------------+---------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-----------------------+-------+-------------------+---------+---------+------+------+-------------+ | 1 | SIMPLE | history_of_int_values | range | PRIMARY,timestamp | PRIMARY | 15 | NULL | 859 | Using where | +------+-------------+-----------------------+-------+-------------------+---------+---------+------+------+-------------+
Query runtime after improvement: About 1 ms. Has dramatic impact if you run out of RAM!
SQL Query Tuning - Wrong built index or query
Difficulty: Medium.
Query:
SELECT * FROM customers WHERE last_name = 'Maier' AND first_name = 'Hans' ;
Table and data: See above.
Query Execution Plan (EXPLAIN
) before improvement:
+------+-------------+-----------+------+---------------+------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-----------+------+---------------+------+---------+------+-------+-------------+ | 1 | SIMPLE | customers | ALL | NULL | NULL | NULL | NULL | 65044 | Using where | +------+-------------+-----------+------+---------------+------+---------+------+-------+-------------+
Query runtime before improvement: About 20 ms.
Here we have 2 different possible solutions which come to my mind:
a) Query Execution Plan (EXPLAIN
) after improvement:
+------+-------------+-----------+------+---------------+-----------+---------+-------------+------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-----------+------+---------------+-----------+---------+-------------+------+-----------------------+ | 1 | SIMPLE | customers | ref | last_name | last_name | 196 | const,const | 1 | Using index condition | +------+-------------+-----------+------+---------------+-----------+---------+-------------+------+-----------------------+
a) Query runtime after improvement: About 1 ms.
b) Query Execution Plan (EXPLAIN
) after improvement:
+------+-------------+-----------+------+---------------+------+---------+-------------------+------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-----------+------+---------------+------+---------+-------------------+------+-----------------------+ | 1 | SIMPLE | customers | ref | city | city | 262 | const,const,const | 1 | Using index condition | +------+-------------+-----------+------+---------------+------+---------+-------------------+------+-----------------------+
b) Query runtime after improvement: About 1 ms.
SQL Query Tuning - OR optimization
Difficulty: Medium.
Query:
SELECT * FROM customers WHERE last_name = 'Meier' OR first_name = 'Hans' ;
Table and data: See above.
Query Execution Plan (EXPLAIN
) before improvement:
+------+-------------+-----------+------+---------------+------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-----------+------+---------------+------+---------+------+-------+-------------+ | 1 | SIMPLE | customers | ALL | NULL | NULL | NULL | NULL | 65044 | Using where | +------+-------------+-----------+------+---------------+------+---------+------+-------+-------------+
Query runtime before improvement: About 20 ms.
Here we have 2 different possible solutions which come to my mind:
a) Query Execution Plan (EXPLAIN
) after improvement:
+------+-------------+-----------+-------------+----------------------+----------------------+---------+------+------+------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-----------+-------------+----------------------+----------------------+---------+------+------+------------------------------------------------+ | 1 | SIMPLE | customers | index_merge | last_name,first_name | last_name,first_name | 130,66 | NULL | 2 | Using union(last_name,first_name); Using where | +------+-------------+-----------+-------------+----------------------+----------------------+---------+------+------+------------------------------------------------+
a) Query runtime after improvement: About 2 ms.
b) Query Execution Plan (EXPLAIN
) after improvement:
+------+--------------+------------+------+---------------+------------+---------+-------+------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+--------------+------------+------+---------------+------------+---------+-------+------+-----------------------+ | 1 | PRIMARY | customers | ref | last_name | last_name | 130 | const | 1 | Using index condition | | 2 | UNION | customers | ref | first_name | first_name | 66 | const | 1 | Using index condition | | NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | | +------+--------------+------------+------+---------------+------------+---------+-------+------+-----------------------+
b) Query runtime after improvement: About 1 ms.
Nested subquery DELETE
Difficulty: Medium.
Comment: Problem does NOT happen with equivalent SELECT
! It happens up to MySQL 5.7 and MariaDB 10.7. It seems to be fixed in the MySQL 8.0 Optimizer. UPDATE
behaves the same like the DELETE
statement:
Query:
DELETE FROM test WHERE id IN (SELECT id FROM t2) ;
Query Execution Plan (EXPLAIN
) before improvement:
+------+--------------------+-------+-----------------+---------------+---------+---------+------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+--------------------+-------+-----------------+---------------+---------+---------+------+---------+-------------+ | 1 | PRIMARY | test | ALL | NULL | NULL | NULL | NULL | 8021228 | Using where | | 2 | DEPENDENT SUBQUERY | t2 | unique_subquery | PRIMARY | PRIMARY | 4 | func | 1 | Using index | +------+--------------------+-------+-----------------+---------------+---------+---------+------+---------+-------------+
Query runtime before improvement: 9.859 sec
Query Execution Plan (EXPLAIN
) after improvement:
+------+-------------+-------+--------+---------------+---------+---------+------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+--------+---------------+---------+---------+------------+------+-------------+ | 1 | SIMPLE | t2 | index | PRIMARY | PRIMARY | 4 | NULL | 6 | Using index | | 1 | SIMPLE | test | eq_ref | PRIMARY | PRIMARY | 4 | test.t2.id | 1 | | +------+-------------+-------+--------+---------------+---------+---------+------------+------+-------------+
Query runtime after improvement: 0.001 sec
See also MDEV-22248 and MDEV-22415. Should be fixed in MariaDB 10.8.
Wrong casting
CREATE TABLE `order` ( id INT UNSIGNED NOT NULL AUTO_INCREMENT , order_id VARCHAR(64) , data VARCHAR(255) NULL DEFAULT 'Blabla' , PRIMARY KEY (id) , UNIQUE INDEX (order_id) ); INSERT INTO `order` SELECT NULL, NULL FROM dual; INSERT INTO `order` SELECT NULL, NULL FROM `order`; -- up to 256k rows UPDATE `order` SET order_id = id; EXPLAIN SELECT * FROM `order` WHERE order_id BETWEEN 238900 AND 238999 ; +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+ | 1 | SIMPLE | order | ALL | order_id | NULL | NULL | NULL | 2085934 | Using where | +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+
Execution time: 800 milliseconds
Optimized Query:
+------+-------------+-------+-------+---------------+----------+---------+------+------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+----------+---------+------+------+-----------------------+ | 1 | SIMPLE | order | range | order_id | order_id | 67 | NULL | 999 | Using index condition | +------+-------------+-------+-------+---------------+----------+---------+------+------+-----------------------+
Execution time: 10 milliseconds (f=80!)
See also: MDEV-26729
Non normalized join column
CREATE TABLE t1 ( id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY , data VARCHAR(64) , ts TIMESTAMP ); INSERT INTO t1 SELECT NULL, 'Some dummy data', NOW(); INSERT INTO t1 SELECT NULL, 'Some dummy data', NOW() FROM t1; -- Up to 128k rows CREATE TABLE t2 ( id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY , fld VARCHAR(64) , KEY (fld) ); INSERT INTO t2 SELECT NULL, ROUND(RAND()*100000, 0); INSERT INTO t2 SELECT NULL, ROUND(RAND()*100000, 0) FROM t2; -- Up to 512k rows EXPLAIN SELECT * FROM t1 JOIN t2 ON SUBSTR(t2.fld, 1, 8) = t1.id WHERE t1.id BETWEEN 100 and 1000 ; +------+-------------+-------+--------+---------------+---------+---------+------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+--------+---------------+---------+---------+------+---------+-------------+ | 1 | SIMPLE | t2 | index | NULL | fld | 67 | NULL | 1046834 | Using index | | 1 | SIMPLE | t1 | eq_ref | PRIMARY | PRIMARY | 4 | func | 1 | Using where | +------+-------------+-------+--------+---------------+---------+---------+------+---------+-------------+
Execution time: 450 ms
+------+-------------+-------+-------+---------------+---------+---------+------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+---------+---------+------------+------+-------------+ | 1 | SIMPLE | t1 | range | PRIMARY | PRIMARY | 4 | NULL | 448 | Using where | | 1 | SIMPLE | t2 | ref | vnum | vnum | 5 | test.t1.id | 5 | | +------+-------------+-------+-------+---------------+---------+---------+------------+------+-------------+
Execution time: 23 ms
Optimizer is not considering ICP
CREATE TABLE test ... INSERT INTO test SELECT NULL, 'Some data which are not relevant', NOW(); INSERT INTO test SELECT NULL, 'Some data which are not relevant', NOW() FROM test; -- Repeat this step until you have 1 Mio rows ALTER TABLE test ADD INDEX (ts); EXPLAIN SELECT * FROM test WHERE ts = '2024-03-13 10:35:23'; +------+-------------+-------+------+---------------+------+---------+-------+------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+-------+------+-------+ | 1 | SIMPLE | test | ref | ts | ts | 5 | const | 1000 | | +------+-------------+-------+------+---------------+------+---------+-------+------+-------+ EXPLAIN SELECT * FROM test WHERE ts = CAST('2024-03-13 10:35:23' AS DATETIME); +------+-------------+-------+------+---------------+------+---------+-------+------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+-------+------+-----------------------+ | 1 | SIMPLE | test | ref | ts | ts | 5 | const | 1000 | Using index condition | +------+-------------+-------+------+---------------+------+---------+-------+------+-----------------------+
The second case should be faster but is in simple cases slightly slower. Should have a significant impact if data are located on disk.
OR optimization with Materialized View
CREATE TABLE `t1` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `data1` varchar(128) DEFAULT NULL, `data2` varchar(128) DEFAULT NULL, PRIMARY KEY (`id`) ); INSERT INTO t1 SELECT NULL, 'Some data which are not relevant', NOW(); INSERT INTO t1 SELECT NULL, 'Some data which are not relevant', NOW() FROM t1; ... UPDATE t1 SET data1 = MD5(id); UPDATE t1 SET data2 = MD5(data1); ALTER TABLE t1 ADD INDEX(data1), ADD INDEX (data2); EXPLAIN SELECT * FROM t1 WHERE data1 = 'c4ca4238a0b923820dcc509a6f75849b' OR data2 = '8a6dbf554746d73e62de6e21509b6ee2' ; +------+-------------+-------+-------------+---------------+-------------+---------+------+------+---------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------------+---------------+-------------+---------+------+------+---------------------------------------+ | 1 | SIMPLE | t1 | index_merge | data1,data2 | data1,data2 | 131,131 | NULL | 2 | Using union(data1,data2); Using where | +------+-------------+-------+-------------+---------------+-------------+---------+------+------+---------------------------------------+
Pretty fast, possibly faster if index on (data1, data2)
. Now the real problem:
CREATE table t2 LIKE t1; ALTER TABLE t2 ADD COLUMN t1_id INT UNSIGNED NOT NULL; INSERT INTO t2 SELECT id, data1, data2, id FROM t1; ALTER TABLE t2 ADD INDEX (t1_id); EXPLAIN SELECT t1.id, t1.data1, t2.data2 FROM t1 JOIN t2 ON t1.id = t2.t1_id WHERE t1.data1 = 'c4ca4238a0b923820dcc509a6f75849b' OR t2.data2 = '8a6dbf554746d73e62de6e21509b6ee2' ; +------+-------------+-------+--------+---------------+---------+---------+---------------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+--------+---------------+---------+---------+---------------+---------+-------------+ | 1 | SIMPLE | t2 | ALL | data2,t1_id | NULL | NULL | NULL | 1042503 | | | 1 | SIMPLE | t1 | eq_ref | PRIMARY,data1 | PRIMARY | 4 | test.t2.t1_id | 1 | Using where | +------+-------------+-------+--------+---------------+---------+---------+---------------+---------+-------------+
Horribly slow! Run time: 3.0 sec because of full table scan on table t2. Optimizer cannot do index_merge
operation on 2 different tables.
Solution 1:
EXPLAIN SELECT t1.id, t1.data1, t2.data2 FROM t1 JOIN t2 ON t1.id = t2.t1_id WHERE t1.data1 = 'c4ca4238a0b923820dcc509a6f75849b' UNION SELECT t1.id, t1.data1, t2.data2 FROM t1 JOIN t2 ON t1.id = t2.t1_id WHERE t2.data2 = '8a6dbf554746d73e62de6e21509b6ee2' ; +------+--------------+------------+--------+---------------+---------+---------+---------------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+--------------+------------+--------+---------------+---------+---------+---------------+------+--------------------------+ | 1 | PRIMARY | t1 | ref | PRIMARY,data1 | data1 | 131 | const | 1 | Using where; Using index | | 1 | PRIMARY | t2 | ref | t1_id | t1_id | 4 | test.t1.id | 1 | | | 2 | UNION | t2 | ref | data2,t1_id | data2 | 131 | const | 1 | Using index condition | | 2 | UNION | t1 | eq_ref | PRIMARY | PRIMARY | 4 | test.t2.t1_id | 1 | | | NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | | +------+--------------+------------+--------+---------------+---------+---------+---------------+------+--------------------------+
Much faster. Run time: 0.001 sec. If this cannot be done because of application reasons try solution 2:
CREATE TABLE t1t2mv AS SELECT t1.id, t1.data1, t2.data2 FROM t1 JOIN t2 ON t1.id = t2.t1_id; ALTER TABLE t1t2mv ADD PRIMARY KEY (id), ADD INDEX (data1), ADD INDEX (data2); EXPLAIN SELECT id, data1, data2 FROM t1t2mv WHERE data1 = 'c4ca4238a0b923820dcc509a6f75849b' OR data2 = '8a6dbf554746d73e62de6e21509b6ee2' ; +------+-------------+--------+-------------+---------------+-------------+---------+------+------+---------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+--------+-------------+---------------+-------------+---------+------+------+---------------------------------------+ | 1 | SIMPLE | t1t2mv | index_merge | data1,data2 | data1,data2 | 131,131 | NULL | 2 | Using union(data1,data2); Using where | +------+-------------+--------+-------------+---------------+-------------+---------+------+------+---------------------------------------+
Run time: 0.004 sec.
See also: