本文是关于mysql的连接消除技术:外连接消除、连接消除、嵌套连接消除。
连接消除技术包括三种:外连接消除、连接消除、嵌套连接消除 1、外连接消除 外连接消除就是把外连接变为内连接。之所以要消除外连接,是因为查询优化器在处理外连接操作时所消耗的时间多于内连接,因此消除外连接后可以加快查询速度。同时可以减少不必要的I/O开销,加快算法执行速度。 创建测试表并插入数据:CREATE TABLE t_1 (t_1_id INT UNIQUE, t_1_col_1 INT, t_1_col_2 VARCHAR(10));
CREATE TABLE t_2 (t_2_id INT UNIQUE, t_2_col_1 INT, t_2_col_2 VARCHAR(10));
INSERT INTO t_1 VALUES (1, 11, 't_1_1'); INSERT INTO t_1 VALUES (2, 12, NULL);
INSERT INTO t_1 VALUES (3, NULL, 't_1_3'); INSERT INTO t_1 VALUES (4, 14, 't_1_4');
INSERT INTO t_1 VALUES (5, 15, NULL); INSERT INTO t_1 VALUES (7, NULL, NULL);
INSERT INTO t_2 VALUES (1, 11, 't_2_1'); INSERT INTO t_2 VALUES (2, NULL, 't_2_2');
INSERT INTO t_2 VALUES (3, 13, NULL); INSERT INTO t_2 VALUES (4, 14, 't_2_4');
INSERT INTO t_2 VALUES (6, 16, 't_2_6'); INSERT INTO t_2 VALUES (7, NULL, NULL);
root@testdb 02:21:55>select * from t_1;
+--------+-----------+-----------+
| t_1_id | t_1_col_1 | t_1_col_2 |
+--------+-----------+-----------+
| 1 | 11 | t_1_1 |
| 2 | 12 | NULL |
| 3 | NULL | t_1_3 |
| 4 | 14 | t_1_4 |
| 5 | 15 | NULL |
| 7 | NULL | NULL |
+--------+-----------+-----------+
6 rows in set (0.00 sec)
root@testdb 02:22:08>select * from t_2;
+--------+-----------+-----------+
| t_2_id | t_2_col_1 | t_2_col_2 |
+--------+-----------+-----------+
| 1 | 11 | t_2_1 |
| 2 | NULL | t_2_2 |
| 3 | 13 | NULL |
| 4 | 14 | t_2_4 |
| 6 | 16 | t_2_6 |
| 7 | NULL | NULL |
+--------+-----------+-----------+
6 rows in set (0.00 sec)
语句一:使用TRUE作为ON的子句,WHERE子句包括连接条件且WHERE条件可以保证从结果中排除外连接右侧(右表)生成的值为NULL的行。root@testdb 02:48:44>SELECT * FROM t_1 LEFT JOIN t_2 ON true WHERE t_1_id = t_2_id;
+--------+-----------+-----------+--------+-----------+-----------+
| t_1_id | t_1_col_1 | t_1_col_2 | t_2_id | t_2_col_1 | t_2_col_2 |
+--------+-----------+-----------+--------+-----------+-----------+
| 1 | 11 | t_1_1 | 1 | 11 | t_2_1 |
| 2 | 12 | NULL | 2 | NULL | t_2_2 |
| 3 | NULL | t_1_3 | 3 | 13 | NULL |
| 4 | 14 | t_1_4 | 4 | 14 | t_2_4 |
| 7 | NULL | NULL | 7 | NULL | NULL |
+--------+-----------+-----------+--------+-----------+-----------+
5 rows in set (0.00 sec)
root@testdb 02:22:11>explain extended SELECT * FROM t_1 LEFT JOIN t_2 ON true WHERE t_1_id = t_2_id;show warnings;
+----+-------------+-------+------+---------------+--------+---------+-------------------+------+----------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+--------+---------+-------------------+------+----------+-------+
| 1 | SIMPLE | t_1 | ALL | t_1_id | NULL | NULL | NULL | 6 | 100.00 | NULL |
| 1 | SIMPLE | t_2 | ref | t_2_id | t_2_id | 5 | testdb.t_1.t_1_id | 1 | 100.00 | NULL |
+----+-------------+-------+------+---------------+--------+---------+-------------------+------+----------+-------+
2 rows in set, 1 warning (0.00 sec)
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note | 1003 | /* select#1 */ select `testdb`.`t_1`.`t_1_id` AS `t_1_id`,`testdb`.`t_1`.`t_1_col_1` AS `t_1_col_1`,`testdb`.`t_1`.`t_1_col_2` AS `t_1_col_2`,`testdb`.`t_2`.`t_2_id` AS `t_2_id`,`testdb`.`t_2`.`t_2_col_1` AS `t_2_col_1`,`testdb`.`t_2`.`t_2_col_2` AS `t_2_col_2` from `testdb`.`t_1` join `testdb`.`t_2` where (`testdb`.`t_2`.`t_2_id` = `testdb`.`t_1`.`t_1_id`) |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
从执行计划看到被优化器处理后的语句由left join优化成了join。
语句二:使用ON子句包括连接条件,并不能保证从结果中排除外连接右侧(右表)生成的值为NULL的行。root@testdb 02:53:25>SELECT * FROM t_1 LEFT JOIN t_2 ON t_1_id = t_2_id;
+--------+-----------+-----------+--------+-----------+-----------+
| t_1_id | t_1_col_1 | t_1_col_2 | t_2_id | t_2_col_1 | t_2_col_2 |
+--------+-----------+-----------+--------+-----------+-----------+
| 1 | 11 | t_1_1 | 1 | 11 | t_2_1 |
| 2 | 12 | NULL | 2 | NULL | t_2_2 |
| 3 | NULL | t_1_3 | 3 | 13 | NULL |
| 4 | 14 | t_1_4 | 4 | 14 | t_2_4 |
| 5 | 15 | NULL | NULL | NULL | NULL |
| 7 | NULL | NULL | 7 | NULL | NULL |
+--------+-----------+-----------+--------+-----------+-----------+
6 rows in set (0.01 sec)
root@testdb 03:06:26>explain extended SELECT * FROM t_1 LEFT JOIN t_2 ON t_1_id = t_2_id;show warnings;
+----+-------------+-------+------+---------------+--------+---------+-------------------+------+----------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+--------+---------+-------------------+------+----------+-------+
| 1 | SIMPLE | t_1 | ALL | NULL | NULL | NULL | NULL | 6 | 100.00 | NULL |
| 1 | SIMPLE | t_2 | ref | t_2_id | t_2_id | 5 | testdb.t_1.t_1_id | 1 | 100.00 | NULL |
+----+-------------+-------+------+---------------+--------+---------+-------------------+------+----------+-------+
2 rows in set, 1 warning (0.00 sec)
+-------+------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note | 1003 | /* select#1 */ select `testdb`.`t_1`.`t_1_id` AS `t_1_id`,`testdb`.`t_1`.`t_1_col_1` AS `t_1_col_1`,`testdb`.`t_1`.`t_1_col_2` AS `t_1_col_2`,`testdb`.`t_2`.`t_2_id` AS `t_2_id`,`testdb`.`t_2`.`t_2_col_1` AS `t_2_col_1`,`testdb`.`t_2`.`t_2_col_2` AS `t_2_col_2` from `testdb`.`t_1` left join `testdb`.`t_2` on((`testdb`.`t_1`.`t_1_id` = `testdb`.`t_2`.`t_2_id`)) where 1 |
+-------+------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
可以看到这个结果集第5行的右侧结果集存在NULL的情况,其执行计划依然是left join,是没办法优化为join的。
语句三:使用ON和WHERE子句包括连接条件,可以保证从结果中排除外连接右侧(右表)生成的值为NULL的行。root@testdb 03:06:42>SELECT * FROM t_1 LEFT JOIN t_2 ON t_1_id = t_2_id WHERE t_1_id = t_2_id;
+--------+-----------+-----------+--------+-----------+-----------+
| t_1_id | t_1_col_1 | t_1_col_2 | t_2_id | t_2_col_1 | t_2_col_2 |
+--------+-----------+-----------+--------+-----------+-----------+
| 1 | 11 | t_1_1 | 1 | 11 | t_2_1 |
| 2 | 12 | NULL | 2 | NULL | t_2_2 |
| 3 | NULL | t_1_3 | 3 | 13 | NULL |
| 4 | 14 | t_1_4 | 4 | 14 | t_2_4 |
| 7 | NULL | NULL | 7 | NULL | NULL |
+--------+-----------+-----------+--------+-----------+-----------+
5 rows in set (0.00 sec)
root@testdb 03:11:24>explain extended SELECT * FROM t_1 LEFT JOIN t_2 ON t_1_id = t_2_id WHERE t_1_id = t_2_id;
+----+-------------+-------+------+---------------+--------+---------+-------------------+------+----------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+--------+---------+-------------------+------+----------+-------+
| 1 | SIMPLE | t_1 | ALL | t_1_id | NULL | NULL | NULL | 6 | 100.00 | NULL |
| 1 | SIMPLE | t_2 | ref | t_2_id | t_2_id | 5 | testdb.t_1.t_1_id | 1 | 100.00 | NULL |
+----+-------------+-------+------+---------------+--------+---------+-------------------+------+----------+-------+
2 rows in set, 1 warning (0.00 sec)
root@testdb 03:11:38>show warnings;
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note | 1003 | /* select#1 */ select `testdb`.`t_1`.`t_1_id` AS `t_1_id`,`testdb`.`t_1`.`t_1_col_1` AS `t_1_col_1`,`testdb`.`t_1`.`t_1_col_2` AS `t_1_col_2`,`testdb`.`t_2`.`t_2_id` AS `t_2_id`,`testdb`.`t_2`.`t_2_col_1` AS `t_2_col_1`,`testdb`.`t_2`.`t_2_col_2` AS `t_2_col_2` from `testdb`.`t_1` join `testdb`.`t_2` where (`testdb`.`t_2`.`t_2_id` = `testdb`.`t_1`.`t_1_id`) |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
可以看到语句三同语句一是一样的,被优化器处理后的语句由left join优化成了join。
语句四:当外表的索引列出现在WHERE子句中,可以看到依然是left join无法被优化。root@testdb 03:11:38>EXPLAIN EXTENDED SELECT * FROM t_1 LEFT JOIN t_2 ON t_1_id = t_2_id WHERE t_1_id>0;show warnings;
+----+-------------+-------+------+---------------+--------+---------+-------------------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+--------+---------+-------------------+------+----------+-------------+
| 1 | SIMPLE | t_1 | ALL | t_1_id | NULL | NULL | NULL | 6 | 100.00 | Using where |
| 1 | SIMPLE | t_2 | ref | t_2_id | t_2_id | 5 | testdb.t_1.t_1_id | 1 | 100.00 | NULL |
+----+-------------+-------+------+---------------+--------+---------+-------------------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
注:左向外连接的结果集包括左表(t_1)所有的行,不仅是连接所匹配的行,还包括左表某行在右表中没有匹配的行用NULL表示。
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note | 1003 | /* select#1 */ select `testdb`.`t_1`.`t_1_id` AS `t_1_id`,`testdb`.`t_1`.`t_1_col_1` AS `t_1_col_1`,`testdb`.`t_1`.`t_1_col_2` AS `t_1_col_2`,`testdb`.`t_2`.`t_2_id` AS `t_2_id`,`testdb`.`t_2`.`t_2_col_1` AS `t_2_col_1`,`testdb`.`t_2`.`t_2_col_2` AS `t_2_col_2` from `testdb`.`t_1` left join `testdb`.`t_2` on((`testdb`.`t_2`.`t_2_id` = `testdb`.`t_1`.`t_1_id`)) where (`testdb`.`t_1`.`t_1_id` > 0) |
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
语句五:当内表的索引列出现在WHERE子句中,left join可以被优化join。root@testdb 03:21:22>EXPLAIN EXTENDED SELECT * FROM t_1 LEFT JOIN t_2 ON t_1_id = t_2_id WHERE t_2_id>0;show warnings;
+----+-------------+-------+------+---------------+--------+---------+-------------------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+--------+---------+-------------------+------+----------+-------------+
| 1 | SIMPLE | t_1 | ALL | t_1_id | NULL | NULL | NULL | 6 | 100.00 | Using where |
| 1 | SIMPLE | t_2 | ref | t_2_id | t_2_id | 5 | testdb.t_1.t_1_id | 1 | 100.00 | NULL |
+----+-------------+-------+------+---------------+--------+---------+-------------------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note | 1003 | /* select#1 */ select `testdb`.`t_1`.`t_1_id` AS `t_1_id`,`testdb`.`t_1`.`t_1_col_1` AS `t_1_col_1`,`testdb`.`t_1`.`t_1_col_2` AS `t_1_col_2`,`testdb`.`t_2`.`t_2_id` AS `t_2_id`,`testdb`.`t_2`.`t_2_col_1` AS `t_2_col_1`,`testdb`.`t_2`.`t_2_col_2` AS `t_2_col_2` from `testdb`.`t_1` join `testdb`.`t_2` where ((`testdb`.`t_2`.`t_2_id` = `testdb`.`t_1`.`t_1_id`) and (`testdb`.`t_1`.`t_1_id` > 0)) |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
综上实例可以知道外连接优化的本质:语义上是外连接,但WHER条件使得外连接可以蜕化为内连接
2、连接消除 连接消除就是去掉不必要的连接对象,减少连接操作。 准备测试数据:root@testdb 03:44:52>select * from t1 limit 1;
+----+------+
| a1 | b1 |
+----+------+
| 1 | 1 |
+----+------+
1 row in set (0.00 sec)
root@testdb 03:45:05>select * from t2 limit 1;
+----+------+
| a2 | b2 |
+----+------+
| 1 | 1 |
+----+------+
1 row in set (0.00 sec)
root@testdb 03:45:10>select * from t3 limit 1;
+----+------+
| a3 | b3 |
+----+------+
| 1 | 1 |
+----+------+
1 row in set (0.00 sec)
语句1:唯一键/主键作为连接条件,三表内连接可以去掉中间表(mysql优化器不支持,需人工去掉)root@testdb 03:47:24>select t1.*,t3.* from t1 join t2 on (a1=a2) join t3 on (a2=a3) limit 1;
+----+------+----+------+
| a1 | b1 | a3 | b3 |
+----+------+----+------+
| 1 | 1 | 1 | 1 |
+----+------+----+------+
1 row in set (0.00 sec)
root@testdb 04:04:24>explain extended select t1.*,t3.* from t1 join t2 on (a1=a2) join t3 on (a2=a3) limit 1;show warnings;
+----+-------------+-------+--------+---------------+---------+---------+--------------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+--------+---------------+---------+---------+--------------+------+----------+-------------+
| 1 | SIMPLE | t1 | index | PRIMARY | idx_b1 | 21 | NULL | 9977 | 100.00 | Using index |
| 1 | SIMPLE | t2 | eq_ref | PRIMARY | PRIMARY | 4 | testdb.t1.a1 | 1 | 100.00 | Using index |
| 1 | SIMPLE | t3 | eq_ref | PRIMARY | PRIMARY | 4 | testdb.t1.a1 | 1 | 100.00 | NULL |
+----+-------------+-------+--------+---------------+---------+---------+--------------+------+----------+-------------+
3 rows in set, 1 warning (0.00 sec)
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note | 1003 | /* select#1 */ select `testdb`.`t1`.`a1` AS `a1`,`testdb`.`t1`.`b1` AS `b1`,`testdb`.`t3`.`a3` AS `a3`,`testdb`.`t3`.`b3` AS `b3` from `testdb`.`t1` join `testdb`.`t2` join `testdb`.`t3` where ((`testdb`.`t2`.`a2` = `testdb`.`t1`.`a1`) and (`testdb`.`t3`.`a3` = `testdb`.`t1`.`a1`)) limit 1 |
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
可以看到这个语句中的t2表仅仅作为连接条件,查询目标列没有t2的列,它属于多余的对象,但是优化器依然去连接它。
我们实际工作中可以人工去掉t2表,改写为:
root@testdb 03:47:40>select t1.*,t3.* from t1 join t3 on (a1=a3) limit 1;
+----+------+----+------+
| a1 | b1 | a3 | b3 |
+----+------+----+------+
| 1 | 1 | 1 | 1 |
+----+------+----+------+
1 row in set (0.00 sec)
语句2:可去除的表,除了作为连接对象外,不出现在任何子句中比如:root@testdb 03:48:00>select max(b1) from t1,t2;
+---------+
| max(b1) |
+---------+
| 9999 |
+---------+
1 row in set (2.74 sec)
可以去掉t2:
root@testdb 03:54:56>select max(b1) from t1;
+---------+
| max(b1) |
+---------+
| 9999 |
+---------+
1 row in set (0.00 sec)
还有select a1 from t1,t2 group by a1可以改为select a1 from t1 group by a1等类似写法
3、嵌套连接消除 连接存在多个层次,用括号标识连接的优先次序。嵌套连接消除,就是消除嵌套的连接层次,把多个层次的连接减少为较少层次的连接,尽量“扁平化”。 创建测试表及数据:CREATE TABLE B (b1 INT, b2 VARCHAR(9));
CREATE TABLE A (a1 INT, a2 VARCHAR(9));
CREATE TABLE C (c1 INT, c2 VARCHAR(9));
INSERT INTO B VALUES(1, 'B1'), (NULL, 'B2'), (31, 'B31'), (32, 'B32'), (NULL, 'B4'),(5, 'B5'), (6, 'B6');
INSERT INTO A VALUES(1, 'A1'), (null, 'A2'), (NULL, 'A31'), (32, 'A32'), (4, 'A4'), (5, 'A5'), (NULL, 'A6');
INSERT INTO C VALUES(1, 'C1'), (NULL, 'C2'), (31, 'C31'), (NULL, 'C32'), (4, 'C4'), (NULL, 'C5'),(6, 'A6');
语句示例:root@testdb 04:17:29>EXPLAIN EXTENDED SELECT * FROM A JOIN (B JOIN C ON B.b1=C.c1) ON A.a1=B.b1
-> WHERE A.a1 > 1;
+----+-------------+-------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| 1 | SIMPLE | A | ALL | NULL | NULL | NULL | NULL | 7 | 100.00 | Using where |
| 1 | SIMPLE | B | ALL | NULL | NULL | NULL | NULL | 7 | 100.00 | Using where; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | C | ALL | NULL | NULL | NULL | NULL | 7 | 100.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
3 rows in set, 1 warning (0.00 sec)
root@testdb 04:18:20>show warnings;
+-------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note | 1003 | /* select#1 */ select `testdb`.`a`.`a1` AS `a1`,`testdb`.`a`.`a2` AS `a2`,`testdb`.`b`.`b1` AS `b1`,`testdb`.`b`.`b2` AS `b2`,`testdb`.`c`.`c1` AS `c1`,`testdb`.`c`.`c2` AS `c2` from `testdb`.`a` join `testdb`.`b` join `testdb`.`c` where ((`testdb`.`b`.`b1` = `testdb`.`a`.`a1`) and (`testdb`.`c`.`c1` = `testdb`.`a`.`a1`) and (`testdb`.`a`.`a1` > 1)) |
+-------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
从该SQL语义上看,是希望先执行B和C连接,然后再和A连接。但是执行计划上看到连接顺序依次是先A和B再和C,说明mysql优化器是支持对嵌套连接的消除的。
针对三种连接技术做个简单的总结:
连接消除是去掉被连接的某个对象。
外连接消除是去掉外连接的语义使其变为内连接。
嵌套连接消除是一种连接的语义顺序的变化。