摘要:结合实例探讨mysql查询优化技术中的非SPJ优化
SPJ:Select(查询)、Project(投影)、Join(连接)
非SPJ:SPJ+[Group By、Order By、Distinct、Limit等]
Group By的优化
创建测试数据
CREATE TABLE t_god (a INT, b INT, c INT, d INT, e INT); CREATE INDEX t_god_idx_1 ON t_god (a,b,c); CREATE INDEX t_god_idx_2 ON t_god (d);
|
1、在索引列上执行GROUP BY,支持GROUP BY优化(没有使用“Using file sort”类似的操作进行排序):
root@testdb 10:44:23>EXPLAIN EXTENDED SELECT a FROM t_god GROUP BY a; +----+-------------+-------+-------+---------------+-------------+---------+------+------+----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+-------+---------------+-------------+---------+------+------+----------+-------------+ | 1 | SIMPLE | t_god | index | t_god_idx_1 | t_god_idx_1 | 15 | NULL | 1 | 100.00 | Using index | +----+-------------+-------+-------+---------------+-------------+---------+------+------+----------+-------------+ 1 row in set, 1 warning (0.00 sec)
|
2、在索引列上执行ORDERBY,MySQL支持ORDERBY优化:
root@testdb 10:44:33>EXPLAIN EXTENDED SELECT a FROM t_god ORDER BY a; +----+-------------+-------+-------+---------------+-------------+---------+------+------+----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+-------+---------------+-------------+---------+------+------+----------+-------------+ | 1 | SIMPLE | t_god | index | NULL | t_god_idx_1 | 15 | NULL | 1 | 100.00 | Using index | +----+-------------+-------+-------+---------------+-------------+---------+------+------+----------+-------------+ 1 row in set, 1 warning (0.00 sec)
|
3、在索引列上执行ORDERBY、GROUPBY,MySQL支持ORDERBY优化也支持GROUPBY优化:
root@testdb 11:08:54>EXPLAIN EXTENDED SELECT a FROM t_god GROUP BY a ORDER BY a; +----+-------------+-------+-------+---------------+-------------+---------+------+------+----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+-------+---------------+-------------+---------+------+------+----------+-------------+ | 1 | SIMPLE | t_god | index | t_god_idx_1 | t_god_idx_1 | 15 | NULL | 1 | 100.00 | Using index | +----+-------------+-------+-------+---------------+-------------+---------+------+------+----------+-------------+ 1 row in set, 1 warning (0.00 sec)
|
4、在索引列上执行带有聚集操作的GROUPBY,MySQL支持GROUPBY优化:
root@testdb 11:10:14>EXPLAIN SELECT a, MIN(b) FROM t_god WHERE c>2 GROUP BY a; +----+-------------+-------+-------+---------------+-------------+---------+------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+-------------+---------+------+------+--------------------------+ | 1 | SIMPLE | t_god | index | t_god_idx_1 | t_god_idx_1 | 15 | NULL | 1 | Using where; Using index | +----+-------------+-------+-------+---------------+-------------+---------+------+------+--------------------------+ 1 row in set (0.00 sec)
|
Order By的优化
创建测试数据
CREATE TABLE t_o1 (a1 INT UNIQUE, b1 INT); CREATE TABLE t_o2 (a2 INT UNIQUE, b2 INT);
|
1、在索引列上进行排序操作, MySQL支持利用索引进行排序优化:
root@testdb 11:13:04>EXPLAIN SELECT a1 FROM t_o1 ORDER BY a1; +----+-------------+-------+-------+---------------+------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+------+---------+------+------+-------------+ | 1 | SIMPLE | t_o1 | index | NULL | a1 | 5 | NULL | 1 | Using index | +----+-------------+-------+-------+---------------+------+---------+------+------+-------------+ 1 row in set (0.00 sec)
|
2、排序下推,MySQL不支持。在非索引列上执行连接,然后排序:
root@testdb 11:16:32>EXPLAIN EXTENDED SELECT * FROM t_o1, t_o2 WHERE b1=b2 ORDER BY b1; +----+-------------+-------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+ | 1 | SIMPLE | t_o1 | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | Using temporary; Using filesort | | 1 | SIMPLE | t_o2 | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | Using where; Using join buffer (Block Nested Loop) | +----+-------------+-------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+ 2 rows in set, 1 warning (0.00 sec) 可以看到执行计划中使用了“Using file sort”临时表操作进行排序
|
group by、order by对同一个Btree索引的前缀部分可以用到索引进行优化
Distinct的优化
1、MySQL支持对于DISTINCT消除的优化技术。在主键的a1列上执行DISTINCT是可以去除distinct的,查询执行计划如下:
root@testdb 11:36:32>explain select distinct a1 from t1; +----+-------------+-------+-------+----------------+--------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+----------------+--------+---------+------+------+-------------+ | 1 | SIMPLE | t1 | index | PRIMARY,idx_b1 | idx_b1 | 21 | NULL | 9977 | Using index | +----+-------------+-------+-------+----------------+--------+---------+------+------+-------------+ 1 row in set (0.00 sec)
|
2、MySQL不支持对于DISTINCT推入的优化技术。
a2列是唯一列,又处于反半连接的语义(NOT EXISTS),完全可以把DISTINCT下推到表t_o2中先执行,然后再执行反半连接操作:
root@testdb 11:39:21>EXPLAIN EXTENDED SELECT DISTINCT b1 FROM t_o1 WHERE NOT EXISTS (SELECT 1 FROM t_o2 WHERE b1=a2); +----+--------------------+-------+------+---------------+------+---------+----------------+------+----------+------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------+---------------+------+---------+----------------+------+----------+------------------------------+ | 1 | PRIMARY | t_o1 | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | Using where; Using temporary | | 2 | DEPENDENT SUBQUERY | t_o2 | ref | a2 | a2 | 5 | testdb.t_o1.b1 | 1 | 100.00 | Using index | +----+--------------------+-------+------+---------------+------+---------+----------------+------+----------+------------------------------+ 2 rows in set, 2 warnings (0.00 sec) 通过优化器处理后的执行计划可以看出,distinct关键字仍然存在: root@testdb 11:39:33>show warnings; +-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Note | 1276 | Field or reference 'testdb.t_o1.b1' of SELECT #2 was resolved in SELECT #1 | | Note | 1003 | /* select#1 */ select distinct `testdb`.`t_o1`.`b1` AS `b1` from `testdb`.`t_o1` where (not(exists(/* select#2 */ select 1 from `testdb`.`t_o2` where (`testdb`.`t_o1`.`b1` = `testdb`.`t_o2`.`a2`)))) | +-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2 rows in set (0.00 sec)
|
LIMIT的优化
1、LIMIT对单表扫描的影响:如果索引扫描可用且花费低于全表扫描,则用索引扫描实现LIMIT(LIMIT取很少量的行,否则优化器更倾向于使用全表扫描)。
2、LIMIT对排序的影响:如果LIMIT和ORDERBY子句协同使用,当取到LIMIT设定个数的有序元组数后,后续的排序操作将不再进行。
3、LIMIT对去重的影响:如果LIMIT和DISTINCT子句协同使用,当取到LIMIT设定个数的唯一的元组数后,后续的去重操作将不再进行。
4、LIMIT受分组的影响:如果LIMIT和GROUPBY子句协同使用,GROUPBY按索引有序计算每个组的总数的过程中,LIMIT操作不必计数直到下一个分组开始计算。
5、LIMIT 0:直接返回空结果集。
6、MySQL支持对不带HAVING子句的LIMIT进行优化。
SET的优化
与集合操作相关的优化:ORDER BY子句去除
union本身就是无序的合并结果集,在子句中加order by操作是无意义的,因此从下面执行计划中可以看出是没有排序操作的:
root@testdb 01:39:50>explain extended (select * from t1 where b1 < 20 order by b1) union (select * from t2 where b2 < 20 order by b2); +----+--------------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------+ | 1 | PRIMARY | t1 | range | idx_b1 | idx_b1 | 5 | NULL | 18 | 100.00 | Using where; Using index | | 2 | UNION | t2 | ALL | NULL | NULL | NULL | NULL | 9999 | 100.00 | Using where | | NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary | +----+--------------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------+ 3 rows in set, 1 warning (0.01 sec)
|
如果在order by后面加上limit就无法去除order by子句了。从下面执行计划可以看出存在排序操作(Using filesort)。
root@testdb 01:40:53>explain extended (select * from t1 where b1 < 20 order by b1 limit 5) union (select * from t2 where b2 < 20 order by b2 limit 5); +----+--------------+------------+-------+---------------+--------+---------+------+------+----------+-----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------+------------+-------+---------------+--------+---------+------+------+----------+-----------------------------+ | 1 | PRIMARY | t1 | range | idx_b1 | idx_b1 | 5 | NULL | 18 | 100.00 | Using where; Using index | | 2 | UNION | t2 | ALL | NULL | NULL | NULL | NULL | 9999 | 100.00 | Using where; Using filesort | | NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary | +----+--------------+------------+-------+---------------+--------+---------+------+------+----------+-----------------------------+ 3 rows in set, 1 warning (0.01 sec)
|
其他优化:
1、SELECT DISTINCT a FROM t1 LIMIT 1;
2、SELECT DISTICT MAX(a) FROM t1;