MySQL查询优化技术之非SPJ优化


摘要:结合实例探讨mysql查询优化技术中的非SPJ优化


SPJ:Select(查询)、Project(投影)、Join(连接)
非SPJ:SPJ+[Group By、Order By、Distinct、Limit等]

Group By的优化

创建测试数据

CREATE TABLE t_god (a INT, b INT, c INT, d INT, e INT);
CREATE INDEX t_god_idx_1 ON t_god (a,b,c);
CREATE INDEX t_god_idx_2 ON t_god (d);

1、在索引列上执行GROUP BY,支持GROUP BY优化(没有使用“Using file sort”类似的操作进行排序):

root@testdb 10:44:23>EXPLAIN EXTENDED SELECT a FROM t_god GROUP BY a;
+----+-------------+-------+-------+---------------+-------------+---------+------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+----------+-------------+
| 1 | SIMPLE | t_god | index | t_god_idx_1 | t_god_idx_1 | 15 | NULL | 1 | 100.00 | Using index |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

2、在索引列上执行ORDERBY,MySQL支持ORDERBY优化:

root@testdb 10:44:33>EXPLAIN EXTENDED SELECT a FROM t_god ORDER BY a;
+----+-------------+-------+-------+---------------+-------------+---------+------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+----------+-------------+
| 1 | SIMPLE | t_god | index | NULL | t_god_idx_1 | 15 | NULL | 1 | 100.00 | Using index |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

3、在索引列上执行ORDERBY、GROUPBY,MySQL支持ORDERBY优化也支持GROUPBY优化:

root@testdb 11:08:54>EXPLAIN EXTENDED SELECT a FROM t_god GROUP BY a ORDER BY a;
+----+-------------+-------+-------+---------------+-------------+---------+------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+----------+-------------+
| 1 | SIMPLE | t_god | index | t_god_idx_1 | t_god_idx_1 | 15 | NULL | 1 | 100.00 | Using index |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

4、在索引列上执行带有聚集操作的GROUPBY,MySQL支持GROUPBY优化:

root@testdb 11:10:14>EXPLAIN SELECT a, MIN(b) FROM t_god WHERE c>2 GROUP BY a;
+----+-------------+-------+-------+---------------+-------------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+--------------------------+
| 1 | SIMPLE | t_god | index | t_god_idx_1 | t_god_idx_1 | 15 | NULL | 1 | Using where; Using index |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

Order By的优化

创建测试数据

CREATE TABLE t_o1 (a1 INT UNIQUE, b1 INT);
CREATE TABLE t_o2 (a2 INT UNIQUE, b2 INT);

1、在索引列上进行排序操作, MySQL支持利用索引进行排序优化:

root@testdb 11:13:04>EXPLAIN SELECT a1 FROM t_o1 ORDER BY a1;
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | t_o1 | index | NULL | a1 | 5 | NULL | 1 | Using index |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)

2、排序下推,MySQL不支持。在非索引列上执行连接,然后排序:

root@testdb 11:16:32>EXPLAIN EXTENDED SELECT * FROM t_o1, t_o2 WHERE b1=b2 ORDER BY b1;
+----+-------------+-------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| 1 | SIMPLE | t_o1 | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | Using temporary; Using filesort |
| 1 | SIMPLE | t_o2 | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
2 rows in set, 1 warning (0.00 sec)
可以看到执行计划中使用了“Using file sort”临时表操作进行排序

group by、order by对同一个Btree索引的前缀部分可以用到索引进行优化

Distinct的优化

1、MySQL支持对于DISTINCT消除的优化技术。在主键的a1列上执行DISTINCT是可以去除distinct的,查询执行计划如下:

root@testdb 11:36:32>explain select distinct a1 from t1;
+----+-------------+-------+-------+----------------+--------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+----------------+--------+---------+------+------+-------------+
| 1 | SIMPLE | t1 | index | PRIMARY,idx_b1 | idx_b1 | 21 | NULL | 9977 | Using index |
+----+-------------+-------+-------+----------------+--------+---------+------+------+-------------+
1 row in set (0.00 sec)

2、MySQL不支持对于DISTINCT推入的优化技术。
a2列是唯一列,又处于反半连接的语义(NOT EXISTS),完全可以把DISTINCT下推到表t_o2中先执行,然后再执行反半连接操作:

root@testdb 11:39:21>EXPLAIN EXTENDED SELECT DISTINCT b1 FROM t_o1 WHERE NOT EXISTS (SELECT 1 FROM t_o2 WHERE b1=a2);
+----+--------------------+-------+------+---------------+------+---------+----------------+------+----------+------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+-------+------+---------------+------+---------+----------------+------+----------+------------------------------+
| 1 | PRIMARY | t_o1 | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | Using where; Using temporary |
| 2 | DEPENDENT SUBQUERY | t_o2 | ref | a2 | a2 | 5 | testdb.t_o1.b1 | 1 | 100.00 | Using index |
+----+--------------------+-------+------+---------------+------+---------+----------------+------+----------+------------------------------+
2 rows in set, 2 warnings (0.00 sec)
通过优化器处理后的执行计划可以看出,distinct关键字仍然存在:
root@testdb 11:39:33>show warnings;
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note | 1276 | Field or reference 'testdb.t_o1.b1' of SELECT #2 was resolved in SELECT #1 |
| Note | 1003 | /* select#1 */ select distinct `testdb`.`t_o1`.`b1` AS `b1` from `testdb`.`t_o1` where (not(exists(/* select#2 */ select 1 from `testdb`.`t_o2` where (`testdb`.`t_o1`.`b1` = `testdb`.`t_o2`.`a2`)))) |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

LIMIT的优化

1、LIMIT对单表扫描的影响:如果索引扫描可用且花费低于全表扫描,则用索引扫描实现LIMIT(LIMIT取很少量的行,否则优化器更倾向于使用全表扫描)。
2、LIMIT对排序的影响:如果LIMIT和ORDERBY子句协同使用,当取到LIMIT设定个数的有序元组数后,后续的排序操作将不再进行。
3、LIMIT对去重的影响:如果LIMIT和DISTINCT子句协同使用,当取到LIMIT设定个数的唯一的元组数后,后续的去重操作将不再进行。
4、LIMIT受分组的影响:如果LIMIT和GROUPBY子句协同使用,GROUPBY按索引有序计算每个组的总数的过程中,LIMIT操作不必计数直到下一个分组开始计算。
5、LIMIT 0:直接返回空结果集。
6、MySQL支持对不带HAVING子句的LIMIT进行优化。

SET的优化

与集合操作相关的优化:ORDER BY子句去除
union本身就是无序的合并结果集,在子句中加order by操作是无意义的,因此从下面执行计划中可以看出是没有排序操作的:

root@testdb 01:39:50>explain extended (select * from t1 where b1 < 20 order by b1) union (select * from t2 where b2 < 20 order by b2);
+----+--------------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------+
| 1 | PRIMARY | t1 | range | idx_b1 | idx_b1 | 5 | NULL | 18 | 100.00 | Using where; Using index |
| 2 | UNION | t2 | ALL | NULL | NULL | NULL | NULL | 9999 | 100.00 | Using where |
| NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary |
+----+--------------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------+
3 rows in set, 1 warning (0.01 sec)

如果在order by后面加上limit就无法去除order by子句了。从下面执行计划可以看出存在排序操作(Using filesort)。

root@testdb 01:40:53>explain extended (select * from t1 where b1 < 20 order by b1 limit 5) union (select * from t2 where b2 < 20 order by b2 limit 5);
+----+--------------+------------+-------+---------------+--------+---------+------+------+----------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+------------+-------+---------------+--------+---------+------+------+----------+-----------------------------+
| 1 | PRIMARY | t1 | range | idx_b1 | idx_b1 | 5 | NULL | 18 | 100.00 | Using where; Using index |
| 2 | UNION | t2 | ALL | NULL | NULL | NULL | NULL | 9999 | 100.00 | Using where; Using filesort |
| NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary |
+----+--------------+------------+-------+---------------+--------+---------+------+------+----------+-----------------------------+
3 rows in set, 1 warning (0.01 sec)

其他优化:

1、SELECT DISTINCT a FROM t1 LIMIT 1;
2、SELECT DISTICT MAX(a) FROM t1;

文章目录
  1. 1. Group By的优化
  2. 2. Order By的优化
  3. 3. Distinct的优化
  4. 4. LIMIT的优化
  5. 5. SET的优化
  6. 6. 其他优化:
|