暂无图片
暂无图片
3
暂无图片
暂无图片
暂无图片

MySQL8.0 optimizer_switch变化

原创 CuiHulong 2023-01-18
1984

Optimizer_switch变量是支持对优化器行为的控制。是一组值标志,每个标志都有一个on或off的值,以指示是否启用或禁用相应的行为。

MySQL8.0里除了熟悉的hash join重大变化之外,其他方面也有优化。

mysql> SHOW VARIABLES LIKE 'OPTIMIZER_SWITCH'\G; *************************** 1. row *************************** Variable_name: optimizer_switch Value: index_merge=on,index_merge_union=on,index_merge_sort_union=on, index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on, mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off, materialization=on,semijoin=on,loosescan=on,firstmatch=on, duplicateweedout=on,subquery_materialization_cost_based=on, use_index_extensions=on,condition_fanout_filter=on,derived_merge=on, ##下面是目前5.7 Vs 8.0差异内容## use_invisible_indexes=off,skip_scan=on,hash_join=on, subquery_to_derived=off,prefer_ordering_index=on, hypergraph_optimizer=off,derived_condition_pushdown=on

use_invisible_indexes

MySQL8.0 开始支持隐藏索引,有时往往因为索引不合理,执行计划太差,需要大动干戈,现在不需要进行破坏性的更改,优化器方面配合是否使用不可见索引来构建查询执行计划.默认是关闭。当然这里主键以外。

#索引age隐藏 mysql> ALTER TABLE members ALTER INDEX idx_age INVISIBLE; Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 #执行计划全表扫描 mysql> EXPLAIN select *from members where age > 10; +----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+ | 1 | SIMPLE | members | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 33.33 | Using where | +----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+ 1 row in set, 1 warning (0.00 sec) #优化器可用隐藏索引打开。 mysql> SET SESSION optimizer_switch='use_invisible_indexes=on'; Query OK, 0 rows affected (0.00 sec) #执行计划使用age 隐藏索引 mysql> EXPLAIN select *from members where age > 10; +----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+-----------------------+ | 1 | SIMPLE | members | NULL | range | idx_age | idx_age | 5 | NULL | 3 | 100.00 | Using index condition | +----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+-----------------------+ 1 row in set, 1 warning (0.00 sec)

备注: 在一些业务无法及时改变SQL语句的时,可以通过更改索引隐藏属性,有效控制手段。
特别是生产不确定索引性能影响的时候,可以在线操作,算是应急对策。

prefer_ordering_index

MySQL里对任何有LIMIT子句的ORDER BY或GROUP BY查询使用有序索引,覆盖优化器所做的任何其他选择,只要它确定这会导致更快的执行。因此做出这种判断的算法对数据分布和其他条件做出了某些假设,所以它可能并不总是完全正确,而且在某些情况下,此类查询选择不同的优化可能会提供更好的性能。
8.0.21版本中optimizer_switch变量的prefer_ordering_index设置来禁用这种优化算法。

optimizer_switch参数中的prefer_ordering_index进行控制对于order by limit的优化。prefer_ordering_index默认打开,表示MySQL会优先考虑通过order by列索引进行排序优化。

目前验证发现只在范围查询中,有算法干扰。所以使用范围查询,可要小心了。

#模拟表结构和数据 mysql>CREATE table `members` ( id int unsigned NOT NULL AUTO_INCREMENT , first_name varchar(100) DEFAULT NULL , last_name varchar(100) DEFAULT NULL , age INT DEFAULT '0' , create_time timestamp DEFAULT CURRENT_TIMESTAMP, update_time timestamp DEFAULT current_timestamp() ON UPDATE current_timestamp(), primary KEY (id) , KEY idx_first_last(first_name(3),last_name(3)), KEY idx_age(age) ); mysql>INSERT INTO `members`(id,first_name,last_name,age) VALUES(1,'AAAAA','BBBBB',12),(2,'AAAAA','CCCCC',20),(3,'DDDDD','EEEEE',30); #排序优先开启 mysql>SET optimizer_switch = "prefer_ordering_index=on"; #优先ORDER BY 和 LIMIT 选择主键 mysql> EXPLAIN SELECT * FROM members WHERE age> 15 order by `id` DESC LIMIT 1; +----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------+ | 1 | SIMPLE | members | NULL | index | idx_age | PRIMARY | 4 | NULL | 1 | 66.67 | Using where; Backward index scan | +----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------+ 1 row in set, 1 warning (0.00 sec) #排序优先关闭 mysql> SET optimizer_switch = "prefer_ordering_index=off"; Query OK, 0 rows affected (0.00 sec) #选择合理执行计划 mysql> EXPLAIN SELECT * FROM members WHERE age> 15 order by `id` DESC LIMIT 1; +----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+---------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+---------------------------------------+ | 1 | SIMPLE | members | NULL | range | idx_age | idx_age | 5 | NULL | 2 | 100.00 | Using index condition; Using filesort | +----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+---------------------------------------+ 1 row in set, 1 warning (0.00 sec)

subquery_to_derived

MySQL 8.0.21开始,优化器增加了subquery_to_derived(默认关闭)选项。优化器在许多情况下能够将SELECT、WHERE、JOIN或HAVING子句中的标量子查询转换为派生表上的左&外部连接(在某些情况下是内连接)。
所为的:

  • subquery: 包含在 SELECT 中的子查询(不在 FROM子句中)。
  • derived : 包含在 from 子句中的子查询。MySQL会将结果存放在一个临时表中,
    也称为派生表(derived).

在大多数情况下,启用此优化不会产生任何明显的性能改进(在许多情况下甚至会使查询运行得更慢),视情况而定,所以选择默认值即可。

SHOW WARNINGS信息下,可以看到语句改写情况:

mysql> SET SESSION optimizer_switch='subquery_to_derived=OFF'; Query OK, 0 rows affected (0.00 sec) mysql> EXPLAIN SELECT * FROM t1 WHERE t1.a > (SELECT COUNT(a) FROM t2); +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+ | 1 | PRIMARY | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 33.33 | Using where | | 2 | SUBQUERY | t2 | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | NULL | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+ 2 rows in set, 1 warning (0.00 sec) mysql> SHOW WARNINGS; +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------+ | Note | 1003 | /* select#1 */ select `test`.`t1`.`a` AS `a` from `test`.`t1` where (`test`.`t1`.`a` > (/* select#2 */ select count(`test`.`t2`.`a`) from `test`.`t2`)) | +-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)
mysql> SET SESSION optimizer_switch='subquery_to_derived=ON'; Query OK, 0 rows affected (0.00 sec) mysql> EXPLAIN SELECT * FROM t1 WHERE t1.a > (SELECT COUNT(a) FROM t2); +----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+ | 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | NULL | | 1 | PRIMARY | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 33.33 | Using where; Using join buffer (hash join) | | 2 | DERIVED | t2 | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | NULL | +----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+ 3 rows in set, 1 warning (0.00 sec) mysql> SHOW WARNINGS;; +-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Note | 1003 | /* select#1 */ select `test`.`t1`.`a` AS `a` from `test`.`t1` join ( /* select#2 */ select count(`test`.`t2`.`a`) AS `COUNT(a)` from `test`.`t2`) `derived_1_2` where (`test`.`t1`.`a` > `derived_1_2`.`COUNT(a)`) | +-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)

derived_condition_pushdown

Mysql 8.0.22之后的版本支持派生条件回移(Derived Condition Pushdown)优化。该优化可以减少derived派生表处理的行数从而提高查询执行的效率。

比方如下查询:

SELECT * FROM (SELECT i, j FROM t1) AS dt WHERE i > constant # ↓ ↓ ↓ 通过派生条件回移优化后类似如下形式(WHERE条件拿到派生表的里面) SELECT * FROM (SELECT i, j FROM t1 WHERE i > constant)

就是说 将外部WHERE条件下推到派生表中应该会减少需要处理的行数,从而加快查询的执行速度。

MySQL 8.0.29及以后版本中,派生表条件下推优化可以用于UNION查询。
不能下推情况:内部表,包含子查询的条件,公共表表达式,LIMIT子句,变量的赋值 等场景。

skip_scan

MySQL从8.0.13版本开始支持一种新的range scan方式。就是说 组合索引(f1,f2)的时候,查询条件里 只有f2的时候 也可以实现索引扫描的功能.范围扫描比全索引扫描更有效,优化器可以执行多个范围扫描,每个值对应一个f1,使用一种称为Skip Scan的方法,类似于松散索引扫描。

从上述描述可以看到使用skip-scan的方式避免了全索引扫描,从而提升了性能,尤其是在索引前缀列区分度比较低的时候。可以说 避开了索引前缀原则。

mysql> ALTER TABLE t1 ADD PRIMARY KEY(f1, f2); Query OK, 0 rows affected (0.11 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> EXPLAIN SELECT f1, f2 FROM t1 WHERE f2 > 40;; +----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------+ | 1 | SIMPLE | t1 | NULL | range | PRIMARY | PRIMARY | 8 | NULL | 53 | 100.00 | Using where; Using index for skip scan | +----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------+ 1 row in set, 1 warning (0.00 sec)

hypergraph_optimizer

MySQL 8.0.23提供了优化器支持hypergraph(超图)的模型。目前实际还不支持,需要在debug下进行验证测试,是不成熟的优化方案。

hypergraph是数学中用的概念。在数据算法中,描述对象之间建立联系而用的。个人理解,就是通过边界算法,把对应的数据关联点,范围等圈出来。

由此多表关联下,从5.7的nest loop 进化到8.0的hash join,在hash join进一步提升效率的的算法。这个算法非常切合B+Tree配合使用。
image.png

mysql> SET SESSION optimizer_switch='hypergraph_optimizer=ON'; ERROR 3999 (42000): The hypergraph optimizer does not yet support 'use in non-debug builds'

总结

优化器行为变更,可能会导致不同版本SQL语句性能体现不一样。当然变化是往好的方面延伸的,但对于复杂的情况,可能也会出现退化效果。所以版本升级的时候,需要特别关注下。

在这里特别感兴趣hypergraph(超图)优化方案。因为MySQL对于多表关联是一个致命的问题,关联表数量越多性能越差。对于hypergraph的算法本质上的理解,可以说结合B+Tree性能提升空间很大。

最后修改时间:2025-01-08 16:16:28
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论