小白学习MySQL - 降序索引的功能和作用

bisal的个人杂货铺 2022-09-06

323

Oracle很久前的版本就支持降序索引的功能，MySQL是从8.0才开始支持。

MySQL 5.7中，我们创建了一张测试表t1，包含两个字段c1和c2，插入一些数据，如下所示，

mysql> create table t1(c1 varchar(1), c2 int);
Query OK, 0 rows affected (0.32 sec)


mysql> insert into t1 values('a',1),('c',3),('b',2),('a',5);
Query OK, 4 rows affected (0.01 sec)
Records: 4  Duplicates: 0  Warnings: 0


mysql> select * from t1;
+------+------+
| c1   | c2   |
+------+------+
|  a   |    1 |
|  c   |    3 |
|  b   |    2 |
|  a   |    5 |
+------+------+
4 rows in set (0.00 sec)

如果我们创建这样一个复合索引(c1, c2)，其中c1和c2默认升序排列，

mysql> create index idx_t1_01 on t1(c1, c2);
Query OK, 0 rows affected (0.27 sec)
Records: 0  Duplicates: 0  Warnings: 0

执行这条SQL，因为c1和c2在索引中是有序排列的，而且默认升序，可以看到用到了这个索引，不需要回表，

mysql> desc select * from t1 order by c1, c2;
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key       | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
| 1 | SIMPLE      | t1    | NULL       | index | NULL          | idx_t1_01 | 11      | NULL |    4 |   100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.01 sec)

如果执行这条，即按照c1和c2的降序检索，同样用到了索引，

mysql> desc select * from t1 order by c1 desc, c2 desc;
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key       | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
| 1 | SIMPLE      | t1    | NULL       | index | NULL          | idx_t1_01 | 11      | NULL |    4 |   100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.01 sec)

但是参考MySQL官方文章《MySQL 8.0 Labs - Descending Indexes in MySQL》的经验，虽然MySQL 5.7的优化器可以降序遍历升序索引，但这是需要成本代价的，正序访问可能会比降序访问的效率高15%左右。

https://dev.mysql.com/blog-archive/mysql-8-0-labs-descending-indexes-in-mysql/

再进一步，如果执行这条SQL，c1按照升序，c2按照降序，此时除了用到索引外，还用到了filesort，原因就是复合索引中c1和c2都是升序排列，但是执行的SQL中c1按照升序，c2按照降序，两者相悖，因此，会多了对c2做降序排列的操作，

mysql> desc select * from t1 order by c1, c2 desc;
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------------+
| id | select_type | table | partitions | type  | possible_keys | key       | key_len | ref  | rows | filtered | Extra                       |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------------+
|  1 | SIMPLE      | t1    | NULL       | index | NULL          | idx_t1_01 | 11      | NULL |    4 |   100.00 | Using index; Using filesort |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------------+
1 row in set, 1 warning (0.00 sec)

如果是Oracle，我们可能会想到，可以在创建索引的时候，指定字段的顺序，如下所示，c1按照升序，c2按照降序，

mysql> drop index idx_t1_01 on t1;
Query OK, 0 rows affected (0.22 sec)
Records: 0  Duplicates: 0  Warnings: 0


mysql> create index idx_t1_01 on t1(c1, c2 desc);
Query OK, 0 rows affected (0.11 sec)
Records: 0  Duplicates: 0  Warnings: 0

再执行SQL，我们发现，还是按照刚才的执行计划，

mysql> desc select * from t1 order by c1, c2 desc;
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------------+
| id | select_type | table | partitions | type  | possible_keys | key       | key_len | ref  | rows | filtered | Extra                       |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------------+
|  1 | SIMPLE      | t1    | NULL       | index | NULL          | idx_t1_01 | 11      | NULL |    4 |   100.00 | Using index; Using filesort |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------------+
1 row in set, 1 warning (0.00 sec)

究其原因，MySQL 8.0之前，不支持"降序索引"，这点可以从表定义中看到，虽然我们创建索引的时候，明确指出c2按照降序创建的，但是如下定义语句中，c2没带desc，说明还是默认升序排列，

mysql> show create table t1;
+-------+----------------------------------+
| Table | Create Table                     |
+-------+----------------------------------+
| t1    | CREATE TABLE `t1` (
  `c1` varchar(1) DEFAULT NULL,
  `c2` int(11) DEFAULT NULL,
  KEY `idx_t1_01` (`c1`,`c2`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+-------+----------------------------------+
1 row in set (0.00 sec)

如果我们使用MySQL 8.0(InnoDB)，就会看到一些不同的现象，如下所示，我们创建(c1, c2)的复合索引，执行SQL效果和5.7相同，

bisal@mysqldb 18:41:  [test]> create index idx_t1_01 on t1(c1, c2);
Query OK, 0 rows affected (0.05 sec)
Records: 0  Duplicates: 0  Warnings: 0


bisal@mysqldb 18:41:  [test]> desc select * from t1 order by c1, c2;
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key       | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | t1    | NULL       | index | NULL          | idx_t1_01 | 12      | NULL |    3 |   100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

如果按照c1和c2降序进行检索，我们看到，这时显示的是"Backward index scan"，

bisal@mysqldb 18:43:  [test]> desc select * from t1 order by c1 desc, c2 desc;
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+----------------------------------+
| id | select_type | table | partitions | type  | possible_keys | key       | key_len | ref  | rows | filtered | Extra                            |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+----------------------------------+
|  1 | SIMPLE      | t1    | NULL       | index | NULL          | idx_t1_01 | 12      | NULL |    3 |   100.00 | Backward index scan; Using index |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+----------------------------------+
1 row in set, 1 warning (0.00 sec)

我们知道，对于普通的升序索引，从根节点到叶子节点是升序排列的，所有索引节点从左到右也是升序排列的，但是如果要得到升序索引排在后面的数据(例如主键id字段，默认升序，select * from test where id < 1000 order by id desc)，就需要对这个索引逆向查找，就使用了backward index scan，这就是基于双向链表的机制。

同样，为了做对比，如果c1按照升序，c2按照降序，执行的SQL，还是用到filesort，

bisal@mysqldb 18:41:  [test]> desc select * from t1 order by c1, c2 desc;
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------------+
| id | select_type | table | partitions | type  | possible_keys | key       | key_len | ref  | rows | filtered | Extra                       |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------------+
|  1 | SIMPLE      | t1    | NULL       | index | NULL          | idx_t1_01 | 12      | NULL |    3 |   100.00 | Using index; Using filesort |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------------+
1 row in set, 1 warning (0.00 sec)

重点来了，如果创建一个c1按照升序，c2按照降序的复合索引，

bisal@mysqldb 18:41:  [test]> create index idx_t1_01 on t1(c1, c2 desc);
Query OK, 0 rows affected (0.05 sec)
Records: 0  Duplicates: 0  Warnings: 0

此时执行这条SQL，是可以用到这个索引的，因为从根节点到叶子节点，索引节点从左到右，c1都是按照升序排列，c2都是按照降序排列，和检索条件的顺序是完全一致的，

bisal@mysqldb 18:42:  [test]> desc select * from t1 order by c1, c2 desc;
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key       | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | t1    | NULL       | index | NULL          | idx_t1_01 | 12      | NULL |    3 |   100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

从他的定义中，我们可以看到，索引种c2列是按照降序定义的，这就是和5.7的区别，

bisal@mysqldb 17:46:  [test]> show create table t1\G;
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `c1` varchar(1) DEFAULT NULL,
  `c2` int DEFAULT NULL,
  KEY `idx_t1_01` (`c1`,`c2` DESC)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

由于支持了索引中的降序存储，就可以完全利用索引的有序性，避免额外的排序，达到快速检索数据，提升检索效率的目的。

但是需要注意的是，由于引入了降序索引，MySQL 8.0之前支持的group by子句默认隐式排序的功能，不再支持了，因此对结果集有排序的需求，就需要显式执行order by。如果熟悉Oracle的朋友，可能知道从9i升级到更高版本时，同样会碰到group by默认不排序的功能改变，相应地，很可能需要应用配合改造，具体可参考《Oracle不同版本group by的变化》。

虽然MySQL 8.0开始支持降序索引了，但是，选择什么类型的索引，还是要根据实际的业务场景来决策，例如有(c1, c2 desc)这种需求，适当地创建降序索引是有帮助的，如果绝大多数场景都是(c1, c2)，这就很可能产生负面影响，因此，降序索引只是为了优化提供了一种选择，究竟选择什么，还是要结合实际的场景来综合考量，从MySQL对降序索引的支持，能看出他的设计，其实也是与时俱进的。

小白学习MySQL

《小白学习MySQL - Derived Table》

《小白学习MySQL - Generated Columns功能》

《小白学习MySQL - 增量统计SQL的需求 - 开窗函数的方案》

《小白学习MySQL - 统计的"投机取巧"》

《小白学习MySQL - 增量统计SQL的需求》

《小白学习MySQL - 你碰到过这种无法登陆的场景？》

《小白学习MySQL - 不同版本创建用户的些许区别》

《小白学习MySQL - 随机插入测试数据的工具》

《小白学习MySQL - varchar类型字段为什么经常定义成255？》

《小白学习MySQL - 变通创建索引的案例一则》

《小白学习MySQL - “投机取巧”统计表的记录数》