MySQL中怎样快速找出超长索引

老叶茶馆 2021-02-04

1124

大家好，我是知数堂SQL 优化班老师网名：骑龟的兔子

需求：

想要查找哪些索引太长了，这个SQL在5.7下跑的特别慢，8.0则挺快的，帮看下有啥优化方案没

具体SQL 和执行计划如下：

SELECT c.TABLE_SCHEMA AS DB, 
 c .TABLE_NAME AS TBL,
 c.COLUMN_NAME AS COL,
 c.CHARACTER_OCTET_LENGTH AS COL_LEN_BYTES, 
 s.INDEX_NAME,
  s.SUB_PART * CHARACTER_OCTET_LENGTH/CHARACTER_MAXIMUM_LENGTH AS SUB_PART_LEN
FROM information_schema.COLUMNS c 
INNER JOIN information_schema.STATISTICS s USING(TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME)
INNER JOIN information_schema.TABLES t USING(TABLE_SCHEMA, TABLE_NAME)  
WHERE c.TABLE_SCHEMA not in ('information_schema','performance_schema','mysql','sys', 'test')
AND c.DATA_TYPE IN ("varchar", "char", "text", "blob") 
AND ((CHARACTER_OCTET_LENGTH > 50 and SUB_PART is null) or
SUB_PART * CHARACTER_OCTET_LENGTH/CHARACTER_MAXIMUM_LENGTH > 50)
AND t.TABLE_ROWS > 10000
ORDER BY COL_LEN_BYTES DESC;

执行计划

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: c
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: Using where; Open_frm_only; Scanned all databases; Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: s
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: Using where; Open_frm_only; Scanned all databases; Using join buffer (Block Nested Loop)
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: Using where; Open_full_table; Scanned all databases; Using join buffer (Block Nested Loop)
3 rows in set, 1 warning (0.01 sec)


select count(*) from information_schema.tables;
+----------+
| count(*) |
+----------+
|    33600 |
+----------+

select count(*) from information_schema.COLUMNS;
+----------+
| count(*) |
+----------+
|   342967 |
+----------+
select count(*) from information_schema.STATISTICS;
+----------+
| count(*) |
+----------+
|   135167 |
+----------+

上面的SQL 运行450+ s 也运行不出来，最后kill掉了。

我们初步分析一下，从执行计划中可以看出三个表都是ALL 所以很慢

那添加索引不就行了吗，因为是系统表，所以不能随便添加！

那该怎么办？想到了AUTOKEY 就是临时索引，那思路就是改写SQL

达到生成临时索引，最终达到优化效果

改写的SQL 如下

SELECT c.TABLE_SCHEMA AS DB, c.TABLE_NAME AS TBL, c.COLUMN_NAME AS COL, c.CHARACTER_OCTET_LENGTH AS COL_LEN_BYTES, s.INDEX_NAME,
s.SUB_PART * c.CHARACTER_OCTET_LENGTH/c.CHARACTER_MAXIMUM_LENGTH AS SUB_PART_LEN
FROM 
( select c.TABLE_SCHEMA, c.TABLE_NAME, c.COLUMN_NAME ,c.CHARACTER_OCTET_LENGTH ,c.CHARACTER_MAXIMUM_LENGTH , c.DATA_TYPE
from 
information_schema.COLUMNS c 
WHERE c.TABLE_SCHEMA not in ('information_schema','performance_schema','mysql','sys', 'test') 
AND c.DATA_TYPE IN ("varchar", "char", "text", "blob") 
limit 10000000000
) c 
INNER JOIN 
(
select s.TABLE_SCHEMA, s.TABLE_NAME, s.COLUMN_NAME ,s.SUB_PART,s.INDEX_NAME
from 
information_schema.STATISTICS s 
WHERE s.TABLE_SCHEMA not in ('information_schema','performance_schema','mysql','sys', 'test')
limit 10000000000 

)s USING(TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME)
INNER JOIN information_schema.TABLES t USING(TABLE_SCHEMA, TABLE_NAME) 
WHERE c.TABLE_SCHEMA not in ('information_schema','performance_schema','mysql','sys', 'test') 
AND c.DATA_TYPE IN ("varchar", "char", "text", "blob") 
AND ((c.CHARACTER_OCTET_LENGTH > 50 and s.SUB_PART is null) or
s.SUB_PART * c.CHARACTER_OCTET_LENGTH/c.CHARACTER_MAXIMUM_LENGTH > 50)
AND t.TABLE_ROWS > 10000
ORDER BY COL_LEN_BYTES DESC;


*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: t
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: Using where; Open_full_table; Scanned all databases; Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived2>
   partitions: NULL
         type: ref
possible_keys: <auto_key0>
          key: <auto_key0>
      key_len: 388
          ref: information_schema.t.TABLE_SCHEMA,information_schema.t.TABLE_NAME
         rows: 2
     filtered: 50.00
        Extra: Using where
*************************** 3. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived3>
   partitions: NULL
         type: ref
possible_keys: <auto_key0>
          key: <auto_key0>
      key_len: 582
          ref: information_schema.t.TABLE_SCHEMA,information_schema.t.TABLE_NAME,c.COLUMN_NAME
         rows: 2
     filtered: 100.00
        Extra: Using where
*************************** 4. row ***************************
           id: 3
  select_type: DERIVED
        table: s
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: Using where; Open_frm_only; Scanned all databases
*************************** 5. row ***************************
           id: 2
  select_type: DERIVED
        table: c
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: Using where; Open_frm_only; Scanned all databases
5 rows in set, 1 warning (0.01 sec)

结果来了 2463 rows in set, 417 warnings (23.39 sec)

但是经过几次运行之后有时候是40多秒有时候甚至达到了166s 非常不稳定！

那分析下上面这个SQL的问题在哪里？

问题就是生成的AUTO KEY的量相对来说非常大！因为没有进行任何过滤

那现在的思路就是对生成的AUTOKEY的量进行减少

我们通过相对小的表TABLES 表生成autokey 之后 STATISTICS ,COLUMNS

表分别跟 TABLES 表进行JOIN 然后减少数据量达到减少生成AUOKEY 的量减少达到优化目的，具体的方法如下


select count(1) 
from 
(
select s.TABLE_SCHEMA, s.TABLE_NAME, s.COLUMN_NAME ,s.SUB_PART,s.INDEX_NAME
from 
information_schema.STATISTICS s 
WHERE s.TABLE_SCHEMA not in ('information_schema','performance_schema','mysql','sys', 'test') 
)s straight_join 
(
select t.TABLE_SCHEMA, t.TABLE_NAME
from 
 information_schema.TABLES t  
WHERE t.TABLE_SCHEMA not in ('information_schema','performance_schema','mysql','sys', 'test')  
AND t.TABLE_ROWS > 10000
limit 10000000000
) t on s.TABLE_SCHEMA=t.TABLE_SCHEMA and  s.TABLE_NAME =t.TABLE_NAME

*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: s
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: Using where; Open_frm_only; Scanned all databases
*************************** 2. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived3>
   partitions: NULL
         type: ref
possible_keys: <auto_key0>
          key: <auto_key0>
      key_len: 388
          ref: information_schema.s.TABLE_SCHEMA,information_schema.s.TABLE_NAME
         rows: 2
     filtered: 100.00
        Extra: Using index
*************************** 3. row ***************************
           id: 3
  select_type: DERIVED
        table: t
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: Using where; Open_full_table; Scanned all databases
3 rows in set, 1 warning (0.00 sec)


+----------+
| count(1) |
+----------+
|     7478 |
+----------+
1 row in set, 40 warnings (7.52 sec)


select count(1) 
from 
( select c.TABLE_SCHEMA, c.TABLE_NAME, c.COLUMN_NAME ,c.CHARACTER_OCTET_LENGTH ,c.CHARACTER_MAXIMUM_LENGTH , c.DATA_TYPE
from 
information_schema.COLUMNS c 
WHERE c.TABLE_SCHEMA not in ('information_schema','performance_schema','mysql','sys', 'test') 
AND c.DATA_TYPE IN ("varchar", "char", "text", "blob")  
) c straight_join 
(
select t.TABLE_SCHEMA, t.TABLE_NAME
from 
 information_schema.TABLES t  
WHERE t.TABLE_SCHEMA not in ('information_schema','performance_schema','mysql','sys', 'test')  
AND t.TABLE_ROWS > 10000
limit 10000000000
) t on c.TABLE_SCHEMA=t.TABLE_SCHEMA and  c.TABLE_NAME =t.TABLE_NAME

*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: c
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: Using where; Open_frm_only; Scanned all databases
*************************** 2. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived3>
   partitions: NULL
         type: ref
possible_keys: <auto_key0>
          key: <auto_key0>
      key_len: 388
          ref: information_schema.c.TABLE_SCHEMA,information_schema.c.TABLE_NAME
         rows: 2
     filtered: 100.00
        Extra: Using index
*************************** 3. row ***************************
           id: 3
  select_type: DERIVED
        table: t
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: Using where; Open_full_table; Scanned all databases
3 rows in set, 1 warning (0.00 sec)

+----------+
| count(1) |
+----------+
|     8106 |
+----------+
1 row in set, 417 warnings (8.62 sec)

最终SQL 如下


SELECT c.TABLE_SCHEMA AS DB, c.TABLE_NAME AS TBL, c.COLUMN_NAME AS COL, c.CHARACTER_OCTET_LENGTH AS COL_LEN_BYTES, s.INDEX_NAME,
s.SUB_PART * c.CHARACTER_OCTET_LENGTH/c.CHARACTER_MAXIMUM_LENGTH AS SUB_PART_LEN
from 
( select c.TABLE_SCHEMA, c.TABLE_NAME, c.COLUMN_NAME ,c.CHARACTER_OCTET_LENGTH ,c.CHARACTER_MAXIMUM_LENGTH , c.DATA_TYPE
from 
information_schema.COLUMNS c 
WHERE c.TABLE_SCHEMA not in ('information_schema','performance_schema','mysql','sys', 'test') 
AND c.DATA_TYPE IN ("varchar", "char", "text", "blob")  
) c straight_join 
(
select t.TABLE_SCHEMA, t.TABLE_NAME
from 
 information_schema.TABLES t  
WHERE t.TABLE_SCHEMA not in ('information_schema','performance_schema','mysql','sys', 'test')  
AND t.TABLE_ROWS > 10000
limit 10000000000
) t on c.TABLE_SCHEMA=t.TABLE_SCHEMA and  c.TABLE_NAME =t.TABLE_NAME

straight_join
(
select s.* 
from 
(
select s.TABLE_SCHEMA, s.TABLE_NAME, s.COLUMN_NAME ,s.SUB_PART,s.INDEX_NAME
from 
information_schema.STATISTICS s 
WHERE s.TABLE_SCHEMA not in ('information_schema','performance_schema','mysql','sys', 'test') 
)s straight_join 
(
select t.TABLE_SCHEMA, t.TABLE_NAME
from 
 information_schema.TABLES t  
WHERE t.TABLE_SCHEMA not in ('information_schema','performance_schema','mysql','sys', 'test')  
AND t.TABLE_ROWS > 10000
limit 10000000000
) t on s.TABLE_SCHEMA=t.TABLE_SCHEMA and  s.TABLE_NAME =t.TABLE_NAME
limit 10000000000
) s 
on c.TABLE_SCHEMA=s.TABLE_SCHEMA and  c.TABLE_NAME=s.TABLE_NAME and c.COLUMN_NAME =s.COLUMN_NAME
where 
  ((c.CHARACTER_OCTET_LENGTH > 50 and s.SUB_PART is null) or
s.SUB_PART * c.CHARACTER_OCTET_LENGTH/c.CHARACTER_MAXIMUM_LENGTH > 50)


*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: c
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: Using where; Open_frm_only; Scanned all databases
*************************** 2. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived3>
   partitions: NULL
         type: ref
possible_keys: <auto_key0>
          key: <auto_key0>
      key_len: 388
          ref: information_schema.c.TABLE_SCHEMA,information_schema.c.TABLE_NAME
         rows: 2
     filtered: 100.00
        Extra: Using index
*************************** 3. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived4>
   partitions: NULL
         type: ref
possible_keys: <auto_key0>
          key: <auto_key0>
      key_len: 582
          ref: information_schema.c.TABLE_SCHEMA,information_schema.c.TABLE_NAME,information_schema.c.COLUMN_NAME
         rows: 2
     filtered: 100.00
        Extra: Using where
*************************** 4. row ***************************
           id: 4
  select_type: DERIVED
        table: s
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: Using where; Open_frm_only; Scanned all databases
*************************** 5. row ***************************
           id: 4
  select_type: DERIVED
        table: <derived6>
   partitions: NULL
         type: ref
possible_keys: <auto_key0>
          key: <auto_key0>
      key_len: 388
          ref: information_schema.s.TABLE_SCHEMA,information_schema.s.TABLE_NAME
         rows: 2
     filtered: 100.00
        Extra: Using index
*************************** 6. row ***************************
           id: 6
  select_type: DERIVED
        table: t
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: Using where; Open_full_table; Scanned all databases
*************************** 7. row ***************************
           id: 3
  select_type: DERIVED
        table: t
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: Using where; Open_full_table; Scanned all databases
7 rows in set, 1 warning (0.00 sec)

看起来稳定了，跑了几次，都没超过15秒

我的新一轮的SQL 优化课即将在春节后开课

我是知数堂SQL 优化班老师~ ^^

如有关于SQL优化方面疑问和一起交流的请加并且 @兔子@知数堂SQL优化

高性能MySQL,SQL优化群有叶金荣，吴炳锡两位大神坐镇：579036588

欢迎加入知数堂大家庭。

我的微信公众号：SQL开发与优化(sqlturning)

扫码直达宝藏课程

mysql

文章转载自老叶茶馆，如果涉嫌侵权，请发送邮件至：contact@modb.pro进行举报，并提供相关证据，一经查实，墨天轮将立刻删除相关内容。

MySQL中怎样快速找出超长索引

评论