Postgresql中order by limit 索引选择错误导致查询很慢

晟数学苑 2021-10-16

2937

点击蓝字阅读更多干货

开发今天发来一条SQL,说执行很慢，出不来结果，如下：

select id, user_id, k_money_value, channel, status, add_time, note from hist_kmoney_record where user_id=133423946  order by id desc limit 10;

表大小，记录数，索引信息如下:

melotpay=# select count(id) from hist_kmoney_record ;
   count   
-----------
 133300926

melotpay=# \dt+ hist_kmoney_record
                                 List of relations
 Schema |        Name        | Type  |  Owner   | Persistence | Size  | Description 
--------+--------------------+-------+----------+-------------+-------+-------------
 public | hist_kmoney_record | table | postgres | permanent   | 22 GB | 

 melotpay=# \d hist_kmoney_record
                      Table "public.hist_kmoney_record"
    Column     |            Type             | Collation | Nullable | Default 
---------------+-----------------------------+-----------+----------+---------
 id            | integer                     |           | not null | 
 user_id       | integer                     |           |          | 
 k_money_value | bigint                      |           |          | 
 channel       | integer                     |           |          | 
 status        | integer                     |           |          | 
 reason        | text                        |           |          | 
 count         | integer                     |           |          | 
 add_time      | timestamp without time zone |           |          | 
 update_time   | timestamp without time zone |           |          | 
 note          | text                        |           |          | 
 uuid          | text                        |           |          | 
Indexes:
    "hist_kmoney_record_pkey" PRIMARY KEY, btree (id)
    "index_hkr_userid_channel" btree (user_id, channel)

看一下执行计划

melotpay=# explain select id, user_id, k_money_value, channel, status, add_time, note from hist_kmoney_record where user_id=133423946  order by id desc limit 10;
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.57..4747.37 rows=10 width=81)
   ->  Index Scan Backward using hist_kmoney_record_pkey on hist_kmoney_record  (cost=0.57..7030008.05 rows=14810 width=81)
         Filter: (user_id = 133423946)
(3 rows)

执行计划显示走的hist_kmoney_record_pkey索引，也就是id主键的索引，而没有用到user_id的索引，所以过滤user_id的代价很大，导致SQL很慢。

从表信息看到，表还是比较大的，我们知道pg是按代价来评估执行计划的，这里为什么没有用user_id列的索引呢，这里是因为统计信息里没有记录user_id和id的相关性。

查看user_id的线性相关性，如下可见是很离散的，因为该值越接近1，线性越好，就是有可能按主键索引排序后需要的数据可能在很后面，要出结果，对于这样的表那将会很慢。所以这里评估代价生成的执行计划是有问题的。

melotpay=# select correlation  from  pg_stats where tablename ='hist_kmoney_record' and attname='user_id';
 correlation 
-------------
  0.13479742

优化(不用id字段的主键索引)：

加大limit值，会使得走id字段索引的评估代价超过走user_id的代价，那么优化器就会选择走user_id字段的索引，如下

melotpay=# explain  select id, user_id, k_money_value, channel, status, add_time, note from hist_kmoney_record where user_id=133423946  order by id desc limit 1000;
                                               QUERY PLAN                                                
---------------------------------------------------------------------------------------------------------
 Limit  (cost=57185.60..57188.10 rows=1000 width=81)
   ->  Sort  (cost=57185.60..57222.62 rows=14810 width=81)
         Sort Key: id DESC
         ->  Bitmap Heap Scan on hist_kmoney_record  (cost=279.34..56373.58 rows=14810 width=81)
               Recheck Cond: (user_id = 133423946)
               ->  Bitmap Index Scan on index_hkr_userid_channel  (cost=0.00..275.64 rows=14810 width=0)
                     Index Cond: (user_id = 133423946)
(7 rows)

limit 1000不是我们想要的分页数，那么可以选择在id字段加一个常量，从而达到不走id字段索引的目的

melotpay=# explain analyze select id, user_id, k_money_value, channel, status, add_time, note from hist_kmoney_record where user_id=133423946  order by id+0 desc limit 10;
                                                                    QUERY PLAN                                                                     
---------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=56023.31..56023.33 rows=10 width=85) (actual time=0.020..0.021 rows=1 loops=1)
   ->  Sort  (cost=56023.31..56059.85 rows=14619 width=85) (actual time=0.019..0.020 rows=1 loops=1)
         Sort Key: ((id + 0)) DESC
         Sort Method: quicksort  Memory: 25kB
         ->  Bitmap Heap Scan on hist_kmoney_record  (cost=277.86..55707.40 rows=14619 width=85) (actual time=0.014..0.014 rows=1 loops=1)
               Recheck Cond: (user_id = 133423946)
               Heap Blocks: exact=1
               ->  Bitmap Index Scan on index_hkr_userid_channel  (cost=0.00..274.21 rows=14619 width=0) (actual time=0.009..0.009 rows=1 loops=1)
                     Index Cond: (user_id = 133423946)
 Planning Time: 0.136 ms
 Execution Time: 0.049 ms

创建合适的索引,如(user_id,id)组合索引

melotpay=# create index idx_hist_kmoney_record_union on hist_kmoney_record(user_id,id);
CREATE INDEX
melotpay=#  explain analyze select id, user_id, k_money_value, channel, status, add_time, note from hist_kmoney_record where user_id=133423946  order by id desc limit 10;
                                                                               QUERY PLAN                                                                     
           
--------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------
 Limit  (cost=0.57..40.35 rows=10 width=81) (actual time=1.302..1.304 rows=1 loops=1)
   ->  Index Scan Backward using idx_hist_kmoney_record_union on hist_kmoney_record  (cost=0.57..58907.13 rows=14809 width=81) (actual time=1.298..1.300 rows=
1 loops=1)
         Index Cond: (user_id = 133423946)
 Planning Time: 0.090 ms
 Execution Time: 1.330 ms
(5 rows)

注意：以上方法并不具有普适性，需要考量实际情况，调整对应的方法。

分享、在看与点赞

只要你点，我们就是胖友

postgresql

文章转载自晟数学苑，如果涉嫌侵权，请发送邮件至：contact@modb.pro进行举报，并提供相关证据，一经查实，墨天轮将立刻删除相关内容。

Postgresql中order by limit 索引选择错误导致查询很慢

评论