暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Postgresql中order by limit 索引选择错误导致查询很慢

晟数学苑 2021-10-16
2937

点击蓝字 阅读更多干货

  开发今天发来一条SQL,说执行很慢,出不来结果,如下:

    select id, user_id, k_money_value, channel, status, add_time, note from hist_kmoney_record where user_id=133423946  order by id desc limit 10;

      表大小,记录数,索引信息如下:

      melotpay=# select count(id) from hist_kmoney_record ;
      count
      -----------
      133300926

      melotpay=# \dt+ hist_kmoney_record
      List of relations
      Schema | Name | Type | Owner | Persistence | Size | Description
      --------+--------------------+-------+----------+-------------+-------+-------------
      public | hist_kmoney_record | table | postgres | permanent | 22 GB |

      melotpay=# \d hist_kmoney_record
      Table "public.hist_kmoney_record"
      Column | Type | Collation | Nullable | Default
      ---------------+-----------------------------+-----------+----------+---------
      id | integer | | not null |
      user_id | integer | | |
      k_money_value | bigint | | |
      channel | integer | | |
      status | integer | | |
      reason | text | | |
      count | integer | | |
      add_time | timestamp without time zone | | |
      update_time | timestamp without time zone | | |
      note | text | | |
      uuid | text | | |
      Indexes:
      "hist_kmoney_record_pkey" PRIMARY KEY, btree (id)
      "index_hkr_userid_channel" btree (user_id, channel)

      看一下执行计划

        melotpay=# explain select id, user_id, k_money_value, channel, status, add_time, note from hist_kmoney_record where user_id=133423946  order by id desc limit 10;
        QUERY PLAN
        ----------------------------------------------------------------------------------------------------------------------------
        Limit (cost=0.57..4747.37 rows=10 width=81)
        -> Index Scan Backward using hist_kmoney_record_pkey on hist_kmoney_record (cost=0.57..7030008.05 rows=14810 width=81)
        Filter: (user_id = 133423946)
        (3 rows)

          执行计划显示走的hist_kmoney_record_pkey索引,也就是id主键的索引,而没有用到user_id的索引,所以过滤user_id的代价很大,导致SQL很慢。

          从表信息看到,表还是比较大的,我们知道pg是按代价来评估执行计划的,这里为什么没有用user_id列的索引呢,这里是因为统计信息里没有记录user_id和id的相关性。


          查看user_id的线性相关性,如下可见是很离散的,因为该值越接近1,线性越好,就是有可能按主键索引排序后需要的数据可能在很后面,要出结果,对于这样的表那将会很慢。所以这里评估代价生成的执行计划是有问题的。

          melotpay=# select correlation  from  pg_stats where tablename ='hist_kmoney_record' and attname='user_id';
          correlation
          -------------
            0.13479742

            优化(不用id字段的主键索引):

            加大limit值,会使得走id字段索引的评估代价超过走user_id的代价,那么优化器就会选择走user_id字段的索引,如下

            melotpay=# explain  select id, user_id, k_money_value, channel, status, add_time, note from hist_kmoney_record where user_id=133423946  order by id desc limit 1000;
            QUERY PLAN
            ---------------------------------------------------------------------------------------------------------
            Limit (cost=57185.60..57188.10 rows=1000 width=81)
            -> Sort (cost=57185.60..57222.62 rows=14810 width=81)
            Sort Key: id DESC
            -> Bitmap Heap Scan on hist_kmoney_record (cost=279.34..56373.58 rows=14810 width=81)
            Recheck Cond: (user_id = 133423946)
            -> Bitmap Index Scan on index_hkr_userid_channel (cost=0.00..275.64 rows=14810 width=0)
            Index Cond: (user_id = 133423946)
            (7 rows)

              limit 1000不是我们想要的分页数,那么可以选择在id字段加一个常量,从而达到不走id字段索引的目的

              melotpay=# explain analyze select id, user_id, k_money_value, channel, status, add_time, note from hist_kmoney_record where user_id=133423946  order by id+0 desc limit 10;
              QUERY PLAN
              ---------------------------------------------------------------------------------------------------------------------------------------------------
              Limit (cost=56023.31..56023.33 rows=10 width=85) (actual time=0.020..0.021 rows=1 loops=1)
              -> Sort (cost=56023.31..56059.85 rows=14619 width=85) (actual time=0.019..0.020 rows=1 loops=1)
              Sort Key: ((id + 0)) DESC
              Sort Method: quicksort Memory: 25kB
              -> Bitmap Heap Scan on hist_kmoney_record (cost=277.86..55707.40 rows=14619 width=85) (actual time=0.014..0.014 rows=1 loops=1)
              Recheck Cond: (user_id = 133423946)
              Heap Blocks: exact=1
              -> Bitmap Index Scan on index_hkr_userid_channel (cost=0.00..274.21 rows=14619 width=0) (actual time=0.009..0.009 rows=1 loops=1)
              Index Cond: (user_id = 133423946)
              Planning Time: 0.136 ms
               Execution Time: 0.049 ms

                创建合适的索引,如(user_id,id)组合索引

                melotpay=# create index idx_hist_kmoney_record_union on hist_kmoney_record(user_id,id);
                CREATE INDEX
                melotpay=# explain analyze select id, user_id, k_money_value, channel, status, add_time, note from hist_kmoney_record where user_id=133423946 order by id desc limit 10;
                QUERY PLAN

                --------------------------------------------------------------------------------------------------------------------------------------------------------------
                -----------
                Limit (cost=0.57..40.35 rows=10 width=81) (actual time=1.302..1.304 rows=1 loops=1)
                -> Index Scan Backward using idx_hist_kmoney_record_union on hist_kmoney_record (cost=0.57..58907.13 rows=14809 width=81) (actual time=1.298..1.300 rows=
                1 loops=1)
                Index Cond: (user_id = 133423946)
                Planning Time: 0.090 ms
                Execution Time: 1.330 ms
                (5 rows)

                注意:以上方法并不具有普适性,需要考量实际情况,调整对应的方法。


                分享、在看与点赞

                只要你点,我们就是胖友

                文章转载自 晟数学苑,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                评论