权限问题导致分区表使用了错误的索引

原创 liuzhilong62 2023-07-01

301

问题现象

业务昨晚对sql进行了更新，之前没有DATE_CREATED字段（是分区键），跑的很快。发版后加了分区字段，本身是为了减少分区数量的访问，但是加了以后UPDATE执行却变慢了。
before：

 update  TABLE_RECORD
                        set IS_DELETED = '1', DATE_UPDATED = LOCALTIMESTAMP(0)
                        WHERE APPL_NO = $1
                        AND IS_DELETED = '0'

after：

 update  TABLE_RECORD
                        set IS_DELETED = '1', DATE_UPDATED = LOCALTIMESTAMP(0)
                        WHERE APPL_NO = $1
                        AND IS_DELETED = '0'
                        AND DATE_CREATED > now() - interval '31' day
                        AND DATE_CREATED < now()

发版前访问时间是毫秒级，发版后访问时间是10s，sql运行频繁，业务不可接受。

问题分析

执行计划看上去是正确的

表结构如下

# \d+ TABLE_RECORD
                                                                     Partitioned table "public.TABLE_RECORD"
             Column             |            Type             | Collation | Nullable |                      Default                      | Storage  | Stats target |       Description        
--------------------------------+-----------------------------+-----------+----------+---------------------------------------------------+----------+--------------+--------------------------
 id_TABLE_RECORD    | character varying(32)       |           | not null | nextval('seq_TABLE_RECORD'::regclass) | extended |              | 
 appl_no                        | character varying(100)      |           |          |                                                   | extended |              | 
 r_appl_no                 | character varying(100)      |           |          |                                                   | extended |              | 
...
 created_by                     | character varying(100)      |           | not null | 'sys'::character varying                          | extended |              | 
 date_created                   | timestamp without time zone |           | not null | now()                                             | plain    |              | 
 updated_by                     | character varying(100)      |           | not null | 'sys'::character varying                          | extended |              | 
 date_updated                   | timestamp without time zone |           | not null | now()                                             | plain    |              | 
Partition key: RANGE (date_created)
Indexes:
    "date_TABLE_RECORD" btree (date_created)
    "idx_dateupdated" btree (date_updated)
    "idx_applnodeleted" btree (appl_no, is_deleted)
    "nk_TABLE_RECORD" btree (appl_no)
Partitions: TABLE_RECORD_202211 FOR VALUES FROM ('2022-11-01 00:00:00') TO ('2022-12-01 00:00:00'),
...
            TABLE_RECORD_202303 FOR VALUES FROM ('2023-03-01 00:00:00') TO ('2023-04-01 00:00:00'),
            TABLE_RECORD_202304 FOR VALUES FROM ('2023-04-01 00:00:00') TO ('2023-05-01 00:00:00'),
            TABLE_RECORD_202305 FOR VALUES FROM ('2023-05-01 00:00:00') TO ('2023-06-01 00:00:00'),
            TABLE_RECORD_202306 FOR VALUES FROM ('2023-06-01 00:00:00') TO ('2023-07-01 00:00:00'),
 ...
            TABLE_RECORD_202512 FOR VALUES FROM ('2025-12-01 00:00:00') TO ('2026-01-01 00:00:00'),
            TABLE_RECORD_other DEFAULT

这个sql会访问近2个月的分区，这2个分区都是有数据的，上面的update只会更新一条数据。
刚开始分析问题非常困惑，因为我们explain的时候，执行计划是没有问题的
explain分区扫描信息：

   ->  Index Scan using TABLE_RECORD_202302_date_created_idx on TABLE_RECORD_202302 TABLE_RECORD_4  (cost=0.44..5.47 rows=1 width=485)
         Index Cond: ((date_created > (now() - '31 days'::interval day)) AND (date_created < now()))
         Filter: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text))
   ->  Index Scan using TABLE_RECORD_202303_date_created_idx on TABLE_RECORD_202303 TABLE_RECORD_5  (cost=0.44..5.47 rows=1 width=482)
         Index Cond: ((date_created > (now() - '31 days'::interval day)) AND (date_created < now()))
         Filter: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text))
   ->  Index Scan using TABLE_RECORD_202304_date_created_idx on TABLE_RECORD_202304 TABLE_RECORD_6  (cost=0.44..5.47 rows=1 width=481)
         Index Cond: ((date_created > (now() - '31 days'::interval day)) AND (date_created < now()))
         Filter: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text))
   ->  Index Scan using idx_applnodeleted_25 on TABLE_RECORD_202305 TABLE_RECORD_7  (cost=0.43..30.49 rows=1 width=483)
         Index Cond: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text))
         Filter: ((date_created < now()) AND (date_created > (now() - '31 days'::interval day)))
   ->  Index Scan using idx_applnodeleted_14 on TABLE_RECORD_202306 TABLE_RECORD_8  (cost=0.56..45.11 rows=18 width=485)
         Index Cond: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text))
         Filter: ((date_created < now()) AND (date_created > (now() - '31 days'::interval day)))
   ->  Index Scan using idx_applnodeleted_38 on TABLE_RECORD_202307 TABLE_RECORD_9  (cost=0.14..5.17 rows=1 width=3502)
         Index Cond: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text))
         Filter: ((date_created < now()) AND (date_created > (now() - '31 days'::interval day)))
   ->  Index Scan using idx_applnodeleted_1 on TABLE_RECORD_202308 TABLE_RECORD_10  (cost=0.14..5.17 rows=1 width=3502)

分区数据分布：

 select count(*),tableoid::regclass from TABLE_RECORD group by 2;
 count |            tableoid             
-------+---------------------------------
 56558 | TABLE_RECORD_202303
  4436 | TABLE_RECORD_202211
  6929 | TABLE_RECORD_202306
   945 | TABLE_RECORD_202305
  1413 | TABLE_RECORD_202304
  5499 | TABLE_RECORD_202212
  1486 | TABLE_RECORD_202301
  4722 | TABLE_RECORD_202302

执行计划看上去对不同的分区访问了不同的索引
date_TABLE_RECORD：分区键上的索引
idx_applnodeleted：appl_no, is_deleted组合索引
实际上sql可以通过DATE_CREATED近31天数据字段裁剪分区。但是如果在用这个字段的索引，那就没有任何过滤性了。而appl_no, is_deleted在分区中过滤性较好，真正的执行计划应该选择 idx_applnodeleted组合索引。
上面的explain执行计划虽然不是真正的执行计划，但是能看到5、6月份的走的索引是没有问题的，走的是appl_no, is_deleted组合索引。
查看真实的执行计划会执行sql，所以把update改成select，查看真实的执行计划

# explain (analyze,buffers,timing,verbose) select count(*) from   TABLE_RECORD                        WHERE APPL_NO = 'LZLMATH20230132302302'                  AND IS_DELETED = '0'          AND DATE_CREATED > now() - interval '31' day                  AND DATE_CREATED < now()        ;
                                                                                               QUERY PLAN                                                                                                
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate  (cost=266.09..266.10 rows=1 width=8) (actual time=0.565..0.566 rows=1 loops=1)
  Output: count(*)
  Buffers: shared hit=48
  ->  Append  (cost=0.14..265.95 rows=56 width=0) (actual time=0.388..0.558 rows=1 loops=1)
        Buffers: shared hit=48
        Subplans Removed: 37
        ->  Index Scan using idx_applnodeleted_25 on public.TABLE_RECORD_202305 TABLE_RECORD_1  (cost=0.43..30.39 rows=2 width=0) (actual time=0.059..0.059 rows=0 loops=1)
              Index Cond: (((TABLE_RECORD_1.appl_no)::text = 'LZLMATH20230132302302'::text) AND ((TABLE_RECORD_1.is_deleted)::text = '0'::text))
              Filter: ((TABLE_RECORD_1.date_created < now()) AND (TABLE_RECORD_1.date_created > (now() - '31 days'::interval day)))
              Buffers: shared hit=3
        ->  Index Scan using idx_applnodeleted_14 on public.TABLE_RECORD_202306 TABLE_RECORD_2  (cost=0.56..42.52 rows=17 width=0) (actual time=0.328..0.498 rows=1 loops=1)
              Index Cond: (((TABLE_RECORD_2.appl_no)::text = 'LZLMATH20230132302302'::text) AND ((TABLE_RECORD_2.is_deleted)::text = '0'::text))
              Filter: ((TABLE_RECORD_2.date_created < now()) AND (TABLE_RECORD_2.date_created > (now() - '31 days'::interval day)))
              Buffers: shared hit=45
Planning:
  Buffers: shared hit=5867
Planning Time: 17.195 ms
Execution Time: 0.654 ms
(18 rows)

select只访问了5月和6月两个分区，说明分区裁剪没问题；两个分区都是idx_applnodeleted索引，所以索引选择也没问题。
直接执行select语句，执行时间是ms级。

# select count(*) from   TABLE_RECORD                        WHERE APPL_NO = 'LZLMATH20230132302302'                  AND IS_DELETED = '0'          AND DATE_CREATED > now() - interval '31' day                  AND DATE_CREATED < now()        ;
 count 
-------
     1
(1 row)

Time: 4.946 ms

分析到这里，看上去执行计划正常，执行时间正常

业务sql仍然缓慢

但是，postgresql日志中仍然能抓到慢sql，update执行了10s

2023-06-29 11:06:45.077 CST,"lzldbopr","lzldb",116286,"30.88.78.90:51871",649cdebf.1c63e,7,"UPDATE",2023-06-29 09:30:39 CST,759/12440291,4002354803,LOG,00000,"duration: 10287.105 ms " plan:
Query Text: update  TABLE_RECORD
                        set IS_DELETED = '1', DATE_UPDATED = LOCALTIMESTAMP(0)
                        WHERE APPL_NO = $1
                        AND IS_DELETED = '0'
                        AND DATE_CREATED > now() - interval '31' day
                        AND DATE_CREATED < now()
Update on TABLE_RECORD  (cost=0.14..203.79 rows=39 width=2960)
  Update on TABLE_RECORD_202211 TABLE_RECORD_1
...
  ->  Index Scan using TABLE_RECORD_202304_date_created_idx on TABLE_RECORD_202304 TABLE_RECORD_6  (cost=0.44..5.47 rows=1 width=481)
        Index Cond: ((date_created > (now() - '31 days'::interval day)) AND (date_created < now()))
        Filter: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text))
  ->  Index Scan using TABLE_RECORD_202305_date_created_idx on TABLE_RECORD_202305 TABLE_RECORD_7  (cost=0.44..5.47 rows=1 width=483)
        Index Cond: ((date_created > (now() - '31 days'::interval day)) AND (date_created < now()))
        Filter: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text))
  ->  Index Scan using TABLE_RECORD_202306_date_created_idx on TABLE_RECORD_202306 TABLE_RECORD_8  (cost=0.44..5.47 rows=1 width=485)
        Index Cond: ((date_created > (now() - '31 days'::interval day)) AND (date_created < now()))
        Filter: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text))
  ->  Index Scan using idx_applnodeleted_38 on TABLE_RECORD_202307 TABLE_RECORD_9  (cost=0.14..5.17 rows=1 width=3502)
        Index Cond: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text))
        Filter: ((date_created < now()) AND (date_created > (now() - '31 days'::interval day)))
...

5、6月分区还是用的date_created分区键上的索引，执行计划只估算了1行，实际上这2个分区实际上都有百万行数据。
分析到这里就非常困惑，优化器本身可以走到更好的索引，explain也看到走到了那个索引，但是业务sql就是没有走正确的索引。

更新统计信息

因为是pgsql执行计划的问题，首先都会想到收集统计信息
问题发生后，收集了分区表父表和子表的统计信息，并且担心会话缓存了执行计划（plan_cache_mode=auto），将统计信息收集时间之前的会话全部kill。
日志仍然显示sql执行10s，说明不是统计信息的问题。
做到这里问题仍然没有解决，似乎已经黔驴技穷了。

问题根因

前面分析执行计划的时候，dba explain的执行计划与应用的执行计划不一致，不过我们都是用postgres超级用户执行的。切换为应用用户再次explain执行，执行计划结果与日志中的执行计划一致！
因为之前出现过原生分区表权限问题，导致执行计划异常的情况。所以立即检查分区表分区的权限
父表权限

# \dp+ TABLE_RECORD
                                                     Access privileges
 Schema |           Name           |       Type        |          Access privileges          | Column privileges | Policies 
--------+--------------------------+-------------------+-------------------------------------+-------------------+----------
 public | TABLE_RECORD | partitioned table | lzldbdata=arwdDxt/lzldbdata    +|                   | 
        |                          |                   | r_lzldbdata_qry=r/lzldbdata    +|                   | 
        |                          |                   | r_lzldbdata_dev_qry=r/lzldbdata+|                   | 
        |                          |                   | r_lzldbdata_dml=arwd/lzldbdata +|                   | 
(1 row)

子表权限

# \dp+ TABLE_RECORD_202505
                                                  Access privileges
 Schema |              Name               | Type  |         Access privileges          | Column privileges | Policies 
--------+---------------------------------+-------+------------------------------------+-------------------+----------
 public | TABLE_RECORD_202505 | table | lzldbdata=arwdDxt/lzldbdata   +|                   |

分区权限少了r_lzldbdata_dml这个角色的,这个角色是授权给业务用户的。
立即执行授权，问题解决

grant select,update,delete,insert  on TABLE_RECORD_202305 to r_lzldbdata_dml;
grant select,update,delete,insert  on TABLE_RECORD_202306 to r_lzldbdata_dml;

再次用opr用户explain，执行计划正确，5、6月走了正常的索引
\c - lzldbopr

   ->  Index Scan using TABLE_RECORD_202303_date_created_idx on TABLE_RECORD_202303 TABLE_RECORD_5  (cost=0.44..5.47 rows=1 width=482)
         Index Cond: ((date_created > (now() - '31 days'::interval day)) AND (date_created < now()))
         Filter: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text))
   ->  Index Scan using TABLE_RECORD_202304_date_created_idx on TABLE_RECORD_202304 TABLE_RECORD_6  (cost=0.44..5.47 rows=1 width=481)
         Index Cond: ((date_created > (now() - '31 days'::interval day)) AND (date_created < now()))
         Filter: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text))
   ->  Index Scan using idx_applnodeleted_25 on TABLE_RECORD_202305 TABLE_RECORD_7  (cost=0.43..30.39 rows=1 width=483)
         Index Cond: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text))
         Filter: ((date_created < now()) AND (date_created > (now() - '31 days'::interval day)))
   ->  Index Scan using idx_applnodeleted_14 on TABLE_RECORD_202306 TABLE_RECORD_8  (cost=0.56..42.57 rows=17 width=485)
         Index Cond: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text))
         Filter: ((date_created < now()) AND (date_created > (now() - '31 days'::interval day)))
   ->  Index Scan using idx_applnodeleted_38 on TABLE_RECORD_202307 TABLE_RECORD_9  (cost=0.14..5.17 rows=1 width=3502)
         Index Cond: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text))
         Filter: ((date_created < now()) AND (date_created > (now() - '31 days'::interval day)))
   ->  Index Scan using idx_applnodeleted_1 on TABLE_RECORD_202308 TABLE_RECORD_10  (cost=0.14..5.17 rows=1 width=3502)
         Index Cond: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text))
         Filter: ((date_created < now()) AND (date_created > (now() - '31 days'::interval day)))

pg日志上也未再观察到慢update语句。

测试（没复现）

初始化建表脚本

--切换非超级用户
\c - lzldbdata
--create table
CREATE TABLE PUBLIC.LZLPARTITION
(
    APPL_NO varchar(100)  NULL,
	IS_DELETED varchar(8) NULL, 
    DATE_CREATED timestamp NOT NULL DEFAULT now(),
    DATE_UPDATED timestamp NOT NULL DEFAULT now()
) PARTITION BY RANGE(DATE_CREATED);

--indexes
create index DATE_LZLPARTITION on PUBLIC.LZLPARTITION (DATE_CREATED);
create index NK_LZLPARTITION on PUBLIC.LZLPARTITION (APPL_NO);


--privs
GRANT SELECT ON TABLE public.LZLPARTITION TO r_lzldbdata_qry;
GRANT SELECT,INSERT,UPDATE,DELETE ON TABLE public.LZLPARTITION TO r_lzldbdata_dml;

--partition
create table LZLPARTITION_202301 partition of LZLPARTITION for values from ('2023-01-01 00:00:00') to ('2023-02-01 00:00:00');
create table LZLPARTITION_202302 partition of LZLPARTITION for values from ('2023-02-01 00:00:00') to ('2023-03-01 00:00:00');
create table LZLPARTITION_202303 partition of LZLPARTITION for values from ('2023-03-01 00:00:00') to ('2023-04-01 00:00:00');
create table LZLPARTITION_202304 partition of LZLPARTITION for values from ('2023-04-01 00:00:00') to ('2023-05-01 00:00:00');
create table LZLPARTITION_202305 partition of LZLPARTITION for values from ('2023-05-01 00:00:00') to ('2023-06-01 00:00:00');
create table LZLPARTITION_202306 partition of LZLPARTITION for values from ('2023-06-01 00:00:00') to ('2023-07-01 00:00:00');

生成数据：

insert
      into
      public.LZLPARTITION
select
      n + 10,
      'N',
      to_char(to_date('2023-01-01', 'YYYY-MM-DD') + ('' || n || ' minute')  ::interval, 'YYYY-MM-DD')::"date",
      now()
from
      generate_series(0, 300000) n

数据分布：

select count(*),tableoid::regclass from lzlpartition group by 2;
 count |      tableoid       
-------+---------------------
 44640 | lzlpartition_202301
 40320 | lzlpartition_202302
 44640 | lzlpartition_202303
 43200 | lzlpartition_202304
 44640 | lzlpartition_202305
 43200 | lzlpartition_202306
 39361 | lzlpartition_202307

权限未继承：

# \dp+ lzlpartition
                                               Access privileges
 Schema |     Name     |       Type        |          Access privileges          | Column privileges | Policies 
--------+--------------+-------------------+-------------------------------------+-------------------+----------
 public | lzlpartition | partitioned table | lzldbdata=arwdDxt/lzldbdata    +|                   | 
        |              |                   | r_lzldbdata_qry=r/lzldbdata    +|                   | 
        |              |                   | r_lzldbdata_dml=arwd/lzldbdata  |                   | 


# \dp+ lzlpartition_202306
                                    Access privileges
 Schema |        Name         | Type  | Access privileges | Column privileges | Policies 
--------+---------------------+-------+-------------------+-------------------+----------
 public | lzlpartition_202306 | table |                   |                   |

执行计划（正确）：

explain  select count(*) from   lzlpartition WHERE APPL_NO = '217450' AND IS_DELETED = 'N' AND DATE_CREATED > now() - interval '31' day AND DATE_CREATED < now();
                                                                 QUERY PLAN                                                                 
--------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=36.76..36.77 rows=1 width=8)
   ->  Append  (cost=0.29..36.74 rows=7 width=0)
         Subplans Removed: 5
         ->  Index Scan using lzlpartition_202305_appl_no_idx on lzlpartition_202305 lzlpartition_1  (cost=0.15..5.19 rows=1 width=0)
               Index Cond: ((appl_no)::text = '217450'::text)
               Filter: (((is_deleted)::text = '0'::text) AND (date_created < now()) AND (date_created > (now() - '31 days'::interval day)))
         ->  Index Scan using lzlpartition_202306_appl_no_idx on lzlpartition_202306 lzlpartition_2  (cost=0.15..5.19 rows=1 width=0)
               Index Cond: ((appl_no)::text = '217450'::text)
               Filter: (((is_deleted)::text = '0'::text) AND (date_created < now()) AND (date_created > (now() - '31 days'::interval day)))

权限仍然没有继承，其实测了其他pg版本也是这样的，看上去是一个通用行为。
但是就算这样也没有复现出来，测试结果走了正确的索引，没有像生产环境那样走错误的索引。

问题总结

因为我们收集了统计信息并kill了会话，不应该是缓存执行计划的问题。执行了grant后，分区执行计划立即正确（甚至grant一个分区，对一个分区），所以比较确认是分区权限导致分区执行计划异常的。分析和处理问题的过程总结如下

切换应用用户查看执行计划。直接用超级用户登陆查看执行计划是通用做法，但是查看到的执行计划还是不对的。
分区表子表的权限问题。根因是postgresql分区表子表上的权限与父表不一致，导致执行计划异常，也就是说权限问题影响了pg的执行计划。
该问题不好复现，非常非常偶发。
权限导致执行计划异常这个问题非常隐蔽，不好定位。

想深入讨论2个问题

权限问题不应该影响执行计划。为什么权限会影响执行计划？
子表权限与父表不一致。为什么子表没有全部继承父表权限？
已提交BUG看看官方咋说

postgresql

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者