暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

权限问题导致分区表使用了错误的索引

原创 liuzhilong62 2023-07-01
301

问题现象

业务昨晚对sql进行了更新,之前没有DATE_CREATED字段(是分区键),跑的很快。发版后加了分区字段,本身是为了减少分区数量的访问,但是加了以后UPDATE执行却变慢了。
before:

update TABLE_RECORD set IS_DELETED = '1', DATE_UPDATED = LOCALTIMESTAMP(0) WHERE APPL_NO = $1 AND IS_DELETED = '0'

after:

update TABLE_RECORD set IS_DELETED = '1', DATE_UPDATED = LOCALTIMESTAMP(0) WHERE APPL_NO = $1 AND IS_DELETED = '0' AND DATE_CREATED > now() - interval '31' day AND DATE_CREATED < now()

发版前访问时间是毫秒级,发版后访问时间是10s,sql运行频繁,业务不可接受。

问题分析

执行计划看上去是正确的

表结构如下

# \d+ TABLE_RECORD Partitioned table "public.TABLE_RECORD" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------------------------------+-----------------------------+-----------+----------+---------------------------------------------------+----------+--------------+-------------------------- id_TABLE_RECORD | character varying(32) | | not null | nextval('seq_TABLE_RECORD'::regclass) | extended | | appl_no | character varying(100) | | | | extended | | r_appl_no | character varying(100) | | | | extended | | ... created_by | character varying(100) | | not null | 'sys'::character varying | extended | | date_created | timestamp without time zone | | not null | now() | plain | | updated_by | character varying(100) | | not null | 'sys'::character varying | extended | | date_updated | timestamp without time zone | | not null | now() | plain | | Partition key: RANGE (date_created) Indexes: "date_TABLE_RECORD" btree (date_created) "idx_dateupdated" btree (date_updated) "idx_applnodeleted" btree (appl_no, is_deleted) "nk_TABLE_RECORD" btree (appl_no) Partitions: TABLE_RECORD_202211 FOR VALUES FROM ('2022-11-01 00:00:00') TO ('2022-12-01 00:00:00'), ... TABLE_RECORD_202303 FOR VALUES FROM ('2023-03-01 00:00:00') TO ('2023-04-01 00:00:00'), TABLE_RECORD_202304 FOR VALUES FROM ('2023-04-01 00:00:00') TO ('2023-05-01 00:00:00'), TABLE_RECORD_202305 FOR VALUES FROM ('2023-05-01 00:00:00') TO ('2023-06-01 00:00:00'), TABLE_RECORD_202306 FOR VALUES FROM ('2023-06-01 00:00:00') TO ('2023-07-01 00:00:00'), ... TABLE_RECORD_202512 FOR VALUES FROM ('2025-12-01 00:00:00') TO ('2026-01-01 00:00:00'), TABLE_RECORD_other DEFAULT

这个sql会访问近2个月的分区,这2个分区都是有数据的,上面的update只会更新一条数据。
刚开始分析问题非常困惑,因为我们explain的时候,执行计划是没有问题的
explain分区扫描信息:

-> Index Scan using TABLE_RECORD_202302_date_created_idx on TABLE_RECORD_202302 TABLE_RECORD_4 (cost=0.44..5.47 rows=1 width=485) Index Cond: ((date_created > (now() - '31 days'::interval day)) AND (date_created < now())) Filter: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text)) -> Index Scan using TABLE_RECORD_202303_date_created_idx on TABLE_RECORD_202303 TABLE_RECORD_5 (cost=0.44..5.47 rows=1 width=482) Index Cond: ((date_created > (now() - '31 days'::interval day)) AND (date_created < now())) Filter: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text)) -> Index Scan using TABLE_RECORD_202304_date_created_idx on TABLE_RECORD_202304 TABLE_RECORD_6 (cost=0.44..5.47 rows=1 width=481) Index Cond: ((date_created > (now() - '31 days'::interval day)) AND (date_created < now())) Filter: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text)) -> Index Scan using idx_applnodeleted_25 on TABLE_RECORD_202305 TABLE_RECORD_7 (cost=0.43..30.49 rows=1 width=483) Index Cond: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text)) Filter: ((date_created < now()) AND (date_created > (now() - '31 days'::interval day))) -> Index Scan using idx_applnodeleted_14 on TABLE_RECORD_202306 TABLE_RECORD_8 (cost=0.56..45.11 rows=18 width=485) Index Cond: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text)) Filter: ((date_created < now()) AND (date_created > (now() - '31 days'::interval day))) -> Index Scan using idx_applnodeleted_38 on TABLE_RECORD_202307 TABLE_RECORD_9 (cost=0.14..5.17 rows=1 width=3502) Index Cond: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text)) Filter: ((date_created < now()) AND (date_created > (now() - '31 days'::interval day))) -> Index Scan using idx_applnodeleted_1 on TABLE_RECORD_202308 TABLE_RECORD_10 (cost=0.14..5.17 rows=1 width=3502)

分区数据分布:

select count(*),tableoid::regclass from TABLE_RECORD group by 2; count | tableoid -------+--------------------------------- 56558 | TABLE_RECORD_202303 4436 | TABLE_RECORD_202211 6929 | TABLE_RECORD_202306 945 | TABLE_RECORD_202305 1413 | TABLE_RECORD_202304 5499 | TABLE_RECORD_202212 1486 | TABLE_RECORD_202301 4722 | TABLE_RECORD_202302

执行计划看上去对不同的分区访问了不同的索引
date_TABLE_RECORD:分区键上的索引
idx_applnodeleted:appl_no, is_deleted组合索引
实际上sql可以通过DATE_CREATED近31天数据字段裁剪分区。但是如果在用这个字段的索引,那就没有任何过滤性了。而appl_no, is_deleted在分区中过滤性较好,真正的执行计划应该选择 idx_applnodeleted组合索引。
上面的explain执行计划虽然不是真正的执行计划,但是能看到5、6月份的走的索引是没有问题的,走的是appl_no, is_deleted组合索引。
查看真实的执行计划会执行sql,所以把update改成select,查看真实的执行计划

# explain (analyze,buffers,timing,verbose) select count(*) from TABLE_RECORD WHERE APPL_NO = 'LZLMATH20230132302302' AND IS_DELETED = '0' AND DATE_CREATED > now() - interval '31' day AND DATE_CREATED < now() ; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=266.09..266.10 rows=1 width=8) (actual time=0.565..0.566 rows=1 loops=1) Output: count(*) Buffers: shared hit=48 -> Append (cost=0.14..265.95 rows=56 width=0) (actual time=0.388..0.558 rows=1 loops=1) Buffers: shared hit=48 Subplans Removed: 37 -> Index Scan using idx_applnodeleted_25 on public.TABLE_RECORD_202305 TABLE_RECORD_1 (cost=0.43..30.39 rows=2 width=0) (actual time=0.059..0.059 rows=0 loops=1) Index Cond: (((TABLE_RECORD_1.appl_no)::text = 'LZLMATH20230132302302'::text) AND ((TABLE_RECORD_1.is_deleted)::text = '0'::text)) Filter: ((TABLE_RECORD_1.date_created < now()) AND (TABLE_RECORD_1.date_created > (now() - '31 days'::interval day))) Buffers: shared hit=3 -> Index Scan using idx_applnodeleted_14 on public.TABLE_RECORD_202306 TABLE_RECORD_2 (cost=0.56..42.52 rows=17 width=0) (actual time=0.328..0.498 rows=1 loops=1) Index Cond: (((TABLE_RECORD_2.appl_no)::text = 'LZLMATH20230132302302'::text) AND ((TABLE_RECORD_2.is_deleted)::text = '0'::text)) Filter: ((TABLE_RECORD_2.date_created < now()) AND (TABLE_RECORD_2.date_created > (now() - '31 days'::interval day))) Buffers: shared hit=45 Planning: Buffers: shared hit=5867 Planning Time: 17.195 ms Execution Time: 0.654 ms (18 rows)

select只访问了5月和6月两个分区,说明分区裁剪没问题;两个分区都是idx_applnodeleted索引,所以索引选择也没问题。
直接执行select语句,执行时间是ms级。

# select count(*) from TABLE_RECORD WHERE APPL_NO = 'LZLMATH20230132302302' AND IS_DELETED = '0' AND DATE_CREATED > now() - interval '31' day AND DATE_CREATED < now() ; count ------- 1 (1 row) Time: 4.946 ms

分析到这里,看上去执行计划正常,执行时间正常

业务sql仍然缓慢

但是,postgresql日志中仍然能抓到慢sql,update执行了10s

2023-06-29 11:06:45.077 CST,"lzldbopr","lzldb",116286,"30.88.78.90:51871",649cdebf.1c63e,7,"UPDATE",2023-06-29 09:30:39 CST,759/12440291,4002354803,LOG,00000,"duration: 10287.105 ms " plan: Query Text: update TABLE_RECORD set IS_DELETED = '1', DATE_UPDATED = LOCALTIMESTAMP(0) WHERE APPL_NO = $1 AND IS_DELETED = '0' AND DATE_CREATED > now() - interval '31' day AND DATE_CREATED < now() Update on TABLE_RECORD (cost=0.14..203.79 rows=39 width=2960) Update on TABLE_RECORD_202211 TABLE_RECORD_1 ... -> Index Scan using TABLE_RECORD_202304_date_created_idx on TABLE_RECORD_202304 TABLE_RECORD_6 (cost=0.44..5.47 rows=1 width=481) Index Cond: ((date_created > (now() - '31 days'::interval day)) AND (date_created < now())) Filter: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text)) -> Index Scan using TABLE_RECORD_202305_date_created_idx on TABLE_RECORD_202305 TABLE_RECORD_7 (cost=0.44..5.47 rows=1 width=483) Index Cond: ((date_created > (now() - '31 days'::interval day)) AND (date_created < now())) Filter: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text)) -> Index Scan using TABLE_RECORD_202306_date_created_idx on TABLE_RECORD_202306 TABLE_RECORD_8 (cost=0.44..5.47 rows=1 width=485) Index Cond: ((date_created > (now() - '31 days'::interval day)) AND (date_created < now())) Filter: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text)) -> Index Scan using idx_applnodeleted_38 on TABLE_RECORD_202307 TABLE_RECORD_9 (cost=0.14..5.17 rows=1 width=3502) Index Cond: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text)) Filter: ((date_created < now()) AND (date_created > (now() - '31 days'::interval day))) ...

5、6月分区还是用的date_created分区键上的索引,执行计划只估算了1行,实际上这2个分区实际上都有百万行数据。
分析到这里就非常困惑,优化器本身可以走到更好的索引,explain也看到走到了那个索引,但是业务sql就是没有走正确的索引。

更新统计信息

因为是pgsql执行计划的问题,首先都会想到收集统计信息
问题发生后,收集了分区表父表和子表的统计信息,并且担心会话缓存了执行计划(plan_cache_mode=auto),将统计信息收集时间之前的会话全部kill。
日志仍然显示sql执行10s,说明不是统计信息的问题。
做到这里问题仍然没有解决,似乎已经黔驴技穷了。

问题根因

前面分析执行计划的时候,dba explain的执行计划与应用的执行计划不一致,不过我们都是用postgres超级用户执行的。切换为应用用户再次explain执行,执行计划结果与日志中的执行计划一致!
因为之前出现过原生分区表权限问题,导致执行计划异常的情况。所以立即检查分区表分区的权限
父表权限

# \dp+ TABLE_RECORD Access privileges Schema | Name | Type | Access privileges | Column privileges | Policies --------+--------------------------+-------------------+-------------------------------------+-------------------+---------- public | TABLE_RECORD | partitioned table | lzldbdata=arwdDxt/lzldbdata +| | | | | r_lzldbdata_qry=r/lzldbdata +| | | | | r_lzldbdata_dev_qry=r/lzldbdata+| | | | | r_lzldbdata_dml=arwd/lzldbdata +| | (1 row)

子表权限

# \dp+ TABLE_RECORD_202505 Access privileges Schema | Name | Type | Access privileges | Column privileges | Policies --------+---------------------------------+-------+------------------------------------+-------------------+---------- public | TABLE_RECORD_202505 | table | lzldbdata=arwdDxt/lzldbdata +| |

分区权限少了r_lzldbdata_dml这个角色的,这个角色是授权给业务用户的。
立即执行授权,问题解决

grant select,update,delete,insert on TABLE_RECORD_202305 to r_lzldbdata_dml; grant select,update,delete,insert on TABLE_RECORD_202306 to r_lzldbdata_dml;

再次用opr用户explain,执行计划正确,5、6月走了正常的索引
\c - lzldbopr

-> Index Scan using TABLE_RECORD_202303_date_created_idx on TABLE_RECORD_202303 TABLE_RECORD_5 (cost=0.44..5.47 rows=1 width=482) Index Cond: ((date_created > (now() - '31 days'::interval day)) AND (date_created < now())) Filter: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text)) -> Index Scan using TABLE_RECORD_202304_date_created_idx on TABLE_RECORD_202304 TABLE_RECORD_6 (cost=0.44..5.47 rows=1 width=481) Index Cond: ((date_created > (now() - '31 days'::interval day)) AND (date_created < now())) Filter: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text)) -> Index Scan using idx_applnodeleted_25 on TABLE_RECORD_202305 TABLE_RECORD_7 (cost=0.43..30.39 rows=1 width=483) Index Cond: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text)) Filter: ((date_created < now()) AND (date_created > (now() - '31 days'::interval day))) -> Index Scan using idx_applnodeleted_14 on TABLE_RECORD_202306 TABLE_RECORD_8 (cost=0.56..42.57 rows=17 width=485) Index Cond: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text)) Filter: ((date_created < now()) AND (date_created > (now() - '31 days'::interval day))) -> Index Scan using idx_applnodeleted_38 on TABLE_RECORD_202307 TABLE_RECORD_9 (cost=0.14..5.17 rows=1 width=3502) Index Cond: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text)) Filter: ((date_created < now()) AND (date_created > (now() - '31 days'::interval day))) -> Index Scan using idx_applnodeleted_1 on TABLE_RECORD_202308 TABLE_RECORD_10 (cost=0.14..5.17 rows=1 width=3502) Index Cond: (((appl_no)::text = 'LZLMATH20230132302302'::text) AND ((is_deleted)::text = '0'::text)) Filter: ((date_created < now()) AND (date_created > (now() - '31 days'::interval day)))

pg日志上也未再观察到慢update语句。

测试(没复现)

初始化建表脚本

--切换非超级用户 \c - lzldbdata --create table CREATE TABLE PUBLIC.LZLPARTITION ( APPL_NO varchar(100) NULL, IS_DELETED varchar(8) NULL, DATE_CREATED timestamp NOT NULL DEFAULT now(), DATE_UPDATED timestamp NOT NULL DEFAULT now() ) PARTITION BY RANGE(DATE_CREATED); --indexes create index DATE_LZLPARTITION on PUBLIC.LZLPARTITION (DATE_CREATED); create index NK_LZLPARTITION on PUBLIC.LZLPARTITION (APPL_NO); --privs GRANT SELECT ON TABLE public.LZLPARTITION TO r_lzldbdata_qry; GRANT SELECT,INSERT,UPDATE,DELETE ON TABLE public.LZLPARTITION TO r_lzldbdata_dml; --partition create table LZLPARTITION_202301 partition of LZLPARTITION for values from ('2023-01-01 00:00:00') to ('2023-02-01 00:00:00'); create table LZLPARTITION_202302 partition of LZLPARTITION for values from ('2023-02-01 00:00:00') to ('2023-03-01 00:00:00'); create table LZLPARTITION_202303 partition of LZLPARTITION for values from ('2023-03-01 00:00:00') to ('2023-04-01 00:00:00'); create table LZLPARTITION_202304 partition of LZLPARTITION for values from ('2023-04-01 00:00:00') to ('2023-05-01 00:00:00'); create table LZLPARTITION_202305 partition of LZLPARTITION for values from ('2023-05-01 00:00:00') to ('2023-06-01 00:00:00'); create table LZLPARTITION_202306 partition of LZLPARTITION for values from ('2023-06-01 00:00:00') to ('2023-07-01 00:00:00');

生成数据:

insert into public.LZLPARTITION select n + 10, 'N', to_char(to_date('2023-01-01', 'YYYY-MM-DD') + ('' || n || ' minute') ::interval, 'YYYY-MM-DD')::"date", now() from generate_series(0, 300000) n

数据分布:

select count(*),tableoid::regclass from lzlpartition group by 2; count | tableoid -------+--------------------- 44640 | lzlpartition_202301 40320 | lzlpartition_202302 44640 | lzlpartition_202303 43200 | lzlpartition_202304 44640 | lzlpartition_202305 43200 | lzlpartition_202306 39361 | lzlpartition_202307

权限未继承:

# \dp+ lzlpartition Access privileges Schema | Name | Type | Access privileges | Column privileges | Policies --------+--------------+-------------------+-------------------------------------+-------------------+---------- public | lzlpartition | partitioned table | lzldbdata=arwdDxt/lzldbdata +| | | | | r_lzldbdata_qry=r/lzldbdata +| | | | | r_lzldbdata_dml=arwd/lzldbdata | | # \dp+ lzlpartition_202306 Access privileges Schema | Name | Type | Access privileges | Column privileges | Policies --------+---------------------+-------+-------------------+-------------------+---------- public | lzlpartition_202306 | table | | |

执行计划(正确):

explain select count(*) from lzlpartition WHERE APPL_NO = '217450' AND IS_DELETED = 'N' AND DATE_CREATED > now() - interval '31' day AND DATE_CREATED < now(); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=36.76..36.77 rows=1 width=8) -> Append (cost=0.29..36.74 rows=7 width=0) Subplans Removed: 5 -> Index Scan using lzlpartition_202305_appl_no_idx on lzlpartition_202305 lzlpartition_1 (cost=0.15..5.19 rows=1 width=0) Index Cond: ((appl_no)::text = '217450'::text) Filter: (((is_deleted)::text = '0'::text) AND (date_created < now()) AND (date_created > (now() - '31 days'::interval day))) -> Index Scan using lzlpartition_202306_appl_no_idx on lzlpartition_202306 lzlpartition_2 (cost=0.15..5.19 rows=1 width=0) Index Cond: ((appl_no)::text = '217450'::text) Filter: (((is_deleted)::text = '0'::text) AND (date_created < now()) AND (date_created > (now() - '31 days'::interval day)))

权限仍然没有继承,其实测了其他pg版本也是这样的,看上去是一个通用行为。
但是就算这样也没有复现出来,测试结果走了正确的索引,没有像生产环境那样走错误的索引。

问题总结

因为我们收集了统计信息并kill了会话,不应该是缓存执行计划的问题。执行了grant后,分区执行计划立即正确(甚至grant一个分区,对一个分区),所以比较确认是分区权限导致分区执行计划异常的。分析和处理问题的过程总结如下

  1. 切换应用用户查看执行计划。直接用超级用户登陆查看执行计划是通用做法,但是查看到的执行计划还是不对的。
  2. 分区表子表的权限问题。根因是postgresql分区表子表上的权限与父表不一致,导致执行计划异常,也就是说权限问题影响了pg的执行计划。
  3. 该问题不好复现,非常非常偶发。
  4. 权限导致执行计划异常这个问题非常隐蔽,不好定位。

想深入讨论2个问题

  1. 权限问题不应该影响执行计划。为什么权限会影响执行计划?
  2. 子表权限与父表不一致。为什么子表没有全部继承父表权限?
    已提交BUG看看官方咋说
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论