2022-12-09
postgresql为什么没用到分区裁剪
pg是12.7的版本
表结构(每个分区大概60万数据):
\d test_tab
Partitioned table "etl.test_tab"
Column | Type | Collation | Nullable | Default
-----------------------+-----------------------------+-----------+----------+---------
custnum | character varying(100) | | |
custname | character varying(300) | | |
custmanager | character varying(100) | | |
branch | character varying(40) | | |
datamonth | timestamp without time zone | | |
gridbuildingnamemajor | character varying(100) | | |
gridbuildingnamesmall | character varying(100) | | |
gridbuildingname | character varying(100) | | |
industry | character varying(100) | | |
hj_bj | numeric | | |
tongzhizhuangji | numeric | | |
zhongduan_count | numeric | | |
ll_month | numeric | | |
Partition key: RANGE (datamonth)
Indexes:
"idx_test_tab01" btree (branch)
"idx_test_tab02" btree (custmanager)
"idx_test_tab03" btree (gridbuildingnamemajor)
"idx_test_tab04" btree (gridbuildingnamesmall)
"idx_test_tab05" btree (gridbuildingname)
"idx_test_tab06" btree (custnum)
"idx_test_tab07" btree (custname, gridbuildingname)
"idx_test_tab08" btree (datamonth)
Number of partitions: 87 (Use \d+ to list them.)执行查询:
select count(*) from test_tab
where datamonth =(select max(datamonth) from test_tab);
执行计划为(中间类似的分区信息):
explain select count(*) from test_tab
where datamonth =(select max(datamonth) from test_tab);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=1005183.85..1005183.86 rows=1 width=8)
InitPlan 2 (returns $1)
-> Result (cost=24.96..24.97 rows=1 width=8)
InitPlan 1 (returns $0)
-> Limit (cost=24.90..24.96 rows=1 width=8)
-> Append (cost=24.90..1699080.58 rows=31932208 width=8)
-> Index Only Scan Backward using test_tab_p_202303_datamonth_idx on test_tab_p_202303 test_tab_p_202303_1 (cost=0.14..64.66 rows=30 width=8)
Index Cond: (datamonth IS NOT NULL)
-> Index Only Scan Backward using test_tab_p_202302_datamonth_idx on test_tab_p_202302 test_tab_p_202302_1 (cost=0.14..64.66 rows=30 width=8)
Index Cond: (datamonth IS NOT NULL)
(省略类似分区信息............)
-> Index Only Scan Backward using test_tab_p_201601_datamonth_idx on test_tab_p_201601 test_tab_p_201601_1 (cost=0.14..64.66 rows=30 width=8)
Index Cond: (datamonth IS NOT NULL)
-> Gather (cost=1005158.67..1005158.88 rows=2 width=8)
Workers Planned: 2
Params Evaluated: $1
-> Partial Aggregate (cost=1004158.67..1004158.68 rows=1 width=8)
-> Parallel Append (cost=0.42..970897.16 rows=13304601 width=0)
-> Parallel Index Only Scan using test_tab_p_202205_datamonth_idx on test_tab_p_202205 (cost=0.43..28487.90 rows=523838 width=0)
Index Cond: (datamonth = $1)
-> Parallel Index Only Scan using test_tab_p_202210_datamonth_idx on test_tab_p_202210 (cost=0.42..15042.56 rows=276505 width=0)
Index Cond: (datamonth = $1)
(省略类似分区信息............)
-> Parallel Index Only Scan using test_tab_p_202302_datamonth_idx on test_tab_p_202302 (cost=0.14..8.15 rows=1 width=0)
Index Cond: (datamonth = $1)
-> Parallel Index Only Scan using test_tab_p_202303_datamonth_idx on test_tab_p_202303 (cost=0.14..8.15 rows=1 width=0)
Index Cond: (datamonth = $1)
-> Parallel Seq Scan on test_tab_p_201904 (cost=0.00..23450.30 rows=331464 width=0)
Filter: (datamonth = $1)
(省略类似分区信息............)
Filter: (datamonth = $1)
-> Parallel Seq Scan on test_tab_p_202108 (cost=0.00..17988.96 rows=247277 width=0)
Filter: (datamonth = $1)
JIT:
Functions: 304
Options: Inlining true, Optimization true, Expressions true, Deforming true
(362 rows)好像没用到分区裁剪,所有分区都扫描了一遍?
如果用类似查询:select count(*) from test_tab where datamonth =date'2022-10-01';
将后面max部分的查询直接用具体日期代入,可以瞬间出结果。
直接查select max(datamonth) from test_tab也可以瞬间出结果。
两个组合一起就要6秒多。
pg好像无法建全局索引,max部分要全部扫描所有分区索引可以理解,但主表好像是不是也没用到分区的时间裁剪,全部扫描了一遍?
收藏
分享
1条回答
默认
最新
回答交流
提交
问题信息
请登录之后查看
邀请回答
暂无人订阅该标签,敬请期待~~
墨值悬赏

评论
