PostgreSQL 10 新特性学习-并行查询增强

原创柚子身上依 2024-09-13

448

1.并行特性增强

PostgreSQL并行特性版本演进
并行查询是PostgreSQL提供的一种功能，它允许查询在多个CPU核心或处理器上同时执行，从而显著提高查询性能，特别是在处理大型数据集时。并行查询的设计目标是在多核处理器的现代硬件上实现更好的资源利用率，减少查询响应时间。通常并行发生在并行查询聚合操作中。
PostgreSQL 9.6 是第一个支持并行查询的版本。PostgreSQL 10 支持更多的并行特性，支持并行B-Tree 索引扫描，Bitmap 扫描及Merge join，并支持并行PREPARE 语句和EXECUTE 语句。

postgres=# select name,setting from pg_settings where name ~'parallel';
              name               | setting 
---------------------------------+---------
 force_parallel_mode             | off
 max_parallel_workers            | 8
 max_parallel_workers_per_gather | 2
 min_parallel_index_scan_size    | 64
 min_parallel_table_scan_size    | 1024
 parallel_setup_cost             | 1000
 parallel_tuple_cost             | 0.1
(7 rows)

1.1.并行Seq scan

从9.6开始支持

postgres=# CREATE TABLE test ( ID int4, NAME CHARACTER VARYING ( 32 ), create_time timestamp without TIME ZONE DEFAULT clock_TIMESTAMP ( ) );
CREATE TABLE
postgres=# INSERT INTO test(id,name) SELECT n,n||'test' FROM generate_series(1,20000000) n; 
INSERT 0 20000000
postgres=# EXPLAIN ANALYZE SELECT * FROM test WHERE name='test';
                                                            QUERY PLAN                                                            
----------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..189923.15 rows=45284 width=94) (actual time=3335.913..3337.399 rows=0 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on test  (cost=0.00..184394.75 rows=18868 width=94) (actual time=3329.592..3329.592 rows=0 loops=3)
         Filter: ((name)::text = 'test'::text)
         Rows Removed by Filter: 6666667
 Planning Time: 355.493 ms
 Execution Time: 3337.442 ms
(8 rows)

1.2.并行B-Tree 索引扫描

承接1.1.使用使用的表、数据

--并行Index Scan 
postgres=# CREATE INDEX index_test on test USING btree (id);
CREATE INDEX

postgres=# EXPLAIN ANALYZE SELECT count(name) FROM test WHERE id <1000000;
                                                                              QUERY PLAN                                                       
                        
-----------------------------------------------------------------------------------------------------------------------------------------------
------------------------
 Finalize Aggregate  (cost=31569.32..31569.33 rows=1 width=8) (actual time=189.852..192.015 rows=1 loops=1)
   ->  Gather  (cost=31569.10..31569.31 rows=2 width=8) (actual time=189.665..192.007 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=30569.10..30569.11 rows=1 width=8) (actual time=134.214..134.215 rows=1 loops=3)
               ->  Parallel Index Scan using index_test on test  (cost=0.44..29526.59 rows=417005 width=12) (actual time=0.187..112.0
64 rows=333333 loops=3)
                     Index Cond: (id < 1000000)
 Planning Time: 0.120 ms
 Execution Time: 192.045 ms
(9 rows)

--并行Index Only Scan
postgres=# EXPLAIN ANALYZE SELECT count(*) from test WHERE id <1000000;
                                                                              QUERY PLAN                                                                               
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=31484.71..31484.72 rows=1 width=8) (actual time=211813.939..211818.297 rows=1 loops=1)
   ->  Gather  (cost=31484.50..31484.71 rows=2 width=8) (actual time=211813.471..211818.290 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=30484.50..30484.51 rows=1 width=8) (actual time=211809.388..211809.389 rows=1 loops=3)
               ->  Parallel Index Only Scan using index_test on test  (cost=0.44..29444.82 rows=415871 width=0) (actual time=1533.892..211111.445 rows=333333 loops=3)
                     Index Cond: (id < 1000000)
                     Heap Fetches: 347055
 Planning time: 0.067 ms
 Execution time: 211818.408 ms
(10 rows)

postgres=# \d test
                               Table "public.test"
   Column    |            Type             | Collation | Nullable |      Default      
-------------+-----------------------------+-----------+----------+-------------------
 id          | integer                     |           |          | 
 name        | character varying(32)       |           |          | 
 create_time | timestamp without time zone |           |          | clock_timestamp()
Indexes:
    "index_test" btree (id)

1.3.并行Bitmap 索引扫描

承接1.2.使用使用的表、数据及索引

postgres=# EXPLAIN ANALYZE SELECT count(*) from test WHERE id<1000000 or id >490000000;
                                                                           QUERY PLAN                                                          
                  
-----------------------------------------------------------------------------------------------------------------------------------------------
------------------
 Finalize Aggregate  (cost=254902.77..254902.78 rows=1 width=8) (actual time=70.467..71.554 rows=1 loops=1)
   ->  Gather  (cost=254902.56..254902.77 rows=2 width=8) (actual time=70.236..71.546 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=253902.56..253902.57 rows=1 width=8) (actual time=65.678..65.681 rows=1 loops=3)
               ->  Parallel Bitmap Heap Scan on test  (cost=18991.39..252860.04 rows=417005 width=0) (actual time=31.622..52.392 rows=3333
33 loops=3)
                     Recheck Cond: ((id < 1000000) OR (id > 490000000))
                     Heap Blocks: exact=2216
                     ->  BitmapOr  (cost=18991.39..18991.39 rows=1000813 width=0) (actual time=34.953..34.954 rows=0 loops=1)
                           ->  Bitmap Index Scan on index_test  (cost=0.00..18486.53 rows=1000813 width=0) (actual time=34.938..34.938 row
s=999999 loops=1)
                                 Index Cond: (id < 1000000)
                           ->  Bitmap Index Scan on index_test  (cost=0.00..4.45 rows=1 width=0) (actual time=0.007..0.008 rows=0 loops=1)
                                 Index Cond: (id > 490000000)
 Planning Time: 0.146 ms
 Execution Time: 71.623 ms
(15 rows)

1.4.并行Merge Join

承接1.2.使用使用的表、数据及索引

postgres=# explain analyze select * from test t1,test t2 where t1.name=t2.name and t1.id<1000000;
                                                                               QUERY PLAN                                                      
                         
-----------------------------------------------------------------------------------------------------------------------------------------------
-------------------------
 Gather  (cost=1676933.12..1830357.23 rows=1000813 width=48) (actual time=33982.911..45257.909 rows=999999 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Merge Join  (cost=1675933.12..1729275.93 rows=417005 width=48) (actual time=35278.750..43458.546 rows=333333 loops=3)
         Merge Cond: ((t2.name)::text = (t1.name)::text)
         ->  Sort  (cost=1520296.22..1541129.55 rows=8333333 width=24) (actual time=31973.077..36182.147 rows=6666667 loops=3)
               Sort Key: t2.name
               Sort Method: external merge  Disk: 229864kB
               Worker 0:  Sort Method: external merge  Disk: 290016kB
               Worker 1:  Sort Method: external merge  Disk: 223960kB
               ->  Parallel Seq Scan on test t2  (cost=0.00..220557.33 rows=8333333 width=24) (actual time=0.037..21390.787 rows=6666667 l
oops=3)
         ->  Materialize  (cost=155636.90..160640.96 rows=1000813 width=24) (actual time=3305.360..3618.343 rows=999997 loops=3)
               ->  Sort  (cost=155636.90..158138.93 rows=1000813 width=24) (actual time=3305.356..3520.754 rows=999997 loops=3)
                     Sort Key: t1.name
                     Sort Method: external merge  Disk: 33296kB
                     Worker 0:  Sort Method: external merge  Disk: 33296kB
                     Worker 1:  Sort Method: external merge  Disk: 33296kB
                     ->  Index Scan using index_test on test t1  (cost=0.44..35364.67 rows=1000813 width=24) (actual time=0.089..440.
728 rows=999999 loops=3)
                           Index Cond: (id < 1000000)
 Planning Time: 0.119 ms
 Execution Time: 45328.660 ms
(21 rows)

1.5.并行PREPARE与EXECUTE

PREPARE 和 EXECUTE 支持并行，在PostgreSQL 9.6 中，该进程未执行并行处理。

postgres=# prepare p as select count(1) from test;
PREPARE
postgres=# explain execute p;
                                        QUERY PLAN                                        
------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=242390.88..242390.89 rows=1 width=8)
   ->  Gather  (cost=242390.67..242390.88 rows=2 width=8)
         Workers Planned: 2
         ->  Partial Aggregate  (cost=241390.67..241390.68 rows=1 width=8)
               ->  Parallel Seq Scan on test  (cost=0.00..220557.33 rows=8333333 width=0)
(5 rows)

1.6.SubPlan

postgres=# explain analyze SELECT count(*) from test where id not in(select id from test where id in(100,1000,10000));
                                                                       QUERY PLAN                                                                       
--------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=252824.92..252824.93 rows=1 width=8) (actual time=4077.534..4103.954 rows=1 loops=1)
   ->  Gather  (cost=252824.71..252824.92 rows=2 width=8) (actual time=4069.620..4103.945 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=251824.71..251824.72 rows=1 width=8) (actual time=4068.743..4068.745 rows=1 loops=3)
               ->  Parallel Seq Scan on test  (cost=17.37..241408.04 rows=4166667 width=0) (actual time=0.140..3132.527 rows=6666666 loops=3)
                     Filter: (NOT (hashed SubPlan 1))
                     Rows Removed by Filter: 1
                     SubPlan 1
                       ->  Index Only Scan using index_test on test test_1  (cost=0.44..17.37 rows=3 width=4) (actual time=0.046..0.075 rows=3 loops=3)
                             Index Cond: (id = ANY ('{100,1000,10000}'::integer[]))
                             Heap Fetches: 3
 Planning time: 0.138 ms
 Execution time: 4104.002 ms
(14 rows)

参考

《PostgreSQL_10_New_Features_en_20170522-1》
《快速掌握PostgreSQL版本新特性》

墨力计划

最后修改时间：2024-12-12 15:42:21

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者

文章被以下合辑收录

PostgreSQL 笔记（共21篇）

学习笔记记录