暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

pg_prewarm预热

原创 roman 2023-07-22
310

pg_prewarm可以用于在系统重启时,加载经常访问的表到操作系统的cache或PG的shared buffer,从而减少系统重启对应用的影响。
pg_prewarm插件从PG 9.4版本开始,已经融入了 PostgreSQL发行版中,无需下载编译安装程序。

pg_prewarm创建和说明

如下,只需要创建扩展即可使用:

postgres=# create extension pg_prewarm;
CREATE EXTENSION

一旦创建扩展,则就生成了响应的函数,如下:

postgres=# \dx+ pg_prewarm
           Objects in extension "pg_prewarm"
                  Object description
-------------------------------------------------------
 function autoprewarm_dump_now()
 function autoprewarm_start_worker()
 function pg_prewarm(regclass,text,text,bigint,bigint)
(3 rows)

函数autoprewarm_dump_now():表示在服务器启动期间没有配置自动预热功能时,可以使用此命令启动自动预热工作程序。

函数autoprewarm_start_worker():会马上对autoprewarm.blocks文件进行更新,如果自动预热进程当前没有运行,则希望在下次重启之后运行它。

重点介绍一下,函数pg_prewarm(regclass,text,text,bigint,bigint):包括5个参数,最终执行后会有一个返回值,是prewarm块的数量。5个参数如下:

--第一个参数regclass:是预热的relation。
--第二个参数是要使用预热的方法,有三种可用的预热方法:prefetch会向操作系统发出异步 预取请求(如果支持异步预取),不支持异步预取则抛出一个错误;read会读取要求范围的块。与prefetch 不同,它是同步的并且在所有平台上都被支持,但是可能较慢;buffer会把要求范围的块读入道数据库的缓冲区。
--第三个参数是relation fork被预热。
--第四个参数是预热的第一个块号。
--第五个参数是预热的最后一个块号。


pg_prewarm预热 分为手动调用自动执行

如下演示来解释说明手动调用自动执行。


首先创建表时默认是会直接预热的,如下,预热543个block,但这个表太小了,4328kB远小于pg默认的shared_buffers大小(128M)

postgres=# create table test_prewarm(id int,name varchar(20));
CREATE TABLE
postgres=# insert into test_prewarm select a,'name-'||a from generate_series(1,100000) as a;
INSERT 0 100000
postgres=# select count(*)  from pg_buffercache where relfilenode=pg_relation_filenode('test_prewarm');  --需要先执行create extension pg_buffercache;
 count
-------
   543
(1 row)
postgres=# select pg_size_pretty(pg_relation_size('test_prewarm'));
 pg_size_pretty
----------------
 4328 kB
(1 row)


为了能手动调用pg_prewarm函数,我们可以创建一个大表,然后将上述test_prewarm表 预热块 驱逐出去,如下:

postgres=# CREATE TABLE big(user_id int4,user_name text,ctime timestamp(6) without time zone default clock_timestamp() );
CREATE TABLE
postgres=# INSERT INTO big(user_id,user_name) select n,n||'_data' from generate_series(1,30000000) n;;
postgres=#  select pg_size_pretty(pg_relation_size('big'));
 pg_size_pretty
----------------
 1716 MB
(1 row)
postgres=# select count(*)  from pg_buffercache where relfilenode=pg_relation_filenode('test_prewarm');
 count
-------
     0
(1 row)

好了,test_prewarm表的预热块 现在已经被驱逐,当前为0个block。

此时我们就可以手动调用pg_prewarm函数,如下所示:

postgres=# select pg_prewarm('test_prewarm','buffer');
 pg_prewarm
------------
        541
(1 row)

postgres=# select count(*)  from pg_buffercache where relfilenode=pg_relation_filenode('test_prewarm');
 count
-------
   541
(1 row)


自动执行的预热方式,则需要配置shared_preload_libraries参数,注意 重启后生效。

postgres=# alter system set shared_preload_libraries=pg_prewarm;
ALTER SYSTEM

设置了shared_preload_libraries=pg_prewarm,就意味着pg_prewarm.autoprewarm(是否启动预热功能)被打开了,如下所示:

postgres=#  select name,setting,unit from pg_settings where name like 'pg_prewarm%';
              name               | setting | unit
---------------------------------+---------+------
 pg_prewarm.autoprewarm          | on      |
 pg_prewarm.autoprewarm_interval | 300     | s
(2 rows)


此时创建新表的话,则系统在$PGDATA目录中就会生成一个autoprewarm.blocks文件,

postgres=# drop table test_prewarm;
DROP TABLE
postgres=# create table test_prewarm(id int,name varchar(20));
CREATE TABLE
postgres=# insert into test_prewarm select a,'name-'||a from generate_series(1,100000) as a;
INSERT 0 100000

$ cd $PGDATA $ ll |grep autoprewarm.blocks -rw------- 1 chenruwang chenruwang 3566 Jul 23 18:11 autoprewarm.blocks


此外,pg进程中又多了一个“postgres: autoprewarm master”进程,如下:

$ ps -ef | grep postgres | grep -v grep
chenruw+  10316      1  0 18:18 ?        00:00:00 /home/chenruwang/soft/bin/postgres
chenruw+  10318  10316  0 18:18 ?        00:00:00 postgres: checkpointer
chenruw+  10319  10316  0 18:18 ?        00:00:00 postgres: background writer
chenruw+  10320  10316  0 18:18 ?        00:00:00 postgres: walwriter
chenruw+  10321  10316  0 18:18 ?        00:00:00 postgres: autovacuum launcher
chenruw+  10322  10316  0 18:18 ?        00:00:00 postgres: stats collector
chenruw+  10323  10316  0 18:18 ?        00:00:00 postgres: autoprewarm master
chenruw+  10324  10316  0 18:18 ?        00:00:00 postgres: logical replication launcher


全表扫描测试:

首先,创建一个测试表test_scan,表大小为50M,如下:

postgres=# CREATE TABLE test_scan(user_id int4,user_name text,ctime timestamp(6) without time zone default clock_timestamp() );
CREATE TABLE
postgres=# INSERT INTO test_scan(user_id,user_name) select n,n||'_data' from generate_series(1,1000000) n;;
INSERT 0 1000000
postgres=# select pg_size_pretty(pg_relation_size('test_scan'));
 pg_size_pretty
----------------
 50 MB
(1 row)


然后,预热大表big,将测试表test_scan的预热块全部驱逐,最后查看全表扫描的执行计划:

postgres=# select pg_prewarm('big','buffer');
 pg_prewarm
------------
     219605
(1 row)

postgres=# select count(*)  from pg_buffercache where relfilenode=pg_relation_filenode('test_scan');
 count
-------
     0
(1 row)

postgres=# explain (analyze,buffers) select * from test_scan;
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Seq Scan on test_scan  (cost=0.00..16370.00 rows=1000000 width=23) (actual time=0.105..148.402 rows=1000000 loops=1)
   Buffers: shared read=6370
 Planning Time: 0.048 ms
 Execution Time: 225.718 ms
(4 rows)
postgres=# explain (analyze,buffers) select * from test_scan;
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Seq Scan on test_scan  (cost=0.00..16370.00 rows=1000000 width=23) (actual time=0.083..141.788 rows=1000000 loops=1)
   Buffers: shared hit=32 read=6338
 Planning Time: 0.049 ms
 Execution Time: 217.181 ms
(4 rows)

postgres=# explain (analyze,buffers) select * from test_scan;
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Seq Scan on test_scan  (cost=0.00..16370.00 rows=1000000 width=23) (actual time=0.054..128.720 rows=1000000 loops=1)
   Buffers: shared hit=64 read=6306
 Planning Time: 0.056 ms
 Execution Time: 196.740 ms

--首次运行shared hit 无,第二次shared hit=32,第三次hit=64 ... ; 然后Execution Time分别是 225.718 ms、217.181 ms、196.740 ms。


现在,我们手动调用pg_prewarm函数,对test_scan表进行预热,如下:

postgres=# select pg_prewarm('test_scan','buffer');
 pg_prewarm
------------
       6370
(1 row)

postgres=# explain (analyze,buffers) select * from test_scan;
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Seq Scan on test_scan  (cost=0.00..16370.00 rows=1000000 width=23) (actual time=0.013..114.560 rows=1000000 loops=1)
   Buffers: shared hit=6370
 Planning Time: 0.043 ms
 Execution Time: 188.619 ms
(4 rows)

--可以很明显的观察到,shared hit=6370;Execution Time: 188.619 ms;预热后整体效果还是可以的。


重启pg自动预热,减小对业务的影响

最后,我们将数据库重新启动,看看启动后,是否有预热功能,重新启动后会发现日志中打印出一条的信息autoprewarm successfully prewarmed,它表示预热成功。

$ pg_ctl restart
waiting for server to shut down....2023-07-23 18:34:46.821 CST [10316] LOG:  received fast shutdown request
2023-07-23 18:34:46.823 CST [10316] LOG:  aborting any active transactions
2023-07-23 18:34:46.824 CST [10316] LOG:  background worker "logical replication launcher" (PID 10324) exited with exit code 1
2023-07-23 18:34:46.837 CST [10318] LOG:  shutting down
2023-07-23 18:34:46.871 CST [10316] LOG:  database system is shut down
 done
server stopped
waiting for server to start....2023-07-23 18:34:46.934 CST [10532] LOG:  starting PostgreSQL 12.9 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit
2023-07-23 18:34:46.934 CST [10532] LOG:  listening on IPv4 address "0.0.0.0", port 5666
2023-07-23 18:34:46.934 CST [10532] LOG:  listening on IPv6 address "::", port 5666
2023-07-23 18:34:46.936 CST [10532] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5666"
2023-07-23 18:34:46.952 CST [10533] LOG:  database system was shut down at 2023-07-23 18:34:46 CST
2023-07-23 18:34:46.953 CST [10532] LOG:  database system is ready to accept connections
 done
server started
[chenruwang@pgcm1 2023-07-23 18:34:47 data]$ 2023-07-23 18:34:47.107 CST [10539] LOG:  autoprewarm successfully prewarmed 16338 of 16384 previously-loaded blocks


总结

以上是shared_buffers大小为默认128M的情况进行测试,而pg_prewarm预热与机器的内存大小有关系,内存太小、shared_buffers太小都会可能导致预热块被驱逐,影响业务运行。

总体来说,在pg_prewarm预热的帮助下,表的查询执行速度有一定的提升,效果还是可以的。




「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论