暂无图片
暂无图片
3
暂无图片
暂无图片
暂无图片

PostgreSQL17新特性之-新增系统视图

原创 阎书利 2024-05-29
658

PostgreSQL-17-beta1版本目前已经发布了,每个版本随着功能的增加和一些相关优化,会对部分视图进行调整,以及增加新的视图。

postgres<17beta1>(ConnAs[postgres]:PID[21362] 2024-05-28/13:19:58)=# select version();
+------------------------------------------------------------------------------------------------------------+
|                                                  version                                                   |
+------------------------------------------------------------------------------------------------------------+
| PostgreSQL 17beta1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit |
+------------------------------------------------------------------------------------------------------------+
(1 row)

postgres<17beta1>(ConnAs[postgres]:PID[21362] 2024-05-28/13:20:02)=# select * from pg_control_system();
+--------------------+--------------------+---------------------+--------------------------+
| pg_control_version | catalog_version_no |  system_identifier  | pg_control_last_modified |
+--------------------+--------------------+---------------------+--------------------------+
|               1300 |          202405161 | 7373892286583727603 | 2024-05-28 11:57:26+08   |
+--------------------+--------------------+---------------------+--------------------------+
(1 row)

目前的PostgreSQL-17-beta1新版本新增了两个视图,分别为pg_wait_events和pg_stat_checkpointer。pg_stat_checkpointer视图包含检查点信息,pg_wait_events 主要包含等待事件的描述。

postgres<17beta1>(ConnAs[postgres]:PID[21590] 2024-05-28/13:37:39)=# \d pg_stat_checkpointer
                      View "pg_catalog.pg_stat_checkpointer"
+---------------------+--------------------------+-----------+----------+---------+
|       Column        |           Type           | Collation | Nullable | Default |
+---------------------+--------------------------+-----------+----------+---------+
| num_timed           | bigint                   |           |          |         |
| num_requested       | bigint                   |           |          |         |
| restartpoints_timed | bigint                   |           |          |         |
| restartpoints_req   | bigint                   |           |          |         |
| restartpoints_done  | bigint                   |           |          |         |
| write_time          | double precision         |           |          |         |
| sync_time           | double precision         |           |          |         |
| buffers_written     | bigint                   |           |          |         |
| stats_reset         | timestamp with time zone |           |          |         |
+---------------------+--------------------------+-----------+----------+---------+

postgres<17beta1>(ConnAs[postgres]:PID[21590] 2024-05-28/13:37:42)=# \d pg_wait_events
           View "pg_catalog.pg_wait_events"
+-------------+------+-----------+----------+---------+
|   Column    | Type | Collation | Nullable | Default |
+-------------+------+-----------+----------+---------+
| type        | text |           |          |         |
| name        | text |           |          |         |
| description | text |           |          |         |
+-------------+------+-----------+----------+---------+

image.png

一、pg_stat_checkpointer视图

1.pg_stat_checkpointer视图解析

在之前的版本,检查点的统计信息通常在pg_stat_bgwriter里查看,但是PostgreSQL-17-beta1新版本把pg_stat_bgwriter的一些列删除掉了,并用pg_stat_checkpointer中的列做了对应取代。

具体的替代关系如下:
pg_stat_bgwriter.checkpoints_timed -> pg_stat_checkpointer.num_timed
pg_stat_bgwriter.checkpoints_req -> pg_stat_checkpointer.num_requested
pg_stat_bgwriter.checkpoint_write_time -> pg_stat_checkpointer.write_time
pg_stat_bgwriter.checkpoint_sync_time -> pg_stat_checkpointer.sync_time
pg_stat_bgwriter.buffers_checkpoint -> pg_stat_checkpointer.buffers_written

postgres<16.1>(ConnAs[postgres]:PID[4639] 2024-05-29/11:03:33)=# \d pg_stat_bgwriter 
                         View "pg_catalog.pg_stat_bgwriter"
+-----------------------+--------------------------+-----------+----------+---------+
|        Column         |           Type           | Collation | Nullable | Default |
+-----------------------+--------------------------+-----------+----------+---------+
| checkpoints_timed     | bigint                   |           |          |         |
| checkpoints_req       | bigint                   |           |          |         |
| checkpoint_write_time | double precision         |           |          |         |
| checkpoint_sync_time  | double precision         |           |          |         |
| buffers_checkpoint    | bigint                   |           |          |         |
| buffers_clean         | bigint                   |           |          |         |
| maxwritten_clean      | bigint                   |           |          |         |
| buffers_backend       | bigint                   |           |          |         |
| buffers_backend_fsync | bigint                   |           |          |         |
| buffers_alloc         | bigint                   |           |          |         |
| stats_reset           | timestamp with time zone |           |          |         |
+-----------------------+--------------------------+-----------+----------+---------+

postgres<17beta1>(ConnAs[postgres]:PID[21590] 2024-05-28/13:37:47)=# \d pg_stat_bgwriter
                       View "pg_catalog.pg_stat_bgwriter"
+------------------+--------------------------+-----------+----------+---------+
|      Column      |           Type           | Collation | Nullable | Default |
+------------------+--------------------------+-----------+----------+---------+
| buffers_clean    | bigint                   |           |          |         |
| maxwritten_clean | bigint                   |           |          |         |
| buffers_alloc    | bigint                   |           |          |         |
| stats_reset      | timestamp with time zone |           |          |         |
+------------------+--------------------------+-----------+----------+---------+

新的视图 pg_stat_checkpointer最主要需要关注的是restartpoints_timed、restartpoints_req和restartpoints_done三列,最后一列显示实际进行了多少次restartpoint。需要注意的是备库的restartpoint不能比主库上的restartpoint更频繁。

postgres<17beta1>(ConnAs[postgres]:PID[21362] 2024-05-28/13:20:11)=# select * from pg_stat_checkpointer;
+-----------+---------------+---------------------+-------------------+--------------------+------------+-----------+-----------------+-------------------------------+
| num_timed | num_requested | restartpoints_timed | restartpoints_req | restartpoints_done | write_time | sync_time | buffers_written |          stats_reset          |
+-----------+---------------+---------------------+-------------------+--------------------+------------+-----------+-----------------+-------------------------------+
|        18 |             1 |                   0 |                 0 |                  0 |          4 |         2 |              45 | 2024-05-28 11:48:07.339976+08 |
+-----------+---------------+---------------------+-------------------+--------------------+------------+-----------+-----------------+-------------------------------+
(1 row)

postgres<17beta1>(ConnAs[postgres]:PID[21362] 2024-05-28/13:24:14)=# checkpoint;
CHECKPOINT
postgres<17beta1>(ConnAs[postgres]:PID[21362] 2024-05-28/13:24:17)=# select * from pg_stat_checkpointer;
+-----------+---------------+---------------------+-------------------+--------------------+------------+-----------+-----------------+-------------------------------+
| num_timed | num_requested | restartpoints_timed | restartpoints_req | restartpoints_done | write_time | sync_time | buffers_written |          stats_reset          |
+-----------+---------------+---------------------+-------------------+--------------------+------------+-----------+-----------------+-------------------------------+
|        18 |             2 |                   0 |                 0 |                  0 |          6 |         3 |              45 | 2024-05-28 11:48:07.339976+08 |
+-----------+---------------+---------------------+-------------------+--------------------+------------+-----------+-----------------+-------------------------------+
(1 row)

image.png

各列的具体含义如下所示

列类型 描述
num_timed bigint 已执行的计划检查点数量
num_requested bigint 已执行的请求检查点的数量
restartpoints_timed bigint 由于超时或尝试执行失败而计划的重新启动点数
restartpoints_req bigint 请求的重启点数
restartpoints_done bigint 已执行的重启点数
write_time double precision 处理检查点和重启点(将文件写入磁盘)部分所花费的总时间(以毫秒为单位)
sync_time double precision 处理检查点和重启点(文件同步到磁盘)部分所花费的总时间(以毫秒为单位)
buffers_written bigint 检查点和重启点期间写入的缓冲区数
stats_reset timestamp with time zone 这些统计数据上次重置的时间

如果想重置pg_stat_checkpointer视图里记录的统计数据,可以使用pg_stat_reset_shared()函数执行如下操作,

postgres<17beta1>(ConnAs[postgres]:PID[21590] 2024-05-28/15:03:14)=# select * from pg_stat_checkpointer;
+-----------+---------------+---------------------+-------------------+--------------------+------------+-----------+-----------------+-------------------------------+
| num_timed | num_requested | restartpoints_timed | restartpoints_req | restartpoints_done | write_time | sync_time | buffers_written |          stats_reset          |
+-----------+---------------+---------------------+-------------------+--------------------+------------+-----------+-----------------+-------------------------------+
|        37 |             2 |                   0 |                 0 |                  0 |          6 |         3 |              45 | 2024-05-28 11:48:07.339976+08 |
+-----------+---------------+---------------------+-------------------+--------------------+------------+-----------+-----------------+-------------------------------+
(1 row)

postgres<17beta1>(ConnAs[postgres]:PID[21590] 2024-05-28/15:03:22)=# SELECT pg_stat_reset_shared('checkpointer');
+----------------------+
| pg_stat_reset_shared |
+----------------------+
|                      |
+----------------------+
(1 row)

postgres<17beta1>(ConnAs[postgres]:PID[21590] 2024-05-28/15:03:30)=# select * from pg_stat_checkpointer;
+-----------+---------------+---------------------+-------------------+--------------------+------------+-----------+-----------------+-------------------------------+
| num_timed | num_requested | restartpoints_timed | restartpoints_req | restartpoints_done | write_time | sync_time | buffers_written |          stats_reset          |
+-----------+---------------+---------------------+-------------------+--------------------+------------+-----------+-----------------+-------------------------------+
|         0 |             0 |                   0 |                 0 |                  0 |          0 |         0 |               0 | 2024-05-28 15:03:30.770561+08 |
+-----------+---------------+---------------------+-------------------+--------------------+------------+-----------+-----------------+-------------------------------+
(1 row)

2.关于重启点(restartpoint)

restartpoint(重启点):重启点其实就是在recovery恢复期间执行的类似checkpoint的操作,此时它执行的是重启点,并不是一个新的检查点。主要做的工作是:服务器将其所有状态强制刷到磁盘,更新pg_control文件以指示已处理的WAL数据无需再次扫描,然后回收pg_wal目录中的任何旧WAL段文件。

一般备库使用的"checkpoint"就是和主库的checkpoint功能类似的restartpoint,备库的 checkpointer 进程会定期的做 restartpoint 进行刷脏,并记录一些位点,如果 crash 了直接从对应的位点开始向后进行 redo。

主库做了一个 checkpoint(非shutdown时的checkpoint)时,CreateCheckPoint()函数下的XLogInsert()函数会写一条 XLOG_CHECKPOINT_ONLINE类型日志记录。

void
CreateCheckPoint(int flags)
{
...
	/*
	 * Now insert the checkpoint record into XLOG.
	 */
	XLogBeginInsert();
	XLogRegisterData((char *) (&checkPoint), sizeof(checkPoint));
	recptr = XLogInsert(RM_XLOG_ID,
			    shutdown ? XLOG_CHECKPOINT_SHUTDOWN :
			    XLOG_CHECKPOINT_ONLINE);
	XLogFlush(recptr);
...
}

image.png

备机的startup进程在xlog_redo(XLogReaderState *record)函数中,会针对这种类型日志做日志回放,并在 RecoveryRestartPoint()函数中复制主库该 checkpoint 的相关位点信息到共享内存中,例如checkpoint 位点、redo point 位点等,这样检查指针就可以在下次执行重新启动点(restartpoint)时计算出来。

	/*
	 * Copy the checkpoint record to shared memory, so that checkpointer can
	 * work out the next time it wants to perform a restartpoint.
	 */
	SpinLockAcquire(&XLogCtl->info_lck);
	XLogCtl->lastCheckPointRecPtr = record->ReadRecPtr;
	XLogCtl->lastCheckPointEndPtr = record->EndRecPtr;
	XLogCtl->lastCheckPoint = *checkPoint;
	SpinLockRelease(&XLogCtl->info_lck);

image.png

restartpoint(重启点)的官方文档的相关描述如下:https://www.postgresql.org/docs/16/wal-configuration.html
image.png

意思是:在归档恢复或备机模式下,服务器定期执行重启点,这类似于正常操作中的检查点:服务器将其所有状态强制到磁盘,更新pg_control文件以指示已处理的WAL数据无需再次扫描,然后回收pg_wal目录中的任何旧WAL段文件。重新启动点的执行频率不能高于主检查点,因为重新启动点只能在检查点记录上执行。 如果自上次重新启动点以来至少经过了checkpoint_timeout秒,或者WAL大小即将超过max_wal_size,则当达到检查点记录时,将触发重新启动点。然而,由于对何时可以执行重新启动点的限制,在恢复过程中,通常会超过max_wal_size,最多超过一个检查点周期的wal。(无论如何,max_wal_size从来都不是一个硬性限制,所以应该始终留出足够的空间以避免磁盘空间不足。)

二、pg_wait_events视图

在之前的版本里,pg_stat_actvity里的wait_event_type和wait_event的列在官方文档里有解释,描述等待事件的相关工作,想获取详细的等待事件解释需要查官方文档,比较麻烦。
https://www.postgresql.org/docs/devel/monitoring-stats.html#WAIT-EVENT-TABLE

PostgreSQL-17-beta1新版本增加了pg_wait_events视图,把等待事件的描述记录在视图里。

image.png

pg_wait_events包含三列,type,name以及description。分别是等待事件种类,等待事件名和解释。通过和pg_stat_activity关联,可以快速了解当前进程正在等待什么。进程现在在做什么。

postgres<17beta1>(ConnAs[postgres]:PID[21590] 2024-05-28/14:52:09)=# SELECT a.pid, a.state, a.wait_event_type, a.wait_event, w.description
FROM pg_stat_activity a JOIN pg_wait_events w
     ON (a.wait_event_type = w.type AND a.wait_event = w.name)
WHERE a.backend_type = 'checkpointer' ;
+-------+-------+-----------------+------------------+----------------------------------------------+
|  pid  | state | wait_event_type |    wait_event    |                 description                  |
+-------+-------+-----------------+------------------+----------------------------------------------+
| 13893 |       | Activity        | CheckpointerMain | Waiting in main loop of checkpointer process |
+-------+-------+-----------------+------------------+----------------------------------------------+

image.png

目前这pg_wait_events这个视图里包含的等待事件种类和对应的等待事件数量如下图所示,总共有264种,使用下来比查文档方便了许多。

postgres<17beta1>(ConnAs[postgres]:PID[21590] 2024-05-28/14:47:02)=# select type, count(1) from pg_wait_events group by type order by count(1) desc;
+-----------+-------+
|   type    | count |
+-----------+-------+
| LWLock    |    81 |
| IO        |    77 |
| IPC       |    57 |
| Activity  |    16 |
| Lock      |    12 |
| Timeout   |    10 |
| Client    |     9 |
| BufferPin |     1 |
| Extension |     1 |
+-----------+-------+
(9 rows)
最后修改时间:2024-05-30 11:41:07
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论