暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

PostgreSQL 调试 checkpointer 进程

SmallDB 2025-03-17
104

 

功能如下

Checkpointer 进程负责将 PostgreSQL 数据库中的脏页刷到磁盘中,当我们通过 GDB 附加到 checkpointer 进程之后,通过执行 CHECKPOINT
 语句发现不能触发后续的 checkpoint 代码。本文针对这个问题简要介绍一下如何调试 PostgreSQL 数据库的 checkpointer启动进程及如何调试它。看看它是否符合预期的效果呢

多尝试,多实验,一定能验证自己的想法对理论对不对,会有一种豁然开朗的感觉

启动入口文件

src/backend/main/main.c
src/backend/bootstrap/bootstrap.c
src/backend/postmaster/postmaster.c

MyAuxProcType

通过 extern
 声明,我们可以在当前文件中使用这个变量,而不需要再次定义它。

src/include/miscadmin.h
typedef enum
{

    NotAnAuxProcess = -1,
    CheckerProcess = 0,
    BootstrapProcess,
    StartupProcess,
    BgWriterProcess,
    ArchiverProcess,
    CheckpointerProcess,
    WalWriterProcess,
    WalReceiverProcess,

    NUM_AUXPROCTYPES            /* Must be last! */
} AuxProcType;

extern AuxProcType MyAuxProcType;

#define AmBootstrapProcess()        (MyAuxProcType == BootstrapProcess)
#define AmStartupProcess()            (MyAuxProcType == StartupProcess)
#define AmBackgroundWriterProcess() (MyAuxProcType == BgWriterProcess)
#define AmArchiverProcess()            (MyAuxProcType == ArchiverProcess)
#define AmCheckpointerProcess()        (MyAuxProcType == CheckpointerProcess)
#define AmWalWriterProcess()        (MyAuxProcType == WalWriterProcess)
#define AmWalReceiverProcess()        (MyAuxProcType == WalReceiverProcess)

  • • 宏名AmBootstrapProcess()
    ,这是用户自定义的宏名称
  • • 参数列表:此宏没有参数,括号里为空。在调用这个宏时,不需要传入任何参数。
  • • 替换文本(MyAuxProcType == BootstrapProcess)
    ,这是一个条件表达式。MyAuxProcType
     是之前通过 extern
     声明的外部变量,它的类型是 AuxProcType
     枚举类型;BootstrapProcess
     是 AuxProcType
     枚举类型中的一个枚举常量。该表达式用于判断 MyAuxProcType
     的值是否等于 BootstrapProcess
源文件
src/backend/postmaster/postmaster.c

宏定义

#define StartupDataBase()       StartChildProcess(StartupProcess)
#define StartArchiver()         StartChildProcess(ArchiverProcess)
#define StartBackgroundWriter() StartChildProcess(BgWriterProcess)
#define StartCheckpointer()     StartChildProcess(CheckpointerProcess)
#define StartWalWriter()        StartChildProcess(WalWriterProcess)
#define StartWalReceiver()      StartChildProcess(WalReceiverProcess)

  • • #define
    :这是 C 和 C++ 预处理器的指令,用于定义宏。
  • • StartupDataBase()
    :这是宏的名称,括号表示这是一个带参数的宏(这里参数为空)。宏名后面的括号是必需的,即使没有参数。
  • • StartChildProcess(StartupProcess)
    :这是宏的替换文本。当在代码中使用 StartupDataBase()
     时,预处理器会将其替换为 StartChildProcess(StartupProcess)
Pseudocode(伪代码)

StartChildProcess 运行接收AuxProcType类型的数据,上面定义发了,不同的进程

  1. 1. 核心函数 AuxiliaryProcessMain(ac, av)专门辅助各种进程(AuxProcType之外的)
  2. 2. 后面的代码会检查进程有没有启动,没有启动会报错的errmsg
static pid_tStartChildProcess(AuxProcType type);


staticpid_t
StartChildProcess(AuxProcType type)
{
    pid_t       pid;
    char       *av[10];
    int         ac = 0;
    char        typebuf[32];

    /*
     * Set up command-line arguments for subprocess
     */

    av[ac++] = "postgres";

#ifdef EXEC_BACKEND
    av[ac++] = "--forkboot";
    av[ac++] = NULL;            /* filled in by postmaster_forkexec */
#endif

    snprintf(typebuf, sizeof(typebuf), "-x%d", type);
    av[ac++] = typebuf;

    av[ac] = NULL;
    Assert(ac < lengthof(av));

#ifdef EXEC_BACKEND
    pid = postmaster_forkexec(ac, av);
#else                           /* !EXEC_BACKEND */
    pid = fork_process();

    if (pid == 0)               /* child */
    {
        InitPostmasterChild();

        /* Close the postmaster's sockets */
        ClosePostmasterPorts(false);

        /* Release postmaster's working memory context */
        MemoryContextSwitchTo(TopMemoryContext);
        MemoryContextDelete(PostmasterContext);
        PostmasterContext = NULL;

        AuxiliaryProcessMain(ac, av);   /* does not return */
    }
#endif                          /* EXEC_BACKEND */

    if (pid < 0)
    {
        /* in parent, fork failed */
        int         save_errno = errno;

        errno = save_errno;
        switch (type)
        {
            case StartupProcess:
                ereport(LOG,
                        (errmsg("could not fork startup process: %m")));
                break;
            case ArchiverProcess:
                ereport(LOG,
                        (errmsg("could not fork archiver process: %m")));
                break;
            case BgWriterProcess:
                ereport(LOG,
                        (errmsg("could not fork background writer process: %m")));
                break;
            case CheckpointerProcess:
                ereport(LOG,
                        (errmsg("could not fork checkpointer process: %m")));
                break;
            case WalWriterProcess:
                ereport(LOG,
                        (errmsg("could not fork WAL writer process: %m")));
                break;
            case WalReceiverProcess:
                ereport(LOG,
                        (errmsg("could not fork WAL receiver process: %m")));
                break;
            default:
                ereport(LOG,
                        (errmsg("could not fork process: %m")));
                break;
        }

        /*
         * fork failure is fatal during startup, but there's no need to choke
         * immediately if starting other child types fails.
         */

        if (type == StartupProcess)
            ExitPostmaster(1);
        return0;
    }

    /*
     * in parent, successful fork
     */

    return pid;
}

真正启动

    switch (MyAuxProcType)
    {
        case CheckerProcess:
            /* don't set signals, they're useless here */
            CheckerModeMain();
            proc_exit(1);        /* should never return */

        case BootstrapProcess:

            /*
             * There was a brief instant during which mode was Normal; this is
             * okay.  We need to be in bootstrap mode during BootStrapXLOG for
             * the sake of multixact initialization.
             */

            SetProcessingMode(BootstrapProcessing);
            bootstrap_signals();
            BootStrapXLOG();
            BootstrapModeMain();
            proc_exit(1);        /* should never return */

        case StartupProcess:
            StartupProcessMain();
            proc_exit(1);

        case ArchiverProcess:
            PgArchiverMain();
            proc_exit(1);

        case BgWriterProcess:
            BackgroundWriterMain();
            proc_exit(1);

        case CheckpointerProcess:
            CheckpointerMain();
            proc_exit(1);

        case WalWriterProcess:
            InitXLOGAccess();
            WalWriterMain();
            proc_exit(1);

        case WalReceiverProcess:
            WalReceiverMain();
            proc_exit(1);

        default:
            elog(PANIC, "unrecognized process type: %d", (int) MyAuxProcType);
            proc_exit(1);
    }

gdb如何调试它呢
postgres=# \q

postgres@BJ-015908:~$ export PGDATA=/usr/local/pgBuild/
postgres@BJ-015908:~$
postgres@BJ-015908:~$ ps -ef |grep postgres
root     7381973812009:09 pts/0    00:00:00 /bin/sh -c cd '/home/daihu/postgresql-14.7' && bin/sh
root     747311564072009:10 pts/600:00:00 su - postgres
postgres 74733     1009:10 ?        00:00:00 /lib/systemd/systemd --user
postgres 7473474733009:10 ?        00:00:00 (sd-pam)
postgres 7473974731009:10 pts/6    00:00:00 -bash
postgres 772631564060009:16 ?      00:00:00 /usr/local/pgBuild/bin/postgres -D usr/local/pgBuild/data
postgres 7727177263009:16 ?        00:00:00 postgres: checkpointer
postgres 7727277263009:16 ?        00:00:00 postgres: background writer
postgres 7727377263009:16 ?        00:00:00 postgres: walwriter
postgres 7727477263009:16 ?        00:00:00 postgres: autovacuum launcher
postgres 7727577263009:16 ?        00:00:00 postgres: stats collector
postgres 7727677263009:16 ?        00:00:00 postgres: logical replication launcher
postgres 7762174739009:17 pts/6    00:00:00 ps -ef
postgres 7762274739009:17 pts/6    00:00:00 grep --color=auto postgres

postgres@BJ-015908:~$ gdb attach 77263
GNU gdb (Ubuntu 12.1-0ubuntu1~22.04.212.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty"for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration"for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
attach: No such file or directory.
Attaching to process 77263
Reading symbols from usr/local/pgBuild/bin/postgres...
Reading symbols from lib/x86_64-linux-gnu/libm.so.6...
Reading symbols from usr/lib/debug/.build-id/7d/8778fca8ea4621b268cc03662855d0cd983439.debug...
Reading symbols from lib/x86_64-linux-gnu/libc.so.6...
Reading symbols from usr/lib/debug/.build-id/cd/410b710f0f094c6832edd95931006d883af48e.debug...
Reading symbols from lib64/ld-linux-x86-64.so.2...
Reading symbols from usr/lib/debug/.build-id/e4/de036b19e4768e7591b596c4be9f9015f2d28a.debug...
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007ffb953b759d in __GI___select (nfds=nfds@entry=7, readfds=readfds@entry=0x7fff2f205b10, writefds=writefds@entry=0x0, exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x7fff2f205a60) at ../sysdeps/unix/sysv/linux/select.c:69
69      ../sysdeps/unix/sysv/linux/select.c: No such file or directory.

(gdb) b postmaster.c 1770
Function "postmaster.c 1770" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (postmaster.c 1770) pending.

(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: usr/local/pgBuild/bin/postgres
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
2025-02-1909:18:02.386 CST [77877] LOG:  starting PostgreSQL 14.7 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.0411.4.064-bit
2025-02-1909:18:02.387 CST [77877] LOG:  listening on IPv4 address "127.0.0.1", port 5432
2025-02-1909:18:02.389 CST [77877] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
[Detaching after fork from child process 77881]
2025-02-1909:18:02.393 CST [77881] LOG:  database system was interrupted; last known up at 2025-02-1909:16:43 CST
2025-02-1909:18:02.568 CST [77881] LOG:  database system was not properly shut down; automatic recovery in progress
2025-02-1909:18:02.570 CST [77881] LOG:  redo starts at 0/16AE460
2025-02-1909:18:02.570 CST [77881] LOG:  invalid record length at 0/16AE498: wanted 24, got 0
2025-02-1909:18:02.570 CST [77881] LOG:  redo done at 0/16AE460 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
[Detaching after fork from child process 77883]
[Detaching after fork from child process 77884]
[Detaching after fork from child process 77885]
[Detaching after fork from child process 77886]
[Detaching after fork from child process 77887]
[Detaching after fork from child process 77888]
2025-02-1909:18:02.584 CST [77877] LOG:  database system is ready to accept connections
c
c
c
^C
Program received signal SIGINT, Interrupt.
0x00007ffff7dc159d in __GI___select (nfds=nfds@entry=7, readfds=readfds@entry=0x7fffffffde60, writefds=writefds@entry=0x0, exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x7fffffffddb0) at ../sysdeps/unix/sysv/linux/select.c:69
69      ../sysdeps/unix/sysv/linux/select.c: No such file or directory.
(gdb) c
Continuing.
^C
Program received signal SIGINT, Interrupt.
0x00007ffff7dc159d in __GI___select (nfds=nfds@entry=7, readfds=readfds@entry=0x7fffffffde60, writefds=writefds@entry=0x0, exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x7fffffffddb0) at ../sysdeps/unix/sysv/linux/select.c:69
69      in ../sysdeps/unix/sysv/linux/select.c
(gdb) set follow-fork-mode child
(gdb) b CreateCheckPoint
Breakpoint 2 at 0x5555556e3f9b: file xlog.c, line 9049.
(gdb) c
Continuing.
[Attaching after Thread 0x7ffff7ca3740(LWP 77877) fork to child process 79376]
[New inferior 2 (process 79376)]
[Detaching after fork from parent process 77877]
[Inferior 1 (process 77877) detached]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
c
^C2025-02-19 09:22:24.699 CST [77877] LOG:  received fast shutdown request
2025-02-19 09:22:24.701 CST [77877] LOG:  aborting any active transactions
2025-02-19 09:22:24.701 CST [77877] LOG:  background worker "logical replication launcher" (PID 77888) exited with exit code 1

Thread 2.1 "postgres" received signal SIGTERM, Terminated.
[Switching to Thread 0x7ffff7ca3740(LWP 79376)]
0x00007ffff7dcbdea in epoll_wait(epfd=4, events=0x555555e25c48, maxevents=1, timeout=timeout@entry=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
30      ../sysdeps/unix/sysv/linux/epoll_wait.c: No such file or directory.
(gdb) p max_wal_szie_mb
No symbol "max_wal_szie_mb" in current context.

(gdb) b CreateCheckPoint(此处需要新打开一个窗口,然后执行postgres-# CHECKPOINT;)
Note: breakpoint 2 also set at pc 0x5555556e3f9b.
Breakpoint 3 at 0x18ff9b: CreateCheckPoint. (2 locations)
(gdb) p max_wal_szie_mb
No symbol "max_wal_szie_mb" in current context.
(gdb) p CheckPointTimeout
$1 = 300
(gdb) p max_wal_size_mb
$2 = 1024
(gdb) min_wal_size_mb
Undefined command: "min_wal_size_mb".  Try "help".
(gdb) pmin_wal_size_mb
Undefined command: "pmin_wal_size_mb".  Try "help".
(gdb) p min_wal_size_mb
$3 = 80
(gdb) print
$4 = 80
(gdb)
(gdb) watch
Argument required (expression to compute).
(gdb) quit
A debugging session is active.

        Inferior 2 [process 79376] will be killed.

Quit anyway? (y or n) y
2025-02-1909:36:57.972 CST [77877] LOG:  server process (PID 79376) was terminated by signal 9: Killed
2025-02-1909:36:57.972 CST [77877] DETAIL:  Failed process was running: CHECKPOINT
        ;
2025-02-1909:36:57.972 CST [77877] LOG:  terminating any other active server processes
2025-02-1909:36:57.972 CST [85963] FATAL:  the database system is shutting down
2025-02-1909:36:57.976 CST [77877] LOG:  abnormal database system shutdown
2025-02-1909:36:57.981 CST [77877] LOG:  database system is shut down
postgres@BJ-015908:~$

 


文章转载自SmallDB,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论