PostgreSQL日志归档到S3

原创 chirpyli 2023-12-25

642

PostgreSQL日志归档

WAL日志在数据库中非常重要，数据库崩溃时，可以依靠WAL进行恢复。在高可用中，备节点可以通过流复制得到WAL日志进行回放，同步到与主节点相同的状态。除此之外，还可以利用WAL进行进行备份，并通过PITR恢复到指定状态。那么如何对WAL进行备份呢，PostgreSQL可通过日志归档功能对WAL进行归档备份。这里我们总结一下WAL日志归档方法。关于源码分析，可参考文章PostgreSQL源码分析——日志归档。另外需要注意15版本与之前版本的区别。

归档方法

归档涉及到如下GUC参数。archive_mode = on开启归档。设置archive_command，如果是PG15版本之后的版本，则增加了archive_library功能，可参考文章New WAL Archive Module/Library in PostgreSQL 15。

# - Archiving -
archive_mode = on               # enables archiving; off, on, or always (change requires restart)
#archive_library = ''           # library to use to archive a logfile segment
                                # (empty string indicates archive_command should
                                # be used)
archive_command = 'test ! -f /home/postgres/pgdata/archive/%f && cp %p /home/postgres/pgdata/archive/%f'               # command to use to archive a logfile segment
                                # placeholders: %p = path of file to archive
                                #               %f = file name only
                                # e.g. 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'
archive_timeout = 1800          # force a logfile segment switch after this
                                # number of seconds; 0 disables

归档到本地目录示例：archive_command = 'test ! -f /home/postgres/pgdata/archive/%f && cp %p /home/postgres/pgdata/archive/%f'，归档一个日志段文件到/home/postgres/pgdata/archive目录下。
也可以归档到远程主机：archive_command = 'sshpass -p asdf scp %p postgres@192.168.109.136:~/pgdata/archive/%f'。通过scp，或者rsync等工具或者其他工具备份到其他主机中。

归档恢复时，设置restore_command。从本地目录恢复的示例：restore_command = 'cp /home/postgres/pgdata/archive/%f %p'，从/home/postgres/pgdata/archive目录拷贝日志到pg_wal目录下。

# - Archive Recovery -

# These are only used in recovery mode.
restore_command = 'cp /home/postgres/pgdata/archive/%f %p'              # command to use to restore an archived logfile segment
                                # placeholders: %p = path of file to restore
                                #               %f = file name only
                                # e.g. 'cp /mnt/server/archivedir/%f %p'
#archive_cleanup_command = ''   # command to execute at every restartpoint  清理归档日志文件，可以用pg_archivecleanup
#recovery_end_command = ''      # command to execute at completion of recovery

也可以从远程主机的归档日志进行恢复，restore_command = 'sshpass -p asdf scp postgres@192.168.109.136:/home/postgres/pgdata/archive/%f %p'。将远程主机的归档日志拷贝到本地pg_wal目录下。
备机启动进行归档恢复时打印的相关日志如下：

2023-12-25 16:45:38.121 CST [postmaster] LOG:  starting PostgreSQL 15.5 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
2023-12-25 16:45:38.122 CST [postmaster] LOG:  listening on IPv4 address "0.0.0.0", port 5431
2023-12-25 16:45:38.122 CST [postmaster] LOG:  listening on IPv6 address "::", port 5431
2023-12-25 16:45:38.129 CST [postmaster] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5431"
2023-12-25 16:45:38.135 CST [checkpointer] DEBUG:  checkpointer updated shared memory configuration values
2023-12-25 16:45:38.136 CST [startup] LOG:  database system was shut down in recovery at 2023-12-25 16:43:42 CST
scp: /home/postgres/pgdata/archive/00000002.history: No such file or directory
2023-12-25 16:45:38.470 CST [startup] DEBUG:  could not restore file "00000002.history" from archive: child process exited with exit code 1
2023-12-25 16:45:38.470 CST [startup] LOG:  entering standby mode
2023-12-25 16:45:38.965 CST [startup] LOG:  restored log file "00000001000000000000001B" from archive  从归档中获取WAL日志段文件
2023-12-25 16:45:38.995 CST [startup] DEBUG:  got WAL segment from archive
2023-12-25 16:45:38.995 CST [startup] DEBUG:  checkpoint record is at 0/1B000060
2023-12-25 16:45:38.995 CST [startup] DEBUG:  redo record is at 0/1B000060; shutdown true
2023-12-25 16:45:38.995 CST [startup] DEBUG:  next transaction ID: 242689; next OID: 24668
2023-12-25 16:45:38.995 CST [startup] DEBUG:  next MultiXactId: 1; next MultiXactOffset: 0
2023-12-25 16:45:38.995 CST [startup] DEBUG:  oldest unfrozen transaction ID: 716, in database 1
2023-12-25 16:45:38.995 CST [startup] DEBUG:  oldest MultiXactId: 1, in database 1
2023-12-25 16:45:38.995 CST [startup] DEBUG:  commit timestamp Xid oldest/newest: 0/0
2023-12-25 16:45:38.995 CST [startup] DEBUG:  transaction ID wrap limit is 2147484363, limited by database with OID 1
2023-12-25 16:45:38.996 CST [startup] DEBUG:  MultiXactId wrap limit is 2147483648, limited by database with OID 1
2023-12-25 16:45:38.996 CST [startup] DEBUG:  starting up replication slots
2023-12-25 16:45:38.996 CST [startup] DEBUG:  xmin required by slots: data 0, catalog 0
2023-12-25 16:45:38.996 CST [startup] DEBUG:  starting up replication origin progress state
2023-12-25 16:45:38.996 CST [startup] DEBUG:  reading stats file "pg_stat/pgstat.stat"
2023-12-25 16:45:38.997 CST [startup] DEBUG:  removing permanent stats file "pg_stat/pgstat.stat"
2023-12-25 16:45:38.999 CST [startup] DEBUG:  resetting unlogged relations: cleanup 1 init 0
2023-12-25 16:45:38.999 CST [startup] DEBUG:  initializing for hot standby
2023-12-25 16:45:38.999 CST [startup] DEBUG:  recovery snapshots are now enabled
2023-12-25 16:45:38.999 CST [startup] LOG:  redo starts at 0/1B0000D8
2023-12-25 16:45:39.481 CST [startup] LOG:  restored log file "00000001000000000000001C" from archive
2023-12-25 16:45:39.520 CST [startup] DEBUG:  got WAL segment from archive
2023-12-25 16:45:40.026 CST [startup] LOG:  restored log file "00000001000000000000001D" from archive

归档到S3

对象存储成本较低，可以将WAL日志归档到S3中存储。方法与前面的过程相同，需要写一个工具，源码可参考https://github.com/chirpyli/walarch 。实现日志段文件上传到S3以及从S3中下载。

#归档命令
archive_command = 'walarch put --access_key *** --secret_key *** --bucket jfs1218 --endpoint eos-wuxi-3.cmecloud.cn --region wuxi3 --file %p'
#恢复命令
restore_command = 'walarch get --access_key *** --secret_key *** --bucket jfs1218 --endpoint eos-wuxi-3.cmecloud.cn --file %f --path %p --region wuxi3'

墨力计划 postgresql

最后修改时间：2023-12-25 17:31:17

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者

PostgreSQL日志归档到S3