暂无图片
vastbase从恢复中启动数据库报错FATAL: WAL ends before end of online backup
我来答
分享
独行猪
2024-12-13
vastbase从恢复中启动数据库报错FATAL: WAL ends before end of online backup

vastbase恢复报错

进行完vastbase的独占模式备份,然后创建恢复recovery.conf文件,从恢复中拉起数据库实例报错

报错信息FATAL: WAL ends before end of online backup,HINT: Online backup started with pg_start_backup() must be ended with pg_stop_backup(), and all WAL up to that point must be available at recovery.

报错源码位置

/* * Complain if we did not roll forward far enough to render the backup * dump consistent. Note: it is indeed okay to look at the local variable * minRecoveryPoint here, even though ControlFile->minRecoveryPoint might * be further ahead --- ControlFile->minRecoveryPoint cannot have been * advanced beyond the WAL we processed. */ if (t_thrd.xlog_cxt.InRecovery && (XLByteLT(EndOfLog, t_thrd.xlog_cxt.minRecoveryPoint) || !XLogRecPtrIsInvalid(t_thrd.shemem_ptr_cxt.ControlFile->backupStartPoint))) { if (reachedStopPoint) { /* stopped because of stop request */ ereport(FATAL, (errmsg("requested recovery stop point is before consistent recovery point"))); } /* * Ran off end of WAL before reaching end-of-backup WAL record, or * minRecoveryPoint. That's usually a bad sign, indicating that you * tried to recover from an online backup but never called * pg_stop_backup(), or you didn't archive all the WAL up to that * point. However, this also happens in crash recovery, if the system * crashes while an online backup is in progress. We must not treat * that as an error, or the database will refuse to start up. */ if (t_thrd.xlog_cxt.ArchiveRecoveryRequested || t_thrd.shemem_ptr_cxt.ControlFile->backupEndRequired) { if (t_thrd.shemem_ptr_cxt.ControlFile->backupEndRequired) { ereport(FATAL, (errmsg("WAL ends before end of online backup"), errhint("All WAL generated while online backup was taken must be available at recovery."))); } else if (!XLogRecPtrIsInvalid(t_thrd.shemem_ptr_cxt.ControlFile->backupStartPoint)) { ereport(FATAL, (errmsg("WAL ends before end of online backup"), errhint( "Online backup started with pg_start_backup() must be ended with pg_stop_backup(), and " "all WAL up to that point must be available at recovery."))); } else { ereport(FATAL, (errmsg("WAL ends before consistent recovery point"))); } } }

根据日志已知道变量

t_thrd.shemem_ptr_cxt.ControlFile->backupEndRequired false

!XLogRecPtrIsInvalid(t_thrd.shemem_ptr_cxt.ControlFile->backupStartPoint) true

t_thrd.xlog_cxt.ArchiveRecoveryRequested true

不确定变量的值,根据最后的日志获取,但不确定最后的

t_thrd.xlog_cxt.minRecoveryPoint 值 lsn 2/47002370

EndOfLog 值 9797894144 lsn 2/48000000

咨询问题

1.为什么会出现这个报错,看日志最后应用的位置是大于minrecoverypoint的

2.当backupEndRequired为false时,进入这个分支,查看全局只有read_backup_label时修改这个值,backup method为pg_start_backup(非独占的为streamed),赋值为false,为什么这样就认为没有backup_end记录了

backupEndRequired 如果 backupEndRequired 为真,我们可以确定正在从备份中恢复,并且在安全启动之前必须查看 一个备份结束记录。如果它为假,但 backupStartPoint 被设置,则在启动时找到了一个 backup_label 文件, 但这可能是来自于一次孤立的 pg_start_backup() 调用的遗留文件, 并没有伴随 pg_stop_backup() 的调用。

我来答
添加附件
收藏
分享
问题补充
3条回答
默认
最新
独行猪
上传附件:vdata2.txt
暂无图片 评论
暂无图片 有用 2
吾亦可往

虽然您提到日志最后应用的位置(EndOfLog)大于 minRecoveryPoint,但出现此报错重点在于满足了 XLByteLT(EndOfLog, t_thrd.xlog_cxt.minRecoveryPoint) ||!XLogRecPtrIsInvalid(t_thrd.shemem_ptr_cxt.ControlFile->backupStartPoint) 这个条件进入了后续报错逻辑判断分支。
具体而言,在恢复过程中,数据库期望能依据备份开始(通过 pg_start_backup() 启动在线备份)到备份结束(应通过 pg_stop_backup() 来结束在线备份)期间所产生的所有 WAL(Write-Ahead Log,预写日志)日志进行完整的恢复操作,以确保备份数据的一致性和完整性。
报错提示表明在到达备份结束对应的 WAL 记录或者 minRecoveryPoint 之前,WAL 就已经结束了。这通常意味着要么在进行在线备份后没有正确调用 pg_stop_backup() 来标记备份结束,要么没有将截止到备份结束那个点的所有 WAL 都准备好用于恢复(比如没有进行妥善归档等操作)。即使当前看到 EndOfLog 大于 minRecoveryPoint,但如果没有满足整个在线备份与恢复的流程完整性要求,依然会触发该报错。

暂无图片 评论
暂无图片 有用 2
独行猪

复现问题,通过日志解析工具与归档路径下的日志进行对比,原因为备份时执行完stop命令后,直接备份日志,当前日志BACUP_END未写入到日志中,

备份的日志

归档下日志

暂无图片 评论
暂无图片 有用 0
回答交流
提交
问题信息
请登录之后查看
附件列表
请登录之后查看
邀请回答
暂无人订阅该标签,敬请期待~~
暂无图片墨值悬赏