vastbase恢复报错

进行完vastbase的独占模式备份，然后创建恢复recovery.conf文件，从恢复中拉起数据库实例报错

报错信息FATAL: WAL ends before end of online backup，HINT: Online backup started with pg_start_backup() must be ended with pg_stop_backup(), and all WAL up to that point must be available at recovery.

报错源码位置

/*
     * Complain if we did not roll forward far enough to render the backup
     * dump consistent.  Note: it is indeed okay to look at the local variable
     * minRecoveryPoint here, even though ControlFile->minRecoveryPoint might
     * be further ahead --- ControlFile->minRecoveryPoint cannot have been
     * advanced beyond the WAL we processed.
     */
    if (t_thrd.xlog_cxt.InRecovery && (XLByteLT(EndOfLog, t_thrd.xlog_cxt.minRecoveryPoint) ||
                                       !XLogRecPtrIsInvalid(t_thrd.shemem_ptr_cxt.ControlFile->backupStartPoint))) {
        if (reachedStopPoint) {
            /* stopped because of stop request */
            ereport(FATAL, (errmsg("requested recovery stop point is before consistent recovery point")));
        }

        /*
         * Ran off end of WAL before reaching end-of-backup WAL record, or
         * minRecoveryPoint. That's usually a bad sign, indicating that you
         * tried to recover from an online backup but never called
         * pg_stop_backup(), or you didn't archive all the WAL up to that
         * point. However, this also happens in crash recovery, if the system
         * crashes while an online backup is in progress. We must not treat
         * that as an error, or the database will refuse to start up.
         */
        if (t_thrd.xlog_cxt.ArchiveRecoveryRequested || t_thrd.shemem_ptr_cxt.ControlFile->backupEndRequired) {
            if (t_thrd.shemem_ptr_cxt.ControlFile->backupEndRequired) {
                ereport(FATAL,
                        (errmsg("WAL ends before end of online backup"),
                         errhint("All WAL generated while online backup was taken must be available at recovery.")));
            } else if (!XLogRecPtrIsInvalid(t_thrd.shemem_ptr_cxt.ControlFile->backupStartPoint)) {
                ereport(FATAL,
                        (errmsg("WAL ends before end of online backup"),
                         errhint(
                             "Online backup started with pg_start_backup() must be ended with pg_stop_backup(), and "
                             "all WAL up to that point must be available at recovery.")));
            } else {
                ereport(FATAL, (errmsg("WAL ends before consistent recovery point")));
            }
        }
    }

根据日志已知道变量

t_thrd.shemem_ptr_cxt.ControlFile->backupEndRequired false

!XLogRecPtrIsInvalid(t_thrd.shemem_ptr_cxt.ControlFile->backupStartPoint) true

t_thrd.xlog_cxt.ArchiveRecoveryRequested true

不确定变量的值，根据最后的日志获取，但不确定最后的

t_thrd.xlog_cxt.minRecoveryPoint 值 lsn 2/47002370

EndOfLog 值 9797894144 lsn 2/48000000

咨询问题

1.为什么会出现这个报错，看日志最后应用的位置是大于minrecoverypoint的

2.当backupEndRequired为false时，进入这个分支，查看全局只有read_backup_label时修改这个值，backup method为pg_start_backup(非独占的为streamed)，赋值为false，为什么这样就认为没有backup_end记录了

backupEndRequired
如果 backupEndRequired 为真，我们可以确定正在从备份中恢复，并且在安全启动之前必须查看
一个备份结束记录。如果它为假，但 backupStartPoint 被设置，则在启动时找到了一个 backup_label 文件，
但这可能是来自于一次孤立的 pg_start_backup() 调用的遗留文件，
并没有伴随 pg_stop_backup() 的调用。

我来答

添加附件

问题补充

3条回答

默认

最新

独行猪

上传附件：vdata2.txt

有用 2

吾亦可往

虽然您提到日志最后应用的位置（EndOfLog）大于 minRecoveryPoint，但出现此报错重点在于满足了 XLByteLT(EndOfLog, t_thrd.xlog_cxt.minRecoveryPoint) ||!XLogRecPtrIsInvalid(t_thrd.shemem_ptr_cxt.ControlFile->backupStartPoint) 这个条件进入了后续报错逻辑判断分支。
具体而言，在恢复过程中，数据库期望能依据备份开始（通过 pg_start_backup() 启动在线备份）到备份结束（应通过 pg_stop_backup() 来结束在线备份）期间所产生的所有 WAL（Write-Ahead Log，预写日志）日志进行完整的恢复操作，以确保备份数据的一致性和完整性。
报错提示表明在到达备份结束对应的 WAL 记录或者 minRecoveryPoint 之前，WAL 就已经结束了。这通常意味着要么在进行在线备份后没有正确调用 pg_stop_backup() 来标记备份结束，要么没有将截止到备份结束那个点的所有 WAL 都准备好用于恢复（比如没有进行妥善归档等操作）。即使当前看到 EndOfLog 大于 minRecoveryPoint，但如果没有满足整个在线备份与恢复的流程完整性要求，依然会触发该报错。

有用 2