暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

达梦数据库REDO误删除恢复疑问?

IT小Chen 2024-05-13
1300

说明:

    在没有备份的情况下,经测试非CURRENT REDO LOG FILE丢失可以恢复,CURRENT REDO LOG FILE使用网上的方法无法恢复,初步怀疑新版本对REDO增加了更严格的校验,具体恢复方法还要咨询原厂,测试过程如下。

环境说明:

数据库版本

    SQL> SELECT * FROM V$VERSION;


    LINEID BANNER
    ---------- ---------------------------------
    1 DM Database Server 64 V8
    2 DB Version: 0x7000c
    3 03134284172-20240321-222308-20093
    4 Msg Version: 0
    5 Gsu level(3-4) cnt: 3
    6 Gsu level(5) cnt: 0

    小版本:

      SQL> SELECT BUILD_VERSION FROM V$INSTANCE;


      LINEID BUILD_VERSION
      ---------- ------------------------------------
      1 1-3-140-2024.03.21-222308-20093-ENT

      补丁版本:

        SQL> SELECT ID_CODE;
        LINEID ID_CODE
        ---------- -----------------------------------
        1 --03134284172-20240321-222308-20093

        安装介质下载地址

          https://www.dameng.com/list_103.html

          创建测试数据

            SQL> CREATE TABLE T1(ID INT);
            SQL> INSERT INTO T1 VALUES(1),(2),(3);
            SQL> COMMIT;

            查询信息

            查看REDO信息 

              SQL> SELECT GROUP_ID,FILE_ID,PATH,RLOG_SIZE FROM V$RLOGFILE;


              LINEID GROUP_ID FILE_ID PATH RLOG_SIZE
              ---------- ----------- ----------- ----------------------- --------------------
              1 2 0 dm8/data/CJC/CJC01.log 268435456
              2 2 1 dm8/data/CJC/CJC02.log 268435456
                SQL> SELECT CKPT_LSN,FILE_LSN,FLUSH_LSN,CUR_LSN,DB_MAGIC FROM V$RLOG;


                LINEID CKPT_LSN FILE_LSN FLUSH_LSN CUR_LSN DB_MAGIC
                ---------- -------------------- -------------------- -------------------- -------------------- --------------------
                1 44644 44656 44656 44656 1516351486

                新增数据

                  INSERT INTO T1 VALUES(4),(5);
                  COMMIT;
                    SQL> SELECT CKPT_LSN,FILE_LSN,FLUSH_LSN,CUR_LSN,DB_MAGIC FROM V$RLOG; 


                    LINEID CKPT_LSN FILE_LSN FLUSH_LSN CUR_LSN DB_MAGIC
                    ---------- -------------------- -------------------- -------------------- -------------------- --------------------
                    1 44644 44661 44661 44661 1516351486

                    V$ROG和V$RLOGFILE视图说明:

                      https://eco.dameng.com/document/dm/zh-cn/pm/dm8-admin-manual-appendix#6.%20%E6%97%A5%E5%BF%97%E7%AE%A1%E7%90%86

                      V$RLOG

                      显示日志的总体信息。通过该视图可以了解系统当前日志事务号 LSN 的情况、归档日志情况、检查点的执行情况等。

                      ......

                      V$RLOGFILE

                      显示日志文件的具体信息。包括文件号、完整路径、文件的状态、文件大小等等。

                      查看REDO信息

                        SQL> host ls -lrth dm8/data/CJC/CJC*.log
                        -rw-r--r-- 1 dmdba dinstall 256M May 11 21:45 dm8/data/CJC/CJC02.log
                        -rw-r--r-- 1 dmdba dinstall 256M May 11 21:51 dm8/data/CJC/CJC01.log

                        可以看到,CJC01.log 时间较新,为CURRENT REDO;

                        模拟故障:

                        非current redo 丢失

                          SQL> host mv dm8/data/CJC/CJC02.log dm8/data/CJC/CJC02.log.bak 
                          SQL> shutdown abort;

                          启动

                            [dmdba@cjc-db-01 CJC]$ dmserver dm.ini 

                            前台报错:

                              /dm8/data/CJC/CJC02.log not exist, can not startup

                              后台报错:

                                [dmdba@cjc-db-01 log]$ tail -10f dm_CJC_202405.log
                                2024-05-11 22:00:21.283 [FATAL] database P0000008747 T0000000000000008747 dm8/data/CJC/CJC02.log not exist,can not startup

                                启动数据库,找不到CJC02.og,导致启动失败。

                                用现有的REDO冒充丢失的REDO

                                  [dmdba@cjc-db-01 CJC]$ cp CJC01.log CJC02.log

                                  启动

                                    [dmdba@cjc-db-01 CJC]$ dmserver dm.ini 

                                    前台报错 

                                      rfil grp init log file dm8/data/CJC/CJC02.log error, code = -717

                                      后台报错

                                        2024-05-11 22:02:32.168 [INFO] database P0000008964 T0000000000000008964  rfil[/dm8/data/CJC/CJC01.log] ver: 7007, sta: 0, arch_lsn = 0, arch_seq = 0, clsn = 44660, clsn_fil = 0, clsn_off = 17547264, l_next_seq = 5058, g_next_seq = 5058, free = 17567744, rpkg_crc_stand = 1
                                        2024-05-11 22:02:32.168 [ERROR] database P0000008964 T0000000000000008964 arch file(/dm8/data/CJC/CJC02.log) fil id check fail
                                        2024-05-11 22:02:32.168 [ERROR] database P0000008964 T0000000000000008964 rfil grp init log file /dm8/data/CJC/CJC02.log error, code = -717
                                        2024-05-11 22:02:32.168 [ERROR] database P0000008964 T0000000000000008964 rlog4_init_low->rfil_grp_alloc failed, alloc_only = 0!
                                        2024-05-11 22:02:32.168 [ERROR] database P0000008964 T0000000000000008964 hlog_upgrade failed, rlog4_sys_init return code[-126]

                                        查看报错:fil id check fail,因为两个REDO fil id不能相同,所以报错。

                                          arch file(/dm8/data/CJC/CJC02.log) fil id check fail

                                          使用dmmdbf工具修改 fil id

                                          dmmdbf工具是什么?

                                            https://eco.dameng.com/document/dm/zh-cn/ops/troubleshooting-human

                                            工具介绍

                                            Dmmdf 工具:主要对 DM 相关的文件的属性进行修改。支持的修改的文件类型有:dbf、rlog、original bak、bakset meta、bakset bkp、bakset,根据文件的不同,可修改的属性也有所不同,最主要的是修改 db_magic。

                                            查看帮助信息

                                              [dmdba@cjc-db-01 CJC]$ dmmdf help
                                              version: 03134284172-20240321-222308-20093
                                              Format: ./dmmdf KEYWORD=value


                                              Example: ./dmmdf TYPE=1 FILE=/opt/data/DAMENG/SYSTEM.DBF


                                              Keyword Explanation
                                              --------------------------------------------------------------------------------
                                              TYPE type
                                              1: for dbf
                                              bp_flag=0: Modify the normal instance
                                              bp_flag=1: Modify the BP instance of DPC
                                              2: for rlog
                                              3: for original bak
                                              4: for bakset meta
                                              5: for bakset bkp
                                              6: for bakset
                                              7: for database degrades
                                              scope=1: Only supports from 0x0007000A to 0x00070009, file is dm.ctl path
                                              scope=2: Only supports from 0x0007000B to 0x0007000A, file is dm.ini path
                                              scope=3: Only supports from 0x0007000C to 0x0007000B for single node database, file is dm.ini path
                                              8: for rlog degrades
                                              scope=1 Supports from 0x7006 to 0x7005
                                              scope=2 Supports from 0x7007 to 0x7006
                                              9: for specified database upgrade(Only supports from V8.1.1.88(V7.1.7.88) to V8.1.1.100(V7.1.7.100)
                                              10: for updating db_magic of database, file is dm.ini path, db_magic is which update to
                                              bp_flag=0: Modify the normal instance
                                              bp_flag=1: Modify the BP instance of DPC, also need page_size to be set
                                              FILE file path
                                              PAGE_SIZE for TYPE=10 BP_FLAG=1, the page size of the datafile, default 8192
                                              DCR_INI dmdcr.ini path
                                              SCOPE scope for database degrades when type=7 or 8
                                              BP_FLAG whether to modify the BP instance of DPC when type=1 or type=10, default not(0)
                                              DB_MAGIC target db_magic when type=10
                                              HELP                show this help info
                                              --------------------------------------------------
                                              Example: ./dmmdf TYPE=1 FILE=/opt/data/DAMENG/SYSTEM.dbf
                                              ./dmmdf TYPE=2 FILE=/opt/data/DAMENG/DAMENG01.log
                                              ./dmmdf TYPE=3 FILE=/opt/data/DAMENG/bak/dmdb.bak
                                              ./dmmdf TYPE=4 FILE=/opt/data/DAMENG/bak/bset/nbak.meta
                                              ./dmmdf TYPE=5 FILE=/opt/data/DAMENG/bak/bset/nbak.bak
                                              ./dmmdf TYPE=6 FILE=/opt/data/DAMENG/bak/bset
                                              ./dmmdf TYPE=7 SCOPE=1 FILE=/opt/data/DAMENG/dm.ctl
                                              ./dmmdf TYPE=7 SCOPE=2 FILE=/opt/data/DAMENG/dm.ini
                                              ./dmmdf TYPE=7 SCOPE=3 FILE=/opt/data/DAMENG/dm.ini
                                              ./dmmdf TYPE=8 SCOPE=1 FILE=/opt/data/DAMENG/dm.ini
                                              ./dmmdf TYPE=8 SCOPE=2 FILE=/opt/data/DAMENG/dm.ini
                                              ./dmmdf TYPE=9 FILE=/opt/data/DAMENG/dm.ini
                                              ./dmmdf TYPE=10 FILE=/opt/data/DAMENG/dm.ini DB_MAGIC=123456

                                              修改CJC01.log,将13 fil_id = 0改成13 fil_id = 1

                                                [dmdba@cjc-db-01 CJC]$ dmmdf TYPE=2 FILE=/dm8/data/CJC/CJC01.log
                                                dmmdf V8
                                                **********************************************************
                                                1 sig = DMRLOG
                                                2 ver = 7007
                                                3 chksum = 969006225
                                                4 sta = 0
                                                5 n_magic = 7
                                                6 db_magic = 1516351486
                                                7 len = 268435456
                                                8 free = 17567744
                                                9 clsn = 44660
                                                10 clsn_fil = 0
                                                11 clsn_off = 17547264
                                                12 pemnt_magic = 1510880832
                                                13 fil_id = 0
                                                15 next_seq = 5058
                                                16 g_next_seq = 5058
                                                17 arch_lsn = 0
                                                18 arch_seq = 0
                                                19 dbversion = 0x7000c
                                                20 min_exec_version = V8.1.1.1
                                                21 min_dct_version = 4
                                                22 p_db_magic = 0
                                                23 n_apply_ep = 0
                                                24 apply_info_lsn = 0
                                                pkg_seq_arr: (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
                                                apply_lsn_arr: (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
                                                25 recv_p_db_magic = 0
                                                26 recv_n_apply_ep = 0
                                                recv_pkg_seq_arr: (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
                                                recv_apply_lsn_arr: (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
                                                27 l_term_id = 0
                                                28 term_id = 0
                                                29 c_seqno = 5057
                                                30 c_lsn = 44661
                                                31 rpkg_crc_stand = 1
                                                **********************************************************
                                                You can only reset sta(4) or db_magic (6) or clsn (9) or clsn_fil(10) or clsn_off(11) or pemnt_magic(12) or fil_id(13) or next_seq(15) or g_next_seq(16) or p_db_magic(22) or n_apply_ep(23).
                                                Please input the num which one you want to change, q to quit: 13
                                                Input the new value: 1
                                                **********************************************************

                                                再次启动:

                                                  [dmdba@cjc-db-01 CJC]$ dmserver dm.ini

                                                  启动成功

                                                    ndct_db_load_info finished, code:0.
                                                    nsvr_process_before_open begin.
                                                    nsvr_process_before_open success.
                                                    SYSTEM IS READY.

                                                    登录数据库,检查

                                                      [dmdba@cjc-db-01 CJC]$ disql SYSDBA/CJC123456:5238


                                                      Server[LOCALHOST:5238]:mode is normal, state is open
                                                      login used time : 15.570(ms)
                                                      disql V8


                                                      SQL> SELECT * FROM T1;
                                                      LINEID ID
                                                      ---------- -----------
                                                      1 1
                                                      2 2
                                                      3 3
                                                      4 4
                                                      5          5
                                                      used time: 1.535(ms). Execute id is 501.


                                                      SQL> INSERT INTO T1 VALUES(6);
                                                      affect rows 1
                                                      used time: 1.171(ms). Execute id is 502.


                                                      SQL> COMMIT;
                                                      executed successfully
                                                      used time: 8.523(ms). Execute id is 503.

                                                      current redo 丢失

                                                      方法一:失败

                                                      按照之前的方法,将另一个redo复制为丢失的redo,使用dmmdf工具只修改fil_id仍然无法启动数据库,

                                                      继续修改其他几个值,还是无法启动数据库

                                                        sta(4) 
                                                        clsn (9) 44727
                                                        clsn_fil(10)
                                                        clsn_off(11) 17592320
                                                        fil_id(13) 0
                                                        next_seq(15) or 5068
                                                        g_next_seq(16) or 5068

                                                        方法二:失败

                                                        参考https://eco.dameng.com/解决方案,其他机器上重新创建一个相同名称、相同配置的实例,干净关库后考走redo文件到原库,手动修改db_magic等,无法解决问题。

                                                        报错如下:

                                                          2024-05-11 22:26:31.493 [INFO] database P0000010953 T0000000000000010953  main rfil[/dm8/data/CJC/CJC01.log]'s sta is active
                                                          2024-05-11 22:26:31.493 [ERROR] database P0000010953 T0000000000000010953 redo_rfil_grp_pwr_collect->rlog4_first_rpkg_seqno_get failed, code:-723
                                                          2024-05-11 22:26:31.494 [INFO] database P0000010953 T0000000000000010953 main rfil [/dm8/data/CJC/CJC01.log]'s grp collect 0 valid pwr record, discard 0 invalid pwr record

                                                          初步怀疑新版本对REDO增加了更严格的校验,具体恢复方法还要咨询原厂,有知道解决方法的欢迎留言。

                                                          参考链接:

                                                            https://eco.dameng.com/community/article/a702397bf5cb0aaf0d018e9cb1bee166
                                                            https://eco.dameng.com/community/article/1ffd251e3b2e82b08e0faf0eae9e345c
                                                            http://www.killdb.com/2021/08/27/%e8%be%be%e6%a2%a6%e6%95%b0%e6%8d%ae%e5%ba%93%e5%ad%a6%e4%b9%a0%e7%ac%94%e8%ae%b0%e4%b9%8b-current-redo%e8%af%af%e5%88%a0%e9%99%a4%e6%81%a2%e5%a4%8d/

                                                            ###chenjuchao 20240512###

                                                            欢迎关注我的公众号《IT小Chen》

                                                               

                                                            文章转载自IT小Chen,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                                                            评论