暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

非典型逻辑坏块问题的分析与解决--技术人生系列第六十八期-我和数据中心的故事

中亦安图服务号 2022-03-24
109


客户环境报ORA-600 [13013]的问题,这是个经典的logic corruption的错误报错如下:

    ORA-00600: internal error code, arguments: [13013], [5001], [2066092], [142606836], [7], [142606836], [17], [], [], [], [], []


    参数的意义是:

      ORA-600 [13013] [a] [b] [c] [d] [e] [f]
      This format relates to Oracle Server 8.0.3 to 10.1
      Arg [a] Passcount
      Arg [b] Data Object number
      Arg [c] Tablespace Relative DBA of block containing the row to be updated
      Arg [d] Row Slot number
      Arg [e] Relative DBA of block being updated (should be same as [c])
      Arg [f] Code



      出现问题的对象的data object id是2066092

      RDBA是142606836(0x088001F4)

      slot是7






      对象名称:

        Select object_name,object_type,owner from dba_objects where data_object_id=2066092;
        OBJECT_NAME OBJECT_TYPE OWNER
        ------------------------------ ------------------------------ ------------------------------
        SECXXXX TABLE HS_XXXX


        有了上面的信息,我们马上作表SECXXXX的逻辑检查。

          Analyze table HS_XXXX.SECXXXX validate structure cascade online ;
          Table analyzed.
            dbv file=/u01/app/proddat.dbf blocksize=8192
            DBVERIFY: Release 11.2.0.4.0 - Production on Tue Mar 22 12:07:24 2022
            Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
            DBVERIFY - Verification starting : FILE = u01/app/proddat.dbf
            DBVERIFY - Verification complete


            Total Pages Examined : 64000
            Total Pages Processed (Data) : 2578
            Total Pages Failing (Data) : 0
            Total Pages Processed (Index): 0
            Total Pages Failing (Index): 0
            Total Pages Processed (Other): 1469
            Total Pages Processed (Seg) : 0
            Total Pages Failing (Seg) : 0
            Total Pages Empty : 59953
            Total Pages Marked Corrupt : 0
            Total Pages Influx : 0
            Total Pages Encrypted : 0
            Highest block SCN : 3116625269 (3662.3116625269)




            无论是逻辑检查,还是物理检查都没有发现任何问题。我们再看一下这个块里保存的内容有没有问题:

              BH (0xd503e4f8) file#: 34 rdba: 0x088001f4 (34/500) class: 1 ba: 0xd565c000
              set: 9 pool: 3 bsz: 8192 bsi: 0 sflg: 1 pwc: 0,28
              dbwrid: 0 obj: 2066092 objn: 2066092 tsn: 34 afn: 34 hint: f
              hash: [0x1905fda90,0xbb0270a8] lru: [0x13cf9c478,0xbcf7cf90]
              lru-flags: hot_buffer
              ckptq: [NULL] fileq: [NULL] objq: [0x159083ac8,0x78061328] objaq: [0x159083ad8,0x78061318]
              st: XCURRENT md: NULL fpin: 'kdswh05: kdsgrp' tch: 6
              flags: block_written_once redo_since_read
              LRBA: [0x0.0.0] LSCN: [0x0.0] HSCN: [0xffff.ffffffff] HSUB: [1]
              buffer tsn: 34 rdba: 0x088001f4 (34/500)
              scn: 0x0e4e.b9553816 seq: 0x01 flg: 0x06 tail: 0x38160601
              frmt: 0x02 chkval: 0xb7a6 type: 0x06=trans data


                tab 0, row 7, @0x17e0
                tl: 228 fb: --H-FL-- lb: 0x2 cc: 34
                col 0: [ 8] 33 30 31 31 37 34 35 35
                col 1: [ 8] 33 30 31 31 37 34 35 35
                col 2: [ 2] 32 45
                col 3: [12] 32 45 38 30 30 30 30 30 30 36 35 32
                col 4: [ 2] c1 04
                col 5: [ 5] c4 15 17 04 12
                col 6: [ 6] 39 39 30 32 30 34
                col 7: [24]
                36 31 33 30 30 30 30 30 30 32 30 32 32 30 33 31 34 38 30 30 30 30 32 38
                col 8: [ 1] 30
                col 9: [ 1] 20
                col 10: [ 6] c4 0b 5b 32 1c 47
                col 11: [ 6] c4 0b 4c 32 1c 47
                col 12: [ 2] c3 10
                col 13: [ 1] 80
                col 14: [ 1] 80
                col 15: [ 1] 80
                col 16: [ 1] 80
                col 17: [ 1] 80
                col 18: [ 1] 80
                col 19: [ 1] 80
                col 20: [ 6] c4 0b 5b 32 1c 47
                col 21: [ 5] c4 0c 1a 01 33
                col 22: [ 1] 80
                col 23: [ 6] c4 03 13 0a 56 37
                col 24: [ 1] 80
                col 25: [ 1] 80
                col 26: [ 1] 31
                col 27: [27]
                77 6f 77 73 75 7f 3f 7c 77 75 7b 79 74 77 78 75 7e 75 73 77 7f 3e 79 73 77
                7e 77
                col 28: [29]
                30 30 30 30 32 45 39 39 30 32 30 34 30 30 30 30 30 36 31 33 30 30 30 31 33
                32 35 36 33
                col 29: [ 5] c0 54 11 31 1e
                col 30: [ 1] 80
                col 31: [ 1] 80
                col 32: [12] 36 31 33 30 30 30 31 33 32 35 36 33
                col 33: [ 5] c4 15 17 04 10


                从block里也看不到任何问题。






                这个问题不是普通的logic corruption的问题,脑子快速搜索还有没有其它情况它让Oracle误以为这个块逻辑损坏呢?块压缩?并行?Add column default XX not null?




                从简单的开始排查

                块压缩? 看过DDL,并没有。

                并行?  看过执行计划,并没有。





                那么是Add column default XX not null? 是的。




                  select name column_name
                  from col$, dba_objects
                  where bitand(col$.PROPERTY,1073741824)=1073741824
                  and object_id=obj#
                  and owner = 'HS_XXX'
                  and object_name = 'SECXXXX';


                  COLUMN_NAME
                  ------------------------------
                  FIRST_BUY_DATE
                  PROD_COST_PRICE
                  INVEST_AMOUNT
                  UNPAID_INCOME
                  TRANS_ACCOUNT



                  进一步分析请关注“小y-黄远邦-中亦”直播间,3月25日晚8点,不见不散!


                  更多实战分享和风险提示,请关注“中亦安图”公众号和小y视频号!也可以加小y微信,shadow-huang-bj,进微信群探讨技术。喜欢就转发吧,您的转发是我们持续分享的动力!


                  小y微信以及公益问诊群(三群)的二维码如下



                  文章转载自中亦安图服务号,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                  评论