
造成数据块损坏的原因多种多样,可是因为物理原因导致,也可能人为原因或Oracle bug导致。

物理坏块(也可以称为介质坏块)指的是块格式本身是坏的,块内的数据没有任何意义。
逻辑坏块,指的是块内的数据在逻辑是存在问题。(内容)

物理一致性检查利用校验和字段工作,主要侧重于检查硬件故障并不关心内容正确与否。
逻辑一致性检查就是侧重于内容的检查,内容检查要比校验和检查复杂的多。

坏块不能避免,只能尽量减少发生坏块。
>create tablespace crpt_ts datafile '/u01/app/oracle/oradata/orcl/crpt_ts.dbf' size 100M;>create table scott.crpt tablespace crpt_ts as select * from dba_objects where rownum <3000;> select distinct dbms_rowid.rowid_block_number(rowid) b_no ,dbms_rowid.rowid_relative_fno(rowid) f_no from scott.crpt order by 1; B_NO F_NO---------- ---------- 131 7 132 7……………………省略…………………………….. 171 739 rows selected.SYS@ orcl>select a.file_id,a.block_id,a.blocks,b.name from dba_extents a,v$datafile b where a.file_id=b.file# and a.owner='SCOTT' and a.segment_name='CRPT' order by a.block_id; FILE_ID BLOCK_ID BLOCKS NAME---------- ---------- ---------- -------------------------------------------------- 7 128 8 /u01/app/oracle/oradata/orcl/crpt_ts.dbf 7 136 8 /u01/app/oracle/oradata/orcl/crpt_ts.dbf 7 144 8 /u01/app/oracle/oradata/orcl/crpt_ts.dbf 7 152 8 /u01/app/oracle/oradata/orcl/crpt_ts.dbf 7 160 8 /u01/app/oracle/oradata/orcl/crpt_ts.dbf 7 168 8 /u01/app/oracle/oradata/orcl/crpt_ts.dbf6 rows selected.---破坏137,158 数据块的内容seek=n从输出文件开头跳过 n个blocks 个块后再开始复制。conv=notrunc不截短输出文件dd if=/dev/zero of=/u01/app/oracle/oradata/orcl/crpt_ts.dbf bs=8192 conv=notrunc seek=137 count=1dd if=/dev/zero of=/u01/app/oracle/oradata/orcl/crpt_ts.dbf bs=8192 conv=notrunc seek=158 count=1也可以这样:• dd of=/u01/app/oracle/oradata/orcl/crpt_ts.dbf bs=8192 conv=notrunc seek=158 <<EOF • > Corrupt me! > EOF
此时dbv命令已经可以检查出两个坏块了:
[oracle@cuug101 script]$ dbv file=/u01/app/oracle/oradata/orcl/crpt_ts.dbfDBVERIFY: Release 11.2.0.4.0 - Production on Tue May 15 16:25:24 2018Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.…………………省略………………….Total Pages Empty : 12629Total Pages Marked Corrupt : 2Highest block SCN : 12592990 (0.12592990)
启动数据库已经发现无法对有坏块的对象进行全表扫描:
SYS@ orcl>select count(*) from scott.crpt;select count(*) from scott.crpt*ERROR at line 1:ORA-01578: ORACLE data block corrupted (file # 7, block # 137)ORA-01110: data file 7: '/u01/app/oracle/oradata/orcl/crpt_ts.dbf'
通过告警日志看出数据库在file 7block 137中的checksum的值和读取时重新计算的值已经不同,由于两次checksum值不同(即异或结果为非0),说明数据块被修改过,数据块为坏块(corruption)。
ORA-01578: ORACLE data block corrupted (file # 7, block # 137)ORA-01110: data file 7: '/u01/app/oracle/oradata/orcl/crpt_ts.dbf'Hex dump of (file 7, block 158) in trace file /u01/app/oracle/diag/rdbms/orcl/orcl/incident/incdir_259354/orcl_m000_5108_i259354_a.trcCorrupt block relative dba: 0x01c0009e (file 7, block 158)Completely zero block found during validationSYS@ orcl>select * from v$database_block_corruption;FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO---------- ---------- ---------- ------------------ ---------7 137 1 0 ALL ZERO7 158 1 0 ALL ZERO
SELECT OWNER, SEGMENT_NAME, SEGMENT_TYPE, TABLESPACE_NAMEFROM DBA_EXTENTS AWHERE FILE_ID = 7AND 137 BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1/OWNER SEGMENT_NAME SEGMENT_TYPE TABLESPACE_NAME------------------------------ -------------------- ------------------ ------------------------------SCOTT CRPT TABLE CRPT_TS
[oracle@cuug101 ~]$ exp system/oracle file=crpt.dmp tables=scott.crpt[oracle@cuug101 ~]$ expdp system/oracle dumpfile=crpt.dmp tables=scott.crptExport: Release 11.2.0.4.0 - Production on Tue May 15 23:49:52 2018Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.. . exporting table TESTEXP-00056: ORACLE error 1578 encounteredORA-01578: ORACLE data block corrupted (file # 7, block # 137)ORA-01110: data file 4: '/u01/app/oracle/oradata/orcl/users01.dbf'Export terminated successfully with warnings.SYS@ orcl>alter system set events='10231 trace name context forever,level 10';
alter system set events='10231 trace name context off';System altered.[oracle@cuug101 exp]$ expdp system/oracle dumpfile=crpt.dmp tables=scott.crptProcessing object type TABLE_EXPORT/TABLE/TABLE. . exported "SCOTT"."CRPT" 262.7 KB 2842 rowsMaster table "SYSTEM"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded[oracle@cuug101 exp]$ impdp system/oracle dumpfile=crpt.dmp remap_table=scott.crpt:crpt1Processing object type TABLE_EXPORT/TABLE/TABLE_DATA. . imported "SCOTT"."CRPT1" 262.7 KB 2842 rows
SYS@ orcl>select table_name, skip_corrupt from dba_tables where table_name = 'CRPT' AND owner='SCOTT';TABLE_NAME SKIP_COR------------------------------ --------CRPT DISABLED
set serveroutput onbegindbms_repair.admin_tables (table_name => 'REPAIR_TABLE',table_type => dbms_repair.repair_table,action => dbms_repair.create_action,tablespace => 'CRPT_TS');end;/
declarerpr_count int;beginrpr_count := 0;dbms_repair.check_object (schema_name => 'SCOTT',object_name => 'CRPT',repair_table_name => 'REPAIR_TABLE',corrupt_count => rpr_count);dbms_output.put_line('repair count: ' || to_char(rpr_count));end;repair count:2
SYS@ orcl>col object_name for a20SYS@ orcl>col CORRUPT_DESCRIPTION for a50SYS@ orcl>col REPAIR_DESCRIPTION for a40SYS@ orcl>col CORRUPT_DESCRIPTION for a20SYS@ orcl>select object_name, block_id, corrupt_type, marked_corrupt,corrupt_description,repair_description from repair_table;OBJECT_NAME BLOCK_ID CORRUPT_TYPE MARKED_COR CORRUPT_DESCRIPTION REPAIR_DESCRIPTION-------------------- ---------- ------------ ---------- -------------------- ----------------------------------------CRPT 137 6148 TRUE mark block software corruptCRPT 158 6148 TRUE mark block software corrupt
1. 定位坏块:只有将坏块信息写入定义的REPAIR_TABLE后,才能处理坏块。
(skip/noskip) SYS@ orcl> declarefix_count int;beginfix_count := 0;dbms_repair.fix_corrupt_blocks (schema_name => 'SCOTT',object_name => 'CRPT',object_type => dbms_repair.table_object,repair_table_name => 'REPAIR_TABLE',fix_count => fix_count);dbms_output.put_line('fix count: ' || to_char(fix_count));end;SYS@ orcl>/fix count: 0PL/SQL procedure successfully completed.
begindbms_repair.skip_corrupt_blocks (schema_name => 'SCOTT',object_name => 'CRPT',object_type => dbms_repair.table_object,flags => dbms_repair.skip_flag);end;8 /PL/SQL procedure successfully completed.SYS@ orcl>select table_name, skip_corrupt from dba_tables where table_name = 'CRPT' AND owner='SCOTT';TABLE_NAME SKIP_COR------------------------------ --------CRPT ENABLEDselect count(*) from crpt;SYS@ orcl>select count(*) from scott.crpt;COUNT(*)----------2842
backup check logical validate datafile 500;RMAN> backup check logical validate datafile 500;Starting backup at 02-MAR-21using target database control file instead of recovery catalogallocated channel: ORA_DISK_1channel ORA_DISK_1: SID=8190 instance=cxbdzxdb1 device type=DISKchannel ORA_DISK_1: starting full datafile backup setchannel ORA_DISK_1: specifying datafile(s) in backup setinput datafile file number=00500 name=+DG_DATA_SSD_2/CXBDZXDB/DATAFILE/tbs_index.257.1065130555channel ORA_DISK_1: backup set complete, elapsed time: 00:01:36List of Datafiles=================File Status Marked Corrupt Empty Blocks Blocks Examined High SCN---- ------ -------------- ------------ --------------- ----------500 FAILED 0 145285 3932160 11215813018512File Name: +DG_DATA_SSD_2/CXBDZXDB/DATAFILE/tbs_index.257.1065130555Block Type Blocks Failing Blocks Processed---------- -------------- ----------------Data 0 1657837Index 1 2124520Other 0 4518validate found one or more corrupt blocksSee trace file /oracle/app/oracle/diag/rdbms/cxbdzxdb/cxbdzxdb1/trace/cxbdzxdb1_ora_35457316.trc for detailsFinished backup at 02-MAR-21
--执行完成后执行
set line 300select * from V$database_block_corruption;SQL> set line 300SQL> select * from V$database_block_corruption;FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTION_TYPE CON_ID---------- ---------- ---------- ------------------ ------------------ ----------500 1042393 1 1.1216E+13 CORRUPT 0
---查询文件号对应的数据文件
select file_id,FILE_NAME,TABLESPACE_NAME,BYTES/1024/1024,STATUS,AUTOEXTENSIBLE,ONLINE_STATUS from dba_data_files where file_id=856;
set linesize 300col owner for a30col table_name for a30col index_name for a30col partitioned for a30select owner,table_name,index_name,status,partitioned,uniqueness,tablespace_name from dba_indexes where owner='POLICY' and index_name='IDX_PLC_RATION_TOPID' order by 3;
set long 3000select dbms_metadata.get_ddl('INDEX','IDX_PLC_RATION_TOPID','POLICY') a from dual;
drop index policy.IDX_PLC_RATION_TOPID;CREATE INDEX "POLICY"."IDX_PLC_RATION_TOPID" ON "POLICY"."PLC_RATION" ("TOPID") TABLESPACE "TBS_INDEX" parallel 12 ;
Fri Jul 2 12:41:36 2010Hex dump of (file 12, block 2718618) in trace file /u01/app/oracle/admin/bi/udump/bi_ora_31213.trcCorrupt block relative dba: 0x03297b9a (file 12, block 2718618)Fractured block found during backing up datafileData in bad block:type: 6 format: 2 rdba: 0x03297b9alast change scn: 0x0002.482fc15b seq: 0x1 flg: 0x06spare1: 0x0 spare2: 0x0 spare3: 0x0consistency value in tail: 0x77b20601check value in block header: 0x253computed block checksum: 0xb6e9Reread of blocknum=2718618, file=/u01/oradata/BI/estaging_user01.712.714072365. found same corrupt dataReread of blocknum=2718618, file=/u01/oradata/BI/estaging_user01.712.714072365. found same corrupt dataReread of blocknum=2718618, file=/u01/oradata/BI/estaging_user01.712.714072365. found same corrupt dataReread of blocknum=2718618, file=/u01/oradata/BI/estaging_user01.712.714072365. found same corrupt dataReread of blocknum=2718618, file=/u01/oradata/BI/estaging_user01.712.714072365. found same corrupt data
(2)查询数据库,可知含有坏块的对象:
SQL> col SEGMENT_NAME format a20col PARTITION_NAME format a10select owner,segment_name,partition_name from dba_extents where file_id = 12 and 2718618 between block_id and block_id + blocks-1;OWNER SEGMENT_NAME PARTITION_-------------------- -------------------- ----------ESTAGING LOG_RECORD_DETAIL_4 P20100630
(3)但全表扫描却没有任何问题:
SQL> select count(*) from ESTAGING.LOG_RECORD_DETAIL_4 partition (P20100630);COUNT(*)----------449937SQL> select count(*) from ESTAGING.LOG_RECORD_DETAIL_4;COUNT(*)----------42049608
(4)使用dbv检查发现有一个坏块(耗时较长):
$ dbv file=/u01/oradata/BI/estaging_user01.712.714072365 BLOCKSIZE=8192DBVERIFY: Release 10.2.0.4.0 - Production on Fri Jul 2 14:15:49 2010Copyright (c) 1982, 2007, Oracle. All rights reserved.DBVERIFY - Verification starting : FILE = /u01/oradata/BI/estaging_user01.712.714072365Page 2718618 is influx - most likely media corruptCorrupt block relative dba: 0x03297b9a (file 12, block 2718618)Fractured block found during dbv:Data in bad block:type: 6 format: 2 rdba: 0x03297b9alast change scn: 0x0002.482fc15b seq: 0x1 flg: 0x06spare1: 0x0 spare2: 0x0 spare3: 0x0consistency value in tail: 0x77b20601check value in block header: 0x253computed block checksum: 0xb6e9DBVERIFY - Verification completeTotal Pages Examined : 2748160Total Pages Processed (Data) : 2462446Total Pages Failing (Data) : 0Total Pages Processed (Index):235234Total Pages Failing (Index):0Total Pages Processed (Other):24969Total Pages Processed (Seg) : 0Total Pages Failing (Seg) : 0Total Pages Empty : 25510Total Pages Marked Corrupt : 1Total Pages Influx : 1Highest block SCN : 1229607770 (2.1229607770)
(5)使用rman检查含有坏块的数据文件(耗时较长),, 期间观察alert.log会发现同样的提示:
RMAN> backup validate datafile 12;这个时候访问v$database_block_corruption可以看到详细的坏块的信息:SQL> select * from v$database_block_corruption;FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO---------- ---------- ---------- ------------------ ---------12 2718618 1 0 FRACTURED
RMAN> blockrecover datafile 12 block 2718618 from backupset;
RMAN> BLOCKRECOVER CORRUPTION LIST;
SQL> select * from v$database_block_corruption;no rows selected
(9)再使用dbv检查发现没有坏块了(耗时较长):
$ dbv file=/u01/oradata/BI/estaging_user01.712.714072365 BLOCKSIZE=8192DBVERIFY: Release 10.2.0.4.0 - Production on Fri Jul 2 15:38:15 2010Copyright (c) 1982, 2007, Oracle. All rights reserved.DBVERIFY - Verification starting : FILE = /u01/oradata/BI/estaging_user01.712.714072365DBVERIFY - Verification completeTotal Pages Examined : 2749440Total Pages Processed (Data) : 2463763Total Pages Failing (Data) : 0Total Pages Processed (Index):235250Total Pages Failing (Index):0Total Pages Processed (Other):24981Total Pages Processed (Seg) : 0Total Pages Failing (Seg) : 0Total Pages Empty : 25446Total Pages Marked Corrupt : 0Total Pages Influx : 0Highest block SCN : 1230819157 (2.1230819157)
注意如果没有rman的备份,无法执行上面的语句,退一步讲,如果这是你有手工的热备,可以将热备catalog datafilecopy 'xxxxxx';转换成一个rman备份。
本 文 原 创 来 源:IT那活儿微信公众号(上海新炬王翦团队)





