13 oracle数据库坏块-逻辑坏块
逻辑数据坏块的场景
1)oracle bug也可能导致逻辑坏块的产生. 特别是parallel dml. 例如:
Bug 5621677 Logical corruption with PARALLEL update
Bug 6994194 Logical corruption from UPDATE DML
Bug 15980234 ORA-1400 / logical corruption from direct path INSERT ALL(fail with
ORA-1400, in direct path / PDML) 。
2)多数情况下逻辑坏块可能都是软件问题导致,当然数据库异常也可能导致。比如掉电的情况下,
就可能导致块内数据不一致
逻辑数据坏块检测工具
RMAN备份恢忽略soft corruption
soft corruption的块不计入maxcorrupt
media recovery会忽略soft corrupt
RMAN validate命令不会在alert中记录soft corrupt的信息,但是会在v$database_block_corruption中记录
DBV可以检测soft corruption
如果不设置event10231或者类似事件,那么soft corruption的块再次访问时报ORA-1578
逻辑数据坏块的检测
exp\expdp 该方式存在缺陷,是发现逻辑坏块后就会报错,不再继续检测.如果是索引逻辑坏块,那么exp是检测不到的
rman 同检测物理坏块一样,rman备份也具备检测逻辑坏块的功能,而且用的很多.
通常我们在进行数据库不完全恢复时,很多时候可能需要使用隐含参数强制打开数据库,打开之后我们都建议
使用rman进行检查,确认是否存在逻辑坏块等等,以免open后数据库出现异常
bbed 上面有介绍
sql查询 select select count(1) table_name ;
select /*+index_ffs(xxx) */ count(1) from table_name;
analyze命令 analyze index idx_name validate structure;
analyze table idx_name validate structure cascade online;
db_block_checking参数 在10g版本中,该参数默认是false,如果设置为true,那么可以进行逻辑坏块的检测,实际上设置以后, oracle在写入时就会进行check,避免产生逻辑坏块
逻辑坏块的分类
data block 通常我们是最大程度的将segment内的数据抢救出来,然后处理掉坏块.(如果是个别坏块,可以手工修复)
index block 对于index logical 坏块,一般来讲,我们通过rebuild就可以解决(有时或许不行,可以drop然后重建)
segment header/bitmap block ...etc 这种情况下比较复杂
模拟逻辑坏块
在很多情况下逻辑坏块都发生在索引上,所以这里我们以模拟索引逻辑坏块为例进行说明
YHQT@ orcl >create table yhqtest_6 as select owner,object_id,object_name from dba_objects where rownum<300;
Table created.
YHQT@ orcl >create index idx_yhqt6_id on yhqtest_6(object_id);
Index created.
YHQT@ orcl >select owner,object_id from dba_objects where object_name=upper('idx_yhqt6_id');
OWNER OBJECT_ID
------------------------------ ----------
YHQT 88407
YHQT@ orcl >alter session set events 'immediate trace name treedump level 88407';
Session altered.
--查看trace结果
[oracle@DSI ~]$ more /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_2354.trc
*** 2019-07-15 16:00:53.250
*** MODULE NAME:(SQL*Plus) 2019-07-15 16:00:53.250
*** ACTION NAME:() 2019-07-15 16:00:53.250 ----- begin tree dump
leaf: 0x28000db 41943259 (0: nrow: 299 rrow: 299)
----- end tree dump
YHQT@ orcl >select dbms_utility.data_block_address_file(TO_NUMBER('28000db','XXXXXXXX')) file_id,
dbms_utility.data_block_address_block(TO_NUMBER('28000db','XXXXXXXX')) block_id from dual; 2 FILE_ID BLOCK_ID
---------- ----------
10 219
--使用bbed查看并修改
BBED> set file 10 block 219
FILE# 10
BLOCK# 219
BBED> map /v
File: /u01/app/oracle/oradata/orcl/yhqt01.dbf (10)
Block: 219 Dba:0x028000db
------------------------------------------------------------ BBED> p kdxle
struct kdxle, 32 bytes @100
struct kdxlexco, 16 bytes @100
ub1 kdxcolev @100 0x00
ub1 kdxcolok @101 0x00
ub1 kdxcoopc @102 0x80
ub1 kdxconco @103 0x02
ub4 kdxcosdc @104 0x00000000
sb2 kdxconro @108 299
sb2 kdxcofbo @110 634
sb2 kdxcofeo @112 4246
sb2 kdxcoavs @114 3612
sb2 kdxlespl @116 0
sb2 kdxlende @118 0
ub4 kdxlenxt @120 0x00000000
ub4 kdxleprv @124 0x00000000
ub1 kdxledsz @128 0x00
ub1 kdxleflg @129 0x00 (NONE) BBED> dump /v offset 108 count 32
File: /u01/app/oracle/oradata/orcl/yhqt01.dbf (10)
Block: 219 Offsets: 108 to 139 Dba:0x028000db
-------------------------------------------------------
2b017a02 96101c0e 00000000 00000000 l +.z.............
YHQT@ orcl >select to_char(299,'xxxxxx') from dual; TO_CHAR
-------
12b
BBED> modify /x 2a offset 108
File: /u01/app/oracle/oradata/orcl/yhqt01.dbf (10)
Block: 219 Offsets: 108 to 139 Dba:0x028000db
------------------------------------------------------------------------
2a017a02 96101c0e 00000000 00000000 00000000 00000000 601f0000 541f481f <32 bytes per line> BBED> sum apply
Check value for File 10, Block 219:
current = 0xf239, required = 0xf239
使用rman进行查看
RMAN> backup validate check logical database;
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
5 FAILED 0 81 64011 10373342
File Name: /home/oracle/backup/test01.tts
Block Type Blocks Failing Blocks Processed
---------- -------------- ----------------
Data 3 20628
Index 0 13088
Other 0 30203
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
10 FAILED 0 6190 6400 10411019
File Name: /u01/app/oracle/oradata/orcl/yhqt01.dbf
Block Type Blocks Failing Blocks Processed
---------- -------------- ----------------
Data 2 53
Index 1 2
Other 1 155
用视图查询
YHQT@ orcl >select * from v$database_block_corruption; FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO
---------- ---------- ---------- ------------------ ---------
10 135 1 0 FRACTURED
10 160 1 0 ALL ZERO
5 2199 1 9533291 CORRUPT
5 2207 1 9579846 CORRUPT
5 2231 1 9620379 CORRUPT
10 151 1 10410172 CORRUPT
10 219 1 10411020 CORRUPT
CORRUPTION_CHANGE#值= 10410172,我们可以发现,corruption_change#值是大于0,那么表示是逻辑坏块
--查询坏块所在的对象
set autot off
set lines 150
col segment_name for a15
col owner for a20
SELECT tablespace_name, segment_type, owner, segment_name
FROM dba_extents
WHERE file_id = 10
and 219 between block_id AND block_id + blocks - 1;
YHQT@ orcl >SELECT tablespace_name, segment_type, owner, segment_name
FROM dba_extents
WHERE file_id = 10
and 219 between block_id AND block_id + blocks - 1; 2 3 4 TABLESPACE_NAME SEGMENT_TYPE OWNER SEGMENT_NAME
------------------------------ ------------------ -------------------- ---------------
YHQT INDEX YHQT IDX_YHQT6_ID
使用dbms_repair检测
创建repair table
YHQT@ orcl >conn / as sysdba
YHQT@ orcl >BEGIN
DBMS_REPAIR.ADMIN_TABLES (
TABLE_NAME => 'REPAIR_TABLE',
TABLE_TYPE => dbms_repair.repair_table,
ACTION => dbms_repair.create_action,
TABLESPACE => 'YHQT');
END;
/
检查对象上是否存在坏块
YHQT@ orcl >set serveroutput on
DECLARE num_corrupt INT;
BEGIN
num_corrupt := 0;
DBMS_REPAIR.CHECK_OBJECT (
SCHEMA_NAME => 'YHQT',
OBJECT_NAME => 'YHQTEST_6',
REPAIR_TABLE_NAME => 'REPAIR_TABLE',
corrupt_count => num_corrupt);
DBMS_OUTPUT.PUT_LINE('number corrupt: ' || TO_CHAR (num_corrupt));
END;
/
通过check后,就可以查询repair table了
SYS@ orcl >col CORRUPT_DESCRIPTION for a100
SYS@ orcl >select OBJECT_ID,RELATIVE_FILE_ID,BLOCK_ID,CORRUPT_TYPE,OBJECT_NAME,CORRUPT_DESCRIPTION FROM repair_table;
---修复逻辑块错误--index
SYS@ orcl >conn yhqt/***
Connected.
YHQT@ orcl >alter index IDX_YHQT6_ID rebuild;
Index altered.
YHQT@ orcl >select * from v$database_block_corruption;
FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO
---------- ---------- ---------- ------------------ ---------
10 135 1 0 FRACTURED
10 160 1 0 ALL ZERO
5 2199 1 9533291 CORRUPT
5 2207 1 9579846 CORRUPT
5 2231 1 9620379 CORRUPT
10 151 1 10410172 CORRUPT
10 219 1 10411020 CORRUPT
7 rows selected. YHQT@ orcl >analyze table yhqtest_6 validate structure cascade online;
analyze table yhqtest_6 validate structure cascade online
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file
YHQT@ orcl >drop index IDX_YHQT6_ID;
Index dropped.
YHQT@ orcl >create index idx_yhqt6_id on yhqtest_6(object_id);
Index created.
YHQT@ orcl >analyze table yhqtest_6 validate structure cascade online;
Table analyzed.
使用expdp备份
[oracle@DSI ~]$ expdp system/**** DIRECTORY=dump_file_dir_test compression=all schemas=YHQT dumpfile=schemas_YHQT_%U.DMP parallel=2 logfile=schemas_YHQT_20190715.log
修复上一篇物理坏块测试中的
2.The block is Fractured/Incomplete - header and footer of the block do not match
分析alert日志
Mon Jul 15 11:12:19 2019
Hex dump of (file 10, block 143) in trace file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_2354.trc
Corrupt block relative dba: 0x0280008f (file 10, block 143)
Fractured block found during multiblock buffer read
Data in bad block:
type: 6 format: 2 rdba: 0x0280008f
last change scn: 0x0000.009e53f5 seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x54f50601 ===>>>tail=2 lower bytes of SCN Base+type+seq=53f5+06+01 而改变之后的值是54f50601
check value in block header: 0x46ca
computed block checksum: 0x0
Reading datafile '/u01/app/oracle/oradata/orcl/yhqt01.dbf' for corruption at rdba: 0x0280008f (file 10, block 143)
Reread (file 10, block 143) found same corrupt data (no logical check)
Mon Jul 15 11:12:19 2019
Corrupt Block Found
TSN = 10, TSNAME = YHQT
RFN = 10, BLK = 143, RDBA = 41943183
OBJN = 88385, OBJD = 88385, OBJECT = YHQTEST_2, SUBOBJECT =
SEGMENT OWNER = YHQT, SEGMENT TYPE = Table Segment
Corrupt Block Found
TSN = 10, TSNAME = YHQT
RFN = 10, BLK = 143, RDBA = 41943183
OBJN = 88385, OBJD = 88385, OBJECT = YHQTEST_2, SUBOBJECT =
SEGMENT OWNER = YHQT, SEGMENT TYPE = Table Segment
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_2354.trc (incident=67366):
ORA-01578: ORACLE data block corrupted (file # 10, block # 143)
ORA-01110: data file 10: '/u01/app/oracle/oradata/orcl/yhqt01.dbf'
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl/incident/incdir_67366/orcl_ora_2354_i67366.trc
Mon Jul 15 11:12:21 2019
Sweep [inc][]: completed
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_2354.trc (incident=67367):
ORA-01578: ORACLE data block corrupted (file # 10, block # 143)
ORA-01110: data file 10: '/u01/app/oracle/oradata/orcl/yhqt01.dbf'
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl/incident/incdir_67367/orcl_ora_2354_i67367.trc
Hex dump of (file 10, block 135) in trace file /u01/app/oracle/diag/rdbms/orcl/orcl/incident/incdir_67366/orcl_m000_2539_i67366_a.trc
用bbed修改
BBED> p tailchk
ub4 tailchk @8188 0x54f50601
BBED> modify /x 0106f553 offset 8188
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
File: /u01/app/oracle/oradata/orcl/yhqt01.dbf (10)
Block: 143 Offsets: 8188 to 8191 Dba:0x0280008f
------------------------------------------------------------------------
0106f553
<32 bytes per line>
BBED> sum apply
Check value for File 10, Block 143:
current = 0x41ca, required = 0x41ca
YHQT@ orcl >alter system flush buffer_cache;
YHQT@ orcl >set linesize 999
YHQT@ orcl >select * from yhqtest_2; ID NAME
---------- ---------===查询正常,数据也还在
1 yhq