catalog损坏情况下的数据库恢复实例
一.环境描述
1.运行环境:HP小型机,HPUX操作系统,Oracle 9.2.0.1,OpenView Data Protector备份管理软件,ESL9000带库。
数据库全备脚本:
run {
allocate channel 'dev_0' type 'sbt_tape'
parms 'ENV=(OB2BARTYPE=Oracle8,OB2APPNAME=oratest,OB2BARLIST=test2oratest)';
backup incremental level <incr_level> filesperset 1
format 'test2oratest<oratest_%s:%t:%p>.dbf'
database
include current controlfile;
sql 'alter system archive log current';
}
归档日志备份脚本:
run {
allocate channel 'dev_0' type 'sbt_tape'
parms 'ENV=(OB2BARTYPE=Oracle8,OB2APPNAME=oratest,
OB2BARLIST=test2oratestarchivelog)';
backup incremental level <incr_level> filesperset 1
format 'test2oratestarchivelog<oratest_%s:%t:%p>.dbf'
archivelog all delete input;
}
2.故障描述:Oracle数据库(测试库)运行在归档模式下。正常情况下,备份管理软件调用Oracle RMAN进行全库和归档日志备份。备份文件存储在ESL9000磁带库上。由于磁盘阵列故障,造成目标数据库控制文件和数据文件全部损坏;同时,存放catalog的数据库也被损坏。
由于catalog损坏,RMAN无法通过catalog找到备份集,restore无法成功。由于没有控制文件,数据库仅能启动到nomount状态下。
二.背景知识
在Oracle 816 以后的版本中,Oracle提供了一个包:DBMS_BACKUP_RESTORE。 DBMS_BACKUP_RESTORE 包是由dbmsbkrs.sql 和 prvtbkrs.plb 这两个脚本创建的。 catproc.sql 脚本运行后会调用这两个包。所以每个数据库都有的这个包作为Oracle服务器和操作系统之间IO操作的接口,由RMAN直接调用。我们可以在数据库nomount 情况下调用这些包 ,来达到数据库恢复的目的。
三.恢复步骤
1.查找备份集
备份集可以在Data Protector的internal database中或日志记录中找到。具体内容(根据需要做了剪裁)如下:
SBT-25032 (?) 10/08/04 10:42:54 [Normal] From: OB2BAR@test2 "oratest" Time: 10/08/04 10:42:54
Starting OB2BAR Backup: 05 test2:test2oratest<oratest_1:539001683:1>.dbf // Oracle8
SBT-25032 (?) 10/08/04 10:46:14 [Normal] From: OB2BAR@test2 "oratest" Time: 10/08/04 10:46:14
Completed OB2BAR Backup: 05 test2:test2oratest<oratest_1:539001683:1>.dbf // Oracle8
SBT-25032 (?) 10/08/04 10:46:22 [Normal] From: OB2BAR@test2 "oratest" Time: 10/08/04 10:46:22
Starting OB2BAR Backup: 05 test2:test2oratest<oratest_2:539001979:1>.dbf // Oracle8
SBT-25032 (?) 10/08/04 11:02:45 [Normal] From: OB2BAR@test2 "oratest" Time: 10/08/04 11:02:45
Completed OB2BAR Backup: 05 test2:test2oratest<oratest_2:539001979:1>.dbf // Oracle8
SBT-28813 (?) 10/08/04 11:34:57 [Normal] From: OB2BAR@test2 "oratest" Time: 10/08/04 11:34:57
Starting OB2BAR Backup: 05 test2:test2oratestarchivelog<oratest_3:539004793:1>.dbf // Oracle8
SBT-28813 (?) 10/08/04 11:35:17 [Normal] From: OB2BAR@test2 "oratest" Time: 10/08/04 11:35:17
Completed OB2BAR Backup: 05 test2:test2oratestarchivelog<oratest_3:539004793:1>.dbf // Oracle8
。。。
SBT-28813 (?) 10/08/04 11:41:57 [Normal] From: OB2BAR@test2 "oratest" Time: 10/08/04 11:41:57
Starting OB2BAR Backup: 05 test2:test2oratestarchivelog<oratest_26:539005316:1>.dbf // Oracle8
SBT-28813 (?) 10/08/04 11:42:08 [Normal] From: OB2BAR@test2 "oratest" Time: 10/08/04 11:42:08
Completed OB2BAR Backup: 05 test2:test2oratestarchivelog<oratest_26:539005316:1>.dbf // Oracle8
SBT-28813 (?) 10/08/04 11:42:13 [Normal] From: OB2BAR@test2 "oratest" Time: 10/08/04 11:42:13
Starting OB2BAR Backup: 05 test2:test2oratestarchivelog<oratest_27:539005332:1>.dbf // Oracle8
SBT-28813 (?) 10/08/04 11:42:16 [Normal] From: OB2BAR@test2 "oratest" Time: 10/08/04 11:42:16
Completed OB2BAR Backup: 05 test2:test2oratestarchivelog<oratest_27:539005332:1>.dbf // Oracle8
从上述内容可以看到存储在带库上的全备的备份集文件为:
test2:test2oratest<oratest_1:539001683:1>.dbf
test2:test2oratest<oratest_2:539001979:1>.dbf
Archivelog的备份集文件为:
test2:test2oratestarchivelog<oratest_3:539004793:1>.dbf
。。。
test2:test2oratestarchivelog<oratest_26:539005316:1>.dbf
test2:test2oratestarchivelog<oratest_27:539005332:1>.dbf
2.恢复控制文件
test2:/backup/test$sqlplus /nolog
SQL*Plus: Release 9.2.0.1.0 - Production on Sat Oct 9 14:30:54 2004
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
SQL> connect / as sysdba
Connected.
SQL>startup nomount
SQL>DECLARE
2 devtype varchar2(256);
3 done boolean;
4 BEGIN
5 devtype:=sys.dbms_backup_restore.deviceAllocate(type=>'sbt_tape',
ident=>'T1',params=>'ENV=(OB2BARTYPE=Oracle8,OB2APPNAME=oratest,OB2BARLIST=test2oratest)');
6 sys.dbms_backup_restore.restoreSetDatafile;
7 sys.dbms_backup_restore.restoreControlfileTo(
cfname=>'/backup/test/Control01.ctl');
8 sys.dbms_backup_restore.restoreBackupPiece(done=>done,
handle=>'test2oratest<oratest_2:539001979:1>.dbf', params=>null);
9 sys.dbms_backup_restore.deviceDeallocate;
10 END;
11 /
[Normal] From: OB2BAR@test2 "oratest" Time: 10/09/04 14:23:28
Starting OB2BAR Restore: 05 test2:test2oratest<oratest_2:539001979:1>.dbf // Oracle8
[Normal] From: OB2BAR@test2 "oratest" Time: 10/09/04 14:23:32
Completed OB2BAR Restore: 05 test2:test2oratest<oratest_2:539001979:1>.dbf // Oracle8
SQL>
由于RMAN做全库备份的时候,最后备份控制文件,应此可以从最后一个全备的备份集文件中恢复控制文件。恢复控制文件时,需将数据库启动到nomount状态。恢复的控制文件应与init文件中定义的控制文件的路径、文件名和数量相一致,否则无法把数据库启动到mount状态。
下面介绍程序包的内容:
第五行 分配一个device channel ,"sbt_tape"说明是从磁带上恢复。Params参数与Data Protector中RMAN备份脚本中的参数一致。
第六行 初始化恢复过程,准备进行控制文件或数据文件恢复。
第七行 指出待恢复文件和恢复文件的存储位置。本处说明是恢复控制文件及存放存放控制文件的路径、文件名。
第八行 说明从哪个备份片中恢复
第九行 释放设备通道
3.恢复数据文件
首先需要知道每个备份集中包含哪些数据文件:
SQL>alter database mount;
SQL>select file# from v$backup_datafile where set_stamp=539001683;
FILE#
----------
2
0
2 rows selected.
SQL>
其中,set_stamp是备份集文件名中的时间戳;
0表示控制文件;
2是数据文件号。
文件号对应的文件名可以在v$datafile中查到。
知道备份集中包含的数据文件号和文件名,就可以恢复数据文件了:
SQL>DECLARE
2 devtype varchar2(256);
3 done boolean;
4 BEGIN
5 devtype:=sys.dbms_backup_restore.deviceAllocate(type=>'sbt_tape',
ident=>'T1',params=>'ENV=(OB2BARTYPE=Oracle8,OB2APPNAME=oratest,OB2BARLIST=test2oratest)');
6 sys.dbms_backup_restore.restoreSetDatafile;
7 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>2,
toname=>'/backup/test/undotbsldb01.dbf');
8 sys.dbms_backup_restore.restoreBackupPiece(done=>done,
handle=>'test2:test2oratest<oratest_1:539001683:1>.dbf', params=>null);
9 sys.dbms_backup_restore.deviceDeallocate;
10 END;
11 /
[Normal] From: OB2BAR@test2 "oratest" Time: 10/09/04 14:20:10
Starting OB2BAR Restore: 05test2:test2oratest<oratest_1:539001683:1>.dbf // Oracle8
[Normal] From: OB2BAR@test2 "oratest" Time: 10/09/04 14:20:24
Completed OB2BAR Restore: 05 test2:test2oratest<oratest_2:539001683:1>.dbf // Oracle8
PL/SQL procedure successfully completed.
SQL>
重复上述操作,可以恢复所有数据文件。恢复的数据文件的存储目录可以不是原存储目录,因此,甚至可以在数据库open状态下从备份集中restore数据文件。
4.恢复archive log文件
SQL>DECLARE
2 devtype varchar2(256);
3 done boolean;
4 BEGIN
5 devtype:=sys.dbms_backup_restore.deviceAllocate(type=>'sbt_tape',ident=>'T1',
params=>'ENV=(OB2BARTYPE=Oracle8,OB2APPNAME=oratest,
OB2BARLIST=test2oratestarchivelog)');
6 sys.dbms_backup_restore.restoreSetArchivedLog;
7 sys.dbms_backup_restore.restoreArchivedLogRange;
8 sys.dbms_backup_restore.restoreBackupPiece(done=>done,
handle=>'test2:test2oratestarchivelog<oratest_26:539005316:1>.dbf', params=>null);
9 sys.dbms_backup_restore.deviceDeallocate;
10 END;
11 /
[Normal] From: OB2BAR@test2 "oratest" Time: 10/09/04 14:40:13
Starting OB2BAR Restore: 05 test2:test2oratestarchivelog<oratest_27:539005316:1>.dbf // Oracle8
[Normal] From: OB2BAR@test2 "oratest" Time: 10/09/04 14:40:13
Completed OB2BAR Restore: 05 test2:test2oratestarchivelog<oratest_27:539005316:1>.dbf // Oracle8
SQL>
根据第一步中获得的archive log备份集的文件名,恢复所有archive log.
5. Recover数据库
SQL> recover database until cancel using backup controlfile;
ORA-00279: change 54285 generated at 10/08/2004 10:41:24 needed for thread 1
ORA-00289: suggestion : /backup/test/archive/1_24.dbf
ORA-00280: change 54285 for thread 1 is in sequence #24
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
ORA-00279: change 55331 generated at 10/08/2004 11:02:48 needed for thread 1
ORA-00289: suggestion : /backup/test/archive/1_25.dbf
ORA-00280: change 55331 for thread 1 is in sequence #25
ORA-00278: log file '/backup/test/archive/1_24.dbf' no longer needed for this recovery
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
ORA-00279: change 56789 generated at 10/08/2004 11:33:12 needed for thread 1
ORA-00289: suggestion : /backup/test/archive/1_26.dbf
ORA-00280: change 56789 for thread 1 is in sequence #26
ORA-00278: log file '/backup/test/archive/1_25.dbf' no longer needed for this recovery
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
ORA-00308: cannot open archived log '/backup/test/archive/1_26.dbf'
ORA-27037: unable to obtain file status
HP-UX Error: 2: No such file or directory
Additional information: 3
SQL> alter database open resetlogs;
Database altered.
SQL>
现在数据库恢复已经完成了。由于online redo log也损坏了,数据库仅恢复到最后一个归档的日志文件。
一个小技巧:在上一步恢复archive log时,archive log文件可能非常多。如果不知道最早应该恢复到哪一个archive log文件,可以先执行一次本步操作,系统会提示最先需要的archive log文件,然后从最后的一个archive log备份集往前恢复直到该archive log文件被恢复。示例如下:
SQL> recover database until cancel using backup controlfile;
ORA-00279: change 54285 generated at 10/08/2004 10:41:24 needed for thread 1
ORA-00289: suggestion : /backup/test/archive/1_24.dbf
ORA-00280: change 54285 for thread 1 is in sequence #24
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
ORA-00308: cannot open archived log '/backup/test/archive/1_24.dbf'
ORA-27037: unable to obtain file status
HP-UX Error: 2: No such file or directory
Additional information: 3
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01152: file 1 was not restored from a sufficiently old backup
ORA-01110: data file 1: '/backup/test/oradata/system01.dbf'
SQL>
显然,第一个需要的archive log文件是/backup/test/archive/1_24.dbf
五.小结
1. 只要保存了完整的全库备份和完整的归档日志备份,即使控制文件和恢复目录全部损坏,数据库还是可以恢复的。
2. 根据上面一条可以看出,无论备份时是否采用了恢复目录,恢复目录在数据库恢复时,并不是必须的。
3. RMAN做的备份可以不用RMAN恢复。
4. HP Data Protector的备份脚本在做全库备份时,是一个一个备份数据文件和日志文件并形成单独的备份集。数据库全备期间的日志文件必须保存完整。
参考资料:
DBA工作备忘录之三:rman备份,没用catalog ,但控制文件丢失,怎么解决? Fenng