RAC数据文件误添加到本地文件系统问题处理
一、报错现象
数据库查询数据报错,定位原因是RAC库把datafile加在本地磁盘,另一个节点无法访问。
二、准备工作
2.1 影响情况
由于该操作将影响该数据文件的使用,所以需与业务进行确认。
2.2 备份数据库
为了数据的安全性,做任何变更之前都需要对数据库进行一个全备,以防遇到不可逆转问题,也能保证数据的完整。
RMAN> backup database plus archivelog format '/home/oracle/backup/%U';
2.3 数据库重启操作
为了保证数据库的可用性。在变更操作之前,对数据库进行进行逐台的重启操作。若启动正常,则进行后续变更操作。
针对数据库db逐台重启:
SQL> shutdown immediate SQL> startup
三、开启归档
由于数据库处理未开启归档状态,所以需要先开启归档模式进行后续操作。
归档开启需要对整个数据库进行关闭,然后启动到mount状态,再开启数据库,所以需要停止业务。
如后续归档需要关闭,步骤与开启归档一致,在步骤2.5中,命令更改为:alter database noarchivelog;即可。
3.1 关闭监听
查看数据库存在哪些监听,都停掉。
两个节点同时操作
[oracle@rac1 ~]$ ps -ef |grep tns root 15 2 0 08:30 ? 00:00:00 [netns] grid 44993 1 0 16:52 ? 00:00:00 /u01/app/grid/product/11.2.0/grid/bin/tnslsnr LISTENER_SCAN1 -inherit grid 45009 1 0 16:52 ? 00:00:00 /u01/app/grid/product/11.2.0/grid/bin/tnslsnr LISTENER -inherit oracle 54254 54199 0 19:13 pts/2 00:00:00 grep tns [oracle@rac1 ~]$ su - grid Password: [grid@rac1 ~]$ [grid@rac1 ~]$ lsnrctl stop LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 04-JUN-2019 19:14:21 Copyright (c) 1991, 2013, Oracle. All rights reserved. Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER))) The command completed successfully [grid@rac1 ~]$ lsnrctl stop LISTENER_SCAN1 LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 04-JUN-2019 19:14:27 Copyright (c) 1991, 2013, Oracle. All rights reserved. Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN1))) The command completed successfully [grid@rac1 ~]$
3.2 设置检查点
两个节点同时操作
SQL> alter system checkpoint;
System altered.
SQL>
3.3 关闭数据库
两个节点同时操作
SQL> shutdown immediate Database closed. Database dismounted. ORACLE instance shut down. SQL>
3.4 启动数据库mount
两个节点同时操作
SQL> shutdown immediate Database closed. Database dismounted. ORACLE instance shut down. SQL>
3.5 开启归档
两个节点同事操作
SQL> alter database archivelog;
Database altered.
SQL>
3.6 启动数据库
SQL> alter database open;
Database altered.
SQL>
3.7 设置归档路径大小
SQL> archive log list
Database log mode Archive Mode
Automatic archival Enabled
Archive destination USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence 106
Next log sequence to archive 107
Current log sequence 107
SQL> show parameter db_recovery_file_dest
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest string +FRA
db_recovery_file_dest_size big integer 1G
SQL>
SQL> alter system set db_recovery_file_dest='xxxx';
SQL> alter system set db_recovery_file_dest_size=2G;
四、恢复步骤
恢复过程中,保留归档不被删除。如有归档备份删除软件,需暂停此类软件。
4.1 数据文件offline
将存在错误路径的文件offline;
offline之后,存在该数据文件上的数据将无法读取,该表空间存在其他正常数据文件中的数据无影响。
alter database datafile 'xxxxxx' offline;
4.2 copy数据文件
将错误路径的数据文件用过rman进行copy到正确的路径。
backup as copy datafile 'xxxxxx' format '+DG_INDEX';
4.3 rename数据文件
alter database rename file 'xxxxx' to 'xxxxx';
4.4 恢复数据文件
这里就需要用到归档日志,所以前面要求保留归档日志。
recover datafile xxx;
4.5 数据文件online
alter database datafile 8 online;
4.6 检查数据文件
查看数据文件路径是否正常,rac各个节点访问情况是否正常。
5、模拟测试
5.1 模拟故障现象
将把环境模拟成故障现象报错的情况。
5.1.1 创建表空间、用户、表
节点一:创建表空间,创建用户,创建表,查询表数据。——正常
SQL> create tablespace lyp datafile '+NEWDATA1' size 20M;
Tablespace created.
SQL> create user lyp identified by lyp default tablespace lyp;
User created.
SQL> grant dba to lyp;
Grant succeeded.
SQL> conn lyp/lyp
Connected.
SQL> create table lyp as select owner,table_name,tablespace_name from dba_tables;
Table created.
SQL>
SQL> select count(*) from lyp;
COUNT(*)
----------
1993
SQL>
节点二:查询表数据。——正常
SQL> conn lyp/lyp
Connected.
SQL> select count(*) from lyp;
COUNT(*)
----------
1993
SQL>
5.1.2 添加错误路径数据文件
SQL> alter tablespace lyp add datafile 'NEWDATA1' size 20M;
Tablespace altered.
SQL> select file_id, file_name from dba_data_files where tablespace_name='LYP';
FILE_ID FILE_NAME
---------- ------------------------------------------------------------
6 +NEWDATA1/rac/datafile/lyp.276.1010080027
7 /u01/app/oracle/product/11.2.0/db_1/dbs/NEWDATA1
SQL>
这里可以看到7号数据文件路径错了。
5.1.3 插入数据
SQL> create table test as select * from lyp;
Table created.
SQL> insert into test select * from test;
1993 rows created.
SQL> insert into test select * from test;
3986 rows created.
SQL> insert into test select * from test;
7972 rows created.
SQL> insert into test select * from test;
15944 rows created.
SQL> insert into test select * from test;
31888 rows created.
SQL> insert into test select * from test;
63776 rows created.
SQL> insert into test select * from test;
127552 rows created.
SQL> insert into test select * from test;
255104 rows created.
SQL> insert into test select * from test;
insert into test select * from test
*
ERROR at line 1:
ORA-01653: unable to extend table LYP.TEST by 128 in tablespace LYP
SQL> commit;
Commit complete.
SQL>
5.1.4 重新报错
节点一:查询正常
SQL> select count(*) from lyp;
COUNT(*)
----------
1993
SQL> select count(*) from test;
COUNT(*)
----------
510208
SQL>
节点二:查询报错。
SQL> select count(*) from lyp;
COUNT(*)
----------
1993
SQL> select count(*) from test;
select count(*) from test *
ERROR at line 1:
ORA-01157: cannot identify/lock data file 7 - see DBWR trace file
ORA-01110: data file 7: '/u01/app/oracle/product/11.2.0/db_1/dbs/NEWDATA1'
SQL>
5.2 恢复故障
5.2.1 数据文件offline
SQL> select file_id,file_name,status from dba_data_files where tablespace_name='LYP';
FILE_ID FILE_NAME STATUS
---------- ------------------------------------------------------------ ---------
6 +NEWDATA1/rac/datafile/lyp.276.1010080027 AVAILABLE
7 /u01/app/oracle/product/11.2.0/db_1/dbs/NEWDATA1 AVAILABLE
SQL> select file#,name,status from v$datafile where file#=7;
FILE# NAME STATUS
---------- ------------------------------------------------------------ -------
7 /u01/app/oracle/product/11.2.0/db_1/dbs/NEWDATA1 ONLINE
SQL> alter database datafile 7 offline;
Database altered.
SQL> select file#,name,status from v$datafile where file#=7;
FILE# NAME STATUS
---------- ------------------------------------------------------------ -------
7 /u01/app/oracle/product/11.2.0/db_1/dbs/NEWDATA1 RECOVER
SQL>
SQL> select count(*) from lyp;
COUNT(*)
----------
1993
SQL> select count(*) from test;
select count(*) from test
*
ERROR at line 1:
ORA-00376: file 7 cannot be read at this time
ORA-01110: data file 7: '/u01/app/oracle/product/11.2.0/db_1/dbs/NEWDATA1'
SQL>
offline之后,存在该数据文件上的数据将无法读取,该表空间存在其他正常数据文件中的数据无影响。
5.2.2 copy数据文件
RMAN> backup as copy datafile '/u01/app/oracle/product/11.2.0/db_1/dbs/NEWDATA1' format '+NEWDATA1';
Starting backup at 04-JUN-19
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=24 instance=rac1 device type=DISK
channel ORA_DISK_1: starting datafile copy
input datafile file number=00007 name=/u01/app/oracle/product/11.2.0/db_1/dbs/NEWDATA1
output file name=+NEWDATA1/rac/datafile/lyp.277.1010081595 tag=TAG20190604T181315 RECID=25 STAMP=1010081595
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
Finished backup at 04-JUN-19
RMAN>
以上标红的就是copy生成出来的数据文件。
5.2.3 rename数据文件
SQL> alter database rename file '/u01/app/oracle/product/11.2.0/db_1/dbs/NEWDATA1' to '+NEWDATA1/rac/datafile/lyp.277.1010081595';
Database altered.
SQL> select file_id,file_name,status from dba_data_files where tablespace_name='LYP';
FILE_ID FILE_NAME STATUS
---------- ------------------------------------------------------------ ---------
6 +NEWDATA1/rac/datafile/lyp.276.1010080027 AVAILABLE
7 +NEWDATA1/rac/datafile/lyp.277.1010081595 AVAILABLE
SQL>
数据文件修改成功
5.2.4 恢复数据文件
SQL> select file#,name,status from v$datafile where file#=7;
FILE# NAME STATUS
---------- ------------------------------------------------------------ -------
7 +NEWDATA1/rac/datafile/lyp.277.1010081595 RECOVER
SQL> recover datafile 7;
Media recovery complete.
SQL> select file#,name,status from v$datafile where file#=7;
FILE# NAME STATUS
---------- ------------------------------------------------------------ -------
7 +NEWDATA1/rac/datafile/lyp.277.1010081595 OFFLINE
SQL>
5.2.5 数据文件online
SQL> alter database datafile 7 online;
Database altered.
SQL> select file#,name,status from v$datafile where file#=7;
FILE# NAME STATUS
---------- ------------------------------------------------------------ -------
7 +NEWDATA1/rac/datafile/lyp.277.1010081595 ONLINE
SQL>
5.2.6 检查数据文件
检查数据文件路径、状态,以及数据是否可查。
一节点:查询正常。
SQL> select file_id,file_name,status from dba_data_files where tablespace_name='LYP';
FILE_ID FILE_NAME STATUS
---------- ------------------------------------------------------------ ---------
6 +NEWDATA1/rac/datafile/lyp.276.1010080027 AVAILABLE
7 +NEWDATA1/rac/datafile/lyp.277.1010081595 AVAILABLE
SQL> select file#,name,status from v$datafile where file#=7;
FILE# NAME STATUS
---------- ------------------------------------------------------------ -------
7 +NEWDATA1/rac/datafile/lyp.277.1010081595 ONLINE
SQL> select count(*) from lyp;
COUNT(*)
----------
1993
SQL> select count(*) from test;
COUNT(*)
----------
510208
SQL>
二节点:查询正常。
SQL> select count(*) from lyp;
COUNT(*)
----------
1993
SQL> select count(*) from test;
COUNT(*)
----------
510208
SQL>
六、回退方案
6.1 回退步骤
按照恢复步骤相反的步骤进行回退。
6.2 异常恢复
对一些非正常因素导致是不可逆转的问题,从而使数据丢失的情况,就需要使用步骤二、中备份的数据进行恢复。