书接上回
深夜一声惊雷.客户炸了群,2套数据库都由原来的RAC切换到了备库(单点),考虑业务的高可用性,因此计划恢复原来的RAC环境。
首先将搭建STANDBY-RAC,然后再做一次Failover切换将STANDBY-RAC切换为主环境。
切换过程:ORACLE-主备备-Failover,此次单独简述STANDBY-RAC的搭建过程。
- 下载搭建文档:Standby-RAC.pdf
搭建简述:主要讲述备份恢复RAC操作
一、设置多路径(两个节点)
1.1 刷新硬件识别存储链路
echo "1" > /sys/class/fc_host/host2/issue_lip
echo "- - -" > /sys/class/scsi_host/host2/scan
echo "1" > /sys/class/fc_host/host3/issue_lip
echo "- - -" > /sys/class/scsi_host/host3/scan
echo "1" > /sys/class/fc_host/host4/issue_lip
echo "- - -" > /sys/class/scsi_host/host4/scan
echo "1" > /sys/class/fc_host/host5/issue_lip
1.2 安装 mapper 软件
yum -y install *mapper*
1.3 设置配置文件:
cat /etc/multipath.conf
defaults {
user_friendly_names yes
}
1.4 查看路径
multipath -l
mpathe (3**********************1) dm-10 3PARdata,VV
size=50G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=0 status=active
|- 2:0:0:2 sdg 8:96 active undef unknown
|- 4:0:0:2 sdu 65:64 active undef unknown
|- 2:0:1:2 sdn 8:208 active undef unknown
`- 4:0:1:2 sdab 65:176 active undef unknown
1.5 设置配置文件
vim /etc/multipath.conf
defaults {
user_friendly_names yes
}
multipaths {
multipath {
no_path_retry fail
wwid 3******************************1
alias ASM-CRS
}
}
1.6 重启multipath 服务
[root@db_rac1 ~]# /etc/init.d/multipathd restart
ok
正在关闭multipathd 端口监控程序: [确定]
正在启动守护进程multipathd: [确定]
[root@db_rac1 ~]# multipath -ll
ASM-CRS (3******************************1) dm-9 3PARdata,VV
size=50G features='0' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
|- 2:0:0:3 sde 8:64 active ready running
|- 4:0:0:3 sdu 65:64 active ready running
|- 2:0:1:3 sdm 8:192 active ready running
`- 4:0:1:3 sdac 65:192 active ready running
- 开机启动:
chkconfig multipathd on - 多路径操作:重建多路径设备列表
执行multipath –F命令可以让multipathd清空之间的设备树,并删除多路径设备。
执行multipath –v3命令可以让multipathd重新扫描系统中的块设备,并建立多路径设备。
二、RAC环境Grid安装(略)
三、安装Oracle软件(略)
四、DBCA建库(*)
- 注:主要是解决部署权限的问题,建议以进行恢复的数据库命名,后续无需删除直接恢复即可。
五、备份
5.1 密码文件传输到各节点
scp initdb.ora oracle@dbrac1:/u01/oracle/11.2.0.4/product/dbs/orapwdbrac1
scp initdb.ora oracle@dbrac2:/u01/oracle/11.2.0.4/product/dbs/orapwdbrac2
5.2 备份参数文件
create pfile='/home/oracle/rman/pfile.ora' from spfile;
5.3 备份数据,脚本<rman.sh>如下
#!/bin/bash
source /home/oracle/.bash_profile
backup_dir=`date +%Y-%m-%d`
b_dir='/home/oracle/rman'
mkdir -p $b_dir/backup_dat/$backup_dir
mkdir -p $b_dir/backup_log/$backup_dir
rman target / log=$b_dir/backup_log/$backup_dir/backup.log <<EOF
Resync catalog;
run{
allocate channel c1 device type disk format='$b_dir/backup_dat/$backup_dir/%U';
allocate channel c2 device type disk format='$b_dir/backup_dat/$backup_dir/%U';
allocate channel c3 device type disk format='$b_dir/backup_dat/$backup_dir/%U';
allocate channel c4 device type disk format='$b_dir/backup_dat/$backup_dir/%U';
allocate channel c5 device type disk format='$b_dir/backup_dat/$backup_dir/%U';
backup as compressed backupset full database;
release channel c1;
release channel c2;
release channel c3;
release channel c4;
release channel c5;
crosscheck backup;
}
EOF
5.4 备份归档
RMAN> backup archivelog all format='/home/oracle/rman/arch_%U%s.dbf';
5.5 备份控制文件
SQL> alter database create standby controlfile as '/home/oracle/rman/control01.ctl';
六、恢复
6.1 配置参数文件
[oracle@db_rac1 rman]$ cp pfile.ora /u01/oracle/11.2.0.3/product/dbs/inittwo_db1.ora
[oracle@db_rac1 rman]$ vim /u01/oracle/11.2.0.3/product/dbs/inittwo_db1.ora
*.audit_file_dest='/u01/oracle/admin/two_db/adump'
*.audit_trail='NONE'
*.compatible='11.2.0.0.0'
*.control_files='+DATA/two_db/control01.ctl','+FRA/two_db/control02.ctl'#Restore Controlfile
*.db_block_size=8192
*.db_create_file_dest='+DATA'
*.db_domain=''
*.db_file_name_convert='/u01/oradata/two_db','+DATA/two_db','/u01/oradata/two_db','+DATA/two_db/datafile'
*.db_name='two_db'
*.db_recovery_file_dest='+FRA'
*.db_recovery_file_dest_size=4558159872
*.db_unique_name='two_db'
*.diagnostic_dest='/u01/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=two_dbXDB)'
*.fal_client='two_db_standby1'
*.fal_server='two_db'
*.log_archive_config='DG_CONFIG=(two_db,two_db_standby1)'
*.log_archive_dest_1='location=+ARCH'
*.log_file_name_convert='/u01/oradata/two_db','+REDO1/two_db','/u01/oradata/two_db','+REDO2/two_db'
*.open_cursors=300
*.pga_aggregate_target=7696547840
*.processes=1500
*.remote_login_passwordfile='exclusive'
*.sessions=1655
*.sga_target=38482739200
*.standby_file_management='AUTO'
two_db1.undo_tablespace='UNDOTBS1'
two_db2.undo_tablespace='UNDOTBS2'
*.cluster_database=false # 这里要先设置为flase,才能进行RMAN 恢复操作。恢复完成在改成true。
*.fal_client='two_db'
*.fal_server='two_db_standby1'
two_db2.instance_number=2
two_db1.instance_number=1
two_db2.thread=2
two_db1.thread=1
remote_listener='scan-two_db:1521'
6.2 创建目录(两个节点)
mkdir -p /u01/oracle/admin/two_db/adump
6.3 启动库到nomount
SQL> startup nomount;
ORACLE instance started.
Total System Global Area 3.8349E+10 bytes
Fixed Size 2236448 bytes
Variable Size 4429189088 bytes
Database Buffers 3.3823E+10 bytes
Redo Buffers 94216192 bytes
6.4 创建spfile.ora文件
- 创建spfile.ora
create spfile='+DATA/two_db/PARAMETERFILE/spfiletwo_db.ora' from pfile='/u01/oracle/11.2.0.3/product/dbs/inittwo_db1.ora';
- 查看spfile.ora
ASMCMD> pwd
+data/two_db/PARAMETERFILE
ASMCMD> ls
spfile.260.1176302113
spfiletwo_db.ora
- 关库
SQL> shutdown immediate;
ORA-01507: database not mounted
ORACLE instance shut down.
- 修改参数文件
[oracle@db_rac1 rman]$ vim /u01/oracle/11.2.0.3/product/dbs/inittwo_db1.ora
spfile='+DATA/two_db/PARAMETERFILE/spfiletwo_db.ora'
- 启动到nomount,启用:spfile
SQL> startup nomount;
ORACLE instance started.
Total System Global Area 3.8349E+10 bytes
Fixed Size 2236448 bytes
Variable Size 3623882720 bytes
Database Buffers 3.4628E+10 bytes
Redo Buffers 94216192 bytes
6.5 恢复控制文件
[oracle@db_rac1 rman]$ rman target /
Recovery Manager: Release 11.2.0.3.0 - Production on Tue Aug 6 14:57:21 2024
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
connected to target database: two_db (not mounted)
RMAN> restore standby controlfile from '/home/oracle/rman/control01.ctl';
Starting restore at 06-AUG-24
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1046 device type=DISK
channel ORA_DISK_1: copied control file copy
output file name=+DATA/two_db/control01.ctl
output file name=+FRA/two_db/control02.ctl
Finished restore at 06-AUG-24
6.6 启动到mount
SQL> alter database mount;
Database altered.
6.7 添加standby logfile:
- 查看主库redo
set line 800
col MEMBER for a100
select * from v$logfile where TYPE='ONLINE';
select * from v$log;
- 主库有redo 6组,此处添加7组standby
alter system set standby_file_management=manual scope=both sid='*';
alter database add standby logfile thread 1 group 5 ('+data') size 50m;
alter database add standby logfile thread 1 group 6 ('+data') size 50m;
alter database add standby logfile thread 1 group 7 ('+data') size 50m;
alter database add standby logfile thread 2 group 8 ('+data') size 50m;
alter database add standby logfile thread 2 group 9 ('+data') size 50m;
alter database add standby logfile thread 2 group 10 ('+data') size 50m;
alter system set standby_file_management=auto scope=both sid='*';
6.8 恢复数据
6.8.1 注册备份信息
[oracle@db_rac1 rman]$ rman target /
Recovery Manager: Release 11.2.0.3.0 - Production on Tue Aug 6 15:19:52 2024
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
connected to target database: two_db (DBID=1104896943, not open)
RMAN> catalog start with '/home/oracle/rman/';
Starting implicit crosscheck backup at 06-AUG-24
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1427 device type=DISK
Crosschecked 101 objects
Finished implicit crosscheck backup at 06-AUG-24
Starting implicit crosscheck copy at 06-AUG-24
using channel ORA_DISK_1
Finished implicit crosscheck copy at 06-AUG-24
searching for all files in the recovery area
cataloging files...
6.8.2 恢复脚本
- 脚本restore.sh
#!/bin/sh
source /home/oracle/.bash_profile
source /etc/profile
rman target / log=/home/oracle/restore.log <<EOF
run {
allocate channel c1 type disk;
allocate channel c2 type disk;
allocate channel c3 type disk;
allocate channel c4 type disk;
allocate channel c5 type disk;
restore database;
recover database ;
release channel c1 ;
release channel c2 ;
release channel c3 ;
release channel c4 ;
release channel c5 ;
}
EOF
- 执行脚本
sh restore.sh > restore.log &
七、Standby Database注册到OCR里,Oracle 用户
7.1 添加standby database 到 ocr,让集群可以管理资源
[oracle@db_rac1 ~]$ cat /u01/oracle/11.2.0.3/product/dbs/inittwo_db1.ora
spfile='+data/two_db/PARAMETERFILE/spfiletwo_db.ora'
- srvctl add database -d db_unique_name
[oracle@db_rac1 ~]$ srvctl add database -d two_db -o $ORACLE_HOME -p +data/two_db/PARAMETERFILE/spfiletwo_db.ora
7.2 添加实例:
- -d db_unique_name
[oracle@db_rac1 ~]$ srvctl add instance -d two_db -i two_db1 -n db_rac1
[oracle@db_rac1 ~]$ srvctl add instance -d two_db -i two_db2 -n db_rac2
7.3 修改集群参数:
SQL> show parameter cluster_database;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
cluster_database boolean FALSE
cluster_database_instances integer 1
SQL> alter system set cluster_database=true scope=spfile sid='*';
System altered.
- 否则无法启动rac2
[grid@db_rac1 ~]$ srvctl start database -d two_db
PRCR-1079 : 无法启动资源 ora.two_db.db
CRS-5017: The resource action "ora.two_db.db start" encountered the following error:
ORA-01102: cannot mount database in EXCLUSIVE mode
. For details refer to "(:CLSN00107:)" in "/u01/grid/11.2.0.3/product/log/db_rac2/agent/crsd/oraagent_oracle/oraagent_oracle.log".
CRS-2674: Start of 'ora.two_db.db' on 'db_rac2' failed
CRS-2632: There are no more servers to try to place resource 'ora.two_db.db' on that would satisfy its placement policy
7.4 通过集群重启库
[grid@db_rac1 ~]$ srvctl stop database -d two_db
[grid@db_rac1 ~]$ srvctl start database -d two_db
7.5 验证状态
[grid@db_rac1 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCH.dg
ONLINE ONLINE db_rac1
ONLINE ONLINE db_rac2
ora.CRS.dg
ONLINE ONLINE db_rac1
ONLINE ONLINE db_rac2
ora.DATA.dg
ONLINE ONLINE db_rac1
ONLINE ONLINE db_rac2
ora.FRA.dg
ONLINE ONLINE db_rac1
ONLINE ONLINE db_rac2
ora.LISTENER.lsnr
ONLINE ONLINE db_rac1
ONLINE ONLINE db_rac2
ora.REDO1.dg
ONLINE ONLINE db_rac1
ONLINE ONLINE db_rac2
ora.REDO2.dg
ONLINE ONLINE db_rac1
ONLINE ONLINE db_rac2
ora.asm
ONLINE ONLINE db_rac1 Started
ONLINE ONLINE db_rac2 Started
ora.gsd
OFFLINE OFFLINE db_rac1
OFFLINE OFFLINE db_rac2
ora.net1.network
ONLINE ONLINE db_rac1
ONLINE ONLINE db_rac2
ora.ons
ONLINE ONLINE db_rac1
ONLINE ONLINE db_rac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE db_rac1
ora.db_rac1.vip
1 ONLINE ONLINE db_rac1
ora.db_rac2.vip
1 ONLINE ONLINE db_rac2
ora.cvu
1 ONLINE ONLINE db_rac1
ora.two_db.db
1 ONLINE ONLINE db_rac1 Open,Readonly
2 ONLINE ONLINE db_rac2 Open,Readonly
ora.oc4j
1 ONLINE ONLINE db_rac1
ora.scan1.vip
1 ONLINE ONLINE db_rac1
八、调试同步
8.1 设置TNS
two_pri =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = 10.10.5.100)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = two_pri)
)
)
8.2 主库配置
alter system set LOG_ARCHIVE_CONFIG='DG_CONFIG=(two_db,two_db_standby1,two_db_standby2)' scope=spfile;
alter system set LOG_ARCHIVE_DEST_3='SERVICE=two_db LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=two_db';
alter system set log_archive_dest_state_3='enable';
8.3 备库操作
- 只读打开库,并启动实时应用
SQL> alter database open read only;
Database altered.
SQL> alter database recover managed standby database using current logfile disconnect from session;
Database altered.
- 查看同步状态
SQL> set line 800
SQL> col NAME for a10
SQL> col VALUE for a20
SQL> select * from v$dataguard_stats where name='apply lag';
NAME VALUE UNIT TIME_COMPUTED DATUM_TIME
---------- -------------------- ------------------------------ ------------------------------ ------------------------------
apply lag +00 00:00:00 day(2) to second(0) interval 08/06/2024 18:10:32 08/06/2024 18:10:31
总结
- 领导越急,自己心态越要稳,否则细节很容易出问题:
– 搭建过程中出现了一些细节问题,数据库版本都没看就一顿安装,最后发现生产为11.2.0.4,安装的为11.2.0.3没办法只能重新来过; - 后续切换也遇到了一些问题:direct connection failure with ASM、Oracle ORA-03113 引发的DBLINK隐患。
- 有效的总结能扎实基础,理顺思路,现在大环境不景气的情况下如逆风行走,更要充实自己。
欢迎赞赏支持或留言指正

最后修改时间:2024-08-27 10:10:43
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




