以上的搭建过程请参考该文档:
Creating a Physical Standby using RMAN Duplicate (RAC or Non-RAC) (Doc ID 1617946.1)
这篇帖子,只针对密码文件这个致命错误,做具体的解答。
1.故障现象
| Primary database | Standby database | os version | database version | |
|---|---|---|---|---|
| host name | zxrdsrv1 zxrdsrv2 |
bxrdsrv1 bxrdsrv2 |
Windows Server 2020 | 19.3.0.0 |
| db_unique_name | ats | atsdg | ||
| instance_name | ats1 ats2 |
ats1 ats2 |
Windows Server 2020 | 19.3.0.0 |
建立 RAC - RAC dg 环境,主备库无法日志同步,查看告警日志发现错误如下:
2024-07-18T16:59:43.486552+08:00
Errors in file E:\APP\ORACLE\diag\rdbms\atsdg\ats1\trace\ats1_ora_14372.trc:
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 3) 系统找不到指定的路径。
ORA-01017: 用户名/口令无效; 登录被拒绝
2024-07-18T16:59:43.495512+08:00
Errors in file E:\APP\ORACLE\diag\rdbms\atsdg\ats1\trace\ats1_ora_14372.trc:
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 3) 系统找不到指定的路径。
2024-07-18T23:59:37.373514+08:00
Errors in file E:\APP\ORACLE\diag\rdbms\atsdg\ats1\trace\ats1_ora_10144.trc:
2.问题分析
根据告警日志分析,主从库的密码文件不一致。
3.解决方法
- 主库复制密码文件到从库
- 重新建立密码文件
4.解决过程
经过多次尝试,对密码文件的格式有严格的要求,根据文档解决了该问题。
For Windows platform, the password file is under %ORACLE_HOME%\database directory with name PWD<SID>.ora.
Use above instance name as example, one should copy password file from primary:
%ORACLE_HOME%\database\<PRIM_PWD_INST_NAME1>.ora
to all standby hosts and rename the file name to:
%ORACLE_HOME%\database\<STBY_PWD_INST_NAME1>.ora for standby <STBY_HOSTNAME1>
%ORACLE_HOME%\database\<STBY_PWD_INST_NAME2>.ora for standby <STBY_HOSTNAME2>
Please replace the commands in solution part accordingly if windows platform is used.
step 1.主库任意节点创建密码文件
PS E:\app\oracle\product\19.3.0\dbhome_1\bin> orapwd file=+DATA password=Admin@123 dbuniquename=ats force=y
PS E:\app\oracle\product\19.3.0\dbhome_1\bin>
在asm磁盘组中,建立密码文件不要指定具体的路径,仅指定磁盘组名称。如指定具体的路径,生成的密码文件路径如下:
+DATA/DB_UNKNOWN/PASSWORD/pwddb_unknown.290.1174694855
强迫症患者误入。
step 2.复制密码文件到备库
ASMCMD [+DATA/ATSDG/PASSWORD] > pwcopy e:\PWDATS.334.1174692303 +DATA/ATSDG/PASSWORD/PWDATSDG
copying e:\PWDATS.334.1174692303 -> +DATA/ATSDG/PASSWORD/PWDATSDG
ASMCMD [+DATA/ATSDG/PASSWORD] > ls
pwdatsdg
该方法就会导致强迫症患者不适。
ASMCMD [+DATA/ATSDG/PASSWORD] > ls -lar
Type Redund Striped Time Sys Name
PASSWORD UNPROT COARSE JUL 18 14:00:00 Y none => pwdatsdg.287.1174658453
PASSWORD UNPROT COARSE JUL 19 00:00:00 N pwdatsdg => +DATA/DB_UNKNOWN/PASSWORD/pwddb_unknown.290.1174694855
ASMCMD [+DATA/ATSDG/PASSWORD] >
个人认为可以使用 file, input_file 这种方式来解决。
step 3.查看主库告警日志
2024-07-19T00:22:29.088790+08:00
Starting background process NSV2
2024-07-19T00:22:29.105716+08:00
NSV2 started with pid=73, OS id=15500
2024-07-19T00:22:47.795333+08:00
Starting background process RSM0
2024-07-19T00:22:47.828188+08:00
RSM0 started with pid=252, OS id=16128
2024-07-19T00:22:52.125248+08:00
RSM0 (PID:16128): Using STANDBY_ARCHIVE_DEST parameter default value as +DATA [krsd.c:18141]
2024-07-19T00:22:52.143168+08:00
ALTER SYSTEM SET log_archive_dest_2='service="atsdg"','ASYNC NOAFFIRM delay=0 optional compression=disable max_failure=0 reopen=300 db_unique_name="atsdg" net_timeout=30','valid_for=(online_logfile,all_roles)' SCOPE=BOTH;
2024-07-19T00:22:52.207883+08:00
ALTER SYSTEM SET log_archive_dest_state_2='ENABLE' SCOPE=BOTH;
2024-07-19T00:22:52.406010+08:00
ALTER SYSTEM SET fal_server='' SCOPE=BOTH;
2024-07-19T00:22:52.533448+08:00
TT04 (PID:3880): SRL selected for T-1.S-23 for LAD:2
step 4.查看备库告警日志
2024-07-19T00:10:36.065695+08:00
rfs (PID:7924): Primary database is in MAXIMUM PERFORMANCE mode
到此处,有没有惊喜,有没有大彩蛋。
step 5. DG_BROKER 添加主从库
DGMGRL> SHOW CONFIGURATION
配置 - ats
保护模式: MaxPerformance
成员:
ats - 主数据库
atsdg - 物理备用数据库
快速启动故障转移: Disabled
配置状态:
SUCCESS (状态已在 49 秒前更新)
DGMGRL>
step 6.启动主从库的其他节点
# 主库节点
C:\Users\Administrator>crsctl stat res -t
--------------------------------------------------------------------------------
Name Target State Server State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
ONLINE ONLINE zxrdsrv1 STABLE
ONLINE ONLINE zxrdsrv2 STABLE
ora.net1.network
ONLINE ONLINE zxrdsrv1 STABLE
ONLINE ONLINE zxrdsrv2 STABLE
ora.ons
ONLINE ONLINE zxrdsrv1 STABLE
ONLINE ONLINE zxrdsrv2 STABLE
ora.proxy_advm
OFFLINE OFFLINE zxrdsrv1 STABLE
OFFLINE OFFLINE zxrdsrv2 STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr(ora.asmgroup)
1 ONLINE ONLINE zxrdsrv1 STABLE
2 ONLINE ONLINE zxrdsrv2 STABLE
3 ONLINE OFFLINE STABLE
ora.DATA.dg(ora.asmgroup)
1 ONLINE ONLINE zxrdsrv1 STABLE
2 ONLINE ONLINE zxrdsrv2 STABLE
3 OFFLINE OFFLINE STABLE
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE zxrdsrv1 STABLE
ora.OCR.dg(ora.asmgroup)
1 ONLINE ONLINE zxrdsrv1 STABLE
2 ONLINE ONLINE zxrdsrv2 STABLE
3 OFFLINE OFFLINE STABLE
ora.asm(ora.asmgroup)
1 ONLINE ONLINE zxrdsrv1 Started,STABLE
2 ONLINE ONLINE zxrdsrv2 Started,STABLE
3 OFFLINE OFFLINE STABLE
ora.asmnet1.asmnetwork(ora.asmgroup)
1 ONLINE ONLINE zxrdsrv1 STABLE
2 ONLINE ONLINE zxrdsrv2 STABLE
3 OFFLINE OFFLINE STABLE
ora.ats.db
1 ONLINE ONLINE zxrdsrv1 Open,HOME=E:\app\ora
cle\product\19.3.0\d
bhome_1,STABLE
2 ONLINE ONLINE zxrdsrv2 Open,HOME=E:\app\ora
cle\product\19.3.0\d
bhome_1,STABLE
ora.cvu
1 ONLINE ONLINE zxrdsrv1 STABLE
ora.qosmserver
1 ONLINE ONLINE zxrdsrv1 STABLE
ora.scan1.vip
1 ONLINE ONLINE zxrdsrv1 STABLE
ora.zxrdsrv1.vip
1 ONLINE ONLINE zxrdsrv1 STABLE
ora.zxrdsrv2.vip
1 ONLINE ONLINE zxrdsrv2 STABLE
--------------------------------------------------------------------------------
C:\Users\Administrator>
# 备库节点
C:\Users\Administrator>crsctl stat res -t
--------------------------------------------------------------------------------
Name Target State Server State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
ONLINE ONLINE bzrdsrv1 STABLE
ONLINE ONLINE bzrdsrv2 STABLE
ora.net1.network
ONLINE ONLINE bzrdsrv1 STABLE
ONLINE ONLINE bzrdsrv2 STABLE
ora.ons
ONLINE ONLINE bzrdsrv1 STABLE
ONLINE ONLINE bzrdsrv2 STABLE
ora.proxy_advm
OFFLINE OFFLINE bzrdsrv1 STABLE
OFFLINE OFFLINE bzrdsrv2 STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr(ora.asmgroup)
1 ONLINE ONLINE bzrdsrv1 STABLE
2 ONLINE ONLINE bzrdsrv2 STABLE
3 ONLINE OFFLINE STABLE
ora.DATA.dg(ora.asmgroup)
1 ONLINE ONLINE bzrdsrv1 STABLE
2 ONLINE ONLINE bzrdsrv2 STABLE
3 OFFLINE OFFLINE STABLE
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE bzrdsrv1 STABLE
ora.OCR.dg(ora.asmgroup)
1 ONLINE ONLINE bzrdsrv1 STABLE
2 ONLINE ONLINE bzrdsrv2 STABLE
3 OFFLINE OFFLINE STABLE
ora.asm(ora.asmgroup)
1 ONLINE ONLINE bzrdsrv1 Started,STABLE
2 ONLINE ONLINE bzrdsrv2 Started,STABLE
3 OFFLINE OFFLINE STABLE
ora.asmnet1.asmnetwork(ora.asmgroup)
1 ONLINE ONLINE bzrdsrv1 STABLE
2 ONLINE ONLINE bzrdsrv2 STABLE
3 OFFLINE OFFLINE STABLE
ora.atsdg.db
1 ONLINE ONLINE bzrdsrv1 Open,Readonly,HOME=E
:\app\oracle\product
\19.3.0\dbhome_1,STA
BLE
2 ONLINE ONLINE bzrdsrv2 Open,Readonly,HOME=E
:\app\oracle\product
\19.3.0\dbhome_1,STA
BLE
ora.bzrdsrv1.vip
1 ONLINE ONLINE bzrdsrv1 STABLE
ora.bzrdsrv2.vip
1 ONLINE ONLINE bzrdsrv2 STABLE
ora.cvu
1 ONLINE ONLINE bzrdsrv1 STABLE
ora.qosmserver
1 ONLINE ONLINE bzrdsrv1 STABLE
ora.scan1.vip
1 ONLINE ONLINE bzrdsrv1 STABLE
--------------------------------------------------------------------------------
C:\Users\Administrator>
你以为这就完了吗?
step 7.查看主从库告警日志
主库:
RSM0 (PID:16128): Using STANDBY_ARCHIVE_DEST parameter default value as +DATA [krsd.c:18141]
2024-07-19T00:22:52.143168+08:00
ALTER SYSTEM SET log_archive_dest_2='service="atsdg"','ASYNC NOAFFIRM delay=0 optional compression=disable max_failure=0 reopen=300 db_unique_name="atsdg" net_timeout=30','valid_for=(online_logfile,all_roles)' SCOPE=BOTH;
2024-07-19T00:22:52.207883+08:00
ALTER SYSTEM SET log_archive_dest_state_2='ENABLE' SCOPE=BOTH;
2024-07-19T00:22:52.406010+08:00
ALTER SYSTEM SET fal_server='' SCOPE=BOTH;
2024-07-19T00:22:52.533448+08:00
TT04 (PID:3880): SRL selected for T-1.S-23 for LAD:2
2024-07-19T00:28:06.456841+08:00
Reconfiguration started (old inc 2, new inc 4)
List of instances (total 2) :
1 2
New instances (total 1) :
2
My inst 1
Global Resource Directory frozen
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
2024-07-19T00:28:06.636051+08:00
LMS 3: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-07-19T00:28:07.636640+08:00
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-07-19T00:28:08.637230+08:00
LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-07-19T00:28:09.637820+08:00
LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-07-19T00:28:09.673661+08:00
Set master node info
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Reconfiguration complete (total time 3.5 secs)
2024-07-19T00:28:28.326441+08:00
Skipping remote setting of Resource Manager plan DEFAULT_MAINTENANCE_PLAN via scheduler window [4d1c] because plan is already set.
2024-07-19T00:29:25.951436+08:00
ALTER SYSTEM SET log_archive_dest_state_2='ENABLE' SCOPE=BOTH;
2024-07-19T00:29:25.979313+08:00
ALTER SYSTEM SET fal_server='' SCOPE=BOTH;
2024-07-19T00:29:32.418926+08:00
Increasing number of high priority LMS from 0 to 4
2024-07-19T00:29:55.965136+08:00
ALTER SYSTEM SET log_archive_dest_state_2='ENABLE' SCOPE=BOTH;
2024-07-19T00:29:55.992018+08:00
ALTER SYSTEM SET fal_server='' SCOPE=BOTH;
2024-07-19T00:30:25.976874+08:00
ALTER SYSTEM SET log_archive_dest_state_2='ENABLE' SCOPE=BOTH;
2024-07-19T00:30:26.002760+08:00
ALTER SYSTEM SET fal_server='' SCOPE=BOTH;
2024-07-19T00:30:56.004575+08:00
ALTER SYSTEM SET log_archive_dest_state_2='ENABLE' SCOPE=BOTH;
2024-07-19T00:30:56.028469+08:00
ALTER SYSTEM SET fal_server='' SCOPE=BOTH;
2024-07-19T00:31:26.034206+08:00
ALTER SYSTEM SET log_archive_dest_state_2='ENABLE' SCOPE=BOTH;
2024-07-19T00:31:26.057106+08:00
ALTER SYSTEM SET fal_server='' SCOPE=BOTH;
主库会自动发现节点2
从库:
PR00 (PID:9188): Media Recovery Waiting for T-2.S-22
2024-07-19T00:28:46.657357+08:00
Reconfiguration started (old inc 2, new inc 4)
List of instances (total 2) :
1 2
New instances (total 1) :
2
My inst 1
Global Resource Directory frozen
2024-07-19T00:28:47.655957+08:00
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
2024-07-19T00:28:47.723659+08:00
LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-07-19T00:28:47.723659+08:00
LMS 3: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-07-19T00:28:48.724250+08:00
LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-07-19T00:28:49.724841+08:00
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-07-19T00:28:49.730815+08:00
Set master node info
2024-07-19T00:28:50.740367+08:00
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Reconfiguration complete (total time 4.2 secs)
2024-07-19T00:29:50.903284+08:00
Increasing number of high priority LMS from 0 to 4
自动节点发现
step 8. DG_BROKER 查看主从数据库的信息
DGMGRL> SHOW database ats
数据库 - ats
角色: PRIMARY
预期状态: TRANSPORT-ON
实例:
ats1
ats2
数据库状态:
SUCCESS
DGMGRL> SHOW database atsdg
数据库 - atsdg
角色: PHYSICAL STANDBY
预期状态: APPLY-ON
传输滞后: 0 秒 (已在 1 秒之前计算)
应用滞后: 7 分钟 5 秒 (已在 1 秒之前计算)
平均应用速率: 1.00 MB/秒
实时查询: ON
实例:
ats1
数据库警告:
ORA-16853: 应用滞后已超出指定阈值
数据库状态:
WARNING
DGMGRL>
出现两个告警,分析日志在做解决。
5.参考文档
SYS Password Management with RAC and Data Guard (Doc ID 1267828.1)
6.写在结尾
感谢我的DBA朋友圈,感谢惠星星,感谢杨涛,感谢王波,感谢在这几天帮组过我,或者提供建议的朋友。
客户的需求,就是我们学习的动力!
战斗吧精灵!!!
最后修改时间:2024-07-19 09:52:23
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




