文章介绍
实现PDB级别的TAF(Transparent Application Failover),透明应用程序故障转移。
环境介绍
| 架构 | 版本 | 是否CDB架构 | 原数据库名称 | PDB名称 |
|---|---|---|---|---|
| RAC | 19.12 | YES | newdb | S_PDB |
测试步骤
添加服务并启动
添加服务的帮助如下:

srvctl add service -db newdb -service TAFS -pdb S_PDB -preferred "newdb1" -available "newdb2" -tafpolicy BASIC -notification TRUE -failovertype SESSION -failovermethod BASIC
srvctl start service -db newdb -service TAFS
srvctl config service -db newdb
[oracle@rac01 admin]$ srvctl config service -db newdb
Service name: TAFS
Server pool:
Cardinality: 1
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: true
Global: false
Commit Outcome: false
Failover type: SESSION
Failover method: BASIC
Failover retries:
Failover delay:
Failover restore: NONE
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Edition:
Pluggable database name: S_PDB
Hub service:
Maximum lag time: ANY
SQL Translation Profile:
Retention: 86400 seconds
Replay Initiation Time: 300 seconds
Drain timeout:
Stop option:
Session State Consistency: DYNAMIC
GSM Flags: 0
Service is enabled
Preferred instances: newdb1
Available instances: newdb2
CSS critical: no
Service uses Java: false
[oracle@rac01 admin]$
查看服务
服务已经启动,运行在实例1上
crsctl stat res -t
ora.newdb.db
1 ONLINE ONLINE rac01 Open,HOME=/u01/app/o
racle/product/19.0.0
/db_1,STABLE
2 ONLINE ONLINE rac02 Open,HOME=/u01/app/o
racle/product/19.0.0
/db_1,STABLE
ora.newdb.tafs.svc
1 ONLINE ONLINE rac01 STABLE
select * from dba_services;
SERVICE_ID 1
NAME TAFS
NAME_HASH 2856793518
NETWORK_NAME TAFS
CREATION_DATE 2022/11/17 10:24:40
CREATION_DATE_HASH 3589686738
FAILOVER_METHOD BASIC
FAILOVER_TYPE SESSION
FAILOVER_RETRIES 0
FAILOVER_DELAY 0
MIN_CARDINALITY
MAX_CARDINALITY
GOAL NONE
DTP N
ENABLED NO
AQ_HA_NOTIFICATIONS YES
CLB_GOAL LONG
EDITION
COMMIT_OUTCOME NO
RETENTION_TIMEOUT 86400
REPLAY_INITIATION_TIMEOUT 300
SESSION_STATE_CONSISTENCY DYNAMIC
GLOBAL_SERVICE NO
PDB S_PDB
SQL_TRANSLATION_PROFILE
MAX_LAG_TIME ANY
GSM_FLAGS 0
PQ_SVC
STOP_OPTION NONE
FAILOVER_RESTORE NONE
DRAIN_TIMEOUT 0
TABLE_FAMILY_ID 0
PLACEMENT_POLICY 0
RESET_STATE NONE
VCSPARE1
NSPARE1 0
计划外连接TAF测试
配置tnsnames.ora如下
TAFS =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = racdb-scan)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = TAFS)
)
)
sqlplus连接后,登录到实例1,sid为413
[oracle@rac01 admin]$ sqlplus system/oracle@tafs
SQL*Plus: Release 19.0.0.0.0 - Production on Thu Nov 17 10:29:47 2022
Version 19.12.0.0.0
Copyright (c) 1982, 2021, Oracle. All rights reserved.
Last Successful login time: Thu Nov 17 2022 10:29:39 +08:00
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.12.0.0.0
SQL>
SQL>
SQL> show parameter name
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
cdb_cluster_name string
cell_offloadgroup_name string
db_file_name_convert string
db_name string newdb
db_unique_name string newdb
global_names boolean FALSE
instance_name string newdb1
lock_name_space string
log_file_name_convert string
pdb_file_name_convert string
processor_group_name string
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
service_names string newdb
SQL> select sid from v$mystat where rownum=1;
SID
----------
413
select * from gv$session where sid=413;
INST_ID 1
SADDR 000000008BB1AE20
SID 413
SERIAL# 14848
LOGON_TIME 2022/11/17 10:29:47
LAST_CALL_ET 707
PDML_ENABLED NO
FAILOVER_TYPE SESSION
FAILOVER_METHOD BASIC
FAILED_OVER NO --failed_over为NO
KILL PMON进程,模拟实例故障
[root@rac01 ~]# ps -ef |grep pmon
grid 16911 1 0 09:32 ? 00:00:00 asm_pmon_+ASM1
grid 17559 1 0 09:33 ? 00:00:00 apx_pmon_+APX1
oracle 18399 1 0 09:33 ? 00:00:00 ora_pmon_newdb1
root 63871 16366 0 10:42 pts/0 00:00:00 grep --color=auto pmon
[root@rac01 ~]# kill -9 18399
发现会话自动failover到实例2,报ORA-25408后自动重连
ora.newdb.tafs.svc
1 ONLINE ONLINE rac02 STABLE
SQL> select sid from v$mystat where rownum=1;
select sid from v$mystat where rownum=1
*
ERROR at line 1:
ORA-25408: can not safely replay call
SQL> select sid from v$mystat where rownum=1;
SID
----------
284
SQL> show parameter name
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
cdb_cluster_name string
cell_offloadgroup_name string
db_file_name_convert string
db_name string newdb
db_unique_name string newdb
global_names boolean FALSE
instance_name string newdb2
lock_name_space string
log_file_name_convert string
pdb_file_name_convert string
processor_group_name string
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
service_names string newdb
SQL>
SQL> !oerr ora 25408
25408, 00000, "can not safely replay call"
// *Cause: The connection was lost while doing this call. It may not be
// safe to replay it after failover.
// *Action: Check to see if the results of the call have taken place, and then
// replay it if desired.
//
select * from gv$session where sid=284;
INST_ID 2
SADDR 00000000849D8E28
SID 284
SERIAL# 8866
LOGON_TIME 2022/11/17 10:43:23
LAST_CALL_ET 41
PDML_ENABLED NO
FAILOVER_TYPE SESSION
FAILOVER_METHOD BASIC
FAILED_OVER YES --failed_over为YES
计划外SELECT语句执行过程中TAF
发出select,使其一直输出
SQL> show parameter name
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
cdb_cluster_name string
cell_offloadgroup_name string
db_file_name_convert string
db_name string newdb
db_unique_name string newdb
global_names boolean FALSE
instance_name string newdb2
lock_name_space string
log_file_name_convert string
pdb_file_name_convert string
processor_group_name string
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
service_names string newdb
SQL>
SQL> select sid from v$mystat where rownum=1;
SID
----------
284
select * from dba_tables,dba_objects,dba_users;
KILL PMON进程,模拟实例故障
[oracle@rac02 ~]$ ps -ef |grep pmon
grid 16185 1 0 09:22 ? 00:00:00 asm_pmon_+ASM2
grid 18795 1 0 09:22 ? 00:00:00 apx_pmon_+APX2
oracle 31337 1 0 09:30 ? 00:00:00 ora_pmon_newdb2
oracle 100766 41591 0 10:52 pts/0 00:00:00 grep --color=auto pmon
[oracle@rac02 ~]$ kill -9 31337
发现会话自动failover到实例1,报ORA-25401后自动重连
INH
---
DEFAULT_COLLATION
--------------------------------------------------------------------------------
IMP ALL PASSWORD_CHANGE_DAT
--- --- -------------------
ERROR:
ORA-25401: can not continue fetches
60 rows selected.
SQL> select sid from v$mystat where rownum=1;
SID
----------
286
SQL> show parameter name
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
cdb_cluster_name string
cell_offloadgroup_name string
db_file_name_convert string
db_name string newdb
db_unique_name string newdb
global_names boolean FALSE
instance_name string newdb1
lock_name_space string
log_file_name_convert string
pdb_file_name_convert string
processor_group_name string
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
service_names string newdb
SQL>
SQL> !oerr ora 25401
25401, 00000, "can not continue fetches"
// *Cause: A failure occured since the last fetch on this statement. Failover
// was unable to bring the statement to its original state to allow
// continued fetches.
// *Action: Reexecute the statement and start fetching from the beginning
SQL>
计划内relocate服务,将服务切换到实例1
srvctl relocate service -db newdb -service TAFS -oldinst newdb2 -newinst newdb1
ora.newdb.tafs.svc
2 ONLINE ONLINE rac01 STABLE
计划内重启实例,同时relocate服务
停止实例帮助如下:

使用如下一条命令,停止实例的时候自动failover服务,如不加-failover会报错,因为存在服务,除非-force强制终止实例
srvctl stop instance -d newdb -i newdb1 -failover
ora.newdb.tafs.svc
2 ONLINE ONLINE rac02 STABLE
注意事项
以下每个操作都被视为计划关闭,因此不会发生服务的故障转移。
SQL> alter pluggable database s_pdb close abort;
$ srvctl stop instance -d newdb -i newdb1 -f
SQL> shutdown abort
参考文档
How to make PDB to be able to failover across nodes by using service (Doc ID 2893913.1)
最后修改时间:2023-03-18 23:09:32
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




