暂无图片
暂无图片
1
暂无图片
暂无图片
暂无图片

02.Install Oracle 19C on Linux, use DGBroker to create a backup with one primary and three backups

原创 ByteHouse 2024-05-08
337

1.配置 dg_broker

https://blog.csdn.net/qq_39466006/article/details/132018521
架构:以一主三备为例

IP hostname db_unique_name DG Broker role switchover  role failover role
192.168.10.91 primary primary 主库 从库 从库
192.168.10.92 standby-01 standby-01 从库 主库 从库
192.168.10.93 standby-02 standby-02 从库 从库 从库
192.168.10.94 standby-03 standby-03 从库 从库 从库

使用以下方法可以添加多个备库,具体配置的差异由你来确定

2.安装oracle数据库软件

3.备数据库配置

3.1.拷贝参数文件

scp /u01/app/oracle/product/19.3.0/dbhome_1/dbs/spfileorcl.ora oracle@192.168.10.94:/u01/app/oracle/product/19.3.0/dbhome_1/dbs

3.2.拷贝密码文件

scp /u01/app/oracle/product/19.3.0/dbhome_1/dbs/orapworcl oracle@192.168.10.94:/u01/app/oracle/product/19.3.0/dbhome_1/dbs

3.3.创建相应的文件目录

根据上面修改的参数文件,为备库创建相应的文件目录

mkdir -p /u01/app/oracle/archivelog
mkdir -p /u01/app/oracle/fast_recovery_area/orcl
mkdir -p /u01/app/oracle/admin/orcl/adump
mkdir -p /u01/app/oracle/oradata/orcl

chmod 775 /u01 -R

3.4.修改监听配置文件

# listener.ora Network Configuration File: /u01/app/oracle/product/19.3.0/dbhome_1/network/admin/listener.ora
# Generated by Oracle configuration tools.

LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = standby-database)(PORT = 1521))
      (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521))
    )
  )

SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (GLOBAL_DBNAME = standby02)
      (SID_NAME = orcl)
      (ORACLE_HOME = /u01/app/oracle/product/19.3.0/dbhome_1)
    )
  (SID_DESC =
      (GLOBAL_DBNAME = standby02_DGMGRL)
      (ORACLE_HOME = /u01/app/oracle/product/19.3.0/dbhome_1)
      (SID_NAME = orcl)
      (ENVS="TNS_ADMIN=/u01/app/oracle/product/19.3.0/dbhome_1/network/admin")
    )

  )

ADR_BASE_LISTENER = /u01/app/oracle

3.5.修改TNS配置文件

# tnsnames.ora Network Configuration File: /u01/app/oracle/product/19.3.0/dbhome_1/network/admin/tnsnames.ora
# Generated by Oracle configuration tools.

LISTENER_ORCL =
  (ADDRESS = (PROTOCOL = TCP)(HOST = standby-database)(PORT = 1521))


STANDBY01 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = standby01)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = standby01)
     (UR = A)
    )
  )
STANDBY02 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = standby02)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = standby02)
     (UR = A)
    )
  )
STANDBY03 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = standby03)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = standby03)
     (UR = A)
    )
  )

PRIMARY =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = primary)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = primary)
     (UR = A)
    )
  )

创建认证配置文件

cat > /u01/app/oracle/product/19.3.0/dbhome_1/network/admin/sqlnet.ora <<EOF
# sqlnet.ora Network Configuration File: /u01/app/oracle/product/19.3.0/dbhome_1/network/admin/sqlnet.ora
# Generated by Oracle configuration tools.

NAMES.DIRECTORY_PATH= (TNSNAMES, EZCONNECT)

ADR_BASE = /u01/app/oracle
EOF

3.6.修改 db_unique_name

alter system set db_unique_name='standby02' scope=spfile;

重新启动数据库到 nomount 状态,然后再启动监听服务

lsnrctl start

3.7.验证监听和TNS配置

主库:

sqlplus sys/admin123@primary as sysdba

sqlplus sys/admin123@standby02 as sysdba

备库:

sqlplus sys/admin123@primary as sysdba

sqlplus sys/admin123@standby02 as sysdba

注意:该步骤一定要在主备库上都能通过才能执行下面步骤

3.8.恢复数据库

# 连接数据库
rman target sys/admin123@primary auxiliary sys/admin123@standby02 nocatalog

# 备份数据
duplicate target database for standby from active database nofilenamecheck;

4.dg_broker 添加备库

step 1.查看 DG_BROKER_CONFIG_FILE文件

dg_broker_config_file1以及dg_broker_config_file2对应路径必须先存在(不存在需要先创建,否则启动dg_broker_start提示ORA-16604:
Data Guard broker configuration file inaccessible)

SQL> show parameter broker
SYS@orcl> show parameter broker NAME TYPE VALUE ------------------------------------ ---------------------- ------------------------------ connection_brokers string ((TYPE=DEDICATED)(BROKERS=1)), ((TYPE=EMON)(BROKERS=1)) dg_broker_config_file1 string /u01/app/oracle/product/19.3.0 /dbhome_1/dbs/dr1orcl.dat dg_broker_config_file2 string /u01/app/oracle/product/19.3.0 /dbhome_1/dbs/dr2orcl.dat dg_broker_start boolean FALSE use_dedicated_broker boolean FALSE SYS@orcl>

可以用默认的路径,也可以自己指定(四个节点均需配置)。如果是RAC环境,把这个文件把到共享存储或者ASM中。

step 2.重置 LOG_ARCHIVE_DEST_n 信息
三个节点均需配置,这步开始,已有主从同步会中断。

ALTER SYSTEM set LOG_ARCHIVE_DEST_2='';
ALTER SYSTEM set LOG_ARCHIVE_DEST_3='';

使用 reset 需要重启。

FQ :

DGMGRL> add database 'standby03' as connect identifier is standby03;
Error: ORA-16698: member has a LOG_ARCHIVE_DEST_n parameter with SERVICE attribute set

Failed.

step 3.开启 dg_broker_start 为TRUE,默认是FALSE(主备库)

SQL> alter system set dg_broker_start=true sid='*';

step 4.增加备库:

DGMGRL> add database 'standby02' as connect identifier is standby02; Database "standby02" added DGMGRL>

查看配置:

DGMGRL> show configuration; Configuration - xiangxun Protection Mode: MaxPerformance Members: primary - Primary database standby01 - Physical standby database standby02 - Physical standby database (disabled) ORA-16905: The member was not enabled yet. Fast-Start Failover: Disabled Configuration Status: SUCCESS (status updated 20 seconds ago) DGMGRL>

step 5.启用当前配置:

DGMGRL> enable Database standby02; Enabled. DGMGRL>

查看配置详细信息:

DGMGRL> show configuration;

Configuration - xiangxun

  Protection Mode: MaxPerformance
  Members:
  primary   - Primary database
    standby01 - Physical standby database 
    standby02 - Physical standby database 
      Error: ORA-16810: multiple errors or warnings detected for the member

Fast-Start Failover:  Disabled

Configuration Status:
ERROR   (status updated 56 seconds ago)

DGMGRL>

查看配置详细信息:

DGMGRL> show configuration verbose; Configuration - xiangxun Protection Mode: MaxPerformance Members: primary - Primary database standby01 - Physical standby database standby02 - Physical standby database Properties: FastStartFailoverThreshold = '30' OperationTimeout = '30' TraceLevel = 'USER' FastStartFailoverLagLimit = '30' CommunicationTimeout = '180' ObserverReconnect = '0' FastStartFailoverAutoReinstate = 'TRUE' FastStartFailoverPmyShutdown = 'TRUE' BystandersFollowRoleChange = 'ALL' ObserverOverride = 'FALSE' ExternalDestination1 = '' ExternalDestination2 = '' PrimaryLostWriteAction = 'CONTINUE' ConfigurationWideServiceName = 'orcl_CFG' Fast-Start Failover: Disabled Configuration Status: SUCCESS DGMGRL>

获取信息中包含任何告警都需要进行处理。

step 6.通过dgmgrl查看主备实例详细

前面使用单引号增加database,后面查询一律要使用。

备库信息:

DGMGRL> show database verbose standby02 Database - standby02 Role: PHYSICAL STANDBY Intended State: APPLY-ON Transport Lag: 0 seconds (computed 1 second ago) Apply Lag: 0 seconds (computed 1 second ago) Average Apply Rate: 3.00 KByte/s Active Apply Rate: 0 Byte/s Maximum Apply Rate: 0 Byte/s Real Time Query: OFF Instance(s): orcl Properties: DGConnectIdentifier = 'standby02' ObserverConnectIdentifier = '' FastStartFailoverTarget = '' PreferredObserverHosts = '' LogShipping = 'ON' RedoRoutes = '' LogXptMode = 'ASYNC' DelayMins = '0' Binding = 'optional' MaxFailure = '0' ReopenSecs = '300' NetTimeout = '30' RedoCompression = 'DISABLE' PreferredApplyInstance = '' ApplyInstanceTimeout = '0' ApplyLagThreshold = '30' TransportLagThreshold = '30' TransportDisconnectedThreshold = '30' ApplyParallel = 'AUTO' ApplyInstances = '0' StandbyFileManagement = '' ArchiveLagTarget = '0' LogArchiveMaxProcesses = '0' LogArchiveMinSucceedDest = '0' DataGuardSyncLatency = '0' LogArchiveTrace = '0' LogArchiveFormat = '' DbFileNameConvert = '' LogFileNameConvert = '' ArchiveLocation = '' AlternateLocation = '' StandbyArchiveLocation = '' StandbyAlternateLocation = '' InconsistentProperties = '(monitor)' InconsistentLogXptProps = '(monitor)' LogXptStatus = '(monitor)' SendQEntries = '(monitor)' RecvQEntries = '(monitor)' HostName = 'standby02' StaticConnectIdentifier = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=standby02)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=standby02_DGMGRL)(INSTANCE_NAME=orcl)(SERVER=DEDICATED)))' TopWaitEvents = '(monitor)' SidName = '(monitor)' Log file locations: Alert log : /u01/app/oracle/diag/rdbms/standby02/orcl/trace/alert_orcl.log Data Guard Broker log : /u01/app/oracle/diag/rdbms/standby02/orcl/trace/drcorcl.log Database Status: SUCCESS DGMGRL>

当前备库状态是节点1实时应用日志,且备库同步都正常。

5.备库重新启用数据库闪回

验证是否开启闪回

SYS@orcl> select flashback_on from v$database;

FLASHBACK_ON
------------------------------------
NO

SYS@orcl> show parameter db_recovery_file_dest

NAME                     TYPE            VALUE
------------------------------------ ---------------------- ------------------------------
db_recovery_file_dest             string
db_recovery_file_dest_size         big integer        0
SYS@orcl>

数据库已经开启flashback,那么下面步骤可忽略。

如上显示,该数据库未开启flashback,可按下面方法开启。

创建文件夹

[oracle@primary-database oracle]$ mkdir /u01/app/oracle/fast_recovery_area

SYS@orcl> alter system set db_recovery_file_dest_size='5G';

System altered.

SYS@orcl> alter system set db_recovery_file_dest='/u01/app/oracle/fast_recovery_area';

System altered.

SYS@orcl> show parameter db_recovery_file_dest

NAME                     TYPE            VALUE
------------------------------------ ---------------------- ------------------------------
db_recovery_file_dest             string            /u01/app/oracle/fast_recovery_
                                area
db_recovery_file_dest_size         big integer        5G
SYS@orcl> 

开启闪回

SYS@orcl> alter database flashback on;

Database altered.

SYS@orcl> select flashback_on from v$database;

FLASHBACK_ON
------------------------------------
YES

SYS@orcl>

FQ 1:未配置闪回目录

SQL> alter database flashback on;
alter database flashback on
*
ERROR at line 1:
ORA-38706: Cannot turn on FLASHBACK DATABASE logging.
ORA-38709: Recovery Area is not enabled.


SQL> show parameter db_

FQ 2:日志应用未停止

SYS@orcl> alter database flashback on;
alter database flashback on
*
ERROR at line 1:
ORA-01153: an incompatible media recovery is active


SYS@orcl>  ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

Database altered.

SYS@orcl> 

6.Fast-Start Fail Over

fast-start failover(FSFO)功能是当primary数据库Crash后,它会自动地快速执行切换standby数据库为primary数据库操作。
那么什么时候会触发fast-start failover呢?
当数据库以正常模式(shutdown immediate/normal/transactional)关闭时,系统不会触发fast-start failover。 使用非正常关闭方式 shutdown abort 关闭primary时会触发fast-start failover。

这里引入一个新的进程observer,observer是用来监控primary数据库是否可用的,当主库不可用时,会执行快速切换standby数据库为primary数据库。

只有maximum availability mode or maximum performance mode才能启用fast-start failover模式。在maximum availability模式下面,在切换时可以保证无数据丢失,在maximum performance mode下面,会有数据丢失,丢失多少数据由FastStartFailoverLagLimit这个参数来配置。

step 1.查看配置属性

DGMGRL> show configuration verbose;

Configuration - xiangxun

  Protection Mode: MaxPerformance
  Members:
  primary   - Primary database
    standby01 - Physical standby database 
    standby02 - Physical standby database 
    standby03 - Physical standby database 

  Properties:
    FastStartFailoverThreshold      = '30'
    OperationTimeout                = '30'
    TraceLevel                      = 'USER'
    FastStartFailoverLagLimit       = '30'
    CommunicationTimeout            = '180'
    ObserverReconnect               = '0'
    FastStartFailoverAutoReinstate  = 'TRUE'
    FastStartFailoverPmyShutdown    = 'TRUE'
    BystandersFollowRoleChange      = 'ALL'
    ObserverOverride                = 'FALSE'
    ExternalDestination1            = ''
    ExternalDestination2            = ''
    PrimaryLostWriteAction          = 'CONTINUE'
    ConfigurationWideServiceName    = 'orcl_CFG'

Fast-Start Failover:  Disabled

Configuration Status:
SUCCESS

DGMGRL> 

修改数据库状态

DGMGRL> show database standby03

Database - standby03

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      0 seconds (computed 351 seconds ago)
  Apply Lag:          0 seconds (computed 351 seconds ago)
  Average Apply Rate: 17.00 KByte/s
  Real Time Query:    OFF
  Instance(s):
    orcl

  Database Warning(s):
    ORA-16857: member disconnected from redo source for longer than specified threshold

Database Status:
WARNING

DGMGRL> 

Intended State 有以下几种:

  • TRANSPORT-ON
  • TRANSPORT-OFF
  • APPLY-ON
  • APPLY-OFF
数据库角色 状态 描述
Primary TRANSPORT-ON 设置重做传输服务,以便在打开主库以进行读/写访问时将重做数据传输到备库或远程同步实例。如果这是Oracle RAC数据库,则所有以读/写模式打开的实例都将运行重做传输服务。首次启用时,这是主库的默认状态。
Primary TRANSPORT-OFF 重做传输服务在主库上停止。

如果这是Oracle RAC数据库,则重做传输服务不在任何实例上运行。
Physical Standby APPLY-ON 备库启用Redo Apply。

如果备库是Oracle RAC数据库,则代理将仅在一个称为Apply实例的备库实例上启动Redo Apply。 如果此实例失败,则代理将自动选择另一个已安装或打开为只读的实例。 然后该新实例将成为Apply实例。

从Oracle Database 12c第2版(12.2.0.1)开始,可以将Redo
Apply设置为在每个活动的运行中的物理备用实例上运行。 如果已经设置数据库在多个实例上运行重做应用,则可以使用DG
Broker属性ApplyInstances来限制Oracle RAC物理备库上重做应用所涉及的实例数量。
Physical Standby APPLY-OFF 备库停用Redo Apply

如果这是Oracle RAC数据库,则在将数据库状态更改为APPLY-ON之前,没有实例运行Apply Services。
Logical Standby APPLY-ON 打开逻辑备库并打开逻辑备库保护后,将在逻辑备库上启动SQL Apply。

如果这是一个Oracle RAC数据库,则SQL Apply正在一个实例(apply实例)上运行。 如果此实例失败,则代理将自动选择另一个打开的实例。 这个新实例将成为apply实例。

首次启用时这是逻辑备库的默认状态。
Logical Standby APPLY-OFF SQL Apply已停止。 逻辑备库保护功能已打开。

如果这是Oracle RAC数据库,则在将状态更改为APPLY-ON之前,不会运行SQL Apply的实例。

主库状态转换

当将主库转换为TRANSPORT-ON状态时,broker将使用配置成员的与重做传输相关的属性以及主库上的RedoRoutes属性来设置重做传输服务。
重做传输服务的设置是通过在主数据库上设置LOG_ARCHIVE_DEST_n和LOG_ARCHIVE_DEST_STATE_n初始化参数,以及在所有数据库(主库或备库)和远程同步实例上设置LOG_ARCHIVE_CONFIG初始化参数来完成的。
当将主库转换为TRANSPORT-OFF状态时,对应的log_archive_dest_state_n将置为RESET状态。
当将主库转换为TRANSPORT-ON状态时,对应的log_archive_dest_state_n将置为ENABLE状态。

# TRANSPORT-OFF
DGMGRL> edit database 'orcl' set state='TRANSPORT-OFF';
2021-03-01T17:42:39.159982+08:00
ALTER SYSTEM SET log_archive_dest_state_2='RESET' SCOPE=BOTH;

# TRANSPORT-ON
DGMGRL> edit database 'orcl' set state='TRANSPORT-ON';
2021-03-01T17:42:58.926310+08:00
ALTER SYSTEM SET log_archive_dest_state_2='ENABLE' SCOPE=BOTH;

物理备库状态转换

在将物理备库转换为APPLY-ON状态时,broker将使用与Redo Apply相关的属性指定的选项启动Redo Apply。 如果备库是Oracle RAC数据库,则代理将在一个称为Apply实例的备用实例上启动Redo Apply。
从Oracle Database 12c第2版(12.2.0.1)开始,可以将Redo Apply设置为在多个活动运行的物理备用实例上运行。 (此功能要求备库具有Oracle Active Data Guard选项的许可证。)如果已经设置数据库在多个实例上运行Redo Apply,则可以使用Data Guard Broker属性ApplyInstances限制数量。
当物理备库转换为APPLY-OFF状态时,broker停止Redo Apply。

# APPLY-OFF
DGMGRL> edit database 'orcladg' set state='APPLY-OFF';
Succeeded.
2021-03-02T09:46:48.776536+08:00
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
2021-03-02T09:46:48.777132+08:00
MRP0: Background Media Recovery cancelled with status 16037
2021-03-02T09:46:48.889176+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcladg/orcladg/trace/orcladg_pr00_2632.trc:
ORA-16037: user requested cancel of managed recovery operation
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Recovered data files to a consistent state at change 7993612
2021-03-02T09:46:49.040882+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcladg/orcladg/trace/orcladg_pr00_2632.trc:
ORA-16037: user requested cancel of managed recovery operation
2021-03-02T09:46:49.142033+08:00
MRP0: Background Media Recovery process shutdown (orcladg)
2021-03-02T09:46:49.777718+08:00
Managed Standby Recovery Canceled (orcladg)
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL

# APPLY-ON
DGMGRL> edit database 'orcladg' set state='APPLY-ON';
Succeeded.
2021-03-02T09:47:46.674256+08:00
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT  NODELAY
2021-03-02T09:47:46.674819+08:00
Attempt to start background Managed Standby Recovery process (orcladg)
Starting background process MRP0
2021-03-02T09:47:46.691256+08:00
MRP0 started with pid=56, OS id=2899 
2021-03-02T09:47:46.692433+08:00
MRP0: Background Managed Standby Recovery process started (orcladg)
2021-03-02T09:47:51.718042+08:00
 Started logmerger process
2021-03-02T09:47:51.737502+08:00
Managed Standby Recovery starting Real Time Apply
2021-03-02T09:47:51.788006+08:00
Parallel Media Recovery started with 2 slaves
2021-03-02T09:47:52.201813+08:00
Media Recovery Waiting for thread 1 sequence 235 (in transit)
2021-03-02T09:47:52.202522+08:00
Recovery of Online Redo Log: Thread 1 Group 5 Seq 235 Reading mem 0
  Mem# 0: /u01/app/oracle/oradata/orcladg/redo05.log
2021-03-02T09:47:52.699581+08:00
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT  NODELAY

设置故障转移目标

step 1.只能为当前主库设置一个目标转移备库,若只有一个备库,此步骤可忽略。

DGMGRL> enable fast_start failover 
Enabled in Potential Data Loss Mode.
DGMGRL> 

启用 Fast-Start Fail Over ,默认第一个备库未优先切换为从库。

DGMGRL> show configuration

Configuration - xiangxun

  Protection Mode: MaxPerformance
  Members:
  primary   - Primary database
    Warning: ORA-16819: fast-start failover observer not started

    standby01 - (*) Physical standby database 
      Warning: ORA-16819: fast-start failover observer not started

    standby02 - Physical standby database 
    standby03 - Physical standby database 
      Warning: ORA-16809: multiple warnings detected for the member

Fast-Start Failover: Enabled in Potential Data Loss Mode

Configuration Status:
WARNING   (status updated 56 seconds ago)

DGMGRL> 

FQ 1:未开启闪回数据库功能

DGMGRL> enable fast_start failover; Warning: ORA-16827: Flashback Database is disabled Enabled in Potential Data Loss Mode.

step 2.修改目标转移备库,需要先禁用FSFO,改完后再启用

DGMGRL> DISABLE FAST_START FAILOVER Disabled. DGMGRL>

修改转移备库为Standby03

DGMGRL> EDIT DATABASE primary SET PROPERTY FASTSTARTFAILOVERTARGET=standby03;
Property "faststartfailovertarget" updated
DGMGRL> EDIT DATABASE standby03 SET PROPERTY FASTSTARTFAILOVERTARGET=primary;
Property "faststartfailovertarget" updated
DGMGRL>

step 3.设置保护模式与LogXptMode
LOGXPTMODE属性在最大可用性模式下应为SYNC,在最大性能模式下应为ASYNC,主备库LOGXPTMODE设置必须相同

DGMGRL> EDIT CONFIGURATION SET PROTECTION MODE AS MaxAvailability;
Error: ORA-16627: operation disallowed since no member would remain to support protection mode

Failed.
DGMGRL> 

最大保护和最高可以的前提条件是日志传输模式为SYNC,所以我们要先修改日志传输模式。
因为考虑到以后要进行故障转移,所以需要将所有的数据库的LogXptMode都设置为SYNC。

DGMGRL> EDIT DATABASE standby03 SET PROPERTY LogXptMode='SYNC'; Property "logxptmode" updated DGMGRL> EDIT DATABASE standby02 SET PROPERTY LogXptMode='SYNC'; Property "logxptmode" updated DGMGRL> EDIT DATABASE standby01 SET PROPERTY LogXptMode='SYNC'; Property "logxptmode" updated DGMGRL> EDIT DATABASE primary SET PROPERTY LogXptMode='SYNC'; Property "logxptmode" updated DGMGRL> EDIT CONFIGURATION SET PROTECTION MODE AS MaxAvailability; Succeeded. DGMGRL>

step 4. 设置FSFO阈值
如果主库与 Observer和目标备库失联时间均超过FastStartFailoverThreshold属性配置阈值(单位为秒),将启动FSFO。

FastStartFailoverThreshold表示,Observer和目标备库 在检测到主库不可用后、希望等多少秒才启动FSFO。再此期间,它们会尝试重连主库。

默认值为30秒,最小6秒。

DGMGRL> edit configuration set property faststartfailoverthreshold=300;

step 5. 设置最大可接受延迟时间

FastStartFailoverLagLimit表示,自动故障转移允许的最大数据丢失量(单位为秒),仅在最高性能保护模式时才可使用。

FastStartFailoverLagLimit作为DG可接受的最大延迟时间,备库的apply lag只有在此限制之内,才允许FSFO。

如果无法维持已配置的数据丢失保证,主库上的redo生成将停止。为了避免长时间停顿,Observer或者目标备库可能会在第一次记录到无法发生FSFO之后,允许主库继续生成redo。此时若主库故障,将无法发生FSFO。

默认值为30秒,最小值为10秒。

DGMGRL> edit configuration set property faststartfailoverlaglimit=50;

step 6.设置FastStartFailoverPmyShutdown属性

默认为true,表示在FastStartFailoverThreshold属性指定的秒数过去之后,使用ABORT选项关闭主数据库。

如果不希望由于 主库redo生成停止 或 主库与Observer和目标备库 失联时间超过FastStartFailoverThreshold指定时间而在触发FSFO后关闭主库,需要将其设置为假。

DGMGRL> edit configuration set property FastStartFailoverThreshold=false;

step 7.设置自动恢复数据库属性
如果FastStartFailoverAutoReinstate属性设置为TRUE,在原主库故障修复后,会自动尝试将其恢复为新主库的备库。

DGMGRL> EDIT CONFIGURATION SET PROPERTY FASTSTARTFAILOVERAUTOREINSTATE=TRUE; Property "faststartfailoverautoreinstate" updated

step 9.启用fast_start故障转移

DGMGRL> enable fast_start failover; Enabled in Zero Data Loss Mode. DGMGRL>

step 10.查看FSFO配置

DGMGRL> show fast_start failover Fast-Start Failover: Enabled in Zero Data Loss Mode Protection Mode: MaxAvailability Lag Limit: 0 seconds Threshold: 30 seconds Active Target: standby02 Potential Targets: "standby02" standby02 valid Observer: (none) Shutdown Primary: TRUE Auto-reinstate: TRUE Observer Reconnect: (none) Observer Override: FALSE Configurable Failover Conditions Health Conditions: Corrupted Controlfile YES Corrupted Dictionary YES Inaccessible Logfile NO Stuck Archiver NO Datafile Write Errors YES Oracle Error Conditions: (none) DGMGRL>

配置Observer(可选)

ObserverConnectIdentifier用于指定Observer应如何连接并监视主备库。如果希望Observer使用不同于发送重做数据连接串的连接标识符,请为主库和目标备库设置此属性(可选)。

DGMGRL> start observer [W000 2024-05-08T09:25:45.030+08:00] FSFO target standby is standby01 Observer 'primary' started [W000 2024-05-08T09:25:45.169+08:00] Observer trace level is set to USER [W000 2024-05-08T09:25:45.169+08:00] Try to connect to the primary. [W000 2024-05-08T09:25:45.169+08:00] Try to connect to the primary primary. [W000 2024-05-08T09:25:45.238+08:00] The standby standby01 is ready to be a FSFO target [W000 2024-05-08T09:25:45.238+08:00] Connection to the primary restored! [W000 2024-05-08T09:25:47.238+08:00] Disconnecting from database primary.

通过登录到观察者计算机并运行DGMGRL,最多启动三个观察者。以具有SYSDG或SYSDBA权限的用户身份连接到配置,然后发出Start observer命令。请注意,在发出命令后将不会得到DGMGRL提示。

DGMGRL> CONNECT sysdg@North_Sales.example.com;
Password: password
Connected to "North_Sales"
Connected as SYSDG.

DGMGRL> START OBSERVER observer1 IN BACKGROUND
> FILE IS /net/sales/dat/oracle/broker/fsfo.dat
> LOGFILE IS /net/sales/dat/oracle/broker/observer.log
> CONNECT IDENTIFIER IS North_Sales
Submitted command "START OBSERVER" using connect identifier "North_Sales"

出于安全考虑,Oracle建议使用此命令格式;没有可见的凭据。这种做法可防止系统上的其他用户使用实用程序(例如UNIX ps实用程序)来显示连接凭据。它还防止明文密码在用户的终端上可见。

从脚本启动观察程序时,Oracle建议您使用支持“connect/”的方法,这样数据库连接凭据就不必嵌入脚本中。如果您选择使用客户端Oracle钱包作为安全的外部密码存储,请确保添加主数据库和快速启动故障切换目标备用数据库的凭据。添加每个数据库的凭据时指定的数据库连接字符串必须与ObserverConnectIdentifer或DGConnectIdentifier数据库属性匹配。

当启动多个观察程序时,启用快速启动故障切换后,一个观察程序是主观察程序,其余观察程序是备份观察程序。只有主观察者才能与Data Guard代理协调快速启动故障切换。如果主数据库和目标备用数据库保持连接,但与主观察器的连接丢失,则代理会尝试指定一个备份观察器作为新的主观察者。

测试

需要启动 OBServer

step 1.主库异常关机
step 2.查看 OBServer 日志

DGMGRL> start observer
[W000 2024-05-08T03:35:44.358-04:00] FSFO target standby is standby02
Observer 'standby03' started
[W000 2024-05-08T03:35:44.778-04:00] Observer trace level is set to USER
[W000 2024-05-08T03:35:44.778-04:00] Try to connect to the primary.
[W000 2024-05-08T03:35:44.778-04:00] Try to connect to the primary primary.
[W000 2024-05-08T03:35:44.837-04:00] The standby standby02 is ready to be a FSFO target
[W000 2024-05-08T03:35:45.837-04:00] Connection to the primary restored!
[W000 2024-05-08T03:35:50.846-04:00] Disconnecting from database primary.
[W000 2024-05-08T03:36:53.996-04:00] Primary database cannot be reached.
[W000 2024-05-08T03:36:53.996-04:00] Fast-Start Failover threshold has not exceeded. Retry for the next 30 seconds
[W000 2024-05-08T03:36:54.996-04:00] Try to connect to the primary.
[W000 2024-05-08T03:37:25.004-04:00] Primary database cannot be reached.
[W000 2024-05-08T03:37:25.004-04:00] Fast-Start Failover threshold has expired.
[W000 2024-05-08T03:37:25.004-04:00] Try to connect to the standby.
[W000 2024-05-08T03:37:25.004-04:00] Making a last connection attempt to primary database before proceeding with Fast-Start Failover.
[W000 2024-05-08T03:37:25.004-04:00] Check if the standby is ready for failover.
[S002 2024-05-08T03:37:25.012-04:00] Fast-Start Failover started...

2024-05-08T03:37:25.012-04:00
Initiating Fast-Start Failover to database "standby02"...
[S002 2024-05-08T03:37:25.012-04:00] Initiating Fast-start Failover.
Performing failover NOW, please wait...
ORA-12543: TNS:destination host unreachable

Unable to connect to database using primary
ORA-12170: TNS:Connect timeout occurred

Unable to connect to database using primary
Failover succeeded, new primary is "standby02"
2024-05-08T03:38:22.701-04:00
[S002 2024-05-08T03:38:22.701-04:00] Fast-Start Failover finished...
[W000 2024-05-08T03:38:22.701-04:00] Failover succeeded. Restart pinging.
[W000 2024-05-08T03:38:22.729-04:00] Primary database has changed to standby02.
[W000 2024-05-08T03:38:22.734-04:00] Try to connect to the primary.
[W000 2024-05-08T03:38:22.734-04:00] Try to connect to the primary standby02.
[W000 2024-05-08T03:38:22.876-04:00] The standby primary needs to be reinstated
[W000 2024-05-08T03:38:22.877-04:00] Try to connect to the new standby primary.
[W000 2024-05-08T03:38:23.877-04:00] Connection to the primary restored!
ORA-12543: TNS:destination host unreachable

Unable to connect to database using primary
[W000 2024-05-08T03:38:26.885-04:00] Reinstating database standby01.

2024-05-08T03:38:26.885-04:00
Initiating reinstatement for database "standby01"...
Reinstating database "standby01", please wait...
Error: ORA-16653: failed to reinstate database

Failed.
Reinstatement of database "standby01" failed
2024-05-08T03:38:28.974-04:00
[W000 2024-05-08T03:38:30.972-04:00] Reinstating database standby03.

2024-05-08T03:38:30.972-04:00
Initiating reinstatement for database "standby03"...
Reinstating database "standby03", please wait...
Error: ORA-16653: failed to reinstate database

step 3.查看当前配置

DGMGRL> SHOW CONFIGURATION

Configuration - xiangxun

  Protection Mode: MaxAvailability
  Members:
  primary   - Primary database
    Error: ORA-12543: TNS:destination host unreachable

    standby02 - (*) Physical standby database 
    standby01 - Physical standby database 
    standby03 - Physical standby database 

Fast-Start Failover: Enabled in Zero Data Loss Mode

Configuration Status:
ERROR   (status updated 0 seconds ago)

DGMGRL> 

step 4.查看备库 standby02 的加色

SQL> select open_mode,protection_mode,database_role,switchover_status from v$database;

OPEN_MODE				 PROTECTION_MODE			  DATABASE_ROLE 		   SWITCHOVER_STATUS
---------------------------------------- ---------------------------------------- -------------------------------- ----------------------------------------
READ WRITE				 MAXIMUM AVAILABILITY			  PRIMARY			   NOT ALLOWED

SQL> 

当前 standby02 已经提升为 主库

step 5.当主库故障恢复后,启动数据库和监听

[oracle@primary ~]$ sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Wed May 8 15:46:04 2024
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Connected to an idle instance.

SYS@orcl> startup
ORACLE instance started.

Total System Global Area 2432695144 bytes
Fixed Size		    8899432 bytes
Variable Size		  536870912 bytes
Database Buffers	 1879048192 bytes
Redo Buffers		    7876608 bytes
Database mounted.
ORA-16649: possible failover to another database prevents this database from being opened


SYS@orcl> 

step 6.查看 OBServer 日志

2024-05-08T03:47:23.207-04:00
Initiating reinstatement for database "primary"...
Reinstating database "primary", please wait...
[W000 2024-05-08T03:48:00.285-04:00] Try to connect to the primary standby02.
[W000 2024-05-08T03:48:01.285-04:00] Connection to the primary restored!
[W000 2024-05-08T03:48:03.291-04:00] Reinstating database standby01.

2024-05-08T03:48:03.291-04:00
Initiating reinstatement for database "standby01"...
Reinstating database "standby01", please wait...
[W000 2024-05-08T03:48:13.316-04:00] The standby primary is ready to be a FSFO target
Reinstatement of database "primary" succeeded
2024-05-08T03:49:47.563-04:00
[W000 2024-05-08T03:49:48.516-04:00] Successfully reinstated database primary.
Error: ORA-16653: failed to reinstate database

Failed.
Reinstatement of database "standby01" failed
2024-05-08T03:49:54.481-04:00
[W000 2024-05-08T03:49:55.530-04:00] Reinstating database standby03.

2024-05-08T03:49:55.530-04:00
Initiating reinstatement for database "standby03"...
Reinstating database "standby03", please wait...
Error: ORA-16653: failed to reinstate database

Failed.
Reinstatement of database "standby03" failed
2024-05-08T03:50:02.022-04:00
[W000 2024-05-08T03:50:03.020-04:00] Disconnecting from database standby02.

当主库物理故障恢复后,从新进行切换回主库。

相关视图

SELECT fs_failover_status,
fs_failover_current_target,
fs_failover_threshold,
fs_failover_observer_present,
fs_failover_observer_host
FROM v$database;

FQ

FQ 1:

DGMGRL> SHOW CONFIGURATION

Configuration - xiangxun

  Protection Mode: MaxPerformance
  Members:
  primary   - Primary database
    standby01 - Physical standby database 
    standby02 - Physical standby database 
    standby03 - Physical standby database 
      Warning: ORA-16857: member disconnected from redo source for longer than specified threshold

Fast-Start Failover:  Disabled

Configuration Status:
WARNING   (status updated 54 seconds ago)

DGMGRL> 

Data Guard代理还具有可配置的数据库属性,当传输或应用滞后超过用户定义的值时,这些属性可用于生成警告:

当数据库的应用滞后超过属性指定的值时,ApplyLagThreshold可配置数据库属性会为逻辑或物理备用生成警告状态。属性值以秒为单位表示。如果值为0秒,则在存在应用滞后时不会生成任何警告。作为最佳实践,Oracle建议将ApplyLagThreshold设置为至少15分钟。
当数据库的传输滞后超过属性指定的值时,TransportLagThreshold可配置数据库属性可用于为逻辑、物理或快照备用生成警告状态。属性值以秒为单位表示。如果值为0秒,则在存在传输滞后时不会生成任何警告。作为最佳做法,Oracle建议将TransportLagThreshold设置为至少15分钟。

DGMGRL> edit database standby03 set property ApplyLagThreshold=900;
Property "applylagthreshold" updated
DGMGRL> show database verbose standby03

Database - standby03

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      5 minutes 25 seconds (computed 37 seconds ago)
  Apply Lag:          5 minutes 25 seconds (computed 37 seconds ago)
  Average Apply Rate: 1.00 KByte/s
  Active Apply Rate:  360.00 KByte/s
  Maximum Apply Rate: 360.00 KByte/s
  Real Time Query:    OFF
  Instance(s):
    orcl

  Database Warning(s):
    ORA-16855: transport lag has exceeded specified threshold
    ORA-16857: member disconnected from redo source for longer than specified threshold

  Properties:
    DGConnectIdentifier             = 'standby03'
    ObserverConnectIdentifier       = ''
    FastStartFailoverTarget         = 'primary'
    PreferredObserverHosts          = ''
    LogShipping                     = 'ON'
    RedoRoutes                      = ''
    LogXptMode                      = 'ASYNC'
    DelayMins                       = '0'
    Binding                         = 'optional'
    MaxFailure                      = '0'
    ReopenSecs                      = '300'
    NetTimeout                      = '30'
    RedoCompression                 = 'DISABLE'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyLagThreshold               = '900'
    TransportLagThreshold           = '30'
    TransportDisconnectedThreshold  = '30'
    ApplyParallel                   = 'AUTO'
    ApplyInstances                  = '0'
    StandbyFileManagement           = ''
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '0'
    LogArchiveMinSucceedDest        = '0'
    DataGuardSyncLatency            = '0'
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = ''
    DbFileNameConvert               = ''
    LogFileNameConvert              = ''
    ArchiveLocation                 = ''
    AlternateLocation               = ''
    StandbyArchiveLocation          = ''
    StandbyAlternateLocation        = ''
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    LogXptStatus                    = '(monitor)'
    SendQEntries                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    HostName                        = 'standby03'
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=standby03)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=standby03_DGMGRL)(INSTANCE_NAME=orcl)(SERVER=DEDICATED)))'
    TopWaitEvents                   = '(monitor)'
    SidName                         = '(monitor)'

  Log file locations:
    Alert log               : /u01/app/oracle/diag/rdbms/standby03/orcl/trace/alert_orcl.log
    Data Guard Broker log   : /u01/app/oracle/diag/rdbms/standby03/orcl/trace/drcorcl.log

Database Status:
WARNING

DGMGRL> 

FQ 2:

DGMGRL> show database verbose standby03

Database - standby03

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      6 minutes 25 seconds (computed 34 seconds ago)
  Apply Lag:          6 minutes 25 seconds (computed 34 seconds ago)
  Average Apply Rate: 1.00 KByte/s
  Active Apply Rate:  360.00 KByte/s
  Maximum Apply Rate: 360.00 KByte/s
  Real Time Query:    OFF
  Instance(s):
    orcl

  Database Warning(s):
    ORA-16855: transport lag has exceeded specified threshold
    ORA-16857: member disconnected from redo source for longer than specified threshold

  Properties:
    DGConnectIdentifier             = 'standby03'
    ObserverConnectIdentifier       = ''
    FastStartFailoverTarget         = 'primary'
    PreferredObserverHosts          = ''
    LogShipping                     = 'ON'
    RedoRoutes                      = ''
    LogXptMode                      = 'ASYNC'
    DelayMins                       = '0'
    Binding                         = 'optional'
    MaxFailure                      = '0'
    ReopenSecs                      = '300'
    NetTimeout                      = '30'
    RedoCompression                 = 'DISABLE'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyLagThreshold               = '900'
    TransportLagThreshold           = '30'
    TransportDisconnectedThreshold  = '30'
    ApplyParallel                   = 'AUTO'
    ApplyInstances                  = '0'
    StandbyFileManagement           = ''
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '0'
    LogArchiveMinSucceedDest        = '0'
    DataGuardSyncLatency            = '0'
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = ''
    DbFileNameConvert               = ''
    LogFileNameConvert              = ''
    ArchiveLocation                 = ''
    AlternateLocation               = ''
    StandbyArchiveLocation          = ''
    StandbyAlternateLocation        = ''
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    LogXptStatus                    = '(monitor)'
    SendQEntries                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    HostName                        = 'standby03'
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=standby03)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=standby03_DGMGRL)(INSTANCE_NAME=orcl)(SERVER=DEDICATED)))'
    TopWaitEvents                   = '(monitor)'
    SidName                         = '(monitor)'

  Log file locations:
    Alert log               : /u01/app/oracle/diag/rdbms/standby03/orcl/trace/alert_orcl.log
    Data Guard Broker log   : /u01/app/oracle/diag/rdbms/standby03/orcl/trace/drcorcl.log

Database Status:
WARNING

DGMGRL> 

上面输出中ORA-16855的错误是正常的,可以看下要求的delay是多少:

SYS@orcl> show parameter log_archive_dest_4  

NAME				     TYPE		    VALUE
------------------------------------ ---------------------- ------------------------------
log_archive_dest_4		     string		    service="standby03", ASYNC NOA
							    FFIRM delay=0 optional compres
							    sion=disable max_failure=0 reo
							    pen=300 db_unique_name="standb
							    y03" net_timeout=30, valid_for
							    =(online_logfile,all_roles)
SYS@orcl> 

在 DGMGRL 中查看

DGMGRL> show database verbose "standby03" DelayMins
  DelayMins = '0'
DGMGRL> 

可以看到,要求的为 0,而实际的滞后为1分钟:

SYS@orcl> SELECT * FROM v$dataguard_stats WHERE name LIKE '%lag%';

SOURCE_DBID SOURCE_DB_UNIQUE_NAME					     NAME
----------- ---------------------------------------------------------------- ----------------------------------------------------------------
VALUE																 UNIT
-------------------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------
TIME_COMPUTED						     DATUM_TIME 						      CON_ID
------------------------------------------------------------ ------------------------------------------------------------ ----------
 1696144244 PRIMARY							     transport lag
+00 00:00:00															 day(2) to second(0) interval
05/08/2024 13:51:19					     05/08/2024 13:51:18						   0

 1696144244 PRIMARY							     apply lag
+00 00:00:00															 day(2) to second(0) interval
05/08/2024 13:51:19					     05/08/2024 13:51:18						   0


SYS@orcl> 

修改 DelayMins 为 120 秒

DGMGRL> edit database standby03 set property DelayMins=120;
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

文章被以下合辑收录

评论