在今天的文章中,我将向您介绍Data Guard Broker—监控环境。
安装数据保护环境并启用代理配置后,我们需要检查系统的可操作性。下面是进行这些检查的DGMGRL命令。
[从DGMGRL运行的命令都是通过连接到DGMGRL从主-1运行的。]
1.我们的第一个检查查询是查询整个配置。
DGMGRL> show configuration
Configuration - Broker_Configuration
Protection Mode: MaxPerformance
Databases:
primary - Primary database
standby - Physical standby database
logical - Logical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
从这里;
保护模式,
哪些DB包括在代理环境中,
快速启动故障切换的状态,
我们可以看到配置的一般状态。
2.可以在数据库级别进行常规检查。
DGMGRL> show database primary;
Database - primary
Role: PRIMARY
Intended State: TRANSPORT-ON
Instance(s):
primary1
primary2
Database Status:
SUCCESS
从这里;
我们了解重做运输服务的状态。
我们了解数据库的状态。
DGMGRL> show database standby;
Database - standby
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds (computed 2 seconds ago)
Apply Lag: 0 seconds (computed 2 seconds ago)
Apply Rate: 517.00 KByte/s
Real Time Query: ON
Instance(s):
primary1 (apply instance)
primary2
Database Status:
SUCCESS
DGMGRL> show database logical;
Database - logical
Role: LOGICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds (computed 1 second ago)
Apply Lag: 4 seconds (computed 1 second ago)
Apply Rate: 862.71 MByte/s
Instance(s):
primary1v
primary2 (apply instance)
Database Status:
SUCCESS
从这里;
数据库的作用,
无论重做应用是否激活,
重做和传输延迟信息,
对于物理备用数据库,实时查询(活动数据保护)是否处于活动状态,
应用服务在哪些实例上运行,
我们学习有关数据库状态的信息。
3.如果数据库或配置状态为错误,我们将创建一个错误条件以查看要检查的参数。
在启用代理配置的同时,我们正在更改SQLPLUS的参数。
[Primary-2] SQL> show parameter log_archive_dest_state_3
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
log_archive_dest_state_3 string ENABLE
log_archive_dest_state_30 string enable
log_archive_dest_state_31 string enable
[Primary-2] SQL> alter system set log_archive_dest_state_2=DEFER scope=both sid='*';
System altered.
我们正在查询代理配置。
DGMGRL> show configuration
Configuration - Broker_Configuration
Protection Mode: MaxPerformance
Databases:
primary - Primary database
Error: ORA-16764: redo transport service to a standby database is not running
standby - Physical standby database
logical - Logical standby database
Fast-Start Failover: DISABLED
Configuration Status:
ERROR
4.上述误差可能并不总是明显的。下面是我们将在这种情况下查看的参数。
我们在看经纪人的日志。
[oracle@primary2 ~]$ tail -100f /u01/app/oracle/diag/rdbms/primary/primary2/trace/drcprimary2.log
01/16/2017 10:12:02
Redo transport problem detected: redo transport to database 'standby' is unexpectedly offline: DEFERRED
注意:从错误中可以看出,LOG_ARCHIVE_DEST_STATE_2参数被延迟。
我们可以在不查看日志的情况下看到“StatusReport”属性的错误。
DGMGRL> show database primary 'StatusReport';
STATUS REPORT
INSTANCE_NAME SEVERITY ERROR_TEXT
primary1 ERROR ORA-16738: redo transport service for database "standby" is not running
primary2 ERROR ORA-16738: redo transport service for database "standby" is not running
我们还质疑与运输服务相关的其他属性。
DGMGRL> show database primary 'LogXptStatus';
LOG TRANSPORT STATUS
PRIMARY_INSTANCE_NAME STANDBY_DATABASE_NAME STATUS
primary1 standby
primary1 logical
primary2 standby
primary2 logical
DGMGRL> show database primary 'InconsistentProperties';
INCONSISTENT PROPERTIES
INSTANCE_NAME PROPERTY_NAME MEMORY_VALUE SPFILE_VALUE BROKER_VALUE
DGMGRL> Show database primary 'InconsistentLogXptProps';
INCONSISTENT LOG TRANSPORT PROPERTIES
INSTANCE_NAME STANDBY_NAME PROPERTY_NAME MEMORY_VALUE BROKER_VALUE
LogXptStatus:显示主数据库中所有实例捕获的传输错误。
不一致属性:列出代理配置和数据库设置中是否存在不一致的情况。
InconsistentLogXptProps:列出备用数据库中与重做传输相关的参数的代理配置和重做传输设置之间的不一致。
5.假设有许多日志切换操作,直到发现错误为止。在这种情况下,我们可以看到哪些存档文件未按如下方式处理。
我们正在进行日志切换操作以进行测试。
[Primary-2] SQL> alter system switch logfile;
System altered.
[Primary-2] SQL> alter system switch logfile;
System altered.
[Primary-2] SQL> alter system switch logfile;
System altered.
[Primary-2] SQL> alter system switch logfile;
System altered.
我们检查主侧和备用侧的序列号。
[Primary-2] SQL> select max(sequence#),thread# from v$archived_log group by thread#;
MAX(SEQUENCE#) THREAD#
-------------- ----------
572 1
332 2
[Physical-2] SQL> select max(sequence#),thread# from v$archived_log group by thread#;
MAX(SEQUENCE#) THREAD#
-------------- ----------
572 1
329 2
[Logical-2] SQL> SELECT max(sequence#),thread#,applied FROM dba_logstdby_log group by thread#, applied;
MAX(SEQUENCE#) THREAD# APPLIED
-------------- ---------- --------
572 1 YES
332 2 YES
可以看出,有些档案没有进入物理备用端。通过查询“SendQEntries”属性可以查看这些存档。
DGMGRL> show database primary 'SendQEntries';
PRIMARY_SEND_QUEUE
STANDBY_NAME STATUS RESETLOGS_ID THREAD LOG_SEQ TIME_GENERATED TIME_COMPLETED FIRST_CHANGE# NEXT_CHANGE# SIZE (KBs)
standby ARCHIVED 932222810 2 330 01/16/2017 10:55:27 01/16/2017 10:55:35 6074895 6074922 8
standby ARCHIVED 932222810 2 331 01/16/2017 10:55:35 01/16/2017 10:55:37 6074922 6074929 1
standby ARCHIVED 932222810 2 332 01/16/2017 10:55:37 01/16/2017 10:55:40 6074929 6074940 1
CURRENT 932222810 1 573 01/16/2017 10:55:34 6074936 2515
CURRENT 932222810
在这里的状态列中,
当前:显示当前写入的重做日志。
存档:表示在线重做日志的写入过程已完成,创建了本地存档,但未将存档发送到备用端。
NOT_archive:表示在线重做日志的写入过程已完成,但尚未检索到本地存档。
我们启用DEFERRED LOG_ARCHIVE_DEST_STATE_2参数并查询配置的状态。
[Primary-1] SQL> alter system set log_archive_dest_state_2=enable scope=both sid='*';
System altered.
DGMGRL> show configuration
Configuration - Broker_Configuration
Protection Mode: MaxPerformance
Databases:
primary - Primary database
standby - Physical standby database
logical - Logical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
[Primary-2] SQL> select max(sequence#),thread# from v$archived_log group by thread#;
MAX(SEQUENCE#) THREAD#
-------------- ----------
573 1
333 2
[Physical-2] SQL> select max(sequence#),thread# from v$archived_log group by thread#;
MAX(SEQUENCE#) THREAD#
-------------- ----------
573 1
333 2
[Logical-2] SQL> SELECT max(sequence#),thread#,applied FROM dba_logstdby_log group by thread#, applied;
MAX(SEQUENCE#) THREAD# APPLIED
-------------- ---------- --------
573 1 YES
333 2 YES
DGMGRL> show database primary 'SendQEntries';
PRIMARY_SEND_QUEUE
STANDBY_NAME STATUS RESETLOGS_ID THREAD LOG_SEQ TIME_GENERATED TIME_COMPLETED FIRST_CHANGE# NEXT_CHANGE# SIZE (KBs)
CURRENT 932222810 1 574 01/16/2017 11:21:41 6080641 36
CURRENT 932222810 2
6.我们首先设置一个测试环境来了解属性,在那里我们可以看到哪些归档文件不适用于备用端。然后查询相关属性。
在逻辑方面,我们质疑重做应用的状态。
DGMGRL> show database logical;
Database - logical
Role: LOGICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds (computed 1 second ago)
Apply Lag: 0 seconds (computed 1 second ago)
Apply Rate: -939900928 Byte/s
Instance(s):
primary1
primary2 (apply instance)
Database Status:
SUCCESS
对于双重控制,在主侧创建一个表,并查看它是否转到备用侧。
[Primary-2] SQL> select table_name from dba_tables where owner='TEST';
TABLE_NAME
------------------------------
EMPLOYEES_YEDEK
LOCATIONS_YEDEK
REGIONS_YEDEK
[Physical-2] SQL> select table_name from dba_tables where owner='TEST';
TABLE_NAME
------------------------------
EMPLOYEES_YEDEK
LOCATIONS_YEDEK
REGIONS_YEDEK
[Logical-2] SQL> select table_name from dba_tables where owner='TEST';
TABLE_NAME
------------------------------
REGIONS_YEDEK
EMPLOYEES_YEDEK
LOCATIONS_YEDEK
[Primary-2] SQL> create table test.departments_yedek as select * from hr.departments;
Table created.
[Primary-2] SQL> select table_name from dba_tables where owner='TEST';
TABLE_NAME
------------------------------
EMPLOYEES_YEDEK
LOCATIONS_YEDEK
REGIONS_YEDEK
DEPARTMENTS_YEDEK
[Physical-2] SQL> select table_name from dba_tables where owner='TEST';
TABLE_NAME
------------------------------
EMPLOYEES_YEDEK
LOCATIONS_YEDEK
REGIONS_YEDEK
DEPARTMENTS_YEDEK
[Logical-2] SQL> select table_name from dba_tables where owner='TEST';
TABLE_NAME
------------------------------
REGIONS_YEDEK
EMPLOYEES_YEDEK
LOCATIONS_YEDEK
DEPARTMENTS_YEDEK
在逻辑方面,我们停止重做应用服务。
DGMGRL> edit database logical set state='APPLY-OFF';
Succeeded.
[Logical-1]—>ALERT LOG
01/16/2017 12:16:53
EDIT DATABASE logical SET STATE = APPLY-OFF
01/16/2017 12:16:55
Command EDIT DATABASE logical SET STATE = APPLY-OFF completed
我们正在执行日志切换过程,以便可以创建归档日志。
[Primary-2] SQL> alter system switch logfile;
System altered.
[Primary-2] SQL> alter system switch logfile;
System altered.
[Primary-2] SQL> alter system switch logfile;
System altered.
[Primary-2] SQL> alter system switch logfile;
System altered.
我们正在查询逻辑数据库的重做应用状态。
DGMGRL> show database logical
Database - logical
Role: LOGICAL STANDBY
Intended State: APPLY-OFF
Transport Lag: 0 seconds (computed 1 second ago)
Apply Lag: 39 seconds (computed 1 second ago)
Apply Rate: (unknown)
Instance(s):
primary1
primary2 (apply instance)
Database Status:
SUCCESS
我们知道哪些档案不适用。
DGMGRL> show database logical 'RecvQEntries';
STANDBY_RECEIVE_QUEUE
STATUS RESETLOGS_ID THREAD LOG_SEQ TIME_GENERATED TIME_COMPLETED FIRST_CHANGE# NEXT_CHANGE# SIZE (KBs)
PARTIALLY_APPLIED 932222810 1 575 01/16/2017 12:08:19 01/16/2017 12:19:01 6089898 6091869 420
PARTIALLY_APPLIED 932222810 2 336 01/16/2017 12:08:22 01/16/2017 12:18:55 6089892 6091799 900
NOT_APPLIED 932222810 2 337 01/16/2017 12:18:55 01/16/2017 12:19:02 6091799 6091851 32
NOT_APPLIED 932222810 2 338 01/16/2017 12:19:02 01/16/2017 12:19:05 6091851 6091863 3
NOT_APPLIED 932222810 2 339
下面是上述DGMGRL查询的SQLPLUS命令。
[Logical-2] SQL> SELECT max(sequence#),thread#,applied FROM dba_logstdby_log group by thread#, applied;
MAX(SEQUENCE#) THREAD# APPLIED
-------------- ---------- --------
575 1 CURRENT
574 1 YES
339 2 NO
335 2 YES
336 2 CURRENT
7.我们了解了导致Log Apply服务在逻辑备用数据库中停止的事务,如下所示。
DGMGRL> show database logical 'LsbyFailedTxnInfo';
LsbyFailedTxnInfo = ''
8.在逻辑备用数据库中,日志应用服务使用的进程和MAX_ SGA信息如下所示。
DGMGRL> show database logical 'LsbyParameters';
LsbyParameters = 'MAX_SGA=30###MAX_SERVERS=14###MAX_EVENTS_RECORDED=10000###PRESERVE_CO
9.我们看到TopWait事件如下。
DGMGRL> show instance primary1 'TopWaitEvents' on database primary;
TOP SYSTEM WAIT EVENTS
Event Wait Time
rdbms ipc message 1503759311
class slave wait 297257530
SQL*Net message from client 148557969
gcs remote message 84543254
DIAG idle wait 84519976
DGMGRL> show instance primary1 'TopWaitEvents' on database standby;
TOP SYSTEM WAIT EVENTS
Event Wait Time
rdbms ipc message 810213291
class slave wait 311501944
parallel recovery slave next change 104565552
gcs remote message 52278608
DIAG idle wait 52247124
DGMGRL> show instance primary1 'TopWaitEvents' on database logical;
TOP SYSTEM WAIT EVENTS
Event Wait Time
rdbms ipc message 1409929720
class slave wait 135264994
SQL*Net message from client 132785675
wait for unread message on broadcast channel 88204824
gcs remote message 88160985
此信息从V$SYSTEM_EVENT视图中获取。
10.通过下面的示例,我们可以看到代理配置文件和SPFILE参数文件不一致的情况。
我们正在查询代理配置。
DGMGRL> show configuration
Configuration - Broker_Configuraiton
Protection Mode: MaxPerformance
Databases:
primary - Primary database
Warning: ORA-16809: multiple warnings detected for the database
standby - Physical standby database
logical - Logical standby database
Fast-Start Failover: DISABLED
Configuration Status:
WARNING
我们质疑是什么导致了错误。
DGMGRL> show database primary 'StatusReport';
STATUS REPORT
INSTANCE_NAME SEVERITY ERROR_TEXT
primary1 WARNING ORA-16714: the value of property LogArchiveTrace is inconsistent with the database setting
primary2 WARNING ORA-16714: the value of property LogArchiveTrace is inconsistent with the database setting
我们将LogArchiveTrace参数的值与数据库中的代理配置文件进行比较。
DGMGRL> show database primary 'InconsistentProperties';
INCONSISTENT PROPERTIES
INSTANCE_NAME PROPERTY_NAME MEMORY_VALUE SPFILE_VALUE BROKER_VALUE
primary1 LogArchiveTrace 6 0 0
primary2 LogArchiveTrace 6
我们还通过检查参数值来检查情况是否如此。
DGMGRL> show instance primary1 'LogArchiveTrace' on database primary;
LogArchiveTrace = '0'
**********
[Primary-1] SQL> show parameter log_archive_trace
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
log_archive_trace integer 6
我们重置了数据库中的参数值。
[Primary-1] SQL> alter system set log_archive_trace=0 scope=both sid='*';
System altered.
我们检查配置是否正常。
[Primary-1] SQL> show parameter log_archive_trace
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
log_archive_trace integer 0
DGMGRL> show configuration
Configuration - Broker_Configuraiton
Protection Mode: MaxPerformance
Databases:
primary - Primary database
standby - Physical standby database
logical - Logical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
DGMGRL> show database primary 'StatusReport';
STATUS REPORT
INSTANCE_NAME SEVERITY ERROR_TEXT
原文标题:Monitoring The Environment
原文作者:Onur ARDAHANLI
原文链接:https://dbtut.com/index.php/2022/06/03/monitoring-the-environment




