暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

监测Oracle环境

原创 eternity 2022-08-15
340

在今天的文章中,我将向您介绍Data Guard Broker—监控环境。

安装数据保护环境并启用代理配置后,我们需要检查系统的可操作性。下面是进行这些检查的DGMGRL命令。

[从DGMGRL运行的命令都是通过连接到DGMGRL从主-1运行的。]

1.我们的第一个检查查询是查询整个配置。

DGMGRL> show configuration
 
Configuration - Broker_Configuration
 
  Protection Mode: MaxPerformance
  Databases:
    primary - Primary database
    standby - Physical standby database
    logical - Logical standby database
 
Fast-Start Failover: DISABLED
 
Configuration Status:
SUCCESS

从这里;

保护模式,

哪些DB包括在代理环境中,

快速启动故障切换的状态,

我们可以看到配置的一般状态。

2.可以在数据库级别进行常规检查。

DGMGRL> show database primary;
 
Database - primary
 
  Role:            PRIMARY
  Intended State:  TRANSPORT-ON
  Instance(s):
    primary1
    primary2
 
Database Status:
SUCCESS

从这里;

我们了解重做运输服务的状态。

我们了解数据库的状态。

DGMGRL> show database standby;
 
Database - standby
 
  Role:            PHYSICAL STANDBY
  Intended State:  APPLY-ON
  Transport Lag:   0 seconds (computed 2 seconds ago)
  Apply Lag:       0 seconds (computed 2 seconds ago)
  Apply Rate:      517.00 KByte/s
  Real Time Query: ON
  Instance(s):
    primary1 (apply instance)
    primary2
 
Database Status:
SUCCESS
DGMGRL> show database logical;
 
Database - logical
 
  Role:            LOGICAL STANDBY
  Intended State:  APPLY-ON
  Transport Lag:   0 seconds (computed 1 second ago)
  Apply Lag:       4 seconds (computed 1 second ago)
  Apply Rate:      862.71 MByte/s
  Instance(s):
    primary1v
    primary2 (apply instance)
 
Database Status:
SUCCESS

从这里;

数据库的作用,

无论重做应用是否激活,

重做和传输延迟信息,

对于物理备用数据库,实时查询(活动数据保护)是否处于活动状态,

应用服务在哪些实例上运行,

我们学习有关数据库状态的信息。

3.如果数据库或配置状态为错误,我们将创建一个错误条件以查看要检查的参数。

在启用代理配置的同时,我们正在更改SQLPLUS的参数。


[Primary-2] SQL> show parameter log_archive_dest_state_3
 
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
log_archive_dest_state_3             string      ENABLE
log_archive_dest_state_30            string      enable
log_archive_dest_state_31            string      enable
[Primary-2] SQL> alter system set log_archive_dest_state_2=DEFER scope=both sid='*';
 
System altered.

我们正在查询代理配置。

DGMGRL> show configuration
 
Configuration - Broker_Configuration
 
  Protection Mode: MaxPerformance
  Databases:
    primary - Primary database
      Error: ORA-16764: redo transport service to a standby database is not running
 
    standby - Physical standby database
    logical - Logical standby database
 
Fast-Start Failover: DISABLED
 
Configuration Status:
ERROR

4.上述误差可能并不总是明显的。下面是我们将在这种情况下查看的参数。

我们在看经纪人的日志。

[oracle@primary2 ~]$ tail -100f /u01/app/oracle/diag/rdbms/primary/primary2/trace/drcprimary2.log 
 
01/16/2017 10:12:02
Redo transport problem detected: redo transport to database 'standby' is unexpectedly offline: DEFERRED

注意:从错误中可以看出,LOG_ARCHIVE_DEST_STATE_2参数被延迟。

我们可以在不查看日志的情况下看到“StatusReport”属性的错误。

DGMGRL> show database primary 'StatusReport';
STATUS REPORT
       INSTANCE_NAME   SEVERITY ERROR_TEXT
            primary1      ERROR ORA-16738: redo transport service for database "standby" is not running
            primary2      ERROR ORA-16738: redo transport service for database "standby" is not running

我们还质疑与运输服务相关的其他属性。

DGMGRL> show database primary 'LogXptStatus';
LOG TRANSPORT STATUS
PRIMARY_INSTANCE_NAME STANDBY_DATABASE_NAME               STATUS 
            primary1              standby                      
            primary1              logical                      
            primary2              standby                      
            primary2              logical                      
 
DGMGRL> show database primary 'InconsistentProperties';
INCONSISTENT PROPERTIES
   INSTANCE_NAME        PROPERTY_NAME         MEMORY_VALUE         SPFILE_VALUE         BROKER_VALUE 
   
DGMGRL> Show database primary 'InconsistentLogXptProps';
INCONSISTENT LOG TRANSPORT PROPERTIES
   INSTANCE_NAME         STANDBY_NAME        PROPERTY_NAME         MEMORY_VALUE         BROKER_VALUE

LogXptStatus:显示主数据库中所有实例捕获的传输错误。

不一致属性:列出代理配置和数据库设置中是否存在不一致的情况。

InconsistentLogXptProps:列出备用数据库中与重做传输相关的参数的代理配置和重做传输设置之间的不一致。

5.假设有许多日志切换操作,直到发现错误为止。在这种情况下,我们可以看到哪些存档文件未按如下方式处理。

我们正在进行日志切换操作以进行测试。

[Primary-2] SQL> alter system switch logfile;
 
System altered.
 
[Primary-2] SQL> alter system switch logfile;
 
System altered.
 
[Primary-2] SQL> alter system switch logfile;
 
System altered.
 
[Primary-2] SQL> alter system switch logfile;
 
System altered.

我们检查主侧和备用侧的序列号。

[Primary-2] SQL> select max(sequence#),thread# from v$archived_log group by thread#;
 
MAX(SEQUENCE#)    THREAD#
-------------- ----------
           572          1
           332          2
[Physical-2] SQL> select max(sequence#),thread# from v$archived_log group by thread#;
 
MAX(SEQUENCE#)    THREAD#
-------------- ----------
           572          1
           329          2
[Logical-2] SQL> SELECT max(sequence#),thread#,applied FROM dba_logstdby_log group by thread#, applied;
 
MAX(SEQUENCE#)    THREAD# APPLIED
-------------- ---------- --------
           572          1 YES
           332          2 YES

可以看出,有些档案没有进入物理备用端。通过查询“SendQEntries”属性可以查看这些存档。

DGMGRL> show database primary 'SendQEntries';
PRIMARY_SEND_QUEUE
        STANDBY_NAME       STATUS          RESETLOGS_ID           THREAD              LOG_SEQ       TIME_GENERATED       TIME_COMPLETED    FIRST_CHANGE#     NEXT_CHANGE#       SIZE (KBs) 
             standby                 ARCHIVED        932222810                2                          330                  01/16/2017 10:55:27  01/16/2017 10:55:35          6074895          6074922                8 
             standby                 ARCHIVED        932222810                2                           331                 01/16/2017 10:55:35  01/16/2017 10:55:37          6074922          6074929                1 
             standby                ARCHIVED        932222810                2                           332                  01/16/2017 10:55:37  01/16/2017 10:55:40          6074929          6074940                1 
                                            CURRENT         932222810                1                           573                  01/16/2017 10:55:34                                                  6074936                                         2515 
                                            CURRENT        932222810    

在这里的状态列中,

当前:显示当前写入的重做日志。

存档:表示在线重做日志的写入过程已完成,创建了本地存档,但未将存档发送到备用端。

NOT_archive:表示在线重做日志的写入过程已完成,但尚未检索到本地存档。

我们启用DEFERRED LOG_ARCHIVE_DEST_STATE_2参数并查询配置的状态。

[Primary-1] SQL> alter system set log_archive_dest_state_2=enable scope=both sid='*';
 
System altered.
DGMGRL> show configuration
 
Configuration - Broker_Configuration
 
  Protection Mode: MaxPerformance
  Databases:
    primary - Primary database
    standby - Physical standby database
    logical - Logical standby database
 
Fast-Start Failover: DISABLED
 
Configuration Status:
SUCCESS
[Primary-2] SQL> select max(sequence#),thread# from v$archived_log group by thread#;
 
MAX(SEQUENCE#)    THREAD#
-------------- ----------
           573          1
           333          2

[Physical-2] SQL> select max(sequence#),thread# from v$archived_log group by thread#;
 
MAX(SEQUENCE#)    THREAD#
-------------- ----------
           573          1
           333          2
[Logical-2] SQL> SELECT max(sequence#),thread#,applied FROM dba_logstdby_log group by thread#, applied;
 
MAX(SEQUENCE#)    THREAD# APPLIED
-------------- ---------- --------
           573          1 YES
           333          2 YES
DGMGRL> show database primary 'SendQEntries';
PRIMARY_SEND_QUEUE
        STANDBY_NAME       STATUS     RESETLOGS_ID           THREAD              LOG_SEQ       TIME_GENERATED       TIME_COMPLETED    FIRST_CHANGE#     NEXT_CHANGE#       SIZE (KBs) 
                          CURRENT        932222810                1                  574  01/16/2017 11:21:41                               6080641                                36 
                          CURRENT        932222810                2     

6.我们首先设置一个测试环境来了解属性,在那里我们可以看到哪些归档文件不适用于备用端。然后查询相关属性。

在逻辑方面,我们质疑重做应用的状态。

DGMGRL> show database logical;
 
Database - logical
 
  Role:            LOGICAL STANDBY
  Intended State:  APPLY-ON
  Transport Lag:   0 seconds (computed 1 second ago)
  Apply Lag:       0 seconds (computed 1 second ago)
  Apply Rate:      -939900928 Byte/s
  Instance(s):
    primary1
    primary2 (apply instance)
 
Database Status:
SUCCESS

对于双重控制,在主侧创建一个表,并查看它是否转到备用侧。

[Primary-2] SQL> select table_name from dba_tables where owner='TEST';
 
TABLE_NAME
------------------------------
EMPLOYEES_YEDEK
LOCATIONS_YEDEK
REGIONS_YEDEK
[Physical-2] SQL> select table_name from dba_tables where owner='TEST';
 
TABLE_NAME
------------------------------
EMPLOYEES_YEDEK
LOCATIONS_YEDEK
REGIONS_YEDEK
[Logical-2] SQL> select table_name from dba_tables where owner='TEST';
 
TABLE_NAME
------------------------------
REGIONS_YEDEK
EMPLOYEES_YEDEK
LOCATIONS_YEDEK
[Primary-2] SQL> create table test.departments_yedek as select * from hr.departments;
 
Table created.
 
[Primary-2] SQL> select table_name from dba_tables where owner='TEST';
 
TABLE_NAME
------------------------------
EMPLOYEES_YEDEK
LOCATIONS_YEDEK
REGIONS_YEDEK
DEPARTMENTS_YEDEK
[Physical-2] SQL> select table_name from dba_tables where owner='TEST';
 
TABLE_NAME
------------------------------
EMPLOYEES_YEDEK
LOCATIONS_YEDEK
REGIONS_YEDEK
DEPARTMENTS_YEDEK
[Logical-2] SQL> select table_name from dba_tables where owner='TEST';
 
TABLE_NAME
------------------------------
REGIONS_YEDEK
EMPLOYEES_YEDEK
LOCATIONS_YEDEK
DEPARTMENTS_YEDEK

在逻辑方面,我们停止重做应用服务。

DGMGRL> edit database logical set state='APPLY-OFF';
Succeeded.

[Logical-1]—>ALERT LOG


01/16/2017 12:16:53
EDIT DATABASE logical SET STATE = APPLY-OFF
01/16/2017 12:16:55
Command EDIT DATABASE logical SET STATE = APPLY-OFF completed

我们正在执行日志切换过程,以便可以创建归档日志。

[Primary-2] SQL> alter system switch logfile;
 
System altered.
 
[Primary-2] SQL> alter system switch logfile;
 
System altered.
 
[Primary-2] SQL> alter system switch logfile;
 
System altered.
 
[Primary-2] SQL> alter system switch logfile;
 
System altered.

我们正在查询逻辑数据库的重做应用状态。

DGMGRL> show database logical
 
Database - logical
 
  Role:            LOGICAL STANDBY
  Intended State:  APPLY-OFF
  Transport Lag:   0 seconds (computed 1 second ago)
  Apply Lag:       39 seconds (computed 1 second ago)
  Apply Rate:      (unknown)
  Instance(s):
    primary1
    primary2 (apply instance)
 
Database Status:
SUCCESS

我们知道哪些档案不适用。

DGMGRL> show database logical 'RecvQEntries';
STANDBY_RECEIVE_QUEUE
              STATUS     RESETLOGS_ID           THREAD              LOG_SEQ       TIME_GENERATED       TIME_COMPLETED    FIRST_CHANGE#     NEXT_CHANGE#       SIZE (KBs) 
   PARTIALLY_APPLIED        932222810                1                  575  01/16/2017 12:08:19  01/16/2017 12:19:01          6089898          6091869              420 
   PARTIALLY_APPLIED        932222810                2                  336  01/16/2017 12:08:22  01/16/2017 12:18:55          6089892          6091799              900 
         NOT_APPLIED        932222810                2                  337  01/16/2017 12:18:55  01/16/2017 12:19:02          6091799          6091851               32 
         NOT_APPLIED        932222810                2                  338  01/16/2017 12:19:02  01/16/2017 12:19:05          6091851          6091863                3 
         NOT_APPLIED        932222810                2                  339

下面是上述DGMGRL查询的SQLPLUS命令。

[Logical-2] SQL> SELECT max(sequence#),thread#,applied FROM dba_logstdby_log group by thread#, applied;
 
MAX(SEQUENCE#)    THREAD# APPLIED
-------------- ---------- --------
           575          1 CURRENT
           574          1 YES
           339          2 NO
           335          2 YES
           336          2 CURRENT

7.我们了解了导致Log Apply服务在逻辑备用数据库中停止的事务,如下所示。

DGMGRL> show database logical 'LsbyFailedTxnInfo';
   LsbyFailedTxnInfo = ''

8.在逻辑备用数据库中,日志应用服务使用的进程和MAX_ SGA信息如下所示。

DGMGRL> show database logical 'LsbyParameters';
   LsbyParameters = 'MAX_SGA=30###MAX_SERVERS=14###MAX_EVENTS_RECORDED=10000###PRESERVE_CO

9.我们看到TopWait事件如下。

DGMGRL> show instance primary1 'TopWaitEvents' on database primary;
 TOP SYSTEM WAIT EVENTS
                Event                                          Wait Time 
    rdbms ipc message                             1503759311 
     class slave wait                                   297257530 
 SQL*Net message from client            148557969 
   gcs remote message                           84543254 
       DIAG idle wait                                  84519976 
DGMGRL> show instance primary1 'TopWaitEvents' on database standby;
 TOP SYSTEM WAIT EVENTS
                Event            Wait Time 
    rdbms ipc message            810213291 
     class slave wait            311501944 
 parallel recovery slave next change            104565552 
   gcs remote message             52278608 
       DIAG idle wait             52247124 
DGMGRL> show instance primary1 'TopWaitEvents' on database logical;
 TOP SYSTEM WAIT EVENTS
                Event            Wait Time 
    rdbms ipc message           1409929720 
     class slave wait            135264994 
 SQL*Net message from client            132785675 
 wait for unread message on broadcast channel             88204824 
   gcs remote message             88160985 

此信息从V$SYSTEM_EVENT视图中获取。

10.通过下面的示例,我们可以看到代理配置文件和SPFILE参数文件不一致的情况。

我们正在查询代理配置。

DGMGRL> show configuration
 
 Configuration - Broker_Configuraiton
 
   Protection Mode: MaxPerformance
   Databases:
     primary - Primary database
       Warning: ORA-16809: multiple warnings detected for the database
 
     standby - Physical standby database
     logical - Logical standby database
 
 Fast-Start Failover: DISABLED
 
 Configuration Status:
 WARNING

我们质疑是什么导致了错误。

DGMGRL> show database primary 'StatusReport';
 STATUS REPORT
        INSTANCE_NAME   SEVERITY ERROR_TEXT
             primary1    WARNING ORA-16714: the value of property LogArchiveTrace is inconsistent with the database setting
             primary2    WARNING ORA-16714: the value of property LogArchiveTrace is inconsistent with the database setting

我们将LogArchiveTrace参数的值与数据库中的代理配置文件进行比较。

DGMGRL> show database primary 'InconsistentProperties';
 INCONSISTENT PROPERTIES
    INSTANCE_NAME        PROPERTY_NAME         MEMORY_VALUE         SPFILE_VALUE         BROKER_VALUE 
         primary1      LogArchiveTrace                    6                    0                    0 
         primary2      LogArchiveTrace                    6         

我们还通过检查参数值来检查情况是否如此。

DGMGRL> show instance primary1 'LogArchiveTrace' on database primary;
   LogArchiveTrace = '0'
   
 **********
 [Primary-1] SQL> show parameter log_archive_trace
 
 NAME                                 TYPE        VALUE
 ------------------------------------ ----------- ------------------------------
 log_archive_trace                    integer     6

我们重置了数据库中的参数值。

[Primary-1] SQL> alter system set log_archive_trace=0 scope=both sid='*';
 
 System altered.

我们检查配置是否正常。

 [Primary-1] SQL> show parameter log_archive_trace
 
 NAME                                 TYPE        VALUE
 ------------------------------------ ----------- ------------------------------
 log_archive_trace                    integer     0
 
 DGMGRL> show configuration
 
 Configuration - Broker_Configuraiton
 
   Protection Mode: MaxPerformance
   Databases:
     primary - Primary database
     standby - Physical standby database
     logical - Logical standby database
 
 Fast-Start Failover: DISABLED
 
 Configuration Status:
 SUCCESS
 
 DGMGRL> show database primary 'StatusReport';
 STATUS REPORT
        INSTANCE_NAME   SEVERITY ERROR_TEXT

原文标题:Monitoring The Environment
原文作者:Onur ARDAHANLI
原文链接:https://dbtut.com/index.php/2022/06/03/monitoring-the-environment

最后修改时间:2022-08-18 17:10:35
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论