暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

One case:Latch free of oracle 9208 ?

Roger 2019-06-17
930

问题描述

最近某客户9208环境,遭遇latch free问题,而在oracle 9i版本中,latch free 只是一个汇总类型的event。
换句话讲,有多种event的表现其实都是通过latch free来呈现。 客户这里的实际情况查询是主要体现在:

row cache objects和cache buffer chains. 对于cache buffer chains,相对比较简单,通常也就是热块或
SQL效率问题。而 row cache objects相对麻烦一些,客户的环境经过分析发现的row cache中的dc_rollback_segs
等待严重导致,如下:

 LATCH# CACHE# NAME                              TYPE        SUBORDINATE#       GETS
------- ------ --------------------------------- ----------- ------------ ----------
     57      3 dc_rollback_segments              PARENT                   -1.591E+09
    .......
     57      3 dc_rollback_segments              PARENT                    -71015419
     58      3 dc_rollback_segments              PARENT                   -1.591E+09
    .......
     58      3 dc_rollback_segments              PARENT                    -71015419
     59      3 dc_rollback_segments              PARENT                   -1.591E+09
     .......
     59      3 dc_rollback_segments              PARENT                    -71015419
     60      3 dc_rollback_segments              PARENT                   -1.591E+09
     .......
     60      3 dc_rollback_segments              PARENT                    -71015419
     61      3 dc_rollback_segments              PARENT                   -1.591E+09
     .......
     61      3 dc_rollback_segments              PARENT                    -71015419
     62      3 dc_rollback_segments              PARENT                   -1.591E+09
     .......
     62      3 dc_rollback_segments              PARENT                    -71015419
     63      3 dc_rollback_segments              PARENT                   -1.591E+09
    .......
     63      3 dc_rollback_segments              PARENT                    -71015419
     64      3 dc_rollback_segments              PARENT                   -1.591E+09
     .......
     64      3 dc_rollback_segments              PARENT                    -71015419


专家解答

针对这个问题,似乎不太好处理,mos提供了一个通过调整_rollback_segment_count参数来避免该问题,而且还不一定有效。

我们先不管是否有效,那么这个参数有何作用 ? 通过10gR2 环境来研究一下。

++++++Session 1

SQL> select * from v$version where rownum < 3;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - Prod
PL/SQL Release 10.2.0.5.0 - Production

SQL> select count(1) from v$latch_children where name='row cache objects';

  COUNT(1)
----------
        41
SQL> select count(1) from dba_rollback_segs;

  COUNT(1)
----------
        11
SQL> select count(1) from dba_rollback_segs where status='ONLINE';

  COUNT(1)
----------
        11
SQL>  set pages 1000
SQL>  column cache# format 99999
SQL>  column name format a33
SQL>  column latch# format 999999
SQL>  select distinct s.kqrstcln    latch#,
  2                   r.cache#,
  3                   r.parameter    name,
  4                   r.type,
  5                   r.subordinate#,
  6                   r.gets
  7     from v$rowcache r, x$kqrst s
  8   where r.cache# = s.kqrstcid
  9  order by 1,4,5;

 LATCH# CACHE# NAME                              TYPE        SUBORDINATE#       GETS
------- ------ --------------------------------- ----------- ------------ ----------
      1      1 dc_free_extents                   PARENT                            0
      2      4 dc_used_extents                   PARENT                            0
      3      2 dc_segments                       PARENT                        49990
      4      0 dc_tablespaces                    PARENT                       203697
      5      5 dc_tablespace_quotas              PARENT                          176
      6      6 dc_files                          PARENT                            3
      7      7 dc_users                          PARENT                       173742
      7      7 dc_users                          SUBORDINATE            0        254
      7      7 dc_users                          SUBORDINATE            1       2031
      7      7 dc_users                          SUBORDINATE            2          0
      8      3 dc_rollback_segments              PARENT                       466657
      9      8 dc_objects                        PARENT                        46315
      9      8 dc_object_grants                  SUBORDINATE            0       3947
     10     17 dc_global_oids                    PARENT                       104704
     11     12 dc_constraints                    PARENT                           88
     12     11 dc_object_ids                     PARENT                       179941
     13     13 dc_sequences                      PARENT                        10722
     14     10 dc_usernames                      PARENT                         9010
     15     15 dc_database_links                 PARENT                          245
     16     16 dc_histogram_defs                 PARENT                        45492
     16     16 dc_histogram_data                 SUBORDINATE            0      14794
     16     16 dc_histogram_data                 SUBORDINATE            1       2401
     17     33 kqlsubheap_object                 PARENT                            0
     18     19 dc_table_scns                     PARENT                          103
     18     19 dc_partition_scns                 SUBORDINATE            0          0
     19     18 dc_outlines                       PARENT                            0
     20     14 dc_profiles                       PARENT                          197
     21     34 realm cache                       PARENT                            0
     21     34 realm auth                        SUBORDINATE            0          0
     22     35 Command rule cache                PARENT                            0
     23     36 Realm Object cache                PARENT                            0
     23     36 Realm Subordinate Cache           SUBORDINATE            0          0
     24     40 Rule Set Cache                    PARENT                            0
     25     37 event map                         PARENT                            0
     26     38 format                            PARENT                            0
     27     39 audit collector                   PARENT                            0
     28     26 global database name              PARENT                           36
     29     20 rule_info                         PARENT                            0
     30     21 rule_or_piece                     PARENT                            0
     30     21 rule_fast_operators               SUBORDINATE            0          0
     31      9 dc_qmc_cache_entries              PARENT                            0
     32     23 dc_qmc_ldap_cache_entries         PARENT                            0
     33     27 qmtmrcin_cache_entries            PARENT                            0
     34     28 qmtmrctn_cache_entries            PARENT                            0
     35     29 qmtmrcip_cache_entries            PARENT                            0
     36     30 qmtmrctp_cache_entries            PARENT                            0
     37     31 qmtmrciq_cache_entries            PARENT                            0
     38     32 qmtmrctq_cache_entries            PARENT                            0
     39     24 outstanding_alerts                PARENT                          676
     40     22 dc_awr_control                    PARENT                         1065
     41     25 dc_hintsets                       PARENT                            0

51 rows selected.

SQL> oradebug setmypid
Statement processed.
SQL> ALTER SESSION SET EVENTS 'immediate trace name row_cache level 2';

Session altered.

SQL> oradebug tracefile_name
/home/ora10g/admin/roger/udump/roger_ora_18777.trc
SQL> oradebug close_trace
Statement processed.

我们先来看下level 2的dump,看下row cache objects的dc是不是41个,如下:

[root@killdb ~]# cat cat /home/ora10g/admin/roger/udump/roger_ora_18777.trc |grep cid
cat: cat: No such file or directory
ROW CACHE HASH TABLE: cid=0 ht=0x28b9e660 size=32
ROW CACHE HASH TABLE: cid=1 ht=0x28b1d534 size=256
ROW CACHE HASH TABLE: cid=2 ht=0x28b1e5fc size=65536
ROW CACHE HASH TABLE: cid=3 ht=0x284b4074 size=128
ROW CACHE HASH TABLE: cid=4 ht=0x28b1dd98 size=256
ROW CACHE HASH TABLE: cid=5 ht=0x28b9e7c4 size=256
ROW CACHE HASH TABLE: cid=6 ht=0x28b9f028 size=64
ROW CACHE HASH TABLE: cid=7 ht=0x28434010 size=65536
ROW CACHE HASH TABLE: cid=8 ht=0x284b44d8 size=65536
ROW CACHE HASH TABLE: cid=9 ht=0x2873cbf8 size=512
ROW CACHE HASH TABLE: cid=10 ht=0x285b62cc size=256
ROW CACHE HASH TABLE: cid=11 ht=0x28535e04 size=65536
ROW CACHE HASH TABLE: cid=12 ht=0x285355a0 size=256
ROW CACHE HASH TABLE: cid=13 ht=0x285b5e68 size=128
ROW CACHE HASH TABLE: cid=14 ht=0x28638f24 size=16
ROW CACHE HASH TABLE: cid=15 ht=0x285b6b30 size=32
ROW CACHE HASH TABLE: cid=16 ht=0x285b6c94 size=65536
ROW CACHE HASH TABLE: cid=17 ht=0x2853453c size=512
ROW CACHE HASH TABLE: cid=18 ht=0x28637ec0 size=512
ROW CACHE HASH TABLE: cid=19 ht=0x28636e5c size=512
ROW CACHE HASH TABLE: cid=20 ht=0x2863cb30 size=65536
ROW CACHE HASH TABLE: cid=21 ht=0x286bcb94 size=65536
ROW CACHE HASH TABLE: cid=22 ht=0x2874577c size=1
ROW CACHE HASH TABLE: cid=23 ht=0x2873dc5c size=512
ROW CACHE HASH TABLE: cid=24 ht=0x28744f18 size=256
ROW CACHE HASH TABLE: cid=25 ht=0x287457e8 size=512
ROW CACHE HASH TABLE: cid=26 ht=0x2863cac4 size=1
ROW CACHE HASH TABLE: cid=27 ht=0x2873ecc0 size=512
ROW CACHE HASH TABLE: cid=28 ht=0x2873fd24 size=512
ROW CACHE HASH TABLE: cid=29 ht=0x28740d88 size=512
ROW CACHE HASH TABLE: cid=30 ht=0x28741dec size=512
ROW CACHE HASH TABLE: cid=31 ht=0x28742e50 size=512
ROW CACHE HASH TABLE: cid=32 ht=0x28743eb4 size=512
ROW CACHE HASH TABLE: cid=33 ht=0x28636cf8 size=32
ROW CACHE HASH TABLE: cid=34 ht=0x28639008 size=256
ROW CACHE HASH TABLE: cid=35 ht=0x2863986c size=256
ROW CACHE HASH TABLE: cid=36 ht=0x2863a0d0 size=256
ROW CACHE HASH TABLE: cid=37 ht=0x2863b198 size=256
ROW CACHE HASH TABLE: cid=38 ht=0x2863b9fc size=256
ROW CACHE HASH TABLE: cid=39 ht=0x2863c260 size=256
ROW CACHE HASH TABLE: cid=40 ht=0x2863a934 size=256

我们可以看到,确认是41个,跟我们查询v$latch_children是符合的.

+++++Session 2

SQL> alter session set tracefile_identifier='001';

Session altered.

SQL> oradebug setmypid
Statement processed.
SQL> oradebug dump row_cache 8;
Statement processed.
SQL> oradebug close_trace
Statement processed.
SQL> oradebug tracefile_name
/home/ora10g/admin/roger/udump/roger_ora_18777_001.trc

从上面的trace我们可以发现,保护dc_rollback_segments row cache的bucket有30个,如下:

[root@killdb ~]# cat /home/ora10g/admin/roger/udump/roger_ora_18777_001.trc |grep dc_rollback_segments|wc -l
30
[root@killdb ~]#

从该trace我们可以看到buckets的个数以及hash chain的情况,如下:

ROW CACHE HASH TABLE: cid=8 ht=0x284b44d8 size=65536
Buckets with more than 20 objects:
NONE
Hash Chain Size     Number of Buckets
---------------     -----------------
              0                 64900
              1                   636
              2                     0
              3                     0
              4                     0
              5                     0
              6                     0
              7                     0
              8                     0
              9                     0
             10                     0
             11                     0
             12                     0
             13                     0
             14                     0
             15                     0
             16                     0
             17                     0
             18                     0
             19                     0
             20                     0
            >20                     0

我们看到,此时有2种hash chain size,分别是0和1. 其中1 表示当前存放内容的hash chain,保护了636个buckets。 另外
hash chain size 为0的,表示这部分的64900 个bucket是空闲的,也就是说没有其中没有任何内容.

下面我们将参数_rollback_segment_count调整为6,然后再来观察下保护dc_rollback_segments的bucket有多少个。

注意:_rollback_segment_count参数是指数据库中保持online回滚段的个数(非SYSTEM回滚段,sysem回滚段总是online的)

+++++Session 3

SQL> alter system set "_rollback_segment_count"=6;

System altered.

SQL> select count(1) from dba_rollback_segs where status='ONLINE';

  COUNT(1)
----------
        11

SQL> shutdown abort;
ORACLE instance shut down.
SQL> 
SQL> startup
ORACLE instance started.

Total System Global Area  167772160 bytes
Fixed Size                  1272600 bytes
Variable Size             146801896 bytes
Database Buffers           16777216 bytes
Redo Buffers                2920448 bytes
Database mounted.
Database opened.
SQL> 
SQL> select count(1) from dba_rollback_segs where status='ONLINE';

  COUNT(1)
----------
         7


SQL> oradebug setmypid
Statement processed.
SQL> oradebug dump row_cache 2
Statement processed.
SQL> oradebug tracefile_name
/home/ora10g/admin/roger/udump/roger_ora_20510.trc
SQL> 
SQL> alter session set tracefile_identifier='001';

Session altered.

SQL> oradebug setmypid
Statement processed.
SQL> oradebug dump row_cache 8;
Statement processed.
SQL> oradebug close_trace
Statement processed.
SQL> oradebug tracefile_name
/home/ora10g/admin/roger/udump/roger_ora_20510_001.trc

我们来看下调整参数之后保护dc_rollback_segments 的bucket有没有变化:
[root@killdb ~]# cat /home/ora10g/admin/roger/udump/roger_ora_20510_001.trc|grep dc_rollback_segments|wc -l
30

我们这里来看似乎仍然是30,真的吗 ?我们先来看下此时一共有多个bucket:

ROW CACHE HASH TABLE: cid=8 ht=0x284b44d8 size=65536
Buckets with more than 20 objects:
NONE
Hash Chain Size     Number of Buckets
---------------     -----------------
              0                 65316
              1                   220
              2                     0
              3                     0
              4                     0
              5                     0
              6                     0
              7                     0
              8                     0
              9                     0
             10                     0
             11                     0
             12                     0
             13                     0
             14                     0
             15                     0
             16                     0
             17                     0
             18                     0
             19                     0
             20                     0
            >20                     0

当前库中此时使用的bucket有220个,因为我将db重启了,所以这个比之前要少,是正常的。 我们重点是来看
保护dc_rollback_segments的bucket是不是有所变化 ?

打开trace文件,搜索发现确认有所降低,如下:

BUCKET 48:
  row cache parent object: address=0x29ae4dc0 cid=3(dc_rollback_segments)
  hash=35e74caf typ=5 transaction=(nil) flags=00000002
  own=0x29ae4e2c[0x29ae4e2c,0x29ae4e2c] wat=0x29ae4e34[0x29ae4e34,0x29ae4e34] mode=N
  status=VALID/-/-/-/-/-/-/-/-
  data=
  00000001 00000001 00000002 00000009 535f0009 4d535359 00243155 00000000 
  00000000 00000000 00000000 00000000 00000001 00000001 0000041d 000003f2 
  005594c4 00000000 00000000 00000001 
  BUCKET 48 total object count=3

BUCKET 86:
  row cache parent object: address=0x29ae4224 cid=3(dc_rollback_segments)
  hash=75647f55 typ=5 transaction=(nil) flags=00000002
  own=0x29ae4290[0x29ae4290,0x29ae4290] wat=0x29ae4298[0x29ae4298,0x29ae4298] mode=N
  status=VALID/-/-/-/-/-/-/-/-
  data=
  00000008 00000001 00000002 00000079 535f0009 4d535359 00243855 00000000 
  00000000 00000000 00000000 00000000 00000001 00000001 00000473 00000447 
  005595da 00000000 00000000 00000001 
  BUCKET 86 total object count=2

BUCKET 91:
  row cache parent object: address=0x29ae3ed0 cid=3(dc_rollback_segments)
  hash=3664ff5a typ=5 transaction=(nil) flags=00000002
  own=0x29ae3f3c[0x29ae3f3c,0x29ae3f3c] wat=0x29ae3f44[0x29ae3f44,0x29ae3f44] mode=N
  status=VALID/-/-/-/-/-/-/-/-
  data=
  0000000b 00000001 00000002 00000529 42520007 30305f53 00000031 00000000 
  00000000 00000000 00000000 00000000 00000001 00000001 00000002 00000004 
  002e6f68 00000000 00000000 00000000 
  BUCKET 91 total object count=2

我们可以看到,此时虽然count显示是有30个bucket在保护dc_rollback_segments,然而实际上
bucket 48/86/91分别保护了3、2、2个object. 换句话讲,实际上用到的bucket就应该是30-(3+2+2)+3=26.

从上面的实验来看,_rollback_segment_count参数确认影响row cache中保护dc_*对象的bucket数量,
总体原则就是参数越大,bucket就越多,反之亦然.

简单的总结一下:
1. _rollback_segment_count参数是控制非SYSTEM回滚段处于online状态的个数;
2. _rollback_segment_count参数越大,保持online状态的回滚段就越多。9i默认情况下oracle会自己去进行判断,
有自己的算法,随着不停的inactive和active操作,smon去offline和online回滚段,会加剧系统资源的消耗;
3. hash bucket数量越多,那么也就越不容易导致竞争。

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论