暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

library cache: mutex X等待事件, blocker session on cpu

张维照 2019-05-31
1785

问题描述

library cache: mutex X等待事件是当一个会话以 exclusive mode持有library cache mutex时,另一个会话当前无论何时申请此mutex时必须等待其释放。很多时候对 library cache做不同的操作时都需要申请一个mutex, 所以最重要的是确认mutex的位置”location”,  该位置有助于分析该类等待事件的原因。

对应参数:

P1 = “idn” = Unique Mutex Identifier
P2 = Mutex “value” = in order to find the mutex value, read note 1298015.1
P3 = “where” = location in code (internal identifier) where mutex is being waited for

P1 is idn – can be used for finding the cursor related to mutex

For library cache: mutex X

1. P1 represents library cache hash bucket number (if idn <= 131072)

2. P1 represents hash value of the library cache object under protection (if idn > 131072)

mutex的问题排查较为复杂,通常一个现象有多个原因,并且mutex问题又会引起其它级连问题。常见原因:

1. 频繁的hard parse

2. High sql version count,需要检查一个很长的chain of versions

3. 引为某些原因导致的Invalidations and reloads

4. shared pool配置过小

5. cursor_sharing=similar和session_cached_cursors配置不当

6. BUGs


专家解答

类问题时通常需要引集的信息:

• AWR Report
• ASH Report
• Hanganalyze trace
• Systemstate dump (contain call stack level)

系统级可用的相关视图:GV$MUTEX_SLEEP 和GV$MUTEX_SLEEP_HISTORY

分析该事件的相关SQL:

-- which library cache objects are the target of the operation.
select * 
from 
   (select 
      case when (kglhdadr = kglhdpar) 
      then 'Parent' 
      else 'Child '||kglobt09 end cursor,  
      kglhdadr ADDRESS, 
      substr(kglnaobj,1,20) NAME, 
      kglnahsh HASH_VALUE, 
      kglobtyd TYPE,  
      kglobt23 LOCKED_TOTAL, 
      kglobt24 PINNED_TOTAL,
      kglhdexc EXECUTIONS, 
      kglhdnsp NAMESPACE  
   from 
      x$kglob 
     -- where kglobtyd != 'CURSOR' 
   order by 
      kglobt24 desc) 
where 
   rownum <= 10; 

-- to find blocker session 
SELECT count (*), to_number(substr(to_char(rawtohex(p2raw)), 1, 8),'XXXXXXXX') blocking_sid 
FROM v$session 
WHERE event = 'library cache: mutex X' 
group by to_number(substr(to_char(rawtohex(p2raw)),1, 8), 'XXXXXXXX'); 

-- to find mutex from ash 
select event, p1, count(1) 
from v$active_session_history 
where sample_time > (sysdate - 20/1440)
and   event = 'library cache: mutex X' 
group by 
   event, p1 
order by 3; 


-- if idn > 131072, the P1 is the object the mutex is against.For the topmost P1 run:
select KGLNAOBJ, KGLNAOWN, KGLHDNSP, KGLOBTYP 
from x$kglob 
where KGLNAHSH= {value of P1}

最近遇到过一个案例数据库几个session 在执行相同的insert values sql时library cache: mutex X争用严重.

SQL> select child_number,count(*)  from v$sql where sql_id='b92jmb1qngsyd' group by child_number;

CHILD_NUMBER   COUNT(*)
------------ ----------
           0         47
           1         47
           2         47
           3         47
           4         47
           5         46
           6         46
           7         47
           8         48
           9         46
          10         47
          11         47
          12         47
          13         47
          14         47
          15         48
          16         47
          17         47
 ...
          83         47
          84         47
          85         47
          86         47
          87         47
          88         47
          89         47
          90         47
          91         47
          92         47
          93         46
          94         47
          95         47
          96         47
          97         47
          98         47
          99         47

100 rows selected.

sql> @no_shared

SQL_ID        NONSHARED_REASON                COUNT(*)
------------- ----------------------------- ----------
b92jmb1qngsyd BIND_MISMATCH                       4730
b92jmb1qngsyd PURGED_CURSOR                          2
b92jmb1qngsyd HASH_MATCH_FAILED                      1
b92jmb1qngsyd BIND_LENGTH_UPGRADEABLE              952

select /*+rule*/m.position,m.bind_name , m.max_length,count(*) child_cursor_count
     from v$sql s, v$sql_bind_metadata m
     where s.sql_id =  'b92jmb1qngsyd'
     and s.child_address = m.address group by m.position,m.bind_name , m.max_length
     order by 1, 2;

-- 确认存在部分变量varchar长度区间问题

SQL> select mutex_type,location_id,location,sleeps from x$mutex_sleep where location_id=85;

MUTEX_TYPE                       LOCATION_ID LOCATION                                     SLEEPS
-------------------------------- ----------- ---------------------------------------- ----------
Library Cache                             85 kgllkdl1  85                               15001254

当时有做SSD 但级别不够(10), 确认了blocker 是no in wait, 并且是on cpu, 执行的是相同的insert sql, 该sql是应用同步数据使用执行频率较高。

当时也有做errorstack

 ----- Abridged Call Stack Trace -----
146297  ksedsts()+544<-kjzdssdmp()+400<-kjzdpcrshnfy()+512<-kstdmp()+416<-dbkedDefDump()+6032<-ksedmp()+64<-ksdxfdmp()+1360<
-ksdxcb()+3216<-sspuser()+688<-<kernel><-kglrdtin()+1121<-kglrdti()+80<-kksAllocCursorStat()+1152<-kksLoadChild()+14400<-kkslod()+112<-kglobld()+1872
146298  <-kglobpn()+1744<-kglpim()+832<-kglpin()+2496<-kxsGetRuntimeLock()+1840
146299  
----- End of Abridged Call Stack Trace -----

以call stack和event为关键字,在MOS中发现一篇Session Spin on Kglrdtin Holding ‘Library Cache: Mutex X’ (文档 ID 2219897.1)
和当前的版本现象较为相似因当前的平台hpux无相关one off patch,且无升级11.2.0.4计划记划,这里只记录一下该问题。

— another case

SQL>  select count(*) from v$sql where sql_id='4uqp4jczbf1qt';

  COUNT(*)
----------
     21372

SQL> select version_count from v$sqlarea where sql_id='4uqp4jczbf1qt';

VERSION_COUNT
-------------
           74

SQL> @no_shared
Enter value for sqlid: 4uqp4jczbf1qt

SQL_ID        NONSHARED_REASON                COUNT(*)
------------- ----------------------------- ----------
4uqp4jczbf1qt BIND_MISMATCH                      21428
4uqp4jczbf1qt HASH_MATCH_FAILED                    290
4uqp4jczbf1qt BIND_LENGTH_UPGRADEABLE            17489

SQL> @pd obso
Show all parameters and session values from x$ksppi/x$ksppcv...

      INDX I_HEX NAME                             VALUE      DESCRIPTION
---------- ----- -------------------------------- ---------- ------------------------------------------------
      1723   6BB _cursor_obsolete_threshold       100        Number of cursors per parent before obsoletion. 

SQL>  select /*+rule*/ m.position,m.bind_name , m.max_length,count(*) child_cursor_count
  2      from v$sql s, v$sql_bind_metadata m
  3      where s.sql_id =  '4uqp4jczbf1qt'
  4      and m.datatype=1
  5      and s.child_address = m.address group by m.position,m.bind_name , m.max_length
  6      order by 1, 2;    

  POSITION BIND_NAME                      MAX_LENGTH CHILD_CURSOR_COUNT
---------- ------------------------------ ---------- ------------------
         1 V0                                     32              22192
         2 V1                                     32              22192
         7 V6                                     32              22192
         8 V7                                     32              22192
         9 V8                                     32              22192
        10 V9                                     32              22192
        12 V11                                    32              22192
        13 V12                                    32              18682
        13 V12                                   128               3188
        13 V12                                  2000                322
        16 V15                                    32              22192
        17 V16                                    32              21977
        17 V16                                   128                215
        18 V17                                    32              22192
        19 V18                                    32              22192
        20 V19                                    32              22192
        21 V20                                    32              22192
        22 V21                                    32              22192
        30 V29                                    32              22192
        31 V30                                    32              22192
        32 V31                                    32              22192
        33 V32                                    32              22192
        34 V33                                    32              22192
        35 V34                                    32              22192
        36 V35                                    32              22192
        37 V36                                    32              22192
        38 V37                                    32              22192
        39 V38                                    32              20037
        39 V38                                   128               2155
        42 V41                                    32              22192
        43 V42                                    32              22192
        47 V46                                    32              22192

Event 10503  level <buffer length>
EVENT 10503 enables users to specify a character bind buffer length. Depending on the length used, the character binds in the child cursor can all be created using the same bind length, skipping bind graduation and keeping the child chain relatively small. This helps to alleviate a potential cursor-sharing problem related to graduated binds.

SQL> alter system set events='10503 trace name context forever, level 2000';

System altered.

SQL> oradebug setmypid
Statement processed.
SQL> oradebug eventdump system
10503 trace name context forever, level 2000
10949 trace name context forever,level 1
28401 trace name context forever,level 1

SQL> select address,hash_value from v$sqlarea where sql_id='4uqp4jczbf1qt';
ADDRESS          HASH_VALUE
---------------- ----------
C000001F574C4F90 1052182233

SQL> exec dbms_shared_pool.purge('C000001F574C4F90,1052182233','C',65);
PL/SQL procedure successfully completed.

SQL> select address,hash_value,last_load_time from v$sqlarea where sql_id='4uqp4jczbf1qt';
ADDRESS          HASH_VALUE LAST_LOAD_TIME
---------------- ---------- -----------------
C000001F574C4F90 1052182233 20190211 15:00:02

SQL> select count(*) from v$sql where sql_id='4uqp4jczbf1qt';
  COUNT(*)
----------
     22700

SQL> alter system flush shared_pool;
System altered.

SQL>  select count(*) from v$sql where sql_id='4uqp4jczbf1qt';
  COUNT(*)
----------
         0

SQL> select /*+rule*/ m.position,m.bind_name , m.max_length,count(*) child_cursor_count
  2       from v$sql s, v$sql_bind_metadata m
  3       where s.sql_id =  '4uqp4jczbf1qt'
  4       and m.datatype=1
  5       and s.child_address = m.address group by m.position,m.bind_name , m.max_length
  6       order by 1, 2;     

  POSITION BIND_NAME                      MAX_LENGTH CHILD_CURSOR_COUNT
---------- ------------------------------ ---------- ------------------
         1 V0                                   2000                 86
         2 V1                                   2000                 86
         7 V6                                   2000                 86
         8 V7                                   2000                 86
         9 V8                                   2000                 86
        10 V9                                   2000                 86
        12 V11                                  2000                 86
        13 V12                                  2000                 86
        16 V15                                  2000                 86
        17 V16                                  2000                 86
        18 V17                                  2000                 86
        19 V18                                  2000                 86
        20 V19                                  2000                 86
        21 V20                                  2000                 86
        22 V21                                  2000                 86
        30 V29                                  2000                 86
        31 V30                                  2000                 86
        32 V31                                  2000                 86
        33 V32                                  2000                 86
        34 V33                                  2000                 86
        35 V34                                  2000                 86
        36 V35                                  2000                 86
        37 V36                                  2000                 86
        38 V37                                  2000                 86
        39 V38                                  2000                 86
        42 V41                                  2000                 86
        43 V42                                  2000                 86
        47 V46                                  2000                 86


「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论