
1. 主机cpu和内存负载都不高


2. 检查mmon进程运行状态正常

3. 检查snapshot保留时间和快照间隔设置正常,发现DBA_HIST_ACTIVE_SESS_HISTOR有数据,但是没有生成snapshot保留时间和快照间隔:

DBA_HIST_ACTIVE_SESS_HISTOR数据:

4. 检查数据库日志,发现以下错误


节点1 flush AWR相关数据,有个wait event: enq: WF – contention,节点2有个wait event: not in wait,通过SessId,Ser# 可以查询到执行的问题sql和执行计划:
▼▼▼SQL_ID 82h7w3w07k63z, child number 0-------------------------------------insert into wrh$_sql_bind_metadata (snap_id, dbid, sql_id, name,position, dup_position, datatype, datatype_string, character_sid,precision, scale, max_length) SELECT /*+ ordered use_nl(bnd) index(bndsql_id) */ :lah_snap_id, :dbid, bnd.sql_id, name, position,dup_position, datatype, datatype_string, character_sid,precision, scale, max_length FROM x$kewrattrnew new,x$kewrsqlidtab tab, v$sql_bind_capture bnd WHEREnew.str1_kewrattr = tab.sqlid_kewrsie AND tab.sqlid_kewrsie = bnd.sql_id AND tab.childaddr_kewrsie = bnd.child_addressPlan hash value: 4222011306-----------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|-----------------------------------------------------------------------------------| 0 | INSERT STATEMENT | | | | 1 (100)|| 1 | LOAD TABLE CONVENTIONAL | | | | || 2 | NESTED LOOPS | | 1 | 217 | 0 (0)||*3 | HASH JOIN | | 1 | 48 | 0 (0)|| 4 | FIXED TABLE FULL | X$KEWRATTRNEW | 1 | 34 | || 5 | FIXED TABLE FULL | X$KEWRSQLIDTAB | 1 | 14 | ||*6 | FIXED TABLE FIXED INDEX| X$KQLFBC (ind:2) | 1 | 169 | 0 (0)|-----------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 3 - access("NEW"."STR1_KEWRATTR"="TAB"."SQLID_KEWRSIE") 6 - filter(("INST_ID"=USERENV('INSTANCE') AND "TAB"."SQLID_KEWRSIE"="KQLFBC_SQLID" AND "TAB"."CHILDADDR_KEWRSIE"="KQLFBC_CADD"))
文档建议收集FIXED TABLE(X$KEWRATTRNEW ,X$KEWRSQLIDTAB)统计信息,收集完统计信息,问题依旧存在,没有解决。
注意:FIXED TABLE(FIXED TABLE),存在过多过期SQL父游标,收集该表统计信息会hang住,详见Bug 22733141 awr snapshots not created, gathering stats on x$kqlfbc hangs。
▼▼▼exec dbms_stats.gather_table_stats('SYS', 'X$KEWRATTRNEW');exec dbms_stats.gather_table_stats('SYS', 'X$KEWRSQLIDTAB');
文档建议的刷新shared_pool,因为是生产环境,没有尝试操作。
最终选择业务停机维护时段,重新启动数据库释放X$KQLFBC表数据,暂时问题解决,开始生成新的awr报告。

填充AWR wrh$_sql_bind_metadata 表来自X$KQLFBC table,如果X$KQLFBC 中的过期的SQL父游标过多,就导致X$KQLFBC查询缓慢,而导致AWR后台进程m00* 进程因超时而挂起,通过跳过收集与该表相关的数据,避免该问题。
▼▼▼alter system set "_awr_disabled_flush_tables" = 'wrh$_sql_bind_metadata';

更多精彩干货分享
点击下方名片关注
IT那活儿





