暂无图片
暂无图片
5
暂无图片
暂无图片
暂无图片

【记一次Oracle归档量爆增及分析思路】

原创 张sir 2022-12-06
2473

前言

         前一段时间巡检发现一套数据库归档量近几日发生了爆增,虽然最终定位并非数据库自身原因,这里也总结下分析步骤和思路。

分析步骤

          1、确认日志切换频率,明显30号和1号日志切换比之前增长一倍。

select to_char(first_time,'YYYY-MON-DD') "Date", to_char(first_time,'DY') day, to_char(sum(decode(to_char(first_time,'HH24'),'00',1,0)),'999') "00", to_char(sum(decode(to_char(first_time,'HH24'),'01',1,0)),'999') "01", to_char(sum(decode(to_char(first_time,'HH24'),'02',1,0)),'999') "02", to_char(sum(decode(to_char(first_time,'HH24'),'03',1,0)),'999') "03", to_char(sum(decode(to_char(first_time,'HH24'),'04',1,0)),'999') "04", to_char(sum(decode(to_char(first_time,'HH24'),'05',1,0)),'999') "05", to_char(sum(decode(to_char(first_time,'HH24'),'06',1,0)),'999') "06", to_char(sum(decode(to_char(first_time,'HH24'),'07',1,0)),'999') "07", to_char(sum(decode(to_char(first_time,'HH24'),'08',1,0)),'999') "08", to_char(sum(decode(to_char(first_time,'HH24'),'09',1,0)),'999') "09", to_char(sum(decode(to_char(first_time,'HH24'),'10',1,0)),'999') "10", to_char(sum(decode(to_char(first_time,'HH24'),'11',1,0)),'999') "11", to_char(sum(decode(to_char(first_time,'HH24'),'12',1,0)),'999') "12", to_char(sum(decode(to_char(first_time,'HH24'),'13',1,0)),'999') "13", to_char(sum(decode(to_char(first_time,'HH24'),'14',1,0)),'999') "14", to_char(sum(decode(to_char(first_time,'HH24'),'15',1,0)),'999') "15", to_char(sum(decode(to_char(first_time,'HH24'),'16',1,0)),'999') "16", to_char(sum(decode(to_char(first_time,'HH24'),'17',1,0)),'999') "17", to_char(sum(decode(to_char(first_time,'HH24'),'18',1,0)),'999') "18", to_char(sum(decode(to_char(first_time,'HH24'),'19',1,0)),'999') "19", to_char(sum(decode(to_char(first_time,'HH24'),'20',1,0)),'999') "20", to_char(sum(decode(to_char(first_time,'HH24'),'21',1,0)),'999') "21", to_char(sum(decode(to_char(first_time,'HH24'),'22',1,0)),'999') "22", to_char(sum(decode(to_char(first_time,'HH24'),'23',1,0)),'999') "23" , count(*) Total from v$log_history group by to_char(first_time,'YYYY-MON-DD'), to_char(first_time,'DY') order by to_date(to_char(first_time,'YYYY-MON-DD'),'YYYY-MON-DD');


Date	       DAY    00   01	02   03   04   05   06	 07   08   09	10   11   12   13   14	 15   16   17	18   19   20   21   22	 23	   TOTAL
-------------- ------ ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----------
2022-NOV-27    SUN	71   33   52   36   53	 51   68   45	56   58   51   56   74	 56   49   63	51   50   78   54   54	 47   33   59	    1298
2022-NOV-28    MON	59   32   54   44   52	 39   65  113  117   96   65  148  110	 79   62   67	97   56   84   47   55	 58   76   56	    1731
2022-NOV-29    TUE	60   32   54   36   60	 42   70   47	60   59   44   56   63	 51   49   51	48   51   83  112  122	138  136  164	    1688
2022-NOV-30    WED     124   83   97  102  121	116  142   84  119  126  100  110  142	 87  116  124	97  137  165  102  117	126   68  137	    2742
2022-DEC-01    THU     133   82   90  106  114	124  150  123	95  131  102  114  130	104  118  181	69    0    0	0    0	  0    0    0	    1966

        2、查看每日的日志量,日志切换频繁不见得日志量就大,还是要确认下日志量的情况。从查询结果看,最近两天单日归档量快到4T了。日常也就2T左右。

  select dest_id,to_char(first_time,'yyyy-mm-dd') TIME,count(*),sum(blocks*block_size)/1024/1024/1024 GB 
  from v$archived_log   =================》即使是rac也无需使用gv$视图。
  where dest_id=1
  and
  first_time>to_date('20221126 00:00','yyyymmdd hh24:mi')
  and
  first_time<to_date('20221201 18:59','yyyymmdd hh24:mi')
  group by dest_id,to_char(first_time,'yyyy-mm-dd')
  order by 1,2;   DEST_ID TIME 	COUNT(*)	 GB
---------- ---------- ---------- ----------
	 1 2022-11-26	    1308 1949.53655
	 1 2022-11-27	    1298  1890.3195
	 1 2022-11-28	    1731 2526.82131
	 1 2022-11-29	    1688 2589.85447
	 1 2022-11-30	    2742 3995.80157
	 1 2022-12-01	    2405  3509.1859

         3、确认下活动会话是否有增长,如果活动会话有增长的话,可能是由于交易量上升导致的归档量变大。根据查询结果,除去两个异常时段有会话数激增,其他时段并没有发现活跃会话有明显的增长。

11-29号:
select to_char(sample_time,'yyyy/mm/dd hh24'),count(*) from dba_hist_active_sess_history where 
 sample_time>to_date('20221129 06:00','yyyymmdd hh24:mi')
and sample_time<to_date('20221129 12:00','yyyymmdd hh24:mi')
group by to_char(sample_time,'yyyy/mm/dd hh24') order by 1;

TO_CHAR(SAMPL	COUNT(*)
------------- ----------
2022/11/29 06	    4869
2022/11/29 07	    4518
2022/11/29 08	    8142
2022/11/29 09	  108017   ============》当天出现数据库异常,导致会话堵塞。
2022/11/29 10	   85392
2022/11/29 11	    3078

11-30号:
select to_char(sample_time,'yyyy/mm/dd hh24'),count(*) from dba_hist_active_sess_history where 
 sample_time>to_date('20221130 06:00','yyyymmdd hh24:mi')
and sample_time<to_date('20221130 12:00','yyyymmdd hh24:mi')
group by to_char(sample_time,'yyyy/mm/dd hh24') order by 1;

TO_CHAR(SAMPL	COUNT(*)
------------- ----------
2022/11/30 06	    4392
2022/11/30 07	   10890   ==================》当天由于归档目录满了,导致会话堵塞。
2022/11/30 08	    4963
2022/11/30 09	    4722
2022/11/30 10	    3549
2022/11/30 11	    4070

12-1
select to_char(sample_time,'yyyy/mm/dd hh24'),count(*) from dba_hist_active_sess_history where 
 sample_time>to_date('20221201 06:00','yyyymmdd hh24:mi')
and sample_time<to_date('20221201 12:00','yyyymmdd hh24:mi')
group by to_char(sample_time,'yyyy/mm/dd hh24') order by 1;

TO_CHAR(SAMPL	COUNT(*)
------------- ----------
2022/12/01 06	    4481
2022/12/01 07	    4146
2022/12/01 08	    5200
2022/12/01 09	    4671
2022/12/01 10	    4283
2022/12/01 11	    3394

         4、参考mos文档How to identify the causes of High Redo Generation (Doc ID 2265722.1),选取日志生成量比较大的06-07点,查看block changed的对象情况。

对比了三天的block changed,发现12月1号和11月30号的TOP4都是一样的,而在11月29号中并没有这几个对象,这几个对象从名字看应该是一个表和三个索引。12月1号早晨6点-7点:SELECT to_char(begin_interval_time,'YY-MM-DD HH24') snap_time, dhso.object_name, sum(db_block_changes_delta) BLOCK_CHANGED FROM dba_hist_seg_stat dhss, dba_hist_seg_stat_obj dhso, dba_hist_snapshot dhs WHERE dhs.snap_id = dhss.snap_id AND dhs.instance_number = dhss.instance_number AND dhss.obj# = dhso.obj# AND dhss.dataobj# = dhso.dataobj# AND begin_interval_time BETWEEN to_date('22-12-01 06:00','YY-MM-DD HH24:MI') 
AND to_date('22-12-01 07:00','YY-MM-DD HH24:MI')  GROUP BY to_char(begin_interval_time,'YY-MM-DD HH24'), dhso.object_name HAVING sum(db_block_changes_delta) > 0 ORDER BY sum(db_block_changes_delta) desc ;

SNAP_TIME   OBJECT_NAME 		   BLOCK_CHANGED
----------- ------------------------------ -------------
22-12-01 06 IX_GTMP_WIND_RECORD_LOG	       106597968
22-12-01 06 IX2_GTMP_WIND_RECORD_LOG		68668736
22-12-01 06 IX1_GTMP_WIND_RECORD_LOG		48441232
22-12-01 06 GTMP_WIND_RECORD_LOG		46247936
22-12-01 06 GZ45_VN_BM_BAL			21052960
22-12-01 06 ** TRANSIENT: 23179648		18150128
22-12-01 06 PK_O32_TTRADESTOCK			16278256
22-12-01 06 PK_GZ45_VN_DAYF_CONFIRM		 7566992
11月30号早晨6点到7点:SELECT to_char(begin_interval_time,'YY-MM-DD HH24') snap_time, dhso.object_name, sum(db_block_changes_delta) BLOCK_CHANGED FROM dba_hist_seg_stat dhss, dba_hist_seg_stat_obj dhso, dba_hist_snapshot dhs WHERE dhs.snap_id = dhss.snap_id AND dhs.instance_number = dhss.instance_number AND dhss.obj# = dhso.obj# AND dhss.dataobj# = dhso.dataobj# AND begin_interval_time BETWEEN to_date('22-11-30 06:00','YY-MM-DD HH24:MI') AND to_date('22-11-30 07:00','YY-MM-DD HH24:MI') GROUP BY to_char(begin_interval_time,'YY-MM-DD HH24'), dhso.object_name HAVING sum(db_block_changes_delta) > 0 ORDER BY sum(db_block_changes_delta) desc ; SNAP_TIME OBJECT_NAME BLOCK_CHANGED ----------- ------------------------------ ------------- 22-11-30 06 IX_GTMP_WIND_RECORD_LOG 59098880 22-11-30 06 IX2_GTMP_WIND_RECORD_LOG 44366832 22-11-30 06 IX1_GTMP_WIND_RECORD_LOG 39393808 22-11-30 06 GTMP_WIND_RECORD_LOG 24130704 22-11-30 06 GZ45_VN_BM_BAL 21051680 22-11-30 06 PK_O32_TTRADESTOCK 16285792 22-11-30 06 GZ45_T_R_FR_ASTSTAT 6459584 22-11-30 06 IX_GZ45_VN_BM_BAL 5837776
11月29号早晨6点-7点:SELECT to_char(begin_interval_time,'YY-MM-DD HH24') snap_time, dhso.object_name, sum(db_block_changes_delta) BLOCK_CHANGED FROM dba_hist_seg_stat dhss, dba_hist_seg_stat_obj dhso, dba_hist_snapshot dhs WHERE dhs.snap_id = dhss.snap_id AND dhs.instance_number = dhss.instance_number AND dhss.obj# = dhso.obj# AND dhss.dataobj# = dhso.dataobj# AND begin_interval_time BETWEEN to_date('22-11-29 06:00','YY-MM-DD HH24:MI') AND to_date('22-11-29 07:00','YY-MM-DD HH24:MI') GROUP BY to_char(begin_interval_time,'YY-MM-DD HH24'), dhso.object_name HAVING sum(db_block_changes_delta) > 0 ORDER BY sum(db_block_changes_delta) desc ; SNAP_TIME OBJECT_NAME BLOCK_CHANGED ----------- ------------------------------ ------------- 22-11-29 06 GZ45_VN_BM_BAL 21660112 22-11-29 06 PK_O32_TTRADESTOCK 16292176 22-11-29 06 IDX_ODS_WIND_CBONDAGENCY 11994048 22-11-29 06 PK_ODS_WIND_CBONDAGENCY 9990224 22-11-29 06 O32_TTRADESTOCK 9007728 22-11-29 06 ODS_WIND_CBONDAGENCY 7276320 22-11-29 06 GZ45_T_R_FR_ASTSTAT 6671232 22-11-29 06 IX_GZ45_VN_BM_BAL 5930496

         5、根据以上对象可以找到执行的相应的SQL语句。

SELECT to_char(begin_interval_time,'YYYY_MM_DD HH24') WHEN, dbms_lob.substr(sql_text,4000,1) SQL, dhss.instance_number INST_ID, dhss.sql_id, executions_delta exec_delta, rows_processed_delta rows_proc_delta 
FROM dba_hist_sqlstat dhss, dba_hist_snapshot dhs, dba_hist_sqltext dhst 
WHERE upper(dhst.sql_text) LIKE '%GTMP_WIND_RECORD_LOG%'
 AND ltrim(upper(dhst.sql_text)) NOT LIKE 'SELECT%' 
AND dhss.snap_id=dhs.snap_id 
AND dhss.instance_number=dhs.instance_number 
AND dhss.sql_id=dhst.sql_id 
AND begin_interval_time 
BETWEEN to_date('22-12-01 06:00','YY-MM-DD HH24:MI')
AND to_date('22-12-01 07:00','YY-MM-DD HH24:MI');

   INST_ID SQL_ID	 EXEC_DELTA ROWS_PROC_DELTA
---------- ------------- ---------- ---------------
2022_12_01 06
INSERT INTO ETL.GTMP_WIND_RECORD_LOG (ID, REC_ID, OPMODE, OPDATE, TABLE_NAME) SE
LECT ID, REC_ID, OPMODE, OPDATE, TABLENAME FROM WINDDF.RECORD_LOG@WINDNEWDB A WH
ERE A.TABLENAME=:B1
	 2 61z1sgnsccx6m	184	   96347952


   INST_ID SQL_ID	 EXEC_DELTA ROWS_PROC_DELTA
---------- ------------- ---------- ---------------
DELETE FROM ETL.GTMP_WIND_RECORD_LOG A WHERE A.ID>:B2 OR A.ID<:B1
	 2 d7k797d3827cg	221	   96610616

        6、其实上面的结论从awr报告中也能的出来,但是awr报告只能显示top5,没办法看到后面的。当我想从awr报告里证明上述结论的时候,我发现这个处于TOP的对象竟然是TEMP表。按我以前的理解,对临时表的操作是不会产生redo的。从度娘上查了下,临时表本身的操作虽然不会产生redo,但是对临时表操作需要产生undo,产生undo自然会产生redo,只是比普通表产生的redo要少很多。


总结

    日常碰到的日志量突增大部分都是交易量上涨、业务逻辑调整等导致的,但是如果你直接去问开发人员,他可能会说啥也没变,这个时候你就会挺懵逼(这是本人亲身经历哈)。如果你拿着你查到的东西去找他,他可能就不会这么硬气了。所以啊,虽然可能不是数据库的问题,但作为万能的DBA,你也的会查啊。

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论