
ORACLE的数据重演是个比较好的功能,比其他压力测试来的真实。
比如说更改某个参数,会带来什么影响呢? 增加内存又会带来什么影响呢? 这些是压力测试软件无法做到的。
那ORACLE重演功能 是抓获线上,生产数据库的真实负载或者是说请求,把它保存在文件中,然后移到测试机器上进行重演,或者叫重放。
接下来我们做做测试看看怎么玩!
目前本仙是两台组件的RAC

#在所有节点新建目录
mkdir backup/ora_cap_rep/
mkdir backup/ora_cap_rep/
sqlpllus as sysdba
#创建捕获目录
CREATE OR REPLACE DIRECTORY db_replay_capture_dir AS '/backup/ora_cap_rep/';
#创建捕获过滤,不想捕获所有的负载 专门捕获节点2 用户 CMS
exec dbms_workload_capture.ADD_FILTER('CMS_CAP_FILTER01','USER', 'CMS');
exec dbms_workload_capture.ADD_FILTER('CMS_CAP_FILTER02','INSTANCE_NUMBER', '2');
#过滤器解释 第一个参数是过滤名字,
第二个参数是过滤条件,
第三个是条件值
INSTANCE_NUMBER;USER;MODULE;ACTION;PROGRAM;SERVICE
开始捕获前生成个快照:
Exec dbms_workload_repository.create_snapshot;
#开始捕获 捕获1个小时 default_action 表示包含和排除,默认包含则过滤器为排除条件,如果是排除,过滤器为包含条件
BEGIN
DBMS_WORKLOAD_CAPTURE.start_capture (
name => 'test_capture_1',
dir => 'DB_REPLAY_CAPTURE_DIR',
default_action => 'EXCLUDE',
duration => 3600
);
END;
/
# duration => null 表示需要手工停止
# 停止捕获工作量
#exec dbms_workload_capture.finish_capture();
#ERROR
ORA-15505: cannot start workload capture because instance 2 encountered errors while accessing directory "/backup/ora_cap_rep/"
说明目录不干净 删除重建
等待1个小时后 从告警文件可以看到
DBMS_WORKLOAD_CAPTURE.START_CAPTURE(): Starting database capture at 04/11/2019 15:45:14
Thu Apr 11 16:45:45 2019
DBMS_WORKLOAD_CAPTURE.FINISH_CAPTURE(): Stopped database capture successfully at 04/11/2019 16:45:45
查看捕获的信息:
#获取capture ID
SELECT DBMS_WORKLOAD_CAPTURE.get_capture_info('DB_REPLAY_CAPTURE_DIR') FROM dual;
SELECT id, name FROM dba_workload_captures;
导出AWR快照
BEGIN
DBMS_WORKLOAD_CAPTURE.export_awr (capture_id => 8);
END;
/
再看目录 多了两个
wcr_ca.log
wcr_ca.dmp
收集报告
Replay报告有几种,本文主要介绍replay report、compare period report两种。
replay reports将replay执行时的性能统计信息列出来,并与capture作简单的比较
compare period report侧重于在两个不同的replay之间,或者replay和capture之间
从数据库参数配置、主机硬件环境、Top SQL耗用资源等方面形成全方位的诊断报告
捕获报告:
文本模式的捕获报告:
SQL> set pagesize 0 long 30000000 longchunksize 1000
SQL> select dbms_workload_capture.report(1,'TEXT') from dual;
HTML模式捕获报告
set serveroutput on
spool home/oracle/capture01.html
set pagesize 20000
set long 200000
DECLARE
v_offset number;
v_length number;
v_nowlength number;
v_char1 varchar2(32767);
v_reprpt CLOB;
BEGIN
v_reprpt := DBMS_WORKLOAD_CAPTURE.report(capture_id=> 1, format=>DBMS_WORKLOAD_CAPTURE.TYPE_HTML);
v_nowlength:=1;
v_length:=dbms_lob.getlength(lob_loc=>v_reprpt);
v_offset:=1;
while ( v_offset < v_length ) loop
v_char1:=dbms_lob.substr(lob_loc=>v_reprpt,offset=>v_offset);
dbms_output.put_line(v_char1);
v_offset:=v_offset+32767;
end loop;
end;
/
spool off
Captured Workload Statistics
'Value' represents the corresponding statistic aggregated across the entire captured database workload.
'% Total' is the percentage of 'Value' over the corresponding system-wide aggregated total.
| Statistic Name | Value | % Total |
|---|---|---|
| DB time (secs) | 2251.60 | 90.13 |
| Average Active Sessions | 0.63 | |
| User calls captured | 638370 | 96.19 |
| User calls captured with Errors | 2 | |
| Session logins | 123 | 28.28 |
| Transactions | 42715 | 98.34 |
Top Events Filtered Out
| Event | Event Class | % Event | Avg Active Sessions |
|---|---|---|---|
| CPU + Wait for CPU | CPU | 84.43 | 0.79 |
| db file sequential read | User I/O | 4.02 | 0.04 |
| log file sync | Commit | 2.48 | 0.02 |
重放部分:
把目录ora_cap_rep复制到测试库中/u01/ora_expdp/下
假如测试库需要还原的话,最好启动归档模式,并建立还原点,一遍重放后再还原回去.
CONN sys/password@test AS SYSDBA
create or replace directory replay_dir as '/u01/ora_expdp/ora_cap_rep/';
初始化重演
1 使用PROCESS_CAPTURE过程来准备capture logs 执行完毕后,
会生成
wcr_process.wmd,
wcr_login.pp,
wcr_seq_data.extb,
wcr_scn_order.extb ,
wcr_conn_data.extb等文件。
begin
dbms_workload_replay.process_capture(capture_dir => 'REPLAY_DIR');
dbms_workload_replay.initialize_replay('cms_replay_1', 'REPLAY_DIR');
dbms_workload_replay.PREPARE_REPLAY();
end;
/
PL/SQL 过程已成功完成。
2 使用wrc 工具效验
wrc mode=calibrate replaydir=/u01/ora_expdp/ora_cap_rep
Workload Replay Client: Release 11.2.0.4.0 - Production on 星期五 4月 26 17:11:37 2019
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Report for Workload in: u01/ora_expdp/ora_cap_rep
-----------------------
Recommendation:
Consider using at least 2 clients divided among 1 CPU(s)
You will need at least 105 MB of memory per client process.
If your machine(s) cannot match that number, consider using more clients.
Workload Characteristics:
- max concurrency: 55 sessions
- total number of sessions: 157
Assumptions:
- 1 client process per 50 concurrent sessions
- 4 client process per CPU
- 256 KB of memory cache per concurrent session
- think time scale = 100
- connect time scale = 100
- synchronization = TRUE
效验结果会显示完成replay需要replayclents和hosts的数量。
本列是要求至少2个客户机,开两个SHELL窗口执行下面命令
wrc system/235619@JXREPORT mode=replay
replaydir=/u01/ora_expdp/ora_cap_rep
3 开始重演 ,回到SQLPLUS窗口
BEGIN
DBMS_WORKLOAD_REPLAY.start_replay;
END;
这个时候回头看两个客户机信息
[oracle@jxreport ora_cap_rep]wrc system/235619@JXREPORT mode=replay replaydir=/u01/ora_expdp/ora_cap_rep
Workload Replay Client: Release 11.2.0.4.0 - Production on 星期五 4月 26 17:56:09 2019
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Wait for the replay to start (17:56:09)
Replay client 1 started (17:56:32)
[oracle@jxreport ~]wrc system/235619@JXREPORT mode=replay replaydir=/u01/ora_expdp/ora_cap_rep
Workload Replay Client: Release 11.2.0.4.0 - Production on 星期五 4月 26 17:56:12 2019
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Wait for the replay to start (17:56:12)
Replay client 2 started (17:56:32)
等待重演完成。。。。。
大约1个小时后
Replay client 1 finished (18:58:54)
Replay client 2 finished (18:59:55)
4 重演结束
1 查看信息
SELECT id, name FROM dba_workload_replays;
SELECT DBMS_WORKLOAD_CAPTURE.get_capture_info('REPLAY_DIR') FROM dual;
收集报告
文本模式重放报告:
SQL> set pagesize 0 long 30000000 longchunksize 1000
select dbms_workload_replay.report(2,'TEXT') from dual;
HTML模式重放报告
set serveroutput on
spool home/oracle/v1028r1.html
set pagesize 20000
set long 200000
declare
v_offset number;
v_length number;
v_nowlength number;
v_char1 varchar2(32767);
v_reprpt clob;
begin
v_reprpt:=dbms_workload_replay.report(replay_id=>2,format=>dbms_workload_replay.TYPE_HTML);
v_nowlength:=1;
v_length:=dbms_lob.getlength(lob_loc=>v_reprpt);
v_offset:=1;
while ( v_offset < v_length ) loop
v_char1:=dbms_lob.substr(lob_loc=>v_reprpt,offset=>v_offset);
dbms_output.put_line(v_char1);
v_offset:=v_offset+32767;
end loop;
end;
/
spool off
Replay Information
| Information | Replay | Capture |
|---|---|---|
| Name | jydb_replay_1 | test_capture_1 |
| Status | COMPLETED | COMPLETED |
| Database Name | JXREPORT | JXNETORC |
| Database Version | 11.2.0.4.0 | 11.2.0.4.0 |
| Start Time | 26-04-19 09:56:32 | 11-04-19 09:22:55 |
| End Time | 26-04-19 10:59:10 | 11-04-19 10:22:55 |
| Duration | 1 hour 2 minutes 38 seconds | 1 hour 0 seconds |
| Directory Object | REPLAY_DIR | REPLAY_DIR |
| Directory Path | /u01/ora_expdp/ora_cap_rep/ | /u01/ora_expdp/ora_cap_rep/ |
| AWR DB Id | 626829309 | |
| AWR Begin Snap Id | 379 | |
| AWR End Snap Id | 380 | |
| Replay Schedule Name |
Replay Options
| Option Name | Value |
|---|---|
| Synchronization | OBJECT_ID |
| Connect Time | 100% |
| Think Time | 100% |
| Think Time Auto Correct | TRUE |
| Number of WRC Clients | 2 (2 Completed, 0 Running ) |
Replay Statistics
| Statistic | Replay | Capture |
|---|---|---|
| DB Time | 5748.055 seconds | 2251.599 seconds |
| Average Active Sessions | 1.53 | .63 |
| User calls | 638370 | 638370 |
Replay Divergence Summary
| Divergence Type | Count | % Total |
|---|---|---|
| Session Failures During Replay | 0 | 0.00 |
| Errors No Longer Seen During Replay | 0 | 0.00 |
| New Errors Seen During Replay | 152 | 0.02 |
| Errors Mutated During Replay | 0 | 0.00 |
| DMLs with Different Number of Rows Modified | 5586 | 0.88 |
| SELECTs with Different Number of Rows Fetched | 27533 | 4.31 |
Workload Profile
Top Events
(-) Hide
| Event | Event Class | % Activity |
|---|---|---|
| db file sequential read | User I/O | 56.32 |
| CPU + Wait for CPU | CPU | 30.02 |
| log file sync | Commit | 3.20 |
| direct path read | User I/O | 2.36 |
| db file parallel read | User I/O | 1.01 |
Top SQL with Top Events
(-) Hide
| SQL ID | Planhash | Executions | % Activity | Event Drilldown | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3ybba3qwu9rnt | 2231708280 | 3 | 20.40 |
| ||||||||
| 4yv1buwd6bhn0 | 399073790 | 1 | 12.14 |
| ||||||||
| 309tnhhdkcmg2 | 167974480 | 9 | 4.55 |
| ||||||||
| fk8wnd398yyqa | 966341815 | 25 | 4.22 |
| ||||||||
| 3yp849brgb9md | 1810308884 | 18 | 3.04 |
|
重演报告简单的对比了生产库和测试库信息
报告下面是失败的语句之类的,测试库的数据和生产的数据本身不同步造成的。不过从简单对比来看DB TIME时间增加了1倍;从等待事件来看花费在CPU身上挺多的。说明测试库的CPU不那么给力!
导入捕获的AWR
这点没有明白,为何把线上AWR导入到测试呢?或许做捕获和重放的对比?
SQL> var v_randbid number;
SQL> exec :v_randbid:=DBMS_WORKLOAD_CAPTURE.IMPORT_AWR(capture_id=>1,staging_schema=>'SYSTEM',force_cleanup=>FALSE);
PL/SQL 过程已成功完成。
SQL> print :v_randbid;
V_RANDBID
----------
686404447
看到了些信息
select id,name,status,start_time,end_time,dir_path,awr_dbid,awr_begin_snap,awr_end_snap from dba_workload_captures;
看到了快照
select snap_id,dbid,begin_interval_time,end_interval_time from dba_hist_snapshot where dbid=686404447
生成比较报告:
set serveroutput on
spool home/oracle/v1028r1r2.html
declare
v_rlt clob;
v_replay_id1 number:=1;
v_replay_id2 number:=2;
v_snum number:=1;
v_length number;
v_char varchar2(32767);
begin
dbms_workload_replay.compare_period_report(replay_id1=>v_replay_id1,replay_id2=>v_replay_id2,format=>'HTML',result=>v_rlt);
v_length:=dbms_lob.GETLENGTH(v_rlt);
while ( v_snum < v_length ) loop
v_char:=dbms_lob.substr(lob_loc=>v_rlt,amount=>32767,offset=>v_snum);
v_snum:=v_snum+32767;
dbms_output.put_line(v_char);
end loop;
end;
/
spool off
这个比较报告没有生成出数据出来,
如果要进行更细化的针对每条sql语句的分析,可以使用dbms_workload_replay.compare_sqlset_report函数,
其实质是利用SPA对于两次replay期间生成的STS进行分析,
函数dbms_workload_replay.compare_sqlset_report返回值是task_name,在dba_advisor_tasks中能够查到
set serveroutput on
spool home/oracle/v1028r1r2_spa.html
declare
v_rlt clob;
v_replay_id1 number:=26;
v_replay_id2 number:=24;
v_snum number:=1;
v_length number;
v_char varchar2(32767);
v_ret varchar2(32767);
begin
v_ret:=dbms_workload_replay.compare_sqlset_report(replay_id1=>v_replay_id1,replay_id2=>v_replay_id2,format=>'HTML',result=>v_rlt);
v_length:=dbms_lob.GETLENGTH(v_rlt);
while ( v_snum < v_length ) loop
v_char:=dbms_lob.substr(lob_loc=>v_rlt,amount=>32767,offset=>v_snum);
v_snum:=v_snum+32767;
dbms_output.put_line(v_char);
end loop;
dbms_output.put_line(v_ret);
end;
/
spool off
后期命令:
-- 暂停重放
exec dbms_workload_replay.pause_replay();
-- 继续重放
exec dbms_workload_replay.resume_replay();
-- 终止重放
exec dbms_workload_replay.cancel_replay();
-- 删除捕获信息
exec dbms_workload_capture.delete_capture_info(&id);
-删除重放信息
exec dbms_workload_replay.delete_replay_info(&id);
输入上面查询dba_workload_replays的id值进行删除。
--通过内部函数读取:
SQL> select DBMS_WORKLOAD_CAPTURE.get_capture_info('DB_REPLAY_CAPTURE_DIR') FROM dual;
DBMS_WORKLOAD_CAPTURE.GET_CAPTURE_INFO('DB_REPLAY_CAPTURE_DIR')
---------------------------------------------------------------
11
--读取dba_workload_captures视图读取:
SQL> SELECT id, name FROM dba_workload_captures;
ID NAME
---------- ------------------------------
11 test_capture_1
如果是物理备库
转换物理备库为快照备库
alter database convert to snapshot standby;
shutdown immediate
startup
恢复数据卫士为物理备库
shutdown immeidate
SQL> startup mount
SQL> alter database convert to physical standby;
如果是普通数据库
1 开启闪回数据库特性
2 建立还原点
3 回滚到还原点
1 开启闪回数据库特性
SQL> archive log list; --确认已开启归档
Database log mode Archive Mode
Automatic archival Enabled
Archive destination USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence 79
Next log sequence to archive 81
Current log sequence 81
SQL> select flashback_on from v$database; --确定开启flashbak数据库
FLASHBACK_ON
------------------
YES
2 建立还原点
SQL> create restore point gold;
Restore point created
3 回滚到还原点
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 313159680 bytes
Fixed Size 2227944 bytes
Variable Size 201326872 bytes
Database Buffers 104857600 bytes
Redo Buffers 4747264 bytes
Database mounted.
SQL> flashback database to restore point gold;
Flashback complete.
SQL> alter database open resetlogs;
Database altered






