背景
ogg为集成模式,且没有配置检查点表,在俩边都意外丢失了trail时,无法使用ogg进程的EXTSEQNO,EXTRBA作为恢复点,同时因为没有检查点表,无法找到复制进程的SCN,则此场景使用时间点来恢复trail文件
模拟故障
停止目标库的复制进程
stop LSX3REP1
数据库不断插入数据
源库:
SQL> begin
for i in 2…9999
loop
insert into lsx.f values(2’a’);
commit;
end loop;
end;
/
PL/SQL procedure successfully completed.
SQL> select count(*) from lsx.f;
COUNT(*)
466670
目标库:
SQL> select count(*) from lsx.f;
COUNT(*)
436676
删除目标库复制进程当前的trail文件
目标库:
[oracle@oradb9 dirdatlsx3ext1]$ mv rt000000002 /home/oracle/
[oracle@oradb9 dirdatlsx3ext1]$ ls
rt000000000 rt000000001
启动复制进程报错
ggsci
GGSCI > start LSX3REP1
GGSCI > view report LSX3REP1
2021-12-21 19:18:15 ERROR OGG-02171 Error reading LCR from data source. Status 500, data source type TrailDataSource.
Source Context :
SourceModule : [gglib.ggtrail.TrailDataSource]
SourceID : [ggtrail/TrailDataSource.cpp]
SourceMethod : [readNextTrailRecord]
SourceLine : [2107]
ThreadBacktrace : [14] elements
: [/u01/ogg/libgglog.so(CMessageContext::AddThreadContext())]
: [/u01/ogg/libgglog.so(CMessageFactory::CreateMessage(CSourceContext*, unsigned int, …))]
: [/u01/ogg/libgglog.so(_MSG_String(CSourceContext*, int, char const*, CMessageFactory::MessageDisposition))]
: [/u01/ogg/replicat(ggs::gglib::ggtrail::TrailDataSource::readNextTrailRecord(ggs::gglib::gglcr::CommonLCR**, long*, int&, int&
, bool, bool))]
: [/u01/ogg/replicat(ggs::er::ReplicatTrailDataSource::readLCR(ggs::gglib::gglcr::CommonLCR**, long&, bool&))]
: [/u01/ogg/replicat(ggs::er::ReplicatContext::processReplicatLoop())]
: [/u01/ogg/replicat(ggs::er::ReplicatContext::run())]
: [/u01/ogg/replicat()]
: [/u01/ogg/replicat(ggs::gglib::MultiThreading::MainThread::ExecMain())]
: [/u01/ogg/replicat(ggs::gglib::MultiThreading::Thread::RunThread(ggs::gglib::MultiThreading::Thread::ThreadArgs*))]
: [/u01/ogg/replicat(ggs::gglib::MultiThreading::MainThread::Run(int, char**))]
: [/u01/ogg/replicat(main)]
: [/lib64/libc.so.6(__libc_start_main)]
: [/u01/ogg/replicat()]
2021-12-21 19:18:15 ERROR OGG-01091 Unable to open file “/u01/ogg/dirdatlsx3ext1/rt000000002” (error 2, No such file or directory).
故障处理
查看ogg进程
目标端复制进程:
GGSCI (oradb9.example.com) 73> info lsx3rep1 detail
REPLICAT LSX3REP1 Last Started 2021-12-21 19:11 Status ABENDED
INTEGRATED
Checkpoint Lag 00:00:00 (updated 00:00:01 ago)
Log Read Checkpoint File /u01/ogg/dirdatlsx3ext1/rt000000002
2021-12-21 19:12:37.210509 RBA 1495160
INTEGRATED Replicat
No DBLOGIN Provided, constructed inbound server name is OGG$LSX3REP1
Current Log BSN value: (requires database login)
Low Watermark CSN value: (requires database login)
High Watermark CSN value: (requires database login)
Extract Source Begin End
/u01/ogg/dirdatlsx3ext1/rt000000002 2021-12-21 19:06 2021-12-21 19:12
/u01/ogg/dirdatlsx3ext1/rt000000001 2021-12-21 19:06 2021-12-21 19:06
/u01/ogg/dirdatlsx3ext1/rt000000001 2021-12-21 19:00 2021-12-21 19:06
/u01/ogg/dirdatlsx3ext1/rt000000001 2021-12-21 18:18 2021-12-21 19:00
/u01/ogg/dirdatlsx3ext1/rt000000000 * Initialized * 2021-12-21 18:18
/u01/ogg/dirdatlsx3ext1/rt000000000 * Initialized * First Record
./dirdatlsx3ext1/rt000000000 * Initialized * First Record
Current directory /u01/ogg/dirdatlsx3ext1
Report file /u01/ogg/dirrpt/LSX3REP1.rpt
Parameter file /u01/ogg/dirprm/lsx3rep1.prm
Checkpoint file /u01/ogg/dirchk/LSX3REP1.cpr
Checkpoint table ggs.checkpointtable
Process file
Error log /u01/ogg/ggserr.log
源端抽取进程及投递进程:
info lsx3*
EXTRACT LSX3DP1 Last Started 2021-12-21 17:37 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:05 ago)
Process ID 40206
Log Read Checkpoint File /u01/ogg1/dirdatlsx3ext1/lt000000064
2021-12-21 17:45:03.000000 RBA 307551
EXTRACT LSX3EXT1 Last Started 2021-12-21 17:37 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:08 ago)
Process ID 40182
Log Read Checkpoint Oracle Integrated Redo Logs
2021-12-21 17:45:02
SCN 0.35435162 (35435162)
GGSCI (oradb) 132> info lsx3ext1 detail
EXTRACT LSX3EXT1 Last Started 2021-12-21 17:37 Status RUNNING
Checkpoint Lag 00:00:04 (updated 00:00:04 ago)
Process ID 40182
Log Read Checkpoint Oracle Integrated Redo Logs
2021-12-21 17:40:27
SCN 0.35399384 (35399384)
Target Extract Trails:
Trail Name Seqno RBA Max MB Trail Type
./dirdatlsx3ext1/lt 60 497256 1 EXTTRAIL
Integrated Extract outbound server first scn: Unavailable.Need DBLOGIN.
Integrated Extract outbound server filtering start scn: Unavailable.Need DBLOGIN.
Extract Source Begin End
Not Available 2021-12-21 16:31 2021-12-21 17:40
Not Available 2021-12-21 16:31 2021-12-21 17:37
Not Available 2021-12-21 16:31 2021-12-21 17:37
Not Available 2021-12-21 16:31 2021-12-21 17:37
Not Available * Initialized * 2021-12-21 17:25
Not Available * Initialized * 2021-12-21 17:25
Not Available * Initialized * 2021-12-21 17:25
Not Available * Initialized * 2021-12-21 17:21
Not Available * Initialized * 2021-12-21 17:21
Not Available * Initialized * 2021-12-21 17:21
Not Available * Initialized * 2021-12-21 16:31
Not Available * Initialized * 2021-12-21 16:31
Not Available * Initialized * 2021-12-21 16:31
Current directory /home/oracle/oggcode
Report file /u01/ogg1/dirrpt/LSX3EXT1.rpt
Parameter file /u01/ogg1/dirprm/lsx3ext1.prm
Checkpoint file /u01/ogg1/dirchk/LSX3EXT1.cpe
Process file
Error log /u01/ogg1/ggserr.log
恢复抽取进程
将复制进程ABENDED的时间点作为抽取进程恢复时间点,让抽取进程从丢失trail的时间重新抽取数据
ggsci
GGSCI > stop LSX3EXT1
GGSCI (oradb) 3> dblogin USERID goldengate@lsx_ogg, PASSWORD xxx,ENCRYPTKEY default
Successfully logged into database.
GGSCI (oradb as goldengate@lsx) 4> alter LSX3EXT1,tranlog,begin 2021-12-21 17:38:37
EXTRACT altered.
GGSCI (oradb as goldengate@lsx) 5> alter lsx3ext1 etrollover
2021-12-21 17:49:02 INFO OGG-01520 Rollover performed. For each affected output trail of Version 10 or higher format, after starting the source extract, issue ALTER EXTSEQNO for that trail’s reader (either pump EXTRACT or REPLICAT) to move the reader’s scan to the new trail file; it will not happen automatically.
EXTRACT altered.
恢复投递进程
投递进程重新从新生成的trail文件开始投递
ggsci
GGSCI > stop LSX3DP1
GGSCI (oradb as goldengate@lsx) 14> alter LSX3DP1,extseqno 65
EXTRACT altered.
GGSCI (oradb as goldengate@lsx) 15> alter LSX3DP1 etrollover
2021-12-21 17:52:43 INFO OGG-01520 Rollover performed. For each affected output trail of Version 10 or higher format, after starting the source extract, issue ALTER EXTSEQNO for that trail’s reader (either pump EXTRACT or REPLICAT) to move the reader’s scan to the new trail file; it will not happen automatically.
EXTRACT altered.
恢复复制进程
GGSCI (oradb9.example.com) 16> alter lsx3rep1,extseqno 3,extrba 0
2021-12-21 19:30:37 INFO OGG-06594 Replicat LSX3REP1 has been altered. Even the start up position might be updated, duplicate suppression remains active in next startup. To override duplicate suppression, start LSX3REP1 with NOFILTERDUPTRANSACTIONS option.
REPLICAT (Integrated) altered.
GGSCI (oradb9.example.com) 17> start lsx3rep1
Sending START request to MANAGER …
REPLICAT LSX3REP1 starting
查看ogg进程状态
复制进程已重新恢复正常
GGSCI (oradb as goldengate@lsx) 50> info lsx3*
EXTRACT LSX3DP1 Last Started 2021-12-21 17:55 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:07 ago)
Process ID 40299
Log Read Checkpoint File /u01/ogg1/dirdatlsx3ext1/lt000000069
2021-12-21 17:45:03.000000 RBA 83627
EXTRACT LSX3EXT1 Last Started 2021-12-21 17:50 Status RUNNING
Checkpoint Lag 00:00:04 (updated 00:00:07 ago)
Process ID 40248
Log Read Checkpoint Oracle Integrated Redo Logs
2021-12-21 17:57:19
SCN 0.35450013 (35450013)
GGSCI (oradb9.example.com) 32> info lsx3rep1
REPLICAT LSX3REP1 Last Started 2021-12-21 19:30 Status RUNNING
INTEGRATED
Checkpoint Lag 00:00:00 (updated 00:00:12 ago)
Process ID 75514
Log Read Checkpoint File /u01/ogg/dirdatlsx3ext1/rt000000004
2021-12-21 19:19:03.208843 RBA 4074729
查看俩边数据:
源库:
SQL> select count(*) from lsx.f;
COUNT(*)
466670
目标库:
SQL> select count(*) from lsx.f;
COUNT(*)
466670
至此数据已同步一致。
相关学习资料
How to recover OGG process if checkpoint file is accidentally removed? (Doc ID 2199555.1)
Resync / Recover Data ( Tables , Trails ) Trails Are Missing, Damaged, Incompatible, Corrupt or Suspect for Extract or Replicat (Doc ID 1614665.1)




