Log File Sync和process的关系

原创 eygle 2007-07-22

847

今天看到玉面飞龙的一则Blog。其中提到了Oracle LGWR机制的一点增强。
文中引用了几个链接。
首先是Jonathan Lewis和Donald K. Burleson的一场口水战，其中一个回复具有一定的参考价值：

For now I'll add that the time for a log file sync (LFS) wait starts as soon as the foreground process is done posting LGWR (it then waits on LFS in the post/wait interface which is an IPC semaphore wait on most platforms). Time is ticking (T0) for the process I'll call "LFS_X".
LGWR's state at the time of posting is important. If he is already in the middle of servicing a flush he could be in the middle of one or more I/Os. Remember, LGWR will limit large flush operations to multiple 128KB writes (async). So LFS_X is likely now paying for the prior group of processes being serviced. Once LGWR is done servicing the prior groups I/Os, it goes into a loop of postings of those LFS waiters and yes, LFS_X is still waiting and yes the work at hand for LGWR has nothing directly to do with getting LFS_X out of his wait. The act of posting the previous set of LFS waiters makes them runable (they were in a post/wait just like LFS_X). Once LGWR posts all the waiters from his current batch he checks to see if there is more work to do before going to sleep. In our case there is (LFS_X needs servicing). LGWR flushes the buffer that has LFS_X's redo pieces and LGWR then posts LFS_X (a semaphore operation on most platforms). LFS_X is now runable. The first thing LFS_X does when it commences running is checks the microsecond resulution time of day (T1).
The delta from T0 above to T1 is the amount of time LFS_X waited in LFS. Any processor shortage suffered by LGWR between T0 and T1 affects the duration of poor LFS_X's LFS wait time. That includes any time LGWR goes in and out of the kernel for posting. That includes any natural exhausting of LGWR's scheduler time quantum (10ms). That includes any time LGWR's processor is interrupted to handle a hardware interrupt. And, finally, that includes time that LFS_X himself has to wait after being posted for the CPU (upon which it has cache affinity) to pick him up. And that decision is based upon his state (runable), mode (user mode) and priority (user mode 100+ depending on age) and nice value.
Lot's of people at lots of systems houses (HP,IBM,Sun,SGI,Sequent,Pyramid, DEC, DG, etc) have spent significant time optimizing the average time between T0 and T1. This work produced the slick Post/Wait drivers that some platforms still support.

Jonathan Lewis的一点分析是:

The connection between processes and log file sync is something that Anjo Kolk told me about some
time ago. I forget exact versions, but I think in 8i the log writer checks the entire x$ array at
the end of a log write to see which sessions have had their log sync request honoured. But in 9i
the array is only checked up to its high-water mark. I think we decided that only current
processes were checked in the 10g implementation.
So there really could be a problem for a very large setting of processes - and in 9i I think there
still is a problem if you intermittently use a lot of processes.

Tom的一点分析和总结是:

8i - could be a problem if processes was set much higher than needed as lgwr would iterate through
the entire array
9i - could be a problem IF you used lots of processes once (raised the HWM of the array once) - but
then again, you needed that many processes at some point so lowering processes would lead to other
serious issues at some point.
10g - probably not an issue anymore.

Steve Adams关于Oracle7和Oracle8的介绍：

If a process is waiting for LGWR to write a particular log buffer block to disk, it waits in a log file sync wait. The normal cause of log file sync waits is transaction termination; however, DBWn also suffers these waits when writing recently modified blocks.
When a process wakes up from a log file sync wait, it must check whether the log buffer block containing the redo of interest has yet been written to disk. If not, it must continue to wait. The SGA variable that shows whether a particular log buffer block has yet been written to disk is the index into the log file representing the base disk block for the log buffer. This variable is of course protected by the redo allocation latch, and so the redo allocation latch must be taken to check it.

Metalink关于Log File Sync的解释:

When a user session(foreground process) COMMITs (or rolls back), the session's redo information needs to be flushed to the redo logfile. The user session will post the LGWR to write all redo required from the log buffer to the redo log file. When the LGWR has finished it will post the user session. The user session waits on this wait event while waiting for LGWR to post it back to confirm all redo changes are safely on disk.

这些资料之间存在一个断点就是：
当LGWR完成Redo写之后，如何Post用户进程，如果Post过程需要检查x$ array去确认是谁Post LGWR的，那么以上一切论断成立，否则前面Jonathan的描述将不可信。谁知道更确切的信息，请告知，谢谢：）
-The End-

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者

Log File Sync和process的关系

评论