OGG使用nfs高可用配置问题解决
Ogg使用nfs时,在创建extract进程时会报无法锁定文件的错误,参见如下解决,最好的解决方法还是使用acfs
|
OGG File Locking Issue with XAG on Node Switch Over when using NFS Mounted File System (Doc ID 1673231.1) |
|
APPLIES
TO: Oracle
GoldenGate - Version 12.1.2.0.0 and later PROBLEM: Provide additional information to the user on OGG File Locking issues with XAG on Node Switch Over when using a NFS mounted File System. SYMPTOMS: On a node switch over via XAG after a node failure when using NFS mounted file system, OGG processes will not restart and encounters the following error: ERROR OGG-00446 Unable to lock file "<ogg_home>/dirchk/<extract_name>.cpe" (error 11, Resource temporarily unavailable) ERROR OGG-00446 Unable to lock file "<ogg_home>/dirchk/<pump_name>.cpe" (error 11, Resource temporarily unavailable) ERROR OGG-00446 Unable to lock file "<ogg_home>/dirchk/<replicat_name>.cpe" (error 11, Resource temporarily unavailable) CAUSE: NFS server doesn’t release locks that were established before the failure of the node, which causes the OGG processes to fail during startup since the files are still being locked and there is no timeout on these locks. This is an inherent behavior of NFS prior to NFS v4. SOLUTION: There are three available solutions depending on the availability and implementation of NFS on the NFS server and NFS client machines. 1. If NFS v4 is available on both the NFS server and clients, then you can mount the file system using NFS v4 and use the built-in lease based file locking feature of NFS v4. The syntax is as follows: mount -t nfs4 oggnfs-svr:/home/oracle/ogg /mnt/oggnfs-svr/oracle/ogg -o rw,bg,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,noac,timeo=600 2. If NFS v4 is not a viable solution, then you can disable the file locking within NFS, however, this will be a less secure solution. The use of “nolock” option poses a risk of data corruption, since any processes from any other NFS client machines can access and modify the files. The syntax is as follows: mount -t nfs oggnfs-svr:/home/oracle/ogg /mnt/oggnfs-svr/oracle/ogg -o nolock,rw,bg,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,noac,nfsvers=3,timeo=600 3. Upgrade the XAG to version 3.1 and set the NFS_UNLOCK option to 1. This was an enhancement done on XAG to clean the locks on OGG checkpoint and trail files left by NFS. The syntax is as follows: $ agctl modify goldengate gg1 --environment_vars "NFS_UNLOCK=1" |
|
Oracle GoldenGate Best Practice: NFS Mount options for use with GoldenGate (Doc ID 1232303.1) |
|
||||||||||||||||||||||
|
APPLIES
TO: Oracle
GoldenGate - Version 11.1.1.0.0 and later The purpose of this bulletin is to document the file system mount options to use when configuring GoldenGate to run with NFS mounted file system. Unless IO buffering is OFF, then NFS mounts should not be used when running any Oracle GoldenGate processes. A danger occurs when one process registers the end of a trail file or transaction log and moves on to the next in sequence and after this event data in the NFS IO buffer gets flushed to disk. The net result is skipped data and this cannot be compensated for with GoldenGate parameter EOFDELAY. When using NFS mounted file system with Oracle GoldenGate files, the setting for file system caching or buffered IO must be disabled on both NFS client and server.
This document is relevant to all environments using Oracle GoldenGate. The important factor to consider when configuring Oracle GoldenGate processes to run on NFS mounted file system is to make sure that buffered IO (data and attribute caching) is always set to OFF on both NFS client and server. NOTES: 1. Oracle does not support running OGG binaries on shared storage. OGG binaries should be installed on local storage. 2. NFS v4 is supported
Mount Options for Oracle GoldenGate Datafiles
*
Although data caching or buffered IO is set to OFF on the NFS client system,
sometimes for other specialized file system such as Veritas File System
(VxFS), or with NAS device/server that supports additional caching feature
such as FlexCache system on NetApp, this will not take into effect unless you
explicitly disable this function on the server side.For VxFS please review
Note Golden Gate Config Parameters For HP Journal File Systems
"direct" Option Is Performing Very Slow (Doc ID 1607386.1) as not
all the options above are supported (example option hard). You will need to
set the setting MINCACHE to UNBUFFERED or DIRECT and for the NetApp the
FlexCache system must not be used at all with Oracle GoldenGate processes. For
Sun Solaris operating system, if extract hang situation is experienced,
consider adding "timeo=600, llock" mount options on top of the ones
required for oracle gg datafile.
** This option is in addition to the regular local file system mount options used to mount the local disk to be used by the NFS client where Oracle GoldenGate datafiles will be used. This setting will forced the IO behavior setting on the file system to be synchronous "sync". Asynchronous IO behavior setting on the file system is not recommended for Oracle GoldenGate datafiles and must be turned off at all times. Note: For ZFS turning on the noac, sync and actimeo=0 on the client side would suffice.
For a Global File system, it may also need "localflocks" mount option for GFS2 filesystem. To check mount settings, cat /etc/fstab
When NFS mounting trail files to a server where any GoldeGate process is going to read them (including Extract pump, Replicat, Distribution Service) the process that writes to the these files (Extract, Receiver Service, Collector) must be running on the same server. For example, if sending remote trail files to a target clustered serve, and this target cluster has 2 nodes (NODE_A and NODE_B) if you are going to run a Replicat process on NODE_B, then the process that writes the trail file that the Replicat process is going to use must also be running on NODE_B. If using an Extract pump, the RMTHOST must connect to NODE_B, the same node that the Replicat reading that trail is running on. If using a Distribution Service to send the trail files it must connect to the Receiver Service that is running on the same node as the Replicat process that will be reading the trail files. It is not supported to have your trail files that are stored on NFS mounted devices where the process writing the trail file is running on a different node or different server than the process that is reading the trail files.
Additional Testing Programs Two
small demo programs written in perl have been attached to this KM to help the
customer identify if there are any cache related issues on the NFS mounted
files system Best Practice - Oracle GoldenGate for Linux, UNIX and Windows
cat
/etc/fstab |
|||||||||||||||||||||||




