某客户一套库不定时出现ORA-4030错误,分析OSW数据,在问题时间点并没有内存、SWAP不足的问题。
报错信息
INCIDENT_ID PROBLEM_KEY CREATE_TIME
-------------------- --------------- ----------------------------------------
.........
97941 ORA 4030 2021-01-11 12:10:03.877000 +08:00
97949 ORA 4030 2021-01-11 12:10:06.947000 +08:00
97950 ORA 4030 2021-01-11 12:10:10.384000 +08:00
97912 ORA 4030 2021-01-11 12:14:59.815000 +08:00
97425 ORA 4030 2021-01-12 12:45:24.105000 +08:00
97426 ORA 4030 2021-01-12 12:45:27.921000 +08:00
97427 ORA 4030 2021-01-12 12:45:30.866000 +08:00
97428 ORA 4030 2021-01-12 12:45:33.915000 +08:00
97444 ORA 4030 2021-01-12 12:45:37.286000 +08:00
98035 ORA 4030 2021-01-12 14:40:29.488000 +08:00
166 rows fetched
adrci> show incident -mode detail -p "incident_id=98038"
ADR Home = /oracle/app/oracle/diag/rdbms/test/test2:
*************************************************************************
**********************************************************
INCIDENT INFO RECORD 1
**********************************************************
INCIDENT_ID 98038
STATUS ready
CREATE_TIME 2021-01-12 15:15:38.005000 +08:00
PROBLEM_ID 1
CLOSE_TIME <NULL>
FLOOD_CONTROLLED none
ERROR_FACILITY ORA
ERROR_NUMBER 4030
ERROR_ARG1 82456
ERROR_ARG2 pga heap
ERROR_ARG3 control file i/o buffer
ERROR_ARG4 <NULL>
ERROR_ARG5 <NULL>
ERROR_ARG6 <NULL>
ERROR_ARG7 <NULL>
ERROR_ARG8 <NULL>
ERROR_ARG9 <NULL>
ERROR_ARG10 <NULL>
ERROR_ARG11 <NULL>
ERROR_ARG12 <NULL>
SIGNALLING_COMPONENT KGH
SIGNALLING_SUBCOMPONENT <NULL>
SUSPECT_COMPONENT <NULL>
SUSPECT_SUBCOMPONENT <NULL>
ECID <NULL>
IMPACTS 0
PROBLEM_KEY ORA 4030
FIRST_INCIDENT 66409
FIRSTINC_TIME 2020-12-23 22:15:28.413000 +08:00
LAST_INCIDENT 98039
LASTINC_TIME 2021-01-12 15:35:27.667000 +08:00
IMPACT1 34668547
IMPACT2 34668546
IMPACT3 0
IMPACT4 0
KEY_NAME ProcId
KEY_VALUE 201.3
KEY_NAME SID
KEY_VALUE 1709.2419
KEY_NAME PQ
KEY_VALUE (33554432, 1610435725)
KEY_NAME Client ProcId
KEY_VALUE oracle@test02 (TNS V1-V3).29229254_1
OWNER_ID 1
INCIDENT_FILE /oracle/app/oracle/diag/rdbms/test/test2/trace/test2_ora_29229254.trc
OWNER_ID 1
INCIDENT_FILE /oracle/app/oracle/diag/rdbms/test/test2/incident/incdir_98038/test2_ora_29229254_i98038.trc
1 rows fetched
报错的进程使用了111 MB的PGA,继续申请内存时失败。
4030 trace可以看到一些报错时的诊断信息,如PGA使用情况,Top 10 processes by pga,内存,swap使用情况及报错的客户端,SQL等信息
========= Dump for incident 97444 (ORA 4030) ========
----- Beginning of Customized Incident Dump(s) -----
=======================================
TOP 10 MEMORY USES FOR THIS PROCESS
---------------------------------------
67% 75 MB, 1511 chunks: "free memory "
session heap ds=1109a66e0 dsprt=11011f2c0
24% 27 MB, 18996 chunks: "qmxtgCrBufClob "
koh dur heap d ds=110c83120 dsprt=1109a66e0
5% 5561 KB, 44 chunks: "XVM Storage "
XVM subheap of ds=110995880 dsprt=1108ec580
1% 604 KB, 28 chunks: "permanent memory "
pga heap ds=110004e00 dsprt=0
0% 508 KB, 6338 chunks: "kolraloc-1 "
kolr heap ds i ds=1109b7ec8 dsprt=1109a66e0
0% 404 KB, 6332 chunks: "kolrarfc:lobloc_kolrhte "
kolr heap ds i ds=1109b7ec8 dsprt=1109a66e0
0% 376 KB, 1 chunk : "kfk_kfkio_freeq "
KFK_IO_SUBHEAP ds=1108ecb20 dsprt=110004e00
0% 248 KB, 735 chunks: "free memory "
koh dur heap d ds=110c83120 dsprt=1109a66e0
0% 171 KB, 35 chunks: "free memory "
lpxHeap subhea ds=1108ec4d0 dsprt=110004e00
0% 154 KB, 53 chunks: "kkecseps:kkestRCHistgrm "
TCHK^9d12ad4 ds=11098ff48 dsprt=11011e308
=======================================
PRIVATE MEMORY SUMMARY FOR THIS PROCESS
---------------------------------------
******************************************************
PRIVATE HEAP SUMMARY DUMP
111 MB total: <<<<<进程使用PGA 111MB
111 MB commented, 605 KB permanent
47 KB free (0 KB in empty extents),
103 MB, 1 heap: "session heap "
------------------------------------------------------
Summary of subheaps at depth 1
110 MB total:
35 MB commented, 109 KB permanent
75 MB free (30 MB in empty extents),
45 MB, 1 heap: "kolr heap ds i " 44 MB free held
28 MB, 3 heaps: "koh dur heap d " 1056 KB free held
=========================================
REAL-FREE ALLOCATOR DUMP FOR THIS PROCESS
-----------------------------------------
Dump of Real-Free Memory Allocator Heap [0x11096acf0]
mag=0xfefe0001 flg=0x5000003 fds=0x0 blksz=65536
blkdstbl=0x11096ad00, iniblk=4096 maxblk=524288 numsegs=15
In-use num=1676 siz=116260864, Freeable num=0 siz=0, Free num=0 siz=0 <<<
==========================================
INSTANCE-WIDE PRIVATE MEMORY USAGE SUMMARY
------------------------------------------
Dumping Work Area Table (level=1)
=====================================
Global SGA Info
---------------
global target: 8000 MB
auto target: 5971 MB
max pga: 1600 MB
pga limit: 3200 MB
pga limit known: 0
pga limit errors: 0
pga inuse: 1365 MB
pga alloc: 1696 MB
pga freeable: 119 MB
pga freed: 622187 MB
pga to free: 0
broker request: 0
pga auto: 0 MB
pga manual: 0 MB
pga alloc (max): 5531 MB
pga auto (max): 1439 MB
pga manual (max): 1 MB
# workareas : 3
# workareas(max): 48
-------------------------
Top 10 processes:
-------------------------
(percentage is of 1697 MB total allocated memory)
7% pid 201: 111 MB used of 112 MB allocated <= CURRENT PROC <<<当前进程使用最高,111MB
4% pid 204: 56 MB used of 63 MB allocated (5696 KB freeable)
4% pid 13: 60 MB used of 62 MB allocated
4% pid 14: 59 MB used of 62 MB allocated
3% pid 202: 53 MB used of 59 MB allocated (6016 KB freeable)
3% pid 200: 52 MB used of 58 MB allocated (5824 KB freeable)
3% pid 40: 52 MB used of 56 MB allocated (832 KB freeable)
3% pid 37: 41 MB used of 55 MB allocated
3% pid 12: 50 MB used of 52 MB allocated
3% pid 173: 42 MB used of 49 MB allocated (5888 KB freeable)
------------------------------------
Begin session detail for pid 201
sid: 1709 ser: 2419 audsid: 21942855 user: 91/USER06
flags: (0x8100041) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40009) -/-/INC
pid: 201 O/S info: user: grid, term: UNKNOWN, ospid: 29229254
image: oracle@test02
client details:
O/S info: user: bea1, term: unknown, ospid: 1234
machine: app-3 program: JDBC Thin Client
application name: JDBC Thin Client, hash value=2546894660
current SQL:
update xxxxxx t set t.validatestate='2'..........
================
SWAP INFORMATION
----------------
swap info: free_mem = 22096.49M rsv = 192.00M
alloc = 112.52M avail = 49152.00M swap_free = 49039.48M
----- End of Customized Incident Dump(s) -----
检查vmstat数据在问题时间点,内存使用无波动,可用内存充足
略
可用内存充足,总体PGA使用不算很高,什么限制了单进程的PGA的使用量?
a) min(5%pga_aggregate_target,50%_pga_max_size, _smm_max_size)
b) os ulimit
data == Identifies the soft limit for the largest process data segment for a user’s process.
检查资源限制未发现问题
$ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 4194304
memory(kbytes) unlimited
coredump(blocks) unlimited
nofiles(descriptors) unlimited
threads(per process) unlimited
processes(per user) unlimited
怀疑OS参数有限制,改过OS参数,监听没重启,通过监听连接的进程仍然受到限制
通过AIX dbx分析通过监听连接的进程是否存在资源限制
dbx 为AIX平台上的调试工具
dbx会中断程序执行,为不影响业务进程,创建一个sqlplus连接进行验证。
创建连接
sqlplus test/passwd@ip:1521/test
获取spid
SQL> select spid,a.program,b.machine,b.module from v$process a,v$session b where a.addr=b.paddr and b.username='ATS006' and b.module like '%SQL*Plus%' order by b.logon_time;
SPID PROGRAM
------------------------ ------------------------------------------------
MACHINE
----------------------------------------------------------------
MODULE
----------------------------------------------------------------
46071844 oracle@test02
test02
SQL*Plus
dbx 查看 rlimit 发现RLIMIT_DATA 内存限制
$dbx -a 46071844
Waiting to attach to process 46071844 ...
Successfully attached to oracle.
warning: Directory containing oracle could not be determined.
Apply 'use' command to initialize source path.
Type 'help' for help.
reading symbolic information ...
stopped in read at 0x90000000003c260 ($t1)
0x90000000003c260 (read+0x260) e8410028 ld r2,0x28(r1)
(dbx) proc rlimit
rlimit name: rlimit_cur rlimit_max (units)
RLIMIT_CPU: (unlimited) (unlimited) sec
RLIMIT_FSIZE: (unlimited) (unlimited) bytes
RLIMIT_DATA: 134217728 (unlimited) bytes <<<<<134,217,728
RLIMIT_STACK: 33554432 4294967296 bytes
RLIMIT_CORE: (unlimited) (unlimited) bytes
RLIMIT_RSS: 33554432 (unlimited) bytes
RLIMIT_AS: (unlimited) (unlimited) bytes
RLIMIT_NOFILE: 100000 (unlimited) descriptors
RLIMIT_THREADS: (unlimited) (unlimited) per process
RLIMIT_NPROC: (unlimited) (unlimited) per user
(dbx)
查看不通过监听的连接rlimit情况,RLIMIT_DATA unlimited
sqlplus test/passwd
SQL> !
zjgldb02:/tmp>$dbx -a 21954668
Waiting to attach to process 21954668 ...
Successfully attached to oracle.
warning: Directory containing oracle could not be determined.
Apply 'use' command to initialize source path.
Type 'help' for help.
reading symbolic information ...
stopped in read at 0x90000000003c260 ($t1)
0x90000000003c260 (read+0x260) e8410028 ld r2,0x28(r1)
(dbx) proc rlimit
rlimit name: rlimit_cur rlimit_max (units)
RLIMIT_CPU: (unlimited) (unlimited) sec
RLIMIT_FSIZE: (unlimited) (unlimited) bytes
RLIMIT_DATA: (unlimited) (unlimited) bytes <<<<RLIMIT_DATA unlimited
RLIMIT_STACK: 33554432 4294967296 bytes
RLIMIT_CORE: (unlimited) (unlimited) bytes
RLIMIT_RSS: (unlimited) (unlimited) bytes
RLIMIT_AS: (unlimited) (unlimited) bytes
RLIMIT_NOFILE: 65534 (unlimited) descriptors
RLIMIT_THREADS: (unlimited) (unlimited) per process
RLIMIT_NPROC: (unlimited) (unlimited) per user
(dbx)




