m.nr_hugepages参数设置过大导致内存分配异常引起系统OOM
因业务测试需要,从vSphere环境克隆生产服务器用于测试,降低内存配置(从196G降低至32GB)后,系统OOM。

扩展内存至192G,修改processes、 sga、pga配置
其中pga_aggregate_limit初始化参数的默认值被设置为以下值中较大的那一个:
Ø 2GB
Ø pga_aggregate_target 初始化参数的值的两倍
Ø 3 MB乘以processes 初始化参数的值
ORA-02097: parameter cannot be modified because specified value is invalid
ORA-00093: pga_aggregate_limit must be between 2048M and 100000G
SQL> create pfile='/home/oracle/20241106orcl_initorcl.ora' from spfile;
SQL> shutdown immediate;
vim /home/oracle/20241106orcl_initorcl.ora
[oracle@nccdb dbs]$ cd /u01/app/oracle/product/19.0.0/db_1/dbs
[oracle@nccdb dbs]$ mv spfileorcl.ora spfileorcl.ora_20241106
[oracle@nccdb dbs]$ cp /home/oracle/20241106orcl_initorcl.ora initorcl.ora
SQL> startup mount;
SQL> create spfile from pfile;
SQL> shutdown immediate;
SQL> shutdown ;
查看数据库的 kernel.shmal、kernel.shmmax这两个参数
shmall 是全部允许使用的共享内存大小,shmmax 是单个段允许使用的大小。这两个可以设置为内存的 90%。 32G 内存, shmall 的大小为 32*1024*1024*1024*0.9/4k(getconf PAGESIZE可得到) 即7549747(7549747.2取整);shmmax 即shmall *4k=30923763712
修改/etc/sysctl.conf
[root@nccdb ~]#vim /etc/sysctl.conf
kernel.shmall = 7549747
kernel.shmmax = 30923763712
关闭数据库,重启服务器,检查内存使用情况used :90G。
检查内存使用情况
[root@nccdb ~]# cat /proc/meminfo
MemTotal: 197911976 kB
MemFree: 949740 kB
MemAvailable: 94719232 kB
Buffers: 0 kB
Cached: 91207616 kB
SwapCached: 2532 kB
Active: 50552000 kB
Inactive: 46356612 kB
Active(anon): 5083720 kB
Inactive(anon): 841644 kB
Active(file): 45468280 kB
Inactive(file): 45514968 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 16777212 kB
SwapFree: 16706812 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 5883448 kB
Mapped: 102040 kB
Shmem: 40044 kB
Slab: 4162276 kB
SReclaimable: 3788948 kB
SUnreclaim: 373328 kB
KernelStack: 16160 kB
PageTables: 182808 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 68545232 kB
Committed_AS: 7901648 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 574148 kB
VmallocChunk: 34359064572 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 46082
HugePages_Free: 1
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 247616 kB
DirectMap2M: 29112320 kB
DirectMap1G: 174063616 kB
[root@nccdb ~]#
HugePages使用异常:46082*2048 kB=90GB
HugePages_Total: 46082
HugePages_Free: 1
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
修改HugePage相关参数:vm.nr_hugepages
[root@nccdb ~]#vim /etc/sysctl.conf
# sysctl settings are defined through files in
# /usr/lib/sysctl.d/, /run/sysctl.d/, and /etc/sysctl.d/.
#
# Vendors settings live in /usr/lib/sysctl.d/.
# To override a whole file, create a new file with the same in
# /etc/sysctl.d/ and put new settings there. To override
# only specific settings, add a file with a lexically later
# name in /etc/sysctl.d/ and put new settings there.
#
# For more information, see sysctl.conf(5) and sysctl.d(5).
kernel.shmall = 7549747
kernel.shmmax = 30923763712
kernel.sem = 250 32000 100 128
kernel.shmmni = 4096
kernel.randomize_va_space = 0
fs.file-max = 6815744
fs.aio-max-nr = 3145728
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
vm.nr_hugepages = 8196
[root@nccdb ~]#
验证
[root@nccdb ~]# sysctl -p
kernel.shmall = 7549747
kernel.shmmax = 30923763712
kernel.sem = 250 32000 100 128
kernel.shmmni = 4096
kernel.randomize_va_space = 0
fs.file-max = 6815744
fs.aio-max-nr = 3145728
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
vm.nr_hugepages = 8196
[root@nccdb ~]# free -h 内存使用17GB(使用正常).
(截图为降低内存后截取)
结论:m.nr_hugepages参数设置过大导致内存分配异常引起系统OOM




