[20260423]再论参数use_large_pages.txt
--//现在服务器内存越来越大,一开始上线最好采用hugepages,而不是以后某个时间再修改,这样无形增加运维的工作量。
--//现在缺省参数use_large_pages缺省等于true,非常容易混淆,并不是一定全部采用HugePage,而是优先使用HugePage,如果不足采
--//用small pages作为补充,oracle实例可以继续启动。此时,Oracle实例就运行在内存上使用混合模式(Mixed Mode)下。
--//发现不知道从什么版本增加了AUTO_ONLY,顺便测试看看,另外前几天的测试实在太乱,重新整理:
--//顺便提一下,oracle 21c不支持设置vm.nr_overcommit_hugepages内核参数方式"借用"内存方式.
--//测试发现只有设置ONLY或者AUTO_ONLY才是全部采用HUGEPAGES。
--//还有发现21c启动无论use_large_pages如何设置,都是执行oradism命令,仅仅在设置vm.nr_hugepages不足的情况,后台保留
--//ora_dism_<sid>相关进程.
--//还是使用中使用pgrep遇到1个问题.
--//最后2个问题,另外写blog讲述.
1.环境:
SYS@book> @ ver2
==============================
PORT_STRING : x86_64/Linux 2.4.xx
VERSION : 21.0.0.0.0
BANNER : Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
BANNER_FULL : Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
Version 21.3.0.0.0
BANNER_LEGACY : Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
CON_ID : 0
PL/SQL procedure successfully completed.
SYS@book> @ hidez ^sga_target|^sga_max_size
NUM N_HEX CON_ID NAME DESCRIPTION DEFAULT_VALUE SESSION_VALUE SYSTEM_VALUE ISSES ISSYS_MOD
---- ----- ------ ------------ ------------------- ------------- ------------- ------------ ----- ---------
173 AD 0 sga_max_size max total SGA size TRUE 1107296256 1107296256 FALSE FALSE
1797 705 0 sga_target Target size of SGA FALSE 1107296256 1107296256 FALSE IMMEDIATE
--//1107296256/1024/1024 = 1056
--//换成hugepages等于1056/2 = 528,实际上我的测试需要的hugepages=530.
SYS@book> @ pvalid use_large
Display valid values for multioption parameters matching "use_large"...
PAR# PARAMETER ORD VALUE DEFAULT
------ ---------------- --- ------------ -------
180 use_large_pages 1 TRUE DEFAULT
use_large_pages 2 AUTO
use_large_pages 3 ONLY
use_large_pages 4 FALSE
use_large_pages 5 AUTO_ONLY
2.测试环境建立:
# grep ^vm /etc/sysctl.d/98-oracle.conf
vm.nr_hugepages = 530
vm.nr_overcommit_hugepages = 50
# sysctl -p /etc/sysctl.d/98-oracle.conf
fs.file-max = 6815744
kernel.sem = 250 32000 100 128
kernel.shmmni = 4096
kernel.shmall = 1073741824
kernel.shmmax = 4398046511104
kernel.panic_on_oops = 1
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
net.ipv4.conf.all.rp_filter = 2
net.ipv4.conf.default.rp_filter = 2
fs.aio-max-nr = 1048576
net.ipv4.ip_local_port_range = 9000 65500
vm.nr_hugepages = 530
vm.nr_overcommit_hugepages = 50
$ cat /u01/app/oracle/dbs/initbook.ora_org
SPFILE='/u01/app/oracle/dbs/spfilebook.ora'
use_large_pages=aaaaa
$ cat oradism.sh
#! /bin/bash
for i in TRUE AUTO ONLY FALSE AUTO_ONLY
do
sed "s/aaaaa/$i/" /u01/app/oracle/dbs/initbook.ora_org >| /u01/app/oracle/dbs/initbook.ora
echo ============================
tail -1 /u01/app/oracle/dbs/initbook.ora
#strace -ff -o aaa /u01/app/oracle/product/21.0.0/dbhome_1/bin/sqlplus -s -l sys/bookbook as sysdba <<<"startup nomount pfile=/u01/app/oracle/dbs/initbook.ora"
#sudo sysctl -w vm.nr_hugepages=520
sqlplus -s -l / as sysdba <<<"startup nomount pfile=/u01/app/oracle/dbs/initbook.ora" > /dev/null
grep -i hugepage /proc/meminfo
ipcs -m
ipcs -mu
pgrep -l dism
sqlplus -s -l / as sysdba <<<"shutdown immediate" > /dev/null
echo ============================
done
3.测试:
$ . oradism.sh | tee test1.txt
============================
use_large_pages=TRUE
AnonHugePages: 20480 kB
HugePages_Total: 530
HugePages_Free: 12
HugePages_Rsvd: 11
HugePages_Surp: 0
Hugepagesize: 2048 kB
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 1048576 oracle 600 10485760 32
0x00000000 1081345 oracle 600 1090519040 32
0x00000000 1114114 oracle 600 8388608 32
0xafa94c20 1146883 oracle 600 16384 32
--//注意看最后1行,其中1个共享内存段使用的small pages,4K的页面,使用4个。
--//实际上使用hugepages的数量是 HugePages_Total - HugePages_Free + HugePages_Rsvd = 530-12+11 = 529 .
------ Shared Memory Status --------
segments allocated 4
pages allocated 270852
pages resident 265217
pages swapped 0
Swap performance: 0 attempts 0 successes
============================
use_large_pages=AUTO
AnonHugePages: 10240 kB
HugePages_Total: 530
HugePages_Free: 12
HugePages_Rsvd: 11
HugePages_Surp: 0
Hugepagesize: 2048 kB
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 1179648 oracle 600 10485760 32
0x00000000 1212417 oracle 600 1090519040 32
0x00000000 1245186 oracle 600 8388608 32
0xafa94c20 1277955 oracle 600 16384 32
--//与use_large_pages=TRUE类似,也是其中1个共享内存段使用的small pages,4K的页面,使用4个。
------ Shared Memory Status --------
segments allocated 4
pages allocated 270852
pages resident 265217
pages swapped 0
Swap performance: 0 attempts 0 successes
============================
use_large_pages=ONLY
AnonHugePages: 20480 kB
HugePages_Total: 530
HugePages_Free: 11
HugePages_Rsvd: 11
HugePages_Surp: 0
Hugepagesize: 2048 kB
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 1310720 oracle 600 10485760 32
0x00000000 1343489 oracle 600 1090519040 32
0x00000000 1376258 oracle 600 8388608 32
0xafa94c20 1409027 oracle 600 2097152 32
--//注意看最后1行,其中1个共享内存段使用的hugepages,2M的页面,使用1个。
--//实际上使用hugepages的数量是 HugePages_Total - HugePages_Free + HugePages_Rsvd = 530-11+11 = 530.
------ Shared Memory Status --------
segments allocated 4
pages allocated 271360
pages resident 265728
pages swapped 0
Swap performance: 0 attempts 0 successes
============================
use_large_pages=FALSE
AnonHugePages: 10240 kB
HugePages_Total: 530
HugePages_Free: 530
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
--//设置use_large_pages=FALSE完全没有使用hugepages,注意一个问题,这样530*2=1060的内存就浪费了。
--//全部使用4K的页面表,等于消耗2倍的内存,在生产系统一定要避免这样的情况。
--//实际上use_large_pages=FALSE无论任何情况都不应该使用!!
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 1441792 oracle 600 9687040 66
0x00000000 1474561 oracle 600 1090519040 33
0x00000000 1507330 oracle 600 7090176 33
0xafa94c20 1540099 oracle 600 16384 33
------ Shared Memory Status --------
segments allocated 4
pages allocated 270340
pages resident 141170
pages swapped 0
Swap performance: 0 attempts 0 successes
============================
use_large_pages=AUTO_ONLY
AnonHugePages: 10240 kB
HugePages_Total: 530
HugePages_Free: 11
HugePages_Rsvd: 11
HugePages_Surp: 0
Hugepagesize: 2048 kB
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 1572864 oracle 600 10485760 32
0x00000000 1605633 oracle 600 1090519040 32
0x00000000 1638402 oracle 600 8388608 32
0xafa94c20 1671171 oracle 600 2097152 32
------ Shared Memory Status --------
segments allocated 4
pages allocated 271360
pages resident 265728
pages swapped 0
Swap performance: 0 attempts 0 successes
--//最后仔细看没有pgrep -l dism的输出,说明后台没有ora_dism_<SID>的进程.
--//测试设置vm.nr_hugepages = 530,满足设置需求.
--//如果测试vm.nr_hugepages < 529的情况呢.
4.继续测试:
--//修改测试脚本,加入sudo sysctl -w vm.nr_hugepages=520.
$ cat oradism.sh
#! /bin/bash
for i in TRUE AUTO ONLY FALSE AUTO_ONLY
do
sed "s/aaaaa/$i/" /u01/app/oracle/dbs/initbook.ora_org >| /u01/app/oracle/dbs/initbook.ora
echo ============================
tail -1 /u01/app/oracle/dbs/initbook.ora
#strace -ff -o aaa /u01/app/oracle/product/21.0.0/dbhome_1/bin/sqlplus -s -l sys/bookbook as sysdba <<<"startup nomount pfile=/u01/app/oracle/dbs/initbook.ora"
sudo sysctl -w vm.nr_hugepages=520
sqlplus -s -l / as sysdba <<<"startup nomount pfile=/u01/app/oracle/dbs/initbook.ora" > /dev/null
grep -i hugepage /proc/meminfo
ipcs -m
ipcs -mu
pgrep -l dism
sqlplus -s -l / as sysdba <<<"shutdown immediate" > /dev/null
done
--//关于sudo的配置,修改/etc/sudoers建立加入如下:
# grep oracle /etc/sudoers
oracle ALL=(ALL) ALL
$ . oradism.sh | tee test3.txt
--//注:执行sudo第1次需要输入oracle的口令,以后不会。
============================
use_large_pages=TRUE
vm.nr_hugepages = 520
AnonHugePages: 16384 kB
HugePages_Total: 520
HugePages_Free: 10
HugePages_Rsvd: 7
HugePages_Surp: 0
Hugepagesize: 2048 kB
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 2359296 oracle 600 10485760 32
0x00000000 2392065 oracle 600 1073741824 32
0x00000000 2424834 oracle 600 16777216 32
0x00000000 2457603 oracle 600 7090176 32
0xafa94c20 2490372 oracle 600 16384 32
--//分成5个共享内存段。
--//10485760/2/1024/1024 = 5
--//1073741824/2/1024/1024 = 512
--//16777216/2/1024/1024 = 8
--//7090176/4/1024 = 1731
--//16384/4/1024 = 4
--//实际上使用hugepages的数量是 HugePages_Total - HugePages_Free + HugePages_Rsvd = 520-10+7 = 517.
--//也就是前面2个共享内存段使用hugepages,其他使用4K的页面。
--//也就是use_large_pages=TRUE,优先使用hugepages,如果vm.nr_hugepages设置不足,使用4K页面表。也就是混合模式。
------ Shared Memory Status --------
segments allocated 5
pages allocated 270535
pages resident 261298
pages swapped 0
Swap performance: 0 attempts 0 successes
============================
use_large_pages=AUTO
vm.nr_hugepages = 520
AnonHugePages: 10240 kB
HugePages_Total: 529
HugePages_Free: 11
HugePages_Rsvd: 11
HugePages_Surp: 0
Hugepagesize: 2048 kB
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 2523136 oracle 600 10485760 32
0x00000000 2555905 oracle 600 1090519040 32
0x00000000 2588674 oracle 600 8388608 32
0xafa94c20 2621443 oracle 600 16384 32
--//HugePages_Total: 529,而实际设置vm.nr_hugepages=520,也就是通过oradism命令修改内核参数满足需求。
------ Shared Memory Status --------
segments allocated 4
pages allocated 270852
pages resident 265217
pages swapped 0
Swap performance: 0 attempts 0 successes
7237 oradism
--//启动oradism后台进程。
--//$ ll $(which oradism)
--//-rwsr-x---. 1 root oinstall 1867552 2021-07-28 02:50:06 /u01/app/oracle/product/21.0.0/dbhome_1/bin/oradism
--//注意oradism的owner=root,权限-rwsr-x---(里面有1个s)。也就是oracle用户执行可以在执行式获得root权限
============================
use_large_pages=ONLY
vm.nr_hugepages = 520
AnonHugePages: 10240 kB
HugePages_Total: 520
HugePages_Free: 520
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
------ Shared Memory Status --------
segments allocated 0
pages allocated 0
pages resident 0
pages swapped 0
Swap performance: 0 attempts 0 successes
--//use_large_pages=ONLY的情况下,数据库根本无法启动,vm.nr_hugepages = 520,无法满足启动需要的530的需求。
--//这里与11g完全不同。开始设置vm.nr_overcommit_hugepages = 50,如果11g下可以通过像vm.nr_overcommit_hugepages借用内存。
--//而21c完全不支持这个功能,在这里浪费时间。
============================
use_large_pages=FALSE
vm.nr_hugepages = 520
AnonHugePages: 10240 kB
HugePages_Total: 520
HugePages_Free: 520
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 2719744 oracle 600 9687040 66
0x00000000 2752513 oracle 600 1090519040 33
0x00000000 2785282 oracle 600 7090176 33
0xafa94c20 2818051 oracle 600 16384 33
------ Shared Memory Status --------
segments allocated 4
pages allocated 270340
pages resident 146413
pages swapped 0
Swap performance: 0 attempts 0 successes
--//use_large_pages=FALSE的情况下,不会使用hugepages,使用4K小的页面表。
============================
use_large_pages=AUTO_ONLY
vm.nr_hugepages = 520
AnonHugePages: 10240 kB
HugePages_Total: 530
HugePages_Free: 11
HugePages_Rsvd: 11
HugePages_Surp: 0
Hugepagesize: 2048 kB
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 2850816 oracle 600 10485760 32
0x00000000 2883585 oracle 600 1090519040 32
0x00000000 2916354 oracle 600 8388608 32
0xafa94c20 2949123 oracle 600 2097152 32
--//分成4个共享内存段。
--//10485760/2/1024/1024 = 5
--//1090519040/2/1024/1024 = 520
--//8388608/2/1024/1024 = 4
--//2097152/2/1024/1024 = 1
--//5+520+4+1 = 530
--//实际上使用hugepages的数量是 HugePages_Total - HugePages_Free + HugePages_Rsvd = 530-11+11 = 530.
------ Shared Memory Status --------
segments allocated 4
pages allocated 271360
pages resident 265728
pages swapped 0
Swap performance: 0 attempts 0 successes
7513 oradism
--//use_large_pages=AUTO_ONLY实际上相当于auto+only的组合,vm.nr_hugepages = 520不足的,但是启动oradism修改内核参数,满足
--//启动需求。4个共享内存段全部使用hugepages。
5.小结:
--//1.USE_LARGE_PAGES = ONLY(强制严格),设置vm.nr_hugepages必须大于需求的hugepages数量。并且全部共享内存段使用hegepages。
--//2.USE_LARGE_PAGES = AUTO,如果hugepages不足通过启动DISM修改内核vm.nr_hugepages满足启动需求。而且有1个特点,如果这种情
--//况出现,其中1个共享内存段使用4K的页面表。如果hugepages充足,后台不会保留dism进程。但是还是有1个共享内存段使用4K的页面
--//表.
--//3.USE_LARGE_PAGES = AUTO_ONLY,结合ONLY与AUTO的特点。如果hugepages不足通过启动DISM修改内核vm.nr_hugepages满足启动需求。
--//如果hugepages充足,后台不会保留dism进程。但是全部共享内存段使用hegepages。
--//4.对于oracle 21,设置vm.nr_overcommit_hugepages已经无用,oracle在USE_LARGE_PAGES任何参数,都不会借用"内存"。
--//5.视乎无论USE_LARGE_PAGES设置什么,启动实例时都会调用oradism,而且调用多次。我的测试体现不出来,另外写blog说明。
--//补充看看oradebug ipc的转储,USE_LARGE_PAGES = AUTO_ONLY
SYS@book> oradebug setmypid
Statement processed.
SYS@book> oradebug ipc
IPC information written to the trace file
*** 2026-04-20T16:25:46.010352+08:00 (CDB$ROOT(1))
Processing Oradebug command 'ipc'
Dump of unix-generic skgm context
areaflags 00001fb7
realmflags 0003ffff
mapsize 00001000
protectsize 00001000
lcmsize 00001000
seglen 00001000
largestsize 0000040000000000
smallestsize 0000000001000000
stacklimit 0x7ffe457f4ab2
stackdir -1
mode 600
magic acc01ade
Dump of unix-generic realm handle `/u01/app/oracle/product/21.0.0/dbhome_1book', flags = 00000500
key 2947107872 actual_key 2947107872 num_areas 4 num_subareas 4
primary shmid: 2228227 primary sanum 3 version 3
deferred alloc: FALSE (0) def_post_create: FALSE (0) exp_memlock: 1060M
Area #0 `Fixed Size' containing Subareas 2-2
Total size 000000000093c768 Minimum Subarea size 00000000
Area Subarea Shmid Segment Addr Stable Addr Actual Addr
0 2 2129920 0x00000060000000 0x00000060000000 0x00000060000000
Subarea size Segment size Req_Protect Cur_protect
000000000093d000 0000000000a00000 default readwrite
Area #1 `Variable Size' containing Subareas 0-0
Total size 0000000041000000 Minimum Subarea size 01000000
Area Subarea Shmid Segment Addr Stable Addr Actual Addr
1 0 2162689 0x00000061000000 0x00000061000000 0x00000061000000
Subarea size Segment size Req_Protect Cur_protect
0000000041000000 0000000041000000 default readwrite
Area #2 `Redo Buffers' containing Subareas 1-1
Total size 00000000006c3000 Minimum Subarea size 00001000
Area Subarea Shmid Segment Addr Stable Addr Actual Addr
2 1 2195458 0x000000a2000000 0x000000a2000000 0x000000a2000000
Subarea size Segment size Req_Protect Cur_protect
00000000006c3000 0000000000800000 default readwrite
Area #3 `skgm overhead' containing Subareas 3-3
Total size 0000000000004000 Minimum Subarea size 00000000
Area Subarea Shmid Segment Addr Stable Addr Actual Addr
3 3 2228227 0x000000a3000000 0x000000a3000000 0x000000a3000000
Subarea size Segment size Req_Protect Cur_protect
0000000000004000 0000000000200000 default readwrite
$ ./lookup.awk skgm
skgm : operating system dependent kernel generic memory (os dependent)
--//第一部分是'固定大小',即固定SGA ;第二部分是'可变大小',包含共享池和缓冲缓存;第三部分是'重做缓冲区';第四部分是
--//'SKGM开销',即该实例共享内存结构的索引。
