前提知识文章
如何锁定MYSQL内存在物理内存里?
探索MYSQL开启大页内存
MYSQL locked_in_memory
Linux 64 页表,进程内存,大页
Linux 关闭透明大页
第一步 配置
1 设置好内存大小 避免内存溢出
[mysqld]#Global Memon Set###innodb_buffer_pool_size = 2048Minnodb_log_buffer_size = 16Mkey_buffer_size = 8M###innodb buffer subordinateinnodb_sort_buffer_size = 4M #插入排序 1Minnodb_ddl_buffer_size = 4Minnodb_buffer_pool_dump_at_shutdown = ONinnodb_buffer_pool_load_at_startup = ONinnodb_adaptive_hash_index = OFFlarge-pages #开启大页 (2048+16+8)/2=1036 Pagesmemlock #4K PAGES LOCK MEMORYmax_connections=500#### Thread Memon Set 8+8+8+8+16+16+8+1=73M+1=74MB PGA##PGA CACHE 74*32=2368+8(32*256K)=2376MBjoin_buffer_size=8388608 #8Msort_buffer_size=8388608 #8Mread_buffer_size=8388608 #8Mread_rnd_buffer_size=8388608 #8Mnet_buffer_length = 16384 #16K max 1Mtmp_table_size =16777216 #16Mbinlog_cache_size =16777216 #16Mbulk_insert_buffer_size=8388608 #8Mthread_cache_size = 32thread_stack = 256Ksync_binlog = 1innodb_flush_log_at_trx_commit = 1innodb_flush_method=o_direct##PS内存 不好统计#performance_schema_max_sql_text_length=1024 #SQL 最大长度#performance_schema_max_digest_length=1024 #SQL数字化后长度#performance_schema_events_statements_history_long_size=10000 #全库SQL最近1万条 这里的SQL_TEXT,DIGEST_TEXT属于静态分配 LENGTH*SIZE#performance_schema_events_statements_history_size=100 #每个线程最近100条 注意线程数量#performance_schema_digests_size=100000 #数字化后保存10万条performance_schema=ONperformance_schema_events_statements_history_size=100performance_schema_events_statements_history_long_size=10000performance_schema_digests_size=100000performance_schema_max_sql_text_length=1024performance_schema_max_digest_length=1024
#配置大页参数 my.cnf
[mysqld]large-pages
2 配置LINUX内核参数
设置内核参数kernel.shmmax和kernel.shmall
shmmax是最大的共享内存段的大小,单位是字节,默认32M
shmall是共享内存的总大小,单位是页
vim etc/sysctl.conf
#-- kernel.shmmni这个内核参数用于设置系统范围内共享内存段的最大数量。该参数的默认值是 4096.#-- kernel.shmall 控制共享内存页数#-- kernel.shmmax 单个共享内存段的最大尺寸,设置为物理内存的 50%#-- Increase total amount of shared memory. The value#-- is the number of pages. At 4KB/page, 4194304 = 16GB.#--echo 4194304 > proc/sys/kernel/shmall#--对于mysql的使用,最好是shmmax与shmall接近#--7168MBkernel.shmmax = 7516192768kernel.shmall = 1835008kernel.shmmni = 4096vm.hugetlb_shm_group =27 #--MYSQL 所在的用户组IDvm.nr_hugepages=2290 ##--持久化大页内存数量
3.修改ulimit
使用ulimit -l或设置/etc/security/limits.conf
vim etc/security/limits.conf@mysql soft memlock unlimited@mysql hard memlock unlimited
4 通过 id mysql 获取mysql所在的group id
[root@localhost ~]# id mysqluid=27(mysql) gid=27(mysql) groups=27(mysql)echo 27 > proc/sys/vm/hugetlb_shm_group[root@localhost ~]# sysctl -w vm.hugetlb_shm_group=27vm.hugetlb_shm_group = 27
5 动态设置分配大页内存:
echo 1040 > proc/sys/vm/nr_hugepages
6 测试动态启动大页
(大页内存不会被其它进程使用,规划化太多会导致LINUX系统本身不够用)
[root@localhost admin_mapper]# free -mtotal used free shared buff/cache availableMem: 7821 158 5985 33 1676 6962Swap: 8063 174 7889[root@localhost admin_mapper]# echo 1040 > proc/sys/vm/nr_hugepages[root@localhost admin_mapper]# free -mtotal used free shared buff/cache availableMem: 7821 2238 3905 33 1676 4883Swap: 8063 174 7889# 1040*2+158=2238
第二 实验
1 空初始
[root@localhost admin_mapper]# echo 0 > proc/sys/vm/nr_hugepages[root@localhost admin_mapper]# free -mtotal used free shared buff/cache availableMem: 7821 157 6018 33 1645 6964Swap: 8063 174 7889[root@localhost admin_mapper]# cat proc/meminfo |grep -E "Huge|Mlocked|Page"Mlocked: 0 kBAnonPages: 40388 kBPageTables: 3228 kBAnonHugePages: 0 kBHugePages_Total: 0HugePages_Free: 0HugePages_Rsvd: 0HugePages_Surp: 0Hugepagesize: 2048 kB
2 启动默认设置的MYSQL
[root@localhost admin_mapper]# echo 0 > proc/sys/vm/nr_hugepages[root@localhost admin_mapper]# free -mtotal used free shared buff/cache availableMem: 7821 1295 4848 33 1677 5825Swap: 8063 174 7889[root@localhost admin_mapper]# cat proc/meminfo |grep -E "Huge|Mlocked|Page"Mlocked: 0 kBAnonPages: 1205588 kBPageTables: 6132 kBAnonHugePages: 0 kBHugePages_Total: 0HugePages_Free: 0HugePages_Rsvd: 0HugePages_Surp: 0Hugepagesize: 2048 kB[root@localhost ~]# cat proc/$(pidof mysqld)/oom_score62
# 1这里使用普通模式 页表(PageTables)占用6132-3228=2904KB,
# 2物理内存占用1295-157=1138MB 这里的物理内存是启动分配的,随着负载增加而动态增长
# 3 OOM得分为62
3 启动锁物理内存的MYSQL
[root@localhost ~]# free -mtotal used free shared buff/cache availableMem: 7821 3535 2831 33 1454 3523Swap: 8063 174 7889[root@localhost ~]# cat proc/meminfoMemTotal: 8008964 kBMemFree: 2899516 kBMemAvailable: 3608024 kBMlocked: 3501712 kBPageTables: 10524 kBHugePages_Total: 0HugePages_Free: 0HugePages_Rsvd: 0HugePages_Surp: 0Hugepagesize: 2048 kBDirectMap4k: 90076 kBDirectMap2M: 3055616 kBDirectMap1G: 7340032 kB[root@localhost ~]# cat proc/$(pidof mysqld)/oom_score210
## 这里 物理内存使用 3535-157=3221 这么说来使用MEMLOCK参数,会全部分配INNODB BUFFER
## 页表达到 10524KB 10524-3228=7296 =7MB
# 3 OOM得分为210
4 启动大页和锁的MYSQL
[root@localhost admin_mapper]# free -mtotal used free shared buff/cache availableMem: 7821 2238 3905 33 1676 4883Swap: 8063 174 7889[root@localhost ~]# service mysqld startRedirecting to bin/systemctl start mysqld.service
mysql> SELECT * FROM sys.memory_global_total;+-----------------+| total_allocated |+-----------------+| 2.90 GiB |+-----------------+1 row in set (0.00 sec)
mysql> SELECT event_name,current_alloc FROM sys.memory_global_by_current_bytes limit 16;+------------------------------------------------------------------------------+---------------+| event_name | current_alloc |+------------------------------------------------------------------------------+---------------+| memory/innodb/buf_buf_pool | 2.04 GiB || memory/performance_schema/events_statements_summary_by_digest | 396.73 MiB || memory/performance_schema/events_statements_summary_by_digest.digest_text | 97.66 MiB || memory/innodb/hash0hash | 57.43 MiB || memory/performance_schema/events_statements_history | 35.55 MiB || memory/performance_schema/events_statements_history.sql_text | 25.00 MiB || memory/performance_schema/events_statements_history.digest_text | 25.00 MiB || memory/innodb/ut0link_buf | 24.00 MiB || memory/performance_schema/events_statements_history_long | 13.89 MiB || memory/performance_schema/events_errors_summary_by_thread_by_error | 12.52 MiB || memory/performance_schema/events_statements_summary_by_thread_by_event_name | 10.20 MiB || memory/performance_schema/events_statements_history_long.digest_text | 9.77 MiB || memory/performance_schema/events_statements_history_long.sql_text | 9.77 MiB || memory/performance_schema/memory_summary_by_thread_by_event_name | 9.32 MiB || memory/performance_schema/table_handles | 9.06 MiB || memory/mysys/KEY_CACHE | 8.00 MiB |+------------------------------------------------------------------------------+---------------+16 rows in set (0.00 sec)
mysql> select sys.format_bytes (sum(current_alloc)) from sys.x$memory_global_by_current_bytes where event_name like 'memory/performance_schema%';+---------------------------------------+| sys.format_bytes (sum(current_alloc)) |+---------------------------------------+| 750.22 MiB |+---------------------------------------+1 row in set, 1 warning (0.00 sec)
[root@localhost admin_mapper]# free -mtotal used free shared buff/cache availableMem: 7821 3793 2350 33 1677 3262Swap: 8063 174 7889[root@localhost admin_mapper]# cat proc/meminfo |grep -E "Huge|Mlocked|Page"Mlocked: 1641060 kBAnonPages: 1632036 kBPageTables: 6856 kBAnonHugePages: 0 kBHugePages_Total: 1040HugePages_Free: 50HugePages_Rsvd: 0HugePages_Surp: 0Hugepagesize: 2048 kB[root@localhost ~]# cat proc/$(pidof mysqld)/oom_score90
#启动大页MYSQL后 非大页内存使用1555MB 3793-2238=1555MB
#大页消耗大部分,只剩50个页 约100MB 使用 1980MB
#Mlocked: 1641060 kB 1602.59MB
#页表消耗 6856-3228=3628KB=3.54MB
# OOM 得分降低到90
5 加大大页内数量
mysql> show variables like '%heap%';+---------------------+----------+| Variable_name | Value |+---------------------+----------+| max_heap_table_size | 16777216 |+---------------------+----------+1 row in set (0.00 sec)mysql> select sys.format_bytes (sum(current_alloc)) from sys.x$memory_global_by_current_bytes where event_name like 'memory/performance_schema%';+---------------------------------------+| sys.format_bytes (sum(current_alloc)) |+---------------------------------------+| 750.22 MiB |+---------------------------------------+1 row in set, 1 warning (0.00 sec)[mysqld]#Global Memon Set###innodb_buffer_pool_size = 2048Minnodb_log_buffer_size = 16Mkey_buffer_size = 8Mmax_heap_table_size =16M#2838/2=1419[root@localhost ~]# free -mtotal used free shared buff/cache availableMem: 7821 158 6203 33 1459 6966Swap: 8063 174 7889[root@localhost admin_mapper]# echo 1419 > proc/sys/vm/nr_hugepages[root@localhost ~]# free -mtotal used free shared buff/cache availableMem: 7821 2997 3364 33 1459 4127Swap: 8063 174 7889[root@localhost ~]# cat proc/meminfo | grep HugeAnonHugePages: 0 kBHugePages_Total: 1419HugePages_Free: 1419HugePages_Rsvd: 0HugePages_Surp: 0Hugepagesize: 2048 kB[root@localhost ~]# service mysqld startRedirecting to bin/systemctl start mysqld.service[root@localhost ~]# free -mtotal used free shared buff/cache availableMem: 7821 4277 2083 33 1459 2781Swap: 8063 174 7889[root@localhost ~]# cat proc/meminfo | grep HugeAnonHugePages: 0 kBHugePages_Total: 1419HugePages_Free: 363HugePages_Rsvd: 0HugePages_Surp: 0Hugepagesize: 2048 kB[root@localhost ~]# cat proc/meminfo | grep MlockMlocked: 1360332 kB[root@localhost ~]# cat proc/$(pidof mysqld)/oom_score82top -p 3267PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND3267 root 20 0 5532860 1.3g 67248 S 0.7 17.3 0:10.85 mysqld
mysql> select event_name,current_alloc from sys.memory_global_by_current_bytes limit 20;+------------------------------------------------------------------------------+---------------+| event_name | current_alloc |+------------------------------------------------------------------------------+---------------+| memory/innodb/buf_buf_pool | 2.06 GiB || memory/performance_schema/events_statements_summary_by_digest | 396.73 MiB || memory/performance_schema/events_statements_summary_by_digest.digest_text | 97.66 MiB || memory/innodb/hash0hash | 57.43 MiB || memory/innodb/ut0link_buf | 24.00 MiB || memory/performance_schema/events_statements_summary_by_thread_by_event_name | 10.20 MiB |mysql> select sys.format_bytes (sum(current_alloc)) from sys.x$memory_global_by_current_bytes where event_name like 'memory/performance_schema%';+---------------------------------------+| sys.format_bytes (sum(current_alloc)) |+---------------------------------------+| 606.74 MiB |+---------------------------------------+1 row in set, 1 warning (0.08 sec)
#分析除了PS内存降低了150M 主要是HISTORY_SIZE 回归1000行的结果
#物理内存4277-2997=1280MB
#top 物理内存1.3G显示那些无法放进共享大页内存的 和 线程级别的内存.
#大页内存有363页空闲等于726MB没有被使用,使用了2112 MB .
#MEMLOCK 锁定物理内存1328MB,多锁了48MB内存.
对比第四个实验
1 物理内存小页使用 1555MB 这里是1280MB 1555-1280=275 物理内存少了275MB
2 MEMLOCK 1602.59-1328 =274.59 锁定物理内存页少了275MB
3 大页内存 (1419-363)-(1040-50)=66*2=132MB
4 我这里使用的是750PS内存,第四步的时候. 第五步PS内存是606, 750-606=144; 275-144=131MB
这样就怼上了,物理内存少的那么多,就被加入到了大页内存中132MB
留下的疑问
1 这132MB是谁呢?
大页内存使用(1419-363)*2=2112MB -2048(INNODBBUF)=64MB
这又对不上了,扣掉INNODBBUF 才64MB
再扣掉40MB(LOG_BUF,KEY_BUF,HEAP) 就剩24MB
这24MB又是谁啊? 再说大页内存相对加入了132MB,
2 系统内存与MYSQL内存差异
从系统上MYSQL累积使用内存 大页内存+锁定内存=(1419-363)*2+(1280)=3392 大于MYSQL看到的内存 2.78 且相差 -545.28MB
mysql> select sys.format_bytes (sum(current_alloc))from sys.x$memory_global_by_current_bytes ;+---------------------------------------+| sys.format_bytes (sum(current_alloc)) |+---------------------------------------+| 2.78 GiB |+---------------------------------------+1 row in set, 1 warning (0.00 sec)
3 PS 统计内存不准
INNODB BUF 看样子确实分配了
mysql> show engine innodb status;----------------------BUFFER POOL AND MEMORY----------------------Total large memory allocated 0Dictionary memory allocated 446200Buffer pool size 131056Buffer pool size, bytes 2147221504Free buffers 129263Database pages 1784Old database pages 0Modified db pages 0Pending reads 0Pending writes: LRU 0, flush list 0, single page 0Pages made young 0, not young 00.00 youngs/s, 0.00 non-youngs/sPages read 1642, created 142, written 1690.00 reads/s, 0.00 creates/s, 0.00 writes/sBuffer pool hit rate 1000 1000, young-making rate 0 1000 not 0 1000Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/sLRU len: 1784, unzip_LRU len: 0I/O sum[0]:cur[0], unzip sum[0]:cur[0]
mysql> select event_name,current_allocfrom sys.x$memory_global_by_current_byteswhere event_name like 'memory/innodb%';+-------------------------------------------+---------------+| event_name | current_alloc |+-------------------------------------------+---------------+| memory/innodb/buf_buf_pool | 2213220352 |
# 这里PS占用2110MB 明显统计INNODB BUFSIZE 不准确
4 INNODB BUF 是否完全分配
第4次实验大页内存使用1980MB,明显低于INNODBBUF SIZE 2048MB
SYSBENCH OLTP 压测
10万笔数据 4个表,4个线程
传统页
sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)Running the test with following options:Number of threads: 4Initializing random number generator from current timeInitializing worker threads...Threads started!SQL statistics:queries performed:read: 263158write: 75188other: 37594total: 375940transactions: 18797 (313.13 per sec.)queries: 375940 (6262.59 per sec.)ignored errors: 0 (0.00 per sec.)reconnects: 0 (0.00 per sec.)General statistics:total time: 60.0279stotal number of events: 18797Latency (ms):min: 4.05avg: 12.77max: 333.2895th percentile: 36.24sum: 239973.87Threads fairness:events (avg/stddev): 4699.2500/21.13execution time (avg/stddev): 59.9935/0.01
sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)Running the test with following options:Number of threads: 4Initializing random number generator from current timeInitializing worker threads...Threads started!SQL statistics:queries performed:read: 272524write: 77864other: 38932total: 389320transactions: 19466 (324.32 per sec.)queries: 389320 (6486.44 per sec.)ignored errors: 0 (0.00 per sec.)reconnects: 0 (0.00 per sec.)General statistics:total time: 60.0189stotal number of events: 19466Latency (ms):min: 4.19avg: 12.33max: 335.0195th percentile: 34.33sum: 239989.53Threads fairness:events (avg/stddev): 4866.5000/31.61execution time (avg/stddev): 59.9974/0.00
大页+锁
[root@localhost ~]# sh sysbench_oltp_4thread_test_run.sh
sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)Running the test with following options:Number of threads: 4Initializing random number generator from current timeInitializing worker threads...Threads started!SQL statistics:queries performed:read: 264656write: 75616other: 37808total: 378080transactions: 18904 (315.01 per sec.)queries: 378080 (6300.11 per sec.)ignored errors: 0 (0.00 per sec.)reconnects: 0 (0.00 per sec.)General statistics:total time: 60.0101stotal number of events: 18904Latency (ms):min: 4.25avg: 12.69max: 357.7195th percentile: 38.94sum: 239956.15Threads fairness:events (avg/stddev): 4726.0000/20.58execution time (avg/stddev): 59.9890/0.00
[root@localhost ~]# sh sysbench_oltp_4thread_test_run.sh
sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)Running the test with following options:Number of threads: 4Initializing random number generator from current timeInitializing worker threads...Threads started!SQL statistics:queries performed:read: 286048write: 81728other: 40864total: 408640transactions: 20432 (340.46 per sec.)queries: 408640 (6809.12 per sec.)ignored errors: 0 (0.00 per sec.)reconnects: 0 (0.00 per sec.)General statistics:total time: 60.0116stotal number of events: 20432Latency (ms):min: 4.21avg: 11.74max: 267.9195th percentile: 30.81sum: 239955.77Threads fairness:events (avg/stddev): 5108.0000/26.05execution time (avg/stddev): 59.9889/0.00transactions: 19466 (324.32 per sec.)queries: 389320 (6486.44 per sec.)95th percentile: 34.33transactions: 20432 (340.46 per sec.) 5%queries: 408640 (6809.12 per sec.) 5%95th percentile: 30.81 10.2%
总结下
1开启大页内存有效降低了OOM得分,减少被KILL风险.
2 有效降低了LINUX内核页表大小,提高性能.
3 锁定在物理内存不被SWAP,提高性能.
4 比较难精确计算出大页内存被谁使用了
5 存在系统内存与MYSQL内存统计差异大
6 PS内存统计与INNNODB STATUS统计差异大
7 大页内存8GB 4线程下提升5%的性能





