You are here
Who else is using my memory - File System Cache analysis
See also our former articles:
- Do not underestimate performance impacts of swapping on NUMA database systems
- MariaDB and MySQL swap analysis
When we do analysis of MariaDB Database servers we also check the memory (RAM and Swap) available:
# free --kilo --wide total used free shared buffers cache available Mem: 16106252 4329952 703356 199008 307872 10765072 11042748 Swap: 31250428 528684 30721744
The values for buffers
and especially for cache
can be sometimes quite big. In this case they use about 10 GiB. So let us have a look what these things called buffers
and cache
are, using our valuable RAM... When we check the man pages of free
we will find:
# man free ... buffers Memory used by kernel buffers (Buffers in /proc/meminfo) cache Memory used by the page cache and slabs (Cached and Slab in /proc/meminfo) buff/cache Sum of buffers and cache
So let us check a more fine grained information in /proc/meminfo
which is an interface to the kernel data structures:
# cat /proc/meminfo | grep -e ^Cached -e Slab -e Buffers Buffers: 307872 kB Cached: 10155156 kB Slab: 609916 kB
Same values! Then let us have a look at the man pages of proc
what we can find about these values:
# man proc ... Buffers Relatively temporary storage for raw disk blocks that shouldn't get tremendously large (20MB or so). Cached In-memory cache for files read from the disk (the page cache). Doesn't include SwapCached. Slab In-kernel data structures cache.
So it looks like we have a raw I/O Cache (called Buffer Cache) and a File System I/O Cache (called Page Cache). So how does this work? What is a raw I/O? And is a Files System I/O cached once (Cached
) or twice (Cached
and Buffers
)?
When we dig a bit deeper we can find that prior to Linux Kernels 2.4 the two Caches were distinct. So that was a waste of memory (RAM). It seems like today this is not the case any more [1], [2], [3]. And man pages are a bit out of date or at least not very precise?
Analysing the Linux Page Cache
A very good source when it comes to Linux Performance Tuning and Measuring is Brendan Gregg's Website. To measure Linux Page Cache Hit Ratio he provides a tool called cachestat
which is part of the perf-tools collection on GitHub.
With cachestat
we get a per second statistics of the Buffer Cache and the Page Cache (without Slabs), Cache Hits, Cache Misses, Dirty Buffer Entries in the Cache and a Cache Hit Ratio:
# sudo cachestat Counting cache functions... Output every 1 seconds. HITS MISSES DIRTIES RATIO BUFFERS_MB CACHE_MB 1419 8 0 99.4% 338 9406 1368 0 0 100.0% 338 9406 1391 0 0 100.0% 338 9406 8558 0 29 100.0% 338 9406 31870 0 163 100.0% 338 9406 1374 0 24 100.0% 338 9406 1388 0 0 100.0% 338 9406 1370 0 0 100.0% 338 9406 1388 0 0 100.0% 338 9406
Brendan Gregg also mentions a tool called pcstat
(on GitHub) by Al Tobey which gets Page Cache Statistics for Files. Unfortunately I had some problems building it on my Ubuntu 16.04 with Go version 1.6. So I built it on an Ubuntu 18.04 (Go 1.10) and copied it over to to Ubuntu 16.04):
# export GOPATH=/tmp/ # cd $GOPATH # go get golang.org/x/sys/unix # go get github.com/tobert/pcstat/pcstat # bin/pcstat $GOPATH/bin/pcstat
Then I tried pcstat
out against a MariaDB 10.4 instance. In the output we can see how big the files are in bytes, how many pages of 4 kib this corresponds to, how many of these 4 kib pages are cached and the percentage of pages cached:
# pcstat /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* +------------------------------------------------------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |------------------------------------------------------+----------------+------------+-----------+---------| | /home/mysql/database/mariadb-104/data/ib_buffer_pool | 14642 | 4 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibdata1 | 79691776 | 19456 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ib_logfile0 | 268435456 | 65536 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ib_logfile1 | 268435456 | 65536 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibtmp1 | 12582912 | 3072 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.frm | 1097 | 1 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.ibd | 13631488 | 3328 | 0 | 000.000 | +------------------------------------------------------+----------------+------------+-----------+---------+
When we run pcstat
over time with the famous watch
command we can even see how the Page Cache is heating up:
# watch -d -n 1 'pcstat /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* ; free -w' +------------------------------------------------------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |------------------------------------------------------+----------------+------------+-----------+---------| | /home/mysql/database/mariadb-104/data/ib_buffer_pool | 14642 | 4 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibdata1 | 79691776 | 19456 | 2416 | 012.418 | | /home/mysql/database/mariadb-104/data/ib_logfile0 | 268435456 | 65536 | 3165 | 004.829 | | /home/mysql/database/mariadb-104/data/ib_logfile1 | 268435456 | 65536 | 5890 | 008.987 | | /home/mysql/database/mariadb-104/data/ibtmp1 | 12582912 | 3072 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.frm | 1097 | 1 | 1 | 100.000 | | /home/mysql/database/mariadb-104/data/test/test.ibd | 13631488 | 3328 | 1164 | 034.976 | +------------------------------------------------------+----------------+------------+-----------+---------+ total used free shared buffers cache available Mem: 16106252 4329952 703356 199008 307872 10765072 11042748 Swap: 31250428 528684 30721744
An other tool which was discussed on Brendans Website was vmtouch
- the Virtual Memory Toucher (on GitHub, Documentation). With vmtouch
we can see for example how much of the directory /home/mysql/database/mariadb-104/data
(datadir
) is currently in cache:
# vmtouch -f /home/mysql/database/mariadb-104/data Files: 503 Directories: 9 Resident Pages: 29356/231060 114M/902M 12.7% Elapsed: 0.009668 seconds
Or more fine grained how much of InnoDB System Files are currently in memory:
# vmtouch -f -v /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/ib_buffer_pool [ ] 0/4 /home/mysql/database/mariadb-104/data/ibdata1 [oOooooo ooooo ] 2416/19456 /home/mysql/database/mariadb-104/data/ib_logfile0 [o oOO] 3165/65536 /home/mysql/database/mariadb-104/data/ib_logfile1 [OOOOOOOOOOOOOOOOOOOOOo ] 23192/65536 /home/mysql/database/mariadb-104/data/ibtmp1 [ ] 0/3072 Files: 5 Directories: 0 Resident Pages: 28773/153604 112M/600M 18.7% Elapsed: 0.005499 seconds
A further question to answer is: Can I see all files cached in the Page Cache? So it seem like this is not easily possible:
There is no efficient search mechanism for doing the reverse - getting a file name belonging to a data block would require reading all inodes and indirect blocks on the file system. If you need to know about every single file's blocks stored in the page cache, you would need to supply a list of all files on your file system(s) to fincore. But that again is likely to spoil the measurement as a large amount of data would be read traversing the directories and getting all inodes and indirect blocks - putting them into the page cache and evicting the very page cache data you were trying to examine. [5]
Also in this article we can read about the Linux File Tools (linux-ftools) by Google. It seems to be a bit more complicated to make them work. So I let it be.
How is the Page Cache related to MariaDB
After all this technical O/S discussion, how is Linux Page Cache related to your MariaDB Database? Your MariaDB Database caches Data and Indexes as well. For the InnoDB Storage Engine this is the InnoDB Buffer Pool and for the Aria Storage Engine this is the Aria Page Cache Buffer. So if your MariaDB Database caches pages and if your Linux O/S caches pages the probability is high they cache the same data twice and thus waste valuable RAM! Fortunately InnoDB is configurable in a way it does NOT cache InnoDB files in the Page Cache. This is controlled with the InnoDB Server Variable innodb_flush_method
.
When we look at InnoDB Files which were opened in a "normal" way (default: innodb_flush_method = fsync
) we get the following information about how the files were opened (man 2 open and [6]):
# lsof +fg ./ib* COMMAND PID USER FD TYPE FILE-FLAG DEVICE SIZE/OFF NODE NAME mysqld 2098 mysql 7uW REG RW,LG,0x80000 8,1 79691776 9175185 ./ibdata1 mysqld 2098 mysql 11uW REG RW,LG,0x80000 8,1 268435456 9175186 ./ib_logfile0 mysqld 2098 mysql 12uW REG RW,LG,0x80000 8,1 268435456 9175187 ./ib_logfile1 mysqld 2098 mysql 13uW REG RW,LG,0x80000 8,1 12582912 9175280 ./ibtmp1
The interesting column here is the FILE-FLAG
column which indicates (man lsof
):
# man lsof FILE-FLAG when g or G has been specified to +f, this field contains the contents of the f_flag[s] member of the kernel file structure and the kernel's per-process open file flags (if available); `G' causes them to be displayed in hexadecimal; `g', as short-hand names; two lists may be displayed with entries separated by commas, the lists separated by a semicolon (`;'); the first list may contain short-hand names for f_flag[s] values from the following table: DIR direct LG large file RW read and write access
The output is not so clear or completely understandable yet thus we want to have the open file flags in hexadecimal notation:
# lsof +fG ./ib* COMMAND PID USER FD TYPE FILE-FLAG DEVICE SIZE/OFF NODE NAME mysqld 2098 mysql 7uW REG 0x88002;0x0 8,1 79691776 9175185 ./ibdata1 mysqld 2098 mysql 11uW REG 0x88002;0x0 8,1 268435456 9175186 ./ib_logfile0 mysqld 2098 mysql 12uW REG 0x88002;0x0 8,1 268435456 9175187 ./ib_logfile1 mysqld 2098 mysql 13uW REG 0x88002;0x0 8,1 12582912 9175280 ./ibtmp1
The Linux Kernel open file flags can be found here: fcntl.h. I have extracted the most relevant open file flags for our examination:
#define O_RDWR 00000002 (oct, 0x00002) #define O_DIRECT 00040000 (oct, 0x04000) /* direct disk access hint */ #define O_LARGEFILE 00100000 (oct, 0x08000) #define O_CLOEXEC 02000000 (oct, 0x80000) /* set close_on_exec */
So we can see that these 4 InnoDB files where opened with O_RDWR
(RW
), O_LARGE_FILE
(LG
) and O_CLOEXEC
(not available (yet?) in lsof
translation output).
Now let us start the MariaDB Database with the server variable set to: innodb_flush_method = O_DIRECT
and check how the files where opened:
# lsof +fg ./ib* COMMAND PID USER FD TYPE FILE-FLAG DEVICE SIZE/OFF NODE NAME mysqld 2098 mysql 7uW REG RW,DIR,LG,0x80000 8,1 79691776 9175185 ./ibdata1 mysqld 2098 mysql 11uW REG RW,LG,0x80000 8,1 268435456 9175186 ./ib_logfile0 mysqld 2098 mysql 12uW REG RW,LG,0x80000 8,1 268435456 9175187 ./ib_logfile1 mysqld 2098 mysql 13uW REG RW,DIR,LG,0x80000 8,1 12582912 9175280 ./ibtmp1 # lsof +fG ./ib* COMMAND PID USER FD TYPE FILE-FLAG DEVICE SIZE/OFF NODE NAME mysqld 2098 mysql 7uW REG 0x8c002;0x0 8,1 79691776 9175185 ./ibdata1 mysqld 2098 mysql 11uW REG 0x88002;0x0 8,1 268435456 9175186 ./ib_logfile0 mysqld 2098 mysql 12uW REG 0x88002;0x0 8,1 268435456 9175187 ./ib_logfile1 mysqld 2098 mysql 13uW REG 0x8c002;0x0 8,1 12582912 9175280 ./ibtmp1
We can see a new flag DIR
or 0x04000
which means the files where opened with O_DIRECT
. But only the InnoDB Temporary Table Tablespace and the InnoDB System Tablespace but not the two InnoDB Transaction Logs.
Translation of hex to oct: 0x8c002 = 02140002
.
But what does O_DIRECT
mean? Looking at the open(2) man pages we can find:
# man 2 open O_DIRECT (since Linux 2.4.10) Try to minimize cache effects of the I/O to and from this file. In general this will degrade performance, but it is useful in special situations, such as when applications do their own caching. File I/O is done directly to/from user-space buffers. The O_DIRECT flag on its own makes an effort to transfer data synchronously, but does not give the guarantees of the O_SYNC flag that data and necessary meta‐ data are transferred. To guarantee synchronous I/O, O_SYNC must be used in addition to O_DIRECT.
So O_DIRECT
is exactly what we want in this case: Bypassing the File System Page Cache to not cache the Database blocks twice!
To verify the impact we run pcstat
again:
# pcstat /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* +------------------------------------------------------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |------------------------------------------------------+----------------+------------+-----------+---------| | /home/mysql/database/mariadb-104/data/ib_buffer_pool | 16020 | 4 | 4 | 100.000 | | /home/mysql/database/mariadb-104/data/ibdata1 | 79691776 | 19456 | 140 | 000.720 | | /home/mysql/database/mariadb-104/data/ib_logfile0 | 268435456 | 65536 | 36844 | 056.219 | | /home/mysql/database/mariadb-104/data/ib_logfile1 | 268435456 | 65536 | 65536 | 100.000 | | /home/mysql/database/mariadb-104/data/ibtmp1 | 12582912 | 3072 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.frm | 1097 | 1 | 1 | 100.000 | | /home/mysql/database/mariadb-104/data/test/test.ibd | 67108864 | 16384 | 13400 | 081.787 | +------------------------------------------------------+----------------+------------+-----------+---------+
But... part of the InnoDB Tablespace files is still cached! Also checking the total amount of Buffers and Cache shows the same:
# free total used free shared buff/cache available Mem: 16106252 4401788 368200 456716 11336264 10691792 Swap: 31250428 1348440 29901988
So restarting the MariaDB database does not purge the Page Cache! Note: This is important to notice because bypassing the Page Cache helps to not wasting valuable RAM but it makes Database restart much more costly because Page Cache does not help/support InnoDB Buffer Pool heating any more!
Then let us clear the Linux Page Cache and check the result:
# echo 1 > /proc/sys/vm/drop_caches # free -w total used free shared buffers cache available Mem: 16106252 4395892 10539864 441708 696 1169800 10882984 Swap: 31250428 1348428 29902000
Checking with pcstat
shows now that all InnoDB pages are wiped out of the Page Cache:
# pcstat /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* +------------------------------------------------------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |------------------------------------------------------+----------------+------------+-----------+---------| | /home/mysql/database/mariadb-104/data/ib_buffer_pool | 16020 | 4 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibdata1 | 79691776 | 19456 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ib_logfile0 | 268435456 | 65536 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ib_logfile1 | 268435456 | 65536 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibtmp1 | 12582912 | 3072 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.frm | 1097 | 1 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.ibd | 67108864 | 16384 | 0 | 000.000 | +------------------------------------------------------+----------------+------------+-----------+---------+
And after a while running traffic on the test
table we can see that InnoDB Transaction Log Files are cached again in the Page Cache but NOT the InnoDB Tablespace files:
# pcstat /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* +------------------------------------------------------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |------------------------------------------------------+----------------+------------+-----------+---------| | /home/mysql/database/mariadb-104/data/ib_buffer_pool | 16020 | 4 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibdata1 | 79691776 | 19456 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ib_logfile0 | 268435456 | 65536 | 3012 | 004.596 | | /home/mysql/database/mariadb-104/data/ib_logfile1 | 268435456 | 65536 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/ibtmp1 | 12582912 | 3072 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.frm | 1097 | 1 | 0 | 000.000 | | /home/mysql/database/mariadb-104/data/test/test.ibd | 71303168 | 17408 | 0 | 000.000 | +------------------------------------------------------+----------------+------------+-----------+---------+
Also with vmtouch
we can see the difference:
./vmtouch -f -v /home/mysql/database/mariadb-104/data/ib* /home/mysql/database/mariadb-104/data/test/test* /home/mysql/database/mariadb-104/data/ib_buffer_pool [ ] 0/4 /home/mysql/database/mariadb-104/data/ibdata1 [ ] 0/19456 /home/mysql/database/mariadb-104/data/ib_logfile0 [o oOOOo ] 4252/65536 /home/mysql/database/mariadb-104/data/ib_logfile1 [ ] 0/65536 /home/mysql/database/mariadb-104/data/ibtmp1 [ ] 0/3072 /home/mysql/database/mariadb-104/data/test/test.frm [ ] 0/1 /home/mysql/database/mariadb-104/data/test/test.ibd [ ] 0/17408 Files: 7 Directories: 0 Resident Pages: 4252/171013 16M/668M 2.49% Elapsed: 0.003264 seconds
And also cachestat
shows the effect of a flushed Buffer Cache and Page Cache:
# ./cachestat Counting cache functions... Output every 1 seconds. HITS MISSES DIRTIES RATIO BUFFERS_MB CACHE_MB 677882 19 740 100.0% 67 1087 679213 10 700 100.0% 67 1087 677236 0 732 100.0% 67 1087 685673 11 932 100.0% 67 1088 677933 5 703 100.0% 67 1088
Caution: Depending on your underlying I/O system it makes nevertheless sense to run your MariaDB Database with innodb_flush_method = fsync
in certain cases! See also PostgreSQL behaviour.
Note: This information could also be interesting for PostgreSQL DBAs because they do redundant buffering with their shared_buffers
(why plural? It is just one!?!) and the O/S Page Cache as well!
What is Slab
Beside Buffer Cache and the Page Cache itself we have a third thing in the /proc/meminfo
statistics listed as Slabs. So what are Slabs? Slab seems to be a specific memory management (allocation) mechanism. It is used for frequently used objects in the Linux Kernel (buffer heads, inodes, dentries, etc.) [7-15]. So it contains something like other Linux Kernel Buffers and Kernel Caches.
What kind of other Linux Kernel Buffer and Kernel Caches exists can be found with the following command:
# sudo cat /proc/slabinfo slabinfo - version: 2.1 # name: tunables : slabdata nf_conntrack_1 14183 15275 320 25 2 : tunables 0 0 0 : slabdata 611 611 0 ext4_groupinfo_4k 8575 8596 144 28 1 : tunables 0 0 0 : slabdata 307 307 0 i915_gem_vma 523 950 320 25 2 : tunables 0 0 0 : slabdata 38 38 0 UDPv6 120 120 1088 30 8 : tunables 0 0 0 : slabdata 4 4 0 tw_sock_TCPv6 2668 2668 280 29 2 : tunables 0 0 0 : slabdata 92 92 0 request_sock_TCPv6 24 72 328 24 2 : tunables 0 0 0 : slabdata 3 3 0 TCPv6 68 105 2112 15 8 : tunables 0 0 0 : slabdata 7 7 0 cfq_queue 391 442 232 17 1 : tunables 0 0 0 : slabdata 26 26 0 mqueue_inode_cache 72 72 896 18 4 : tunables 0 0 0 : slabdata 4 4 0 fuse_request 20 40 400 20 2 : tunables 0 0 0 : slabdata 2 2 0 fuse_inode 1 21 768 21 4 : tunables 0 0 0 : slabdata 1 1 0 fat_cache 102 408 40 102 1 : tunables 0 0 0 : slabdata 4 4 0 hugetlbfs_inode_cache 28 84 584 28 4 : tunables 0 0 0 : slabdata 3 3 0 squashfs_inode_cache 25 50 640 25 4 : tunables 0 0 0 : slabdata 2 2 0 jbd2_journal_handle 340 340 48 85 1 : tunables 0 0 0 : slabdata 4 4 0 jbd2_journal_head 2040 2040 120 34 1 : tunables 0 0 0 : slabdata 60 60 0 jbd2_revoke_table_s 260 512 16 256 1 : tunables 0 0 0 : slabdata 2 2 0 jbd2_revoke_record_s1152 1408 32 128 1 : tunables 0 0 0 : slabdata 11 11 0 ext4_inode_cache 208751 210840 1072 30 8 : tunables 0 0 0 : slabdata 7028 7028 0 ext4_free_data 320 448 64 64 1 : tunables 0 0 0 : slabdata 7 7 0 ext4_allocation_cont 128 128 128 32 1 : tunables 0 0 0 : slabdata 4 4 0 ext4_io_end 392 560 72 56 1 : tunables 0 0 0 : slabdata 10 10 0 ext4_extent_status 64412 77928 40 102 1 : tunables 0 0 0 : slabdata 764 764 0 dquot 144 160 256 16 1 : tunables 0 0 0 : slabdata 10 10 0 mbcache 226 292 56 73 1 : tunables 0 0 0 : slabdata 4 4 0 dio 273 350 640 25 4 : tunables 0 0 0 : slabdata 14 14 0 pid_namespace 42 42 2224 14 8 : tunables 0 0 0 : slabdata 3 3 0 ip4-frags 32 64 248 16 1 : tunables 0 0 0 : slabdata 4 4 0 RAW 396 396 896 18 4 : tunables 0 0 0 : slabdata 22 22 0 UDP 68 68 960 17 4 : tunables 0 0 0 : slabdata 4 4 0 tw_sock_TCP 10750 11136 280 29 2 : tunables 0 0 0 : slabdata 384 384 0 request_sock_TCP 96 96 328 24 2 : tunables 0 0 0 : slabdata 4 4 0 TCP 119 136 1920 17 8 : tunables 0 0 0 : slabdata 8 8 0 blkdev_queue 27 48 1336 24 8 : tunables 0 0 0 : slabdata 2 2 0 blkdev_requests 394 506 368 22 2 : tunables 0 0 0 : slabdata 23 23 0 blkdev_ioc 516 546 104 39 1 : tunables 0 0 0 : slabdata 14 14 0 user_namespace 104 104 304 26 2 : tunables 0 0 0 : slabdata 4 4 0 dmaengine-unmap-256 15 15 2112 15 8 : tunables 0 0 0 : slabdata 1 1 0 sock_inode_cache 1707 1950 640 25 4 : tunables 0 0 0 : slabdata 78 78 0 file_lock_cache 665 665 208 19 1 : tunables 0 0 0 : slabdata 35 35 0 net_namespace 40 40 7296 4 8 : tunables 0 0 0 : slabdata 10 10 0 shmem_inode_cache 3315 3432 656 24 4 : tunables 0 0 0 : slabdata 143 143 0 taskstats 96 96 328 24 2 : tunables 0 0 0 : slabdata 4 4 0 proc_inode_cache 6895 7072 624 26 4 : tunables 0 0 0 : slabdata 272 272 0 sigqueue 100 100 160 25 1 : tunables 0 0 0 : slabdata 4 4 0 bdev_cache 29 76 832 19 4 : tunables 0 0 0 : slabdata 4 4 0 kernfs_node_cache 43625 44982 120 34 1 : tunables 0 0 0 : slabdata 1323 1323 0 mnt_cache 518 546 384 21 2 : tunables 0 0 0 : slabdata 26 26 0 inode_cache 17519 17668 568 28 4 : tunables 0 0 0 : slabdata 631 631 0 dentry 424185 439992 192 21 1 : tunables 0 0 0 : slabdata 20952 20952 0 buffer_head 1112865 1112865 104 39 1 : tunables 0 0 0 : slabdata 28535 28535 0 vm_area_struct 53945 55300 200 20 1 : tunables 0 0 0 : slabdata 2765 2765 0 files_cache 260 299 704 23 4 : tunables 0 0 0 : slabdata 13 13 0 signal_cache 509 630 1088 30 8 : tunables 0 0 0 : slabdata 21 21 0 sighand_cache 346 405 2112 15 8 : tunables 0 0 0 : slabdata 27 27 0 task_struct 1189 1269 3584 9 8 : tunables 0 0 0 : slabdata 141 141 0 Acpi-Operand 5703 5824 72 56 1 : tunables 0 0 0 : slabdata 104 104 0 Acpi-Parse 1314 1314 56 73 1 : tunables 0 0 0 : slabdata 18 18 0 Acpi-State 204 204 80 51 1 : tunables 0 0 0 : slabdata 4 4 0 Acpi-Namespace 4077 4182 40 102 1 : tunables 0 0 0 : slabdata 41 41 0 anon_vma 19831 21522 80 51 1 : tunables 0 0 0 : slabdata 422 422 0 numa_policy 170 170 24 170 1 : tunables 0 0 0 : slabdata 1 1 0 radix_tree_node 321937 327740 584 28 4 : tunables 0 0 0 : slabdata 11705 11705 0 trace_event_file 3985 4002 88 46 1 : tunables 0 0 0 : slabdata 87 87 0 ftrace_event_field 86541 88570 48 85 1 : tunables 0 0 0 : slabdata 1042 1042 0 idr_layer_cache 533 555 2096 15 8 : tunables 0 0 0 : slabdata 37 37 0 kmalloc-8192 1246 1246 8192 4 8 : tunables 0 0 0 : slabdata 502 502 0 kmalloc-4096 658 720 4096 8 8 : tunables 0 0 0 : slabdata 90 90 0 kmalloc-2048 1955 2144 2048 16 8 : tunables 0 0 0 : slabdata 134 134 0 kmalloc-1024 44217 44384 1024 16 4 : tunables 0 0 0 : slabdata 2774 2774 0 kmalloc-512 3037 3808 512 16 2 : tunables 0 0 0 : slabdata 238 238 0 kmalloc-256 17465 20384 256 16 1 : tunables 0 0 0 : slabdata 1274 1274 0 kmalloc-192 27708 28665 192 21 1 : tunables 0 0 0 : slabdata 1365 1365 0 kmalloc-128 140581 143744 128 32 1 : tunables 0 0 0 : slabdata 4492 4492 0 kmalloc-96 168044 168378 96 42 1 : tunables 0 0 0 : slabdata 4009 4009 0 kmalloc-64 117533 123264 64 64 1 : tunables 0 0 0 : slabdata 1926 1926 0 kmalloc-32 80425 90368 32 128 1 : tunables 0 0 0 : slabdata 706 706 0 kmalloc-16 9513 11264 16 256 1 : tunables 0 0 0 : slabdata 44 44 0 kmalloc-8 6616 7168 8 512 1 : tunables 0 0 0 : slabdata 14 14 0 kmem_cache_node 320 320 64 64 1 : tunables 0 0 0 : slabdata 5 5 0 kmem_cache 208 208 256 16 1 : tunables 0 0 0 : slabdata 13 13 0
If you want to see the most frequently used (hottest) Slabs you can see them top
-like with slabtop
. If you press c
you can sort the Slabs by CACHE_SIZE
:
# sudo slabtop Active / Total Objects (% used) : 2249113 / 2280136 (98.6%) Active / Total Slabs (% used) : 70256 / 70256 (100.0%) Active / Total Caches (% used) : 86 / 121 (71.1%) Active / Total Size (% used) : 597547.86K / 605445.30K (98.7%) Minimum / Average / Maximum Object : 0.01K / 0.26K / 18.56K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 294308 289889 98% 0.57K 10511 28 168176K radix_tree_node 105030 104435 99% 1.05K 3501 30 112032K ext4_inode_cache 745446 745446 100% 0.10K 19114 39 76456K buffer_head 59984 59909 99% 1.00K 3749 16 59984K ecryptfs_inode_cache 47520 47157 99% 1.00K 2970 16 47520K kmalloc-1024 215166 214987 99% 0.19K 10246 21 40984K dentry 139744 138452 99% 0.12K 4367 32 17468K kmalloc-128 179508 179011 99% 0.09K 4274 42 17096K kmalloc-96 47140 45768 97% 0.20K 2357 20 9428K vm_area_struct 14700 14700 100% 0.55K 525 28 8400K inode_cache ...
Literature
- [1] What is the major difference between the buffer cache and the page cache? Why were they separate entities in older kernels? Why were they merged later on? (2012-10-21)
- [2] What is the difference between Buffers and Cached columns in
/proc/meminfo
output? (2012-10-19) - [3] The future of the page cache (2017-01-25)
- [4] Linux Page Cache Basics
- [5] Is it possible to list the files that are cached?
- [6] RFC: Clarifying Direct I/O Semantics
- [7] Slab allocation
- [8] Memory management
- [9] Overview of Linux Memory Management Concepts: Slabs
- [10] Memory management - Buddy allocator, Slab allocator
- [11] The SLUB allocator
- [12] Slab Allocator
- [13] What are the different caches reported by slabtop?
- [14] How
/proc/slabinfo
is not quite telling you what it looks like - [15] Short users guide for SLUB
- Shinguz's blog
- Log in or register to post comments