You are here
Sparse files
What is a sparse file?
"A sparse file is a file where space has been allocated but not actually filled with data. These space is not written to the file system. Instead, brief information about these empty regions is stored, which takes up much less disk space. These regions are only written to disk at their actual size when data is written to them. The file system transparently converts reads from empty sections into blocks filled with zero bytes at runtime." [ 1 ]
In other words: Files are not as big as expected.
With databases this can be seen often: For example the MySQL Cluster REDO log files are created as sparse files or some ORACLE tablespace files.
But first let us create such a sparse file:
# dd if=/dev/zero of=sparsefile count=0 obs=1 seek=100G # ls -lah sparsefile -rw-r--r-- 1 oli users 100G 2007-10-24 11:18 sparsefile # df -h . Filesystem Size Used Avail Use% Mounted on /dev/sda9 5.0G 3.5G 1.2G 75% /home
Funny: How can I have a 100 Gbyte file on a 5 Gbyte device? And this also already shows the problem...
But first let us see how we can find the real size of the file. So we can see if a file will make trouble or not:
# du -ks sparsefile 0 sparsefile
In reality this file is only 0 kbyte in size.
Or an example from MySQL Cluster:
# ll -h D9/DBLQH/S?.FragLog -rw-r--r-- 1 mysql dba 16M 2008-01-16 18:02 D9/DBLQH/S0.FragLog -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:43 D9/DBLQH/S1.FragLog -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:43 D9/DBLQH/S2.FragLog -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:43 D9/DBLQH/S3.FragLog -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:43 D9/DBLQH/S4.FragLog -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:44 D9/DBLQH/S5.FragLog -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:44 D9/DBLQH/S6.FragLog -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:44 D9/DBLQH/S7.FragLog -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:44 D9/DBLQH/S8.FragLog -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:44 D9/DBLQH/S9.FragLog # ll -hs D9/DBLQH/S?.FragLog 612K -rw-r--r-- 1 mysql dba 16M 2008-01-16 18:02 D9/DBLQH/S0.FragLog 548K -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:43 D9/DBLQH/S1.FragLog 548K -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:43 D9/DBLQH/S2.FragLog 548K -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:43 D9/DBLQH/S3.FragLog 548K -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:43 D9/DBLQH/S4.FragLog 548K -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:44 D9/DBLQH/S5.FragLog 548K -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:44 D9/DBLQH/S6.FragLog 548K -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:44 D9/DBLQH/S7.FragLog 548K -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:44 D9/DBLQH/S8.FragLog 548K -rw-r--r-- 1 mysql dba 16M 2008-01-16 13:44 D9/DBLQH/S9.FragLog
Why are sparse files dangerous?
In productive environments we want to have predictable behaviors of our systems. We therefore monitor these systems. With sparse files it becomes a little bit more tricky: We have free disk space, we have used disk space and we have possibly used disk space in the close or far future...
What we can do against?
Right now: Not much until the software vendor provides a possibility to avoid this.
- Calculate the exepcted disk space (quantity structure).
- Monitor properly your system.
Literature
- Shinguz's blog
- Log in or register to post comments
Comments
Managing sparse files on NTFS