You are here

Storing BLOBs in the database

We have sometimes discussions with our customers whether to store LOBs (Large Objects) in the database or not. To not rephrase the arguments again and again I have summarized them in the following lines.

The following items are more or less valid for all large data types (BLOB, TEXT and theoretically also for JSON and GIS columns) stored in a MySQL or MariaDB (or any other relational) database.

The idea of a relational table based data-store is to store structured data (numbers, data and short character strings) to have a quick write and read access to them.

And yes, you can also store other things like videos, huge texts (PDF, emails) or similar in a RDBMS but they are principally not designed for such a job and thus non optimal for the task. Software vendors implement such features not mainly because it makes sense but because users want it and the vendors want to attract users (or their managers) with such features (USP, Unique Selling Proposition). Here also one of my Mantras: Use the right tool for the right task:

right_tool_for_the_right_task.jpg

The main topics to discuss related to LOBs are: Operations, performance, economical reasons and technical limitations.

Disadvantages of storing LOBs in the database

  • The database will grow fast. Operations will become more costly and complicated.
  • Backup and restore will become more costly and complicated for the admin because of the increased size caused by LOBs.
  • Backup and restore will take longer because of the same reason.
  • Database and table management functions (OPTIMIZE, ALTER, etc.) will take longer on big LOB tables.
  • Smaller databases need less RAM/disk space and are thus cheaper.
  • Smaller databases fit better into your RAM and are thus potentially faster (RAM vs disk access).
  • RDBMS are a relatively slow technology (compared to others). Reading LOBs from the database is significantly slower than reading LOBs from a filer for example.
  • LOBs stored in the database will spoil your database cache (InnoDB Buffer Pool) and thus possibly slow down other queries (does not necessarily happen with more sophisticated RBDMS).
  • LOB size limitation of 1 Gbyte in reality (max_allowed_packet, theoretically limit is at 4 Gbyte) for MySQL/MariaDB.
  • Expensive, fast database store (RAID-10, SSD) is wasted for something which can be stored better on a cheap slow file store (RAID-5, HDD).
  • It is programmatically often more complicated to get LOBs from a database than from a filer (depends on your libraries).

Advantages of storing LOBs in the database

  • Atomicity between data and LOB is guaranteed by transactions (is it really in MySQL/MariaDB?).
  • There are no dangling links (reference from data to LOB) between data and LOB.
  • Data and LOB are from the same point in time and can be included in the same backup.
  • Data and LOB can be transferred simultaneously to other machines, by database replication or dump/restore.
  • Applications can use the same mechanism to get the data and the LOB. Remote access needs no separate file transfer for the LOB.

Conclusion

So basically you have to balance the advantages vs. the disadvantages of storing LOBs in the database and decided what arguments are more important in your case.

If you have some more good arguments pro or contra storing LOBs in the database please let me know.

Literature

Check also various articles on Google.

Taxonomy upgrade extras: