You are here
Create a single-node MariaDB ColumnStore test installation
Table of Contents
- Hardware requirements
- O/S (Linux) settings
- Installation
- Configuration
- Cross Engine Join User
- Start ColumnStore
- Create the Cross Engine Join User in the database
- MariaDB ColumnStore schemas
- Server System Variables and Server Status Variables
- MariaDB ColumnStore on the file system
- Creating our first table
- Querying and changing data in MariaDB ColumnStore
- MariaDB ColumnStore using sequences instead of
AUTO_INCREMENT
- MariaDB ColumnStore Query Execution Plan
For a long time I wanted to investigate a bit more into the MariaDB ColumnStore architecture. Now I took the time to start with.
MariaDB ColumnStore is an interesting technology if you want to do reports (OLAP, data warehouse (DWH), BI) over only a few columns on a huge amount of rows and if you have only or mostly batch writes. These data are ideally spread (sharded) over several machines and the results where calculated on those different machines an aggregated. Because the data are stored in columns the locality of your data should be better and thus a more efficient caching (in-memory) should result in faster response times for you reporting queries.
MariaDB ColumnStore can be deployed as a single-node or a multi-node set-up. The first one is considered for testing, the later one for production purposes.
For setting up a testing system I have chosen a single-node set-up on a VirtualBox VM first.
MariaDB ColumnStore hardware requirements
There are different informations available about the minimal recommended hardware resources. The range lasts from 8 to 32 physical cores with 16 to 64 Gibyte of RAM, HDD disks are fine because the system is optimized for block streaming (sequential read and write). At least an 1 Gbit network is recommended. [ 1 ]
For our first test we rely on 1 virtual Machine with 4 Gibyte of RAM (possibly with 2 Gibyte of RAM it still would work?), 10 Gibyte of HDD disk space and 1 vCPU.
O/S (Linux) settings
Check Linux kernel settings:
$ sysctl -a | grep -e '^vm.swap' -e '^vm.vfs' -e '^net.core..mem_max' -e '^net.ipv4.tcp_.mem' -e '^net.ipv4.tcp_no_metrics' -e '^net.core.netdev_max' net.core.netdev_max_backlog = 1000 net.core.rmem_max = 212992 net.core.wmem_max = 212992 net.ipv4.tcp_no_metrics_save = 0 net.ipv4.tcp_rmem = 4096 131072 6291456 net.ipv4.tcp_wmem = 4096 16384 4194304 vm.swappiness = 60 vm.vfs_cache_pressure = 100
Check network speed:
$ ethtool enp0s9 | grep Speed Speed: 1000Mb/s
Persist the Linux kernel settings:
$ cat >/etc/sysctl.d/90-mariadb-enterprise-columnstore.conf <<_EOF # # /etc/sysctl.d/90-mariadb-enterprise-columnstore.conf # # minimize swapping vm.swappiness = 10 # optimize Linux to cache directories and inodes vm.vfs_cache_pressure = 10 # Increase the TCP max buffer size net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 # Increase the TCP buffer limits # min, default, and max number of bytes to use net.ipv4.tcp_rmem = 4096 87380 16777216 net.ipv4.tcp_wmem = 4096 65536 16777216 # don't cache ssthresh from previous connection net.ipv4.tcp_no_metrics_save = 1 # for 1 GigE, increase this to 2500 # for 10 GigE, increase this to 30000 net.core.netdev_max_backlog = 2500 _EOF
and make them active:
$ sysctl --load=/etc/sysctl.d/90-mariadb-enterprise-columnstore.conf
Disable AppArmor (for installation only?):
$ systemctl status apparmor $ systemctl stop apparmor $ systemctl disable apparmor $ aa-status
Configure Character Enconding (why exactly?):
$ localedef -i en_US -f UTF-8 en_US.UTF-8
Installation of MariaDB ColumnStore
The first problem I had was finding a good an reliable information about where to download the software. Finally I ended up here: MariaDB ColumnStore download. But MariaDB ColumnStore is the wrong tab. You have to choose the MariaDB Community Server tab.
If you prefer, the software can also be downloaded from the command line:
$ wget https://dlm.mariadb.com/1623874/MariaDB/mariadb-10.5.10/repo/ubuntu/mariadb-10.5.10-ubuntu-bionic-amd64-debs.tar
and there are a lot of packages contained in this tar-ball:
$ tar xf mariadb-10.5.10-ubuntu-bionic-amd64-debs.tar -rw-rw-r-- 1 oli oli 9059256 May 7 04:34 galera-4_26.4.8-bionic_amd64.deb -rw-rw-r-- 1 oli oli 5955264 May 7 04:34 galera-arbitrator-4_26.4.8-bionic_amd64.deb -rw-rw-r-- 1 oli oli 1698 May 7 04:35 InRelease -rw-rw-r-- 1 oli oli 149208 May 7 04:34 libmariadb3_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 3276 May 7 04:34 libmariadb3-compat_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 3248 May 7 04:34 libmariadbclient18_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 5130428 May 7 04:34 libmariadbd19_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 6226020 May 7 04:34 libmariadbd-dev_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 159556 May 7 04:34 libmariadb-dev_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 3452 May 7 04:34 libmariadb-dev-compat_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 3260 May 7 04:34 libmysqlclient18_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 6483096 May 7 04:34 mariadb-backup_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 4672 May 7 04:35 MariaDB-C74CD1D8-public.asc -rw-rw-r-- 1 oli oli 1604684 May 7 04:34 mariadb-client-10.5_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 3088 May 7 04:34 mariadb-client_10.5.10+maria~bionic_all.deb -rw-rw-r-- 1 oli oli 784824 May 7 04:34 mariadb-client-core-10.5_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 4428 May 7 04:34 mariadb-common_10.5.10+maria~bionic_all.deb -rw-rw-r-- 1 oli oli 5811572 May 7 04:34 mariadb-plugin-columnstore_10.5.10-5.5.2+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 558288 May 7 04:34 mariadb-plugin-connect_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 5216 May 7 04:34 mariadb-plugin-cracklib-password-check_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 6172 May 7 04:34 mariadb-plugin-gssapi-client_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 7592 May 7 04:34 mariadb-plugin-gssapi-server_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 1187228 May 7 04:34 mariadb-plugin-mroonga_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 63820 May 7 04:34 mariadb-plugin-oqgraph_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 3122524 May 7 04:34 mariadb-plugin-rocksdb_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 845428 May 7 04:34 mariadb-plugin-s3_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 354276 May 7 04:34 mariadb-plugin-spider_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 4562128 May 7 04:34 mariadb-server-10.5_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 3216 May 7 04:34 mariadb-server_10.5.10+maria~bionic_all.deb -rw-rw-r-- 1 oli oli 6932836 May 7 04:34 mariadb-server-core-10.5_10.5.10+maria~bionic_amd64.deb -rw-rw-r-- 1 oli oli 3260 May 7 04:34 mysql-common_10.5.10+maria~bionic_all.deb -rw-rw-r-- 1 oli oli 42897 May 7 04:34 Packages -rw-rw-r-- 1 oli oli 2278 May 7 04:35 README -rw-rw-r-- 1 oli oli 816 May 7 04:34 Release -rw-rw-r-- 1 oli oli 833 May 7 04:35 Release.gpg -rwxrwxr-x 1 oli oli 1010 May 7 04:35 setup_repository*
Installation of MariaDB ColumnStore was straight forward:
$ apt-get install gnupg2 $ ./setup_repository $ apt-get update $ apt-get install mariadb-server mariadb-plugin-columnstore libjemalloc1 mariadb-backup mariadb-client
Configuration of MariaDB ColumnStore
The minimum recommended MariaDB configuration for ColumnStore looks like this:
$ cat >/etc/mysql/mariadb.conf.d/zz-fromdual.cnf <<_EOF # # /etc/mysql/mariadb.conf.d/zz-fromdual.cnf # [mariadb] log_error = mariadbd-error.log character_set_server = utf8 collation_server = utf8_general_ci columnstore_use_import_for_batchinsert = ALWAYS _EOF
In the MariaDB ColumnStore documentation you can find all other Server System Variables and Options (columnstore*
).
Cross Engine Join User
For Cross Engine Joins a special user is required. These credentials are stored in the file /etc/columnstore/Columnstore.xml
and can be queried as follows:
$ mcsGetConfig -v -a | grep Cross CrossEngineSupport.Host = 127.0.0.1 CrossEngineSupport.Port = 3306 CrossEngineSupport.User = root CrossEngineSupport.Password = CrossEngineSupport.TLSCA = CrossEngineSupport.TLSClientCert = CrossEngineSupport.TLSClientKey =
and changed as follows:
$ mcsSetConfig CrossEngineSupport Host 127.0.0.1 $ mcsSetConfig CrossEngineSupport Port 3306 $ mcsSetConfig CrossEngineSupport User cross_engine $ mcsSetConfig CrossEngineSupport Password secret
Caution: Querying the Cross Engine Join User password can be done by an non-privileged O/S user. Thus I consider this as a security relevant bug. This is known already since 2018 and should be fixed in version 5.6.1 :-( (MCOL-1175, MCOL-4714).
Start the ColumnStore Processes
Starting MariaDB ColumnStore after the configuration changes are quite intuitive:
$ systemctl restart mariadb $ systemctl restart mariadb-columnstore
After these commands the mariadbd
and some ColumnStore processes are started:
$ ps aux | grep -e mysql -e VSZ USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND mysql 1380 0.0 2.4 1274896 99020 ? Ssl 11:20 0:00 /usr/sbin/mariadbd mysql 1499 0.0 0.4 267392 16904 ? Sl 11:21 0:00 /usr/bin/workernode DBRM_Worker1 mysql 1507 0.0 0.2 530448 11648 ? Sl 11:21 0:00 /usr/bin/controllernode mysql 1522 0.0 0.2 2391288 10560 ? Sl 11:21 0:00 /usr/bin/PrimProc mysql 1545 0.0 0.3 300192 14116 ? Sl 11:21 0:00 /usr/bin/WriteEngineServer mysql 1560 0.0 0.2 212452 10188 ? Sl 11:21 0:00 /usr/bin/ExeMgr mysql 1571 0.0 0.4 342364 18704 ? Sl 11:21 0:00 /usr/bin/DMLProc mysql 1592 0.0 0.3 193640 12444 ? Sl 11:21 0:00 /usr/bin/DDLProc
It looks like each of those processes works already multi-threaded:
$ ps -eLf | grep -e mysql -e PID UID PID PPID LWP C NLWP STIME TTY TIME CMD mysql 1380 1 1380 0 10 11:20 ? 00:00:00 /usr/sbin/mariadbd mysql 1380 1 1381 0 10 11:20 ? 00:00:00 /usr/sbin/mariadbd mysql 1380 1 1382 0 10 11:20 ? 00:00:00 /usr/sbin/mariadbd mysql 1380 1 1383 0 10 11:20 ? 00:00:00 /usr/sbin/mariadbd mysql 1380 1 1384 0 10 11:20 ? 00:00:00 /usr/sbin/mariadbd mysql 1380 1 1385 0 10 11:20 ? 00:00:00 /usr/sbin/mariadbd mysql 1380 1 1389 0 10 11:20 ? 00:00:00 /usr/sbin/mariadbd mysql 1380 1 1391 0 10 11:20 ? 00:00:00 /usr/sbin/mariadbd mysql 1380 1 1392 0 10 11:20 ? 00:00:00 /usr/sbin/mariadbd mysql 1380 1 1407 0 10 11:20 ? 00:00:00 /usr/sbin/mariadbd mysql 1499 1 1499 0 5 11:21 ? 00:00:00 /usr/bin/workernode DBRM_Worker1 mysql 1499 1 1501 0 5 11:21 ? 00:00:00 /usr/bin/workernode DBRM_Worker1 mysql 1499 1 1502 0 5 11:21 ? 00:00:00 /usr/bin/workernode DBRM_Worker1 mysql 1499 1 1503 0 5 11:21 ? 00:00:00 /usr/bin/workernode DBRM_Worker1 mysql 1499 1 1504 0 5 11:21 ? 00:00:00 /usr/bin/workernode DBRM_Worker1 mysql 1507 1 1507 0 6 11:21 ? 00:00:00 /usr/bin/controllernode mysql 1507 1 1567 0 6 11:21 ? 00:00:00 /usr/bin/controllernode mysql 1507 1 1573 0 6 11:21 ? 00:00:00 /usr/bin/controllernode mysql 1507 1 1580 0 6 11:21 ? 00:00:00 /usr/bin/controllernode mysql 1507 1 1581 0 6 11:21 ? 00:00:00 /usr/bin/controllernode mysql 1507 1 1596 0 6 11:21 ? 00:00:00 /usr/bin/controllernode mysql 1522 1 1522 0 18 11:21 ? 00:00:00 /usr/bin/PrimProc mysql 1522 1 1523 0 18 11:21 ? 00:00:00 /usr/bin/PrimProc mysql 1522 1 1524 0 18 11:21 ? 00:00:00 /usr/bin/PrimProc mysql 1522 1 1525 0 18 11:21 ? 00:00:00 /usr/bin/PrimProc mysql 1522 1 1526 0 18 11:21 ? 00:00:00 /usr/bin/PrimProc mysql 1522 1 1527 0 18 11:21 ? 00:00:00 /usr/bin/PrimProc mysql 1522 1 1528 0 18 11:21 ? 00:00:00 /usr/bin/PrimProc mysql 1522 1 1529 0 18 11:21 ? 00:00:00 /usr/bin/PrimProc mysql 1522 1 1530 0 18 11:21 ? 00:00:00 /usr/bin/PrimProc mysql 1522 1 1531 0 18 11:21 ? 00:00:00 /usr/bin/PrimProc mysql 1522 1 1532 0 18 11:21 ? 00:00:00 /usr/bin/PrimProc mysql 1522 1 1533 0 18 11:21 ? 00:00:00 /usr/bin/PrimProc mysql 1522 1 1534 0 18 11:21 ? 00:00:00 /usr/bin/PrimProc mysql 1522 1 1535 0 18 11:21 ? 00:00:00 /usr/bin/PrimProc mysql 1522 1 1561 0 18 11:21 ? 00:00:00 /usr/bin/PrimProc mysql 1522 1 1563 0 18 11:21 ? 00:00:00 /usr/bin/PrimProc mysql 1522 1 1586 0 18 11:21 ? 00:00:00 /usr/bin/PrimProc mysql 1522 1 1593 0 18 11:21 ? 00:00:00 /usr/bin/PrimProc mysql 1545 1 1545 0 2 11:21 ? 00:00:00 /usr/bin/WriteEngineServer mysql 1545 1 1546 0 2 11:21 ? 00:00:00 /usr/bin/WriteEngineServer mysql 1560 1 1560 0 6 11:21 ? 00:00:00 /usr/bin/ExeMgr mysql 1560 1 1562 0 6 11:21 ? 00:00:00 /usr/bin/ExeMgr mysql 1560 1 1564 0 6 11:21 ? 00:00:00 /usr/bin/ExeMgr mysql 1560 1 1565 0 6 11:21 ? 00:00:00 /usr/bin/ExeMgr mysql 1560 1 1572 0 6 11:21 ? 00:00:00 /usr/bin/ExeMgr mysql 1560 1 1577 0 6 11:21 ? 00:00:00 /usr/bin/ExeMgr mysql 1571 1 1571 0 3 11:21 ? 00:00:00 /usr/bin/DMLProc mysql 1571 1 1587 0 3 11:21 ? 00:00:00 /usr/bin/DMLProc mysql 1571 1 1588 0 3 11:21 ? 00:00:00 /usr/bin/DMLProc mysql 1592 1 1592 0 3 11:21 ? 00:00:00 /usr/bin/DDLProc mysql 1592 1 1594 0 3 11:21 ? 00:00:00 /usr/bin/DDLProc mysql 1592 1 1595 0 3 11:21 ? 00:00:00 /usr/bin/DDLProc
In the database you can run this SELECT
query to check which MariaDB ColumnStore version is running:
SQL> SELECT plugin_name, plugin_status, plugin_type_version, plugin_library_version, plugin_license, plugin_maturity, plugin_auth_version FROM information_schema.plugins WHERE plugin_library LIKE 'ha_column%'; +---------------------+---------------+---------------------+------------------------+----------------+-----------------+---------------------+ | plugin_name | plugin_status | plugin_type_version | plugin_library_version | plugin_license | plugin_maturity | plugin_auth_version | +---------------------+---------------+---------------------+------------------------+----------------+-----------------+---------------------+ | Columnstore | ACTIVE | 100510.0 | 1.14 | GPL | Stable | 5.5.2 | | COLUMNSTORE_COLUMNS | ACTIVE | 100510.0 | 1.14 | GPL | Stable | 5.5.2 | | COLUMNSTORE_TABLES | ACTIVE | 100510.0 | 1.14 | GPL | Stable | 5.5.2 | | COLUMNSTORE_FILES | ACTIVE | 100510.0 | 1.14 | GPL | Stable | 5.5.2 | | COLUMNSTORE_EXTENTS | ACTIVE | 100510.0 | 1.14 | GPL | Stable | 5.5.2 | +---------------------+---------------+---------------------+------------------------+----------------+-----------------+---------------------+ 5 rows in set (0.002 sec)
or if the MariaDB ColumnStore is enabled at all with:
SQL> SHOW ENGINES; +--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+ | Engine | Support | Comment | Transactions | XA | Savepoints | +--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+ | Columnstore | YES | ColumnStore storage engine | YES | NO | NO | | MRG_MyISAM | YES | Collection of identical MyISAM tables | NO | NO | NO | | MEMORY | YES | Hash based, stored in memory, useful for temporary tables | NO | NO | NO | | Aria | YES | Crash-safe tables with MyISAM heritage. Used for internal temporary tables and privilege tables | NO | NO | NO | | MyISAM | YES | Non-transactional engine with good performance and small data footprint | NO | NO | NO | | SEQUENCE | YES | Generated tables filled with sequential values | YES | NO | YES | | InnoDB | DEFAULT | Supports transactions, row-level locking, foreign keys and encryption for tables | YES | YES | YES | | PERFORMANCE_SCHEMA | YES | Performance Schema | NO | NO | NO | | CSV | YES | Stores tables as CSV files | NO | NO | NO | +--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
Create the Cross Engine Join User in the database
The Cross Engine Join User specified above must also be created inside the database:
SQL> CREATE USER 'cross_engine'@'127.0.0.1' IDENTIFIED BY 'secret'; SQL> CREATE USER 'cross_engine'@'localhost' IDENTIFIED BY 'secret'; SQL> GRANT SELECT ON *.* TO 'cross_engine'@'127.0.0.1'; SQL> GRANT SELECT ON *.* TO 'cross_engine'@'localhost';
This is basically all you have to do for a running single-node MariaDB ColumnStore installation.
MariaDB ColumnStore schemas
MariaDB ColunStore creates 3 new schemas in the database:
SQL> SHOW SCHEMAS; +---------------------+ | Database | +---------------------+ | calpontsys | | columnstore_info | | infinidb_querystats | | information_schema | | mysql | | performance_schema | +---------------------+
We will investigate later what they are used for.
Server System Variables and Server Status Variables
A detailed description of the Server System Variables (columnstore_*
) can be found in the MariaDB documentation (old source: infinidb_*
).
SQL> SHOW GLOBAL STATUS LIKE 'columnstore%'; +-------------------------+--------+ | Variable_name | Value | +-------------------------+--------+ | Columnstore_commit_hash | source | | Columnstore_version | 5.5.2 | +-------------------------+--------+ 2 rows in set (0.001 sec)
SQL> SHOW GLOBAL VARIABLES LIKE 'columnstore%'; +-------------------------------------------------+--------+ | Variable_name | Value | +-------------------------------------------------+--------+ | columnstore_cache_flush_threshold | 500000 | | columnstore_cache_inserts | OFF | | columnstore_compression_type | SNAPPY | | columnstore_decimal_scale | 8 | | columnstore_derived_handler | ON | | columnstore_diskjoin_bucketsize | 100 | | columnstore_diskjoin_largesidelimit | 0 | | columnstore_diskjoin_smallsidelimit | 0 | | columnstore_double_for_decimal_math | OFF | | columnstore_group_by_handler | ON | | columnstore_import_for_batchinsert_delimiter | 7 | | columnstore_import_for_batchinsert_enclosed_by | 17 | | columnstore_local_query | 0 | | columnstore_orderby_threads | 16 | | columnstore_ordered_only | OFF | | columnstore_replication_slave | OFF | | columnstore_select_handler | ON | | columnstore_select_handler_in_stored_procedures | ON | | columnstore_string_scan_threshold | 10 | | columnstore_stringtable_threshold | 20 | | columnstore_um_mem_limit | 0 | | columnstore_use_decimal_scale | OFF | | columnstore_use_import_for_batchinsert | ALWAYS | | columnstore_varbin_always_hex | OFF | +-------------------------------------------------+--------+ 24 rows in set (0.003 sec)
MariaDB ColumnStore also writes to the MariaDB error log:
$ cat mariadbd-error.log 210511 11:20:47 Columnstore: Started; Version: 5.5.2-2
The MariaDB ColumnStore configuration is stored in /etc/columnstore
. Those files can be read by everybody on the system by default. So it possibly makes sense to prohibit access:
$ chown mysql: /etc/columnstore/* $ chmod o-rw /etc/columnstore/*
We have not seen yet an negative impact after this "hardening":
$ ll total 100 drwxr-xr-t 2 mysql mysql 4096 May 11 11:17 ./ drwxr-xr-x 78 root root 4096 May 11 10:12 ../ -rw-r----- 1 mysql mysql 19169 May 11 11:17 Columnstore.xml -rw-r----- 1 mysql mysql 19145 May 11 10:12 Columnstore.xml-20210511 -rw-rw---- 1 mysql mysql 19153 May 11 11:17 Columnstore.xml.columnstoreSave -rw-r----- 1 mysql mysql 15372 May 5 23:00 ErrorMessage.txt -rw-r----- 1 mysql mysql 5619 May 5 23:00 MessageFile.txt -rw-r----- 1 mysql mysql 7456 May 5 23:03 storagemanager.cnf
MariaDB ColumnStore on the file system
It is also interesting to know where on the filesystem the MariaDB ColumnStore files are located. By default they are located under /var/lib/columnstore
(Note: Can they be located somewhere else and how?).
$ tree * data1 ├── 000.dir │ └── 000.dir │ ├── 003.dir │ │ ├── 233.dir │ │ │ └── 000.dir │ │ │ └── FILE000.cdf ... │ │ └── 255.dir │ │ └── 000.dir │ │ └── FILE000.cdf │ ├── 004.dir │ │ ├── 000.dir │ │ │ └── 000.dir │ │ │ └── FILE000.cdf ... │ │ └── 018.dir │ │ └── 000.dir │ │ └── FILE000.cdf │ ├── 007.dir │ │ ├── 209.dir │ │ │ └── 000.dir │ │ │ └── FILE000.cdf │ │ └── 212.dir │ │ └── 000.dir │ │ └── FILE000.cdf │ ├── 008.dir │ │ ├── 013.dir │ │ │ └── 000.dir │ │ │ └── FILE000.cdf ... │ │ └── 028.dir │ │ └── 000.dir │ │ └── FILE000.cdf │ ├── 011.dir │ ├── 012.dir │ ├── 013.dir │ ├── 014.dir │ ├── 015.dir │ ├── 016.dir │ ├── 017.dir │ ├── 018.dir │ │ ├── 164.dir │ │ │ └── 000.dir │ │ │ └── FILE000.cdf ... │ │ └── 255.dir │ │ └── 000.dir │ │ └── FILE000.cdf │ └── 019.dir │ ├── 000.dir │ │ └── 000.dir │ │ └── FILE000.cdf ... ... │ └── 221.dir │ └── 000.dir │ └── FILE000.cdf ├── bulkRollback ├── dbroot1-lock ├── systemFiles │ └── dbrm │ ├── BRM_savesA_em │ ├── BRM_savesA_vbbm │ ├── BRM_savesA_vss │ ├── BRM_savesB_em │ ├── BRM_savesB_vbbm │ ├── BRM_savesB_vss │ ├── BRM_saves_current │ ├── BRM_saves_em │ ├── BRM_saves_journal │ ├── BRM_saves_vbbm │ ├── BRM_saves_vss │ ├── oidbitmap │ ├── SMTxnID │ └── tablelocks └── versionbuffer.cdf local └── module storagemanager └── storagemanager-lock 688 directories, 353 files
Creating our first MariaDB ColumnStore table
For the very first tests we use our well known test
table. But for MariaDB ColumnStore tables you should NOT use the test
schema: ColumnStore tables should not be created in the
(Why not test?).mysql
, information_schema
, calpontsys
or test
databases.
In other documents about MariaDB ColumnStore we can clearly see, that the one who was writing the article was using the test
schema [ 2 ]. An other restriction is: ColumnStore stores all object names in lower case.
So probably CamelCase
table names will not work as expected or they will be converted to lower case.
SQL> SET SESSION default_storage_engine = ColumnStore; SQL> CREATE TABLE test ( id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY , data VARCHAR(128) DEFAULT NULL , ts TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP ); ERROR 1069 (42000): Too many keys specified; max 0 keys allowed
Oppps! What is this? After some searching I found that MariaDB ColumnStore does not know indexes: There is no need for indexing.
See also MCOL-1080. As such indexes typically used to optimize query access for row based systems do not make sense since selectivity is low for such queries.
After removing the Primary Key the next try:
SQL> CREATE TABLE test ( id INT UNSIGNED NOT NULL AUTO_INCREMENT , data VARCHAR(128) DEFAULT NULL , ts TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP ); ERROR 1075 (42000): Incorrect table definition; there can be only one auto column and it must be defined as a key
Also AUTO_INCREMENT
seems not to be supported:
SQL> CREATE TABLE test ( id INT UNSIGNED NOT NULL , data VARCHAR(128) DEFAULT NULL , ts TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP ); ERROR 1178 (42000): The storage engine for the table doesn't support The syntax or the data type(s) is not supported by Columnstore. Please check the Columnstore syntax guide for supported syntax or data types.
Hmmm. What is wrong now? Checking the MariaDB ColumnStore datatypes does not show any obvious problem. Also the the page ColumnStore Create Table was not helpful. Relying on intuition I tried this:
SQL> CREATE TABLE test ( id INT UNSIGNED NOT NULL , data VARCHAR(128) DEFAULT NULL , ts TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ); ERROR 1815 (HY000): Internal error: The default value is out of range for the specified data type.
Becoming stressed a bit now this:
SQL> CREATE TABLE test ( id INT UNSIGNED NOT NULL , data VARCHAR(128) DEFAULT NULL , ts TIMESTAMP NOT NULL ); ERROR 1815 (HY000): Internal error: Config::Config: error accessing config file /etc/columnstore/Columnstore.xml
OK. This makes sense. We were a bit to restrictive in the first run above when we were hardening the MariaDB ColumnStore configuration file.
After fixing this we tried again and it finally works...
Query OK, 0 rows affected (1.755 sec)
For comparison only: This CREATE TABLE
DDL statement with InnoDB takes about 20 ms.
Migrating MariaDB (InnoDB, MyISAM, Aria) tables is not so straight forward as it could be. So the MariaDB ColumnStore seems to be still a bit picky! Interesting is, that if we look at the CREATE TABLE
statement it is shown in a non usable form:
SQL> SHOW CREATE TABLE test\G *************************** 1. row *************************** Table: test Create Table: CREATE TABLE `test` ( `id` int(10) unsigned NOT NULL, `data` varchar(128) DEFAULT NULL, `ts` timestamp NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp() ) ENGINE=Columnstore DEFAULT CHARSET=utf8 1 row in set (0.001 sec)
See also: MCOL-4716, MCOL-1039 and MCOL-4039. This problem seems to be known since a while yet...
This also means that if you want to use mysqldump
for migrating the table structure it will not work for restore without editing:
$ mysqldump --user=root --no-data --skip-lock-tables test > /tmp/test_structure_dump.sql
Querying and changing data in MariaDB ColumnStore
Note: This is NOT the preferred way how you should add a massive amount of data into MariaDB ColumnStore!
SQL> INSERT INTO test VALUES (1, 'some data', NULL); Query OK, 1 row affected (0.224 sec) SQL> INSERT INTO test VALUES (2, 'some data', NULL); Query OK, 1 row affected (0.137 sec) SQL> INSERT INTO test VALUES (3, 'some data', NULL); Query OK, 1 row affected (0.145 sec) SQL> SELECT * FROM test; +----+-----------+---------------------+ | id | data | ts | +----+-----------+---------------------+ | 1 | some data | 2021-05-11 14:01:31 | | 2 | some data | 2021-05-11 14:01:35 | | 3 | some data | 2021-05-11 14:01:38 | +----+-----------+---------------------+ 3 rows in set (0.104 sec) SQL> DELETE FROM test WHERE id = 2; Query OK, 1 row affected (0.194 sec) SQL> SELECT * FROM test; +----+-----------+---------------------+ | id | data | ts | +----+-----------+---------------------+ | 1 | some data | 2021-05-11 14:01:31 | | 3 | some data | 2021-05-11 14:01:38 | +----+-----------+---------------------+ 2 rows in set (0.022 sec) SQL> UPDATE test SET data = 'new data' WHERE id = 3; Query OK, 1 row affected (0.222 sec) Rows matched: 1 Changed: 1 Warnings: 0 SQL> SELECT * FROM test; +----+-----------+---------------------+ | id | data | ts | +----+-----------+---------------------+ | 1 | some data | 2021-05-11 14:01:31 | | 3 | new data | 2021-05-11 14:03:11 | +----+-----------+---------------------+ 2 rows in set (0.018 sec)
So basic functionality (INSERT
, UPDATE
, DELETE
and SELECT
) works. But as we can see, response times for these tiny data sets and single row DML commands are fare above as we are used with other MariaDB Storage Engines (InnoDB, Aria, MyISAM), which is expected... But at least it works. So this is not the best use case for MariaDB ColumnStore. And SELECT * FROM <table>
is probably one of the worst things you can do to a ColumnStore in general.
MariaDB ColumnStore using sequences instead of AUTO_INCREMENT
Because AUTO_INCREMENT
is not supported in MariaDB ColumnStore we try to use a SEQUENCE
instead:
SQL> CREATE SEQUENCE test_s START WITH 5 INCREMENT BY 1; SQL> INSERT INTO test SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP(); Query OK, 1 row affected (1.403 sec) Records: 1 Duplicates: 0 Warnings: 0 SQL> INSERT INTO test SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP() FROM test; Query OK, 4 rows affected (1.210 sec) Records: 4 Duplicates: 0 Warnings: 0
Querying the data shows that sequences work but not as expected. They are evaluated once per query and not once per row:
SQL> SELECT * FROM test; +----+---------------+---------------------+ | id | data | ts | +----+---------------+---------------------+ | 1 | Some data | 2021-05-19 16:58:09 | | 2 | Some data | 2021-05-19 16:58:14 | | 3 | Some data | 2021-05-19 16:58:18 | | 5 | Some new data | 2021-05-19 16:59:00 | | 6 | Some new data | 2021-05-19 16:59:20 | | 6 | Some new data | 2021-05-19 16:59:20 | | 6 | Some new data | 2021-05-19 16:59:20 | | 6 | Some new data | 2021-05-19 16:59:20 | +----+---------------+---------------------+ 8 rows in set (0.028 sec)
And if we do this for many rows, we can suddenly see that this becomes much faster than with other MariaDB Storage Engines:
SQL> INSERT INTO test SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP() FROM test; Query OK, 131072 rows affected (1.221 sec) Records: 131072 Duplicates: 0 Warnings: 0 SQL> INSERT INTO test SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP() FROM test; Query OK, 262144 rows affected (2.236 sec) Records: 262144 Duplicates: 0 Warnings: 0 SQL> INSERT INTO test SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP() FROM test; Query OK, 524288 rows affected (2.266 sec) Records: 524288 Duplicates: 0 Warnings: 0 SQL> INSERT INTO test SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP() FROM test; Query OK, 1048576 rows affected (4.316 sec) Records: 1048576 Duplicates: 0 Warnings: 0 SQL> INSERT INTO test SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP() FROM test; Query OK, 2097152 rows affected (8.367 sec) Records: 2097152 Duplicates: 0 Warnings: 0 SQL> INSERT INTO test SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP() FROM test; Query OK, 4194304 rows affected (15.472 sec) Records: 4194304 Duplicates: 0 Warnings: 0
To compare with InnoDB: About above 1 Mio rows seems to be the break even for MariaDB ColumnStore:
SQL> INSERT INTO test_innodb SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP() FROM test_innodb; Query OK, 1048576 rows affected (3.499 sec) Records: 1048576 Duplicates: 0 Warnings: 0 SQL> INSERT INTO test_innodb SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP() FROM test_innodb; Query OK, 2097152 rows affected (10.100 sec) Records: 2097152 Duplicates: 0 Warnings: 0 SQL> INSERT INTO test_innodb SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP() FROM test_innodb; Query OK, 4194304 rows affected (21.766 sec) Records: 4194304 Duplicates: 0 Warnings: 0
MariaDB ColumnStore Query Execution Plan
If we want to see how the queries are executed the usual EXPLAIN
command does not help much:
SQL> EXPLAIN SELECT * FROM test; +------+---------------+-------+------+---------------+------+---------+------+------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+---------------+-------+------+---------------+------+---------+------+------+-------+ | 1 | PUSHED SELECT | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | +------+---------------+-------+------+---------------+------+---------+------+------+-------+
Instead we should do it in MariaDB ColumnStore like this:
SQL> SELECT calSetTrace(1); +----------------+ | calSetTrace(1) | +----------------+ | 0 | +----------------+ SQL> SELECT * FROM test; ... 2048 rows in set, 1 warning (0.090 sec) SQL> SELECT calGetTrace()\G *************************** 1. row *************************** calGetTrace(): Desc Mode Table TableOID ReferencedColumns PIO LIO PBE Elapsed Rows BPS PM test 5086 (data,id,ts) 69 44 0 0.066 2048 TNS UM - - - - - - 0.000 2048 1 row in set (0.000 sec)
If we run the query a second time we can see that the physical I/O (PIO) has been reduced, so everything comes out of memory:
SQL> SELECT * FROM test; ... 2048 rows in set, 1 warning (0.021 sec) SQL> SELECT calGetTrace()\G *************************** 1. row *************************** calGetTrace(): Desc Mode Table TableOID ReferencedColumns PIO LIO PBE Elapsed Rows BPS PM test 5086 (data,id,ts) 0 44 0 0.000 2048 TNS UM - - - - - - 0.000 2048 1 row in set (0.000 sec)
See also: Query execution and Viewing the ColumnStore query plan.
- Shinguz's blog
- Log in or register to post comments