You are here
MySQL Cluster overview
This is a chaotic collection of my MySQL Cluster experience...
Content
- config.ini template
- my.cnf template
- General Rules and/or experience
- MySQL Cluster restore
- Skript for converting tables to NDB (alter_engine.pl)
- MySQL Cluster memory sizing
config.ini template
A generic MySQL Cluster configuration file (config.ini) to start with. It is pretty much what the MySQL Cluster experts recommend right now:
# # config.ini # # This configuration file is fore MySQL Clusters 6.2 and above... # ---------------------------------------------------------------------- [TCP DEFAULT] # Default is too small! SendBufferMemory = 2M ReceiveBufferMemory = 2M # When this is configured together with section above ndb_mgmd will # return with erro -1 (255). This is a bug and should be fixed earlier # or later! # When you move this section to the bottom it should work. # You need one TCP section for EACH cluster node pair! # (for example: 4 nodes = 6 sections) # [TCP] # # NodeId1: 10 # NodeId2: 11 # Hostname1: 10.40.219.161 # Hostname2: 10.40.219.162 # ---------------------------------------------------------------------- [NDB_MGMD DEFAULT] DataDir = /var/lib/mysql/cluster # ---------------------------------------------------------------------- [NDB_MGMD] Id = 1 HostName = 192.168.1.33 #[NDB_MGMD] # # Id = 2 # HostName = 192.168.1.34 # ---------------------------------------------------------------------- [NDBD DEFAULT] # needs root privileges? LockPagesInMainMemory = 1 MemReportFrequency = 600 # Everything else is not supported NoOfReplicas = 2 # Should be the same on all data node for eas of maintenance DataDir = /var/lib/mysql/cluster # needs root privileges? ODirect = 1 # Memory sizing either from ndb_size.pl, calculation or ALL DUMP 1000 DataMemory = 128M # IndexMemory = DataMemory / (5 - 10) IndexMemory = 16M # REDO log sizing # 256M is the new recommendation FragmentLogFileSize = 256M # 6 x DataMemory / (4 x FragmentLogFileSize) # >= 6.4 # 6 x DataMemory x NO_OF_LQH_THREADS / (4 * FragmentLogFileSize) NoOfFragmentLogFiles = 3 # should be around 16 - 64 MB default is too small (128 MB in extreme cases) RedoBuffer = 32M # No calculation available SharedGlobalMemory = 64M # 1 second data loss in max. TimeBetweenGlobalCheckpoints = 1000 # leave as default to start with # this is too big under low load (better 6-10) TimeBetweenLocalCheckpoints = 20 # On a good I/O system you can set this value higher Diskcheckpointspeed = 10M # Backup and LCP related parameters # This means full speed Diskcheckpointspeedinrestart = 100M BackupMaxWriteSize = 1M BackupLogBufferSize = 4M BackupDataBufferSize = 16M BackupMemory = 20M # # of expected concurrent trx / # data nodes # This memory is pre-allocated, thus keep it low! # leave as default to start with MaxNoOfConcurrentTransactions = 256 # This memory is pre-allocated, thus keep it low! The default (32k) is much too high! # Every INSERT/UPDATE/DELETE (Record = Operation) uses 1 kbyte!!! # MaxNoOfConcurrentOperations > Operations/s (10k - 100k) / number data nodes # MaxNoOfConcurrentOperations > MaxNoOfConcurrentTransactions x Operations/Transaction / 4 # Leave as default to start with. Use LIMIT <n> to limit number of operation records # If you want to be on the safe side set it to 100k - 250k MaxNoOfConcurrentOperations = 2048 # Defaults of 64 can be fine. # Each index consumes approx. 15 kbyte per node! MaxNoOfUniqueHashIndexes = 160 # Each object consumes approx. 10 kbyte per node! # Use ndb_size.pl or calculate the number of OI indexes # Also PK and UK create 1 OI each. Thus this values is at least as big # as MaxNoOfTables. Usually 2 - 4 times bigger. # The default of 128 can be too small soon. MaxNoOfOrderedIndexes = 256 # Not that each OI, UI and BLOB counts towards one table object # Usually safe # Each table consumes approx. 20 kbyte per node! MaxNoOfTables = 128 # Each attribute consumes around 200 bytes of storage per node # a) 6 x MaxNoOfTables # b) NoOfTables x AvgAttributsPerTable MaxNoOfAttributes = 1024 # Default of 768 is often fine. # Minmal value = #NoOfTables + #OI + 3 x #UI MaxNoOfTriggers = 768 # Default too big. MaxNoOfLocalScans = 64 # TimeBetweenEpochs = 100 # CGE >= 6.3.7 # CompressedLCP = 1 # CompressedBackup = 1 # GCE >= 6.3.4 # RealTimeScheduler = 1 # SchedulerSpinTimer = 400 # SchedulerExecutionTimer = 100 # ---------------------------------------------------------------------- [NDBD] Id = 10 HostName = 192.168.1.37 # CGE >= 6.3.7 # LockExecuteThreadToCPU = 1 # LockMaintThreadsToCPU = 0 # on same CPU as eth interupts occur [NDBD] Id = 11 HostName = 192.168.1.36 # CGE >= 6.3.7 # LockExecuteThreadToCPU = 1 # LockMaintThreadsToCPU = 0 # on same CPU as eth interupts occur # ---------------------------------------------------------------------- [MYSQLD DEFAULT] [MYSQLD] Id = 20 HostName = 192.168.1.35 # Keep some spare for later # Prevents cluster restart [MYSQLD] [MYSQLD] [MYSQLD] [MYSQLD] [MYSQLD] [MYSQLD]
my.cnf template
A generic MySQL configuration file (my.cnf) to start with MySQL Cluster.
[mysqld] # NDB Cluster related parameter ndbcluster = 1 # Use IP no DNS name ndb_connectstring = "host1;host2" ndb_cluster_connection_pool = 5 # Recommended default setting changes: ndb_use_exact_count = 0 # this is default # Some of these parameter are performance relevant but have some # drawbacks # multi_range_count = 256 # 0, 1, 10, 10 is usually too verbouse # ndb_extra_logging = 1 # ndb_cache_check_time = 0 # Increase if you are doing many inserts ndb_autoincrement_prefetch_sz = 1024 engine_condition_pushdown = 1 ndb_force_send = 1 ndb_index_stat_enable = 0 # ndb_index_stat_cache_entries = 32 # Statistics cost more than they give # ndb_index_stat_enable = 0 # ndb_index_stat_update_freq = 20 # Leave as default # ndb_report_thresh_binlog_epoch_slip = 3 # ndb_report_thresh_binlog_mem_usage = 10 # ndb_use_copying_alter_table = 0 # ndb_use_transactions = 1
Genral rules and/or experience
MySQL Cluster v5.0
- Avoid ALTER TABLE in MySQL Cluster.
It needs 2 x the RAM of a table (memory is reclaimed only after a node restart).
DoALTER TABLE <table> ENGINE=MyISAM;
instead, then the changes and then alter back to the cluster. DROP/TRUNCATE
reclaims pages.UPDATE/INSERT/DELETE
causes memory de-fragmentation (same symptoms like a memory leak) ⇒ requires a node restart from time to time.- For production use is still and only v5.0 (exception CGE)!
MySQL Cluster v5.1
- On disk table ⇒ is stable now.
- Parallel node recovery is not yet implemented in v5.1 ⇒ maybe 5.2/6.0?
- Partitioning does not work from the SQL interface (mysqld) but from the NDB-API.
- No more memory de-fragmentation is done (see above).
- Cluster to MySQL replication is still buggy! But should work.
- Cluster to Cluster replication is stable.
- In Cluster replication:
binlog_cache_size = TimeBetweenGlobalCheckpoints/1000 (default 2000) (default 32 kbyte) * total traffic (Mbyte/s) (for example 5 Mbyte/s) = 2000 ms / 1000 ms/s * 5 Mbyte/s ≌ 10 Mybte
pdflush(kswapd)
flushes every 5 seconds pages to disk ⇒ Causes troubles with cluster.
The kernel parametersvm.page-cluster
andvm.lower-zone-protection
(later one does not exist on my maybe non-NUMA system) can help:
/# cat /proc/sys/vm/lower_zone_protection # cat /proc/meminfo # sysctl vm | egrep 'page-cluster|lower-zone-protection'
- For MySQL Cluster on AMD Hardware set NUMA to OFF in BIOS! ⇒ Preferred hardware: Intel!
- File system: Use ext3 or XFS.
Literature
tags: