Attribute promotion and demotion in the MariaDB Galera Cluster

Fri, 2025-11-28 16:26 — Shinguz

In MariaDB master/slave replication there is a feature called attribute promotion/demotion.

Simply put, it is about how the slave behaves or should behave if the master and slave have different column definitions or even a different number of columns or a different sequence of columns.

Use case of the customer

This week we discussed with a customer the case of how he could perform a rolling schema upgrade (RSU) in a Galera cluster.

With previous schema changes he has always had problems, which has led to a total failure of the cluster for several hours.

The customer says that columns are never deleted and new columns are only ever added at the end of a table.

And that it is NOT possible to ensure that there are no more write connections during the rolling schema upgrade.

The PHP ORM framework Doctrine is used.

What does the MariaDB documentation say about this?

The study of the MariaDB documentation did not lead to a conclusive result whether a rolling schema upgrade in the running Galera Cluster operation WITH changes (DML statements) on the schema to be changed (AND the tables to be changed) is supported or not and thus should work or not.

Source: Rolling Schema Upgrade (RSU)

When replicating with different table structures, there is only general information on replication but nothing specific to Galera (whether it works or not):

"Tables on the replica and the primary do not need to have the same definition in order for replication to take place. There can be differing numbers of columns, or differing data definitions and, in certain cases, replication can still proceed."

Source: Replication When the Primary and Replica Have Different Table Definitions

For the attribute promotion/demotion feature there is a special MariaDB Server configuration parameter that controls the behaviour: slave_type_conversions.

If you read between the lines here, this could also work for Galera Cluster:

"Determines the type conversion mode on the replica when using row-based replication, including replications in MariaDB Galera cluster."

Source: slave_type_conversions

Test planning

I always make a rough risk assessment for such questions: Frequently used and widely deployed features: Risk of problems is rather low. Rarely used or new features: risk of problems is high! Unfortunately, this assessment is based less on tangible figures and more on experience...

I am not aware of a single MariaDB user who performs rolling schema upgrades during operation. Let alone having a write load on the current schema and the current tables (promotion/demotion attributes).
So: Use of rolling schema upgrade x frequency of use of attribute promotion/demotion is very rare and therefore the risk is very high!

As the situation is not clear and the documentation does not provide any clear information, testing was carried out.

To avoid possible already fixed MariaDB bugs the latest MariaDB 11.8.5 LTS version was used.

Special database configuration parameters were used:

slave_type_conversions = 'ALL_NON_LOSSY,ALL_LOSSY'

Our test table looks as usual as follows:

CREATE TABLE `test` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `data` varchar(128) DEFAULT NULL,
  `ts` timestamp NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(),
  PRIMARY KEY (`id`)
);

And test data was generated as follows:

INSERT INTO test VALUES (NULL, 'Some data to fill table up', NOW());
... -- 9 x

Commands for monitoring the tests:

SQL> SHOW GLOBAL VARIABLES LIKE 'slave_type_conversions';
SQL> SHOW GLOBAL STATUS LIKE 'wsrep_local_state_comment';
SQL> SHOW CREATE TABLE test\G
SQL> CHECKSUM TABLE test;
SQL> SELECT * FROM test;

Testing

Test 1: Attribute promotion with DML remote (on Node C)

This is the simplest case: A column is added and a default value is set. Changes to the table are made on another node:

nodeA> SET SESSION wsrep_OSU_method = 'RSU';
nodeA> ALTER TABLE test ADD COLUMN c1 VARCHAR(64) NOT NULL DEFAULT 'foo';
nodeA> SET SESSION wsrep_OSU_method = 'TOI';

MariaDB error log file:

2025-11-28  9:39:11 0 [Note] WSREP: Member 0.0 (Node A) desyncs itself from group
2025-11-28  9:39:11 0 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 36)
2025-11-28  9:39:11 13 [Note] WSREP: pause
2025-11-28  9:39:11 13 [Note] WSREP: Provider paused at 22db9ea1-cb7b-11f0-b26e-6fbce72757f9:36 (43)
2025-11-28  9:39:11 13 [Note] WSREP: Provider paused at: 36
2025-11-28  9:39:11 13 [Note] WSREP: resume
2025-11-28  9:39:11 13 [Note] WSREP: resuming provider at 43
2025-11-28  9:39:11 13 [Note] WSREP: Provider resumed.
2025-11-28  9:39:11 0 [Note] WSREP: Member 0.0 (Node A) resyncs itself to group.
2025-11-28  9:39:11 0 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 36)
2025-11-28  9:39:11 0 [Note] WSREP: Processing event queue:... -nan% (0/0 events) complete.
2025-11-28  9:39:11 0 [Note] WSREP: Member 0.0 (Node A) synced with group.
2025-11-28  9:39:11 0 [Note] WSREP: Processing event queue:... 100.0% (1/1 events) complete.
2025-11-28  9:39:11 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 36)
2025-11-28  9:39:11 7 [Note] WSREP: Server Node A synced with group

Then the other commands:

nodeC> INSERT INTO test VALUES (NULL, 'Some data to fill table up', NOW());
nodeC> UPDATE test SET data = 'Some data changed' WHERE id = 10;
nodeC> DELETE FROM test WHERE id = 19;

The ALTER TABLE command was then executed on nodes 2 and 3.

All 3 operations worked perfectly. The cluster is still fully functional. Subsequently:

SQL> ALTER TABLE test DROP COLUMN c1;

to return the system to its initial state for further tests.

Test 2: Attribute promotion with DML remote (on node B)

If you do the same experiment on node B, the cluster will blow up in your face!

nodeB> INSERT INTO test VALUES (NULL, 'Some data to fill table up', NOW());

nodeC> SHOW GLOBAL STATUS LIKE 'wsrep_local_state_comment';
+---------------------------+--------------+
| Variable_name             | Value        |
+---------------------------+--------------+
| wsrep_local_state_comment | Inconsistent |
+---------------------------+--------------+

MariaDB error log file:

[Note] WSREP: Member 1(Node A) initiates vote on 22db9ea1-cb7b-11f0-b26e-6fbce72757f9:45,dbc9a6ea898b6a29:  Got error 171 "The event was corrupt, leading to illegal data being read" from storage engine InnoDB, Error_code: 1030;
[Note] WSREP: Votes over 22db9ea1-cb7b-11f0-b26e-6fbce72757f9:45:
   dbc9a6ea898b6a29:   1/3
Waiting for more votes.
[Note] WSREP: Member 2(Node C) initiates vote on 22db9ea1-cb7b-11f0-b26e-6fbce72757f9:45,dbc9a6ea898b6a29:  Got error 171 "The event was corrupt, leading to illegal data being read" from storage engine InnoDB, Error_code: 1030;
[Note] WSREP: Votes over 22db9ea1-cb7b-11f0-b26e-6fbce72757f9:45:
   dbc9a6ea898b6a29:   2/3
Winner: dbc9a6ea898b6a29
[Note] WSREP: Got vote request for seqno 22db9ea1-cb7b-11f0-b26e-6fbce72757f9:45
[Note] WSREP: Recovering vote result from history: 22db9ea1-cb7b-11f0-b26e-6fbce72757f9:45,dbc9a6ea898b6a29
[ERROR] WSREP: Vote 0 (success) on 22db9ea1-cb7b-11f0-b26e-6fbce72757f9:45 is inconsistent with group. Leaving cluster.
[Note] WSREP: Closing send monitor...
[Note] WSREP: Closed send monitor.
[Note] WSREP: gcomm: terminating thread
[Note] WSREP: gcomm: joining thread
[Note] WSREP: gcomm: closing backend
[Note] WSREP: view(view_id(NON_PRIM,1456a640-aca0,15) memb {
        1456a640-aca0,0
} joined {
} left {
} partitioned {
        97230fdb-b973,0
        a2e10f2c-8929,0
})
[Note] WSREP: PC protocol downgrade 1 -> 0
[Note] WSREP: view((empty))
[Note] WSREP: gcomm: closed
[Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
[Note] WSREP: Flow-control interval: [16, 16]
[Note] WSREP: Received NON-PRIMARY.
[Note] WSREP: Shifting SYNCED -> OPEN (TO: 45)
[Note] WSREP: New SELF-LEAVE.
[Note] WSREP: Flow-control interval: [0, 0]
[Note] WSREP: Received SELF-LEAVE. Closing connection.
[Note] WSREP: Shifting OPEN -> CLOSED (TO: 45)
[Note] WSREP: RECV thread exiting 0: Success
[Note] WSREP: recv_thread() joined.
[Note] WSREP: Closing send queue.
[Note] WSREP: Closing receive queue.
[Note] WSREP: ================================================
View:
  id: 22db9ea1-cb7b-11f0-b26e-6fbce72757f9:45
  status: non-primary
  protocol_version: 4
  capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
  final: no
  own_index: 0
  members(1):
        0: 1456a640-cc36-11f0-aca0-e388a5f80ba9, Node B
=================================================
[Note] WSREP: Non-primary view
[Note] WSREP: Server status change synced -> connected
[Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
[Note] WSREP: ================================================
View:
  id: 22db9ea1-cb7b-11f0-b26e-6fbce72757f9:45
  status: non-primary
  protocol_version: 4
  capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
  final: yes
  own_index: -1
  members(0):
=================================================
[Note] WSREP: Non-primary view
[Note] WSREP: Server status change connected -> disconnected
[Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
[Note] WSREP: Applier thread exiting ret: 6 thd: 2
[Note] WSREP: Applier thread exiting ret: 6 thd: 9
[Warning] Aborted connection 2 to db: 'unconnected' user: 'unauthenticated' host: '' (This connection closed normally without authentication)
[Note] WSREP: Applier thread exiting ret: 6 thd: 5
[Warning] Aborted connection 5 to db: 'unconnected' user: 'unauthenticated' host: '' (This connection closed normally without authentication)
[Warning] Aborted connection 9 to db: 'unconnected' user: 'unauthenticated' host: '' (This connection closed normally without authentication)
[Note] WSREP: Service thread queue flushed.
[Note] WSREP: ####### Assign initial position for certification: 00000000-0000-0000-0000-000000000000:-1, protocol version: 6
[Note] WSREP: Applier thread exiting ret: 0 thd: 6
[Warning] Aborted connection 6 to db: 'unconnected' user: 'unauthenticated' host: '' (This connection closed normally without authentication)

No idea why node B and C behave differently. But this makes the whole rolling schema upgrade (RSU) process completely arbitrary and unplannable.

The whole thing was tested in different variants and the node sometimes becomes inconsistent and sometimes not.

The inconsistent node B is synchronised back into the cluster via a forced SST.

Test 3: Attribute promotion with DML remote (on node C)

This case is somewhat trickier, as no default value is specified.

nodeA> SET SESSION wsrep_OSU_method = 'RSU';
nodeA> ALTER TABLE test ADD COLUMN c1 VARCHAR(64) NOT NULL;   -- DEFAULT ''
nodeA> SET SESSION wsrep_OSU_method = 'TOI';

nodeC> INSERT INTO test VALUES (NULL, 'Some data to fill table up', NOW());
nodeC> UPDATE test SET data = 'Some data changed' WHERE id = 13;
nodeC> DELETE FROM test WHERE id = 22;

The ALTER TABLE command was then executed on nodes 2 and 3.

All 3 operations worked perfectly. The cluster is still fully functional.

Test 4: Attribute promotion with DML remote (on node B)

Same test but the ALTER TABLE ADD COLUMN command is executed on node A and the DML command on node B.

Nodes A and C become "Inconsistent" and node B is still "Synced".

Test 5: Attribute promotion with DML locally (on node A)

Analogue test but the DML command is executed locally on the same node as the DDL command.

nodeA> SET SESSION wsrep_OSU_method = 'RSU';
nodeA> ALTER TABLE test ADD COLUMN c1 VARCHAR(64) NOT NULL;   -- DEFAULT ''
nodeA> SET SESSION wsrep_OSU_method = 'TOI';

nodeA> INSERT INTO test VALUES (NULL, 'Some data to fill table up', NOW());
ERROR 1136 (21S01): Column count doesn't match value count at row 1

nodeA> INSERT INTO test (id, data, ts) VALUES (NULL, 'Some data to fill table up', NOW());
ERROR 1364 (HY000): Field 'c1' doesn't have a default value

nodeA> INSERT INTO test (id, data, ts, c1) VALUES (NULL, 'Some data to fill table up', NOW(), '');

nodeA> SELECT * FROM test;
ERROR 1047 (08S01): WSREP has not yet prepared node for application use

nodeA> SHOW GLOBAL STATUS LIKE 'wsrep_local_state_comment';
+---------------------------+--------------+
| Variable_name             | Value        |
+---------------------------+--------------+
| wsrep_local_state_comment | Inconsistent |
+---------------------------+--------------+

Cluster node became inconsistent. Interestingly enough, the whole thing suddenly worked during further testing! So completely unpredictable...

Test 6: DDL on 2 nodes

New question: What happens after the DDL command has been executed on 2 nodes and then DML commands occur?

nodeA> SET SESSION wsrep_OSU_method = 'RSU';
nodeA> ALTER TABLE test ADD COLUMN c1 VARCHAR(64) NOT NULL DEFAULT 'foo';
nodeA> SET SESSION wsrep_OSU_method = 'TOI';

nodeB> SET SESSION wsrep_OSU_method = 'RSU';
nodeB> ALTER TABLE test ADD COLUMN c1 VARCHAR(64) NOT NULL DEFAULT 'foo';
nodeB> SET SESSION wsrep_OSU_method = 'TOI';

nodeC> INSERT INTO test (id, data, ts) VALUES (NULL, 'Some data to fill table up', NOW());

works, but:

nodeB> INSERT INTO test (id, data, ts) VALUES (NULL, 'Some data to fill table up', NOW());

root@localhost [test]>  SHOW GLOBAL STATUS LIKE 'wsrep_local_state_comment';
+---------------------------+--------------+
| Variable_name             | Value        |
+---------------------------+--------------+
| wsrep_local_state_comment | Inconsistent |
+---------------------------+--------------+

Test 7: UPDATE and DELETE commands from the same node

nodeA> SET SESSION wsrep_OSU_method = 'RSU';
nodeA> ALTER TABLE test ADD COLUMN c1 VARCHAR(64) NOT NULL DEFAULT 'foo';
nodeA> SET SESSION wsrep_OSU_method = 'TOI';

nodeA> UPDATE test SET data = 'Some data changed' WHERE id = 16;

nodeA> SHOW GLOBAL STATUS LIKE 'wsrep_local_state_comment';
+---------------------------+--------------+
| Variable_name             | Value        |
+---------------------------+--------------+
| wsrep_local_state_comment | Inconsistent |
+---------------------------+--------------+

nodeA> DELETE FROM test WHERE id = 25;
nodeA> SHOW GLOBAL STATUS LIKE 'wsrep_local_state_comment';
+---------------------------+--------+
| Variable_name             | Value  |
+---------------------------+--------+
| wsrep_local_state_comment | Synced |
+---------------------------+--------+

...
[Warning] WSREP: Ignoring error 'Can't find record in 'test'' on Delete_rows_v1 event. Error_code: 1032
[Warning] Slave SQL: Could not execute Delete_rows_v1 event on table test.test; Can't find record in 'test', Error_code: 1032; Can't find record in 'test', Error_code: 1032; Can't find record in 'test', Error_code: 1032; Can't find record in 'test', Error_code: 1032; Can't find record in 'test', Error_code: 1032; Can't find record in 'test', Error_code: 1032; Can't find record in 'test', Error_code: 1032; Can't find record in 'test', Error_code: 1032; Can't find record in 'test', Error_code: 1032; Can't find record in 'test', Error_code: 1032; Can't find record in 'test', Error_code: 1032; Can't find record in 'test', Error_code: 1032; Can't find record in 'test', Error_code: 1032; Can't find record in 'test', Error_code: 1032; Can't find record in 'test', Error_code: 1032; Can't find record in 'test', Error_code: 1032; Can't find record in 'test', Error_code: 1032; Can't find record in 'test', Error_code: 1032; Can't find record in 'test', Error_code: 1032; Can't find record in 'test', Error_code: 1032; Can't find re
[ERROR] Slave SQL: Could not read field 'id' of table 'test.test', Internal MariaDB error code: 1610
[ERROR] mariadbd: Can't find record in 'test'
...

Process list from node C:

nodeC> SHOW PROCESSLIST;
+----+-------------+-----------+------+---------+------+-------------------------+----------------------------------------------------------------------------------------+----------+
| Id | User        | Host      | db   | Command | Time | State                   | Info                                                                                   | Progress |
+----+-------------+-----------+------+---------+------+-------------------------+----------------------------------------------------------------------------------------+----------+
|  2 | system user |           | NULL | Sleep   | 1670 | After apply log event   | NULL                                                                                   |    0.000 |
|  1 | system user |           | NULL | Sleep   | 4609 | wsrep aborter idle      | NULL                                                                                   |    0.000 |
|  7 | system user |           | NULL | Sleep   | 4608 |                         | NULL                                                                                   |    0.000 |
|  8 | system user |           | test | Sleep   | 1441 | Executing               | DELETE FROM test WHERE id = 25?5jR                                                     |    0.000 |
| 10 | system user |           | NULL | Sleep   | 2002 | wsrep applied write set | INSERT INTO test (id, data, ts) VALUES (NULL, 'Some data to fill table up', NOW()) 9O? |    0.000 |
| 28 | root        | localhost | NULL | Query   |    0 | starting                | show processlist                                                                       |    0.000 |
+----+-------------+-----------+------+---------+------+-------------------------+----------------------------------------------------------------------------------------+----------+

Node C is still synchronised:

nodeC> SHOW GLOBAL STATUS LIKE 'wsrep_local_state_comment';
+---------------------------+--------+
| Variable_name             | Value  |
+---------------------------+--------+
| wsrep_local_state_comment | Synced |
+---------------------------+--------+

Shutdown of the node for the following error message:

nodeC> SQL> shutdown;
ERROR 1047 (08S01): WSREP has not yet prepared node for application use

The entire cluster was then no longer usable. And had to be restarted (bootstrap).

Summary

We have opened a bug at MariaDB on this topic: Attribute Promotion/Demotion in Galera Cluster.

Further tests were not carried out for the time being, as this feature is too unstable in the tested version and DROP COLUMN is not a use case of our customer.

Conclusion: MariaDB Galera Cluster does not properly handle this situation in the version tested, nor does the cluster prevent this case. Cluster nodes are marked as inconsistent. Our current recommendation: Do NOT do a rolling schema upgrade (RSU) with concurrent DML commands (INSERT, UPDATE, DELETE) on the tables to be changed with the analysed version!

Further sources:

Taxonomy upgrade extras:

Shinguz's blog
Log in or register to post comments

Languages

Login

You are here

Attribute promotion and demotion in the MariaDB Galera Cluster

Use case of the customer

What does the MariaDB documentation say about this?

Test planning

Testing

Test 1: Attribute promotion with DML remote (on Node C)

Test 2: Attribute promotion with DML remote (on node B)

Test 3: Attribute promotion with DML remote (on node C)

Test 4: Attribute promotion with DML remote (on node B)

Test 5: Attribute promotion with DML locally (on node A)

Test 6: DDL on 2 nodes

Test 7: UPDATE and DELETE commands from the same node

Summary

Navigation

Trademarks

Languages

Login

Search form

You are here

Attribute promotion and demotion in the MariaDB Galera Cluster

Use case of the customer

What does the MariaDB documentation say about this?

Test planning

Testing

Test 1: Attribute promotion with DML remote (on Node C)

Test 2: Attribute promotion with DML remote (on node B)

Test 3: Attribute promotion with DML remote (on node C)

Test 4: Attribute promotion with DML remote (on node B)

Test 5: Attribute promotion with DML locally (on node A)

Test 6: DDL on 2 nodes

Test 7: UPDATE and DELETE commands from the same node

Summary

Navigation

Trademarks