You are here
Online DDL vs pt-online-schema-change
One of the most expensive database operations is performing Data Definition Language (DDL, e.g. CREATE, DROP, ALTER, etc.) statements, specially, the ALTER statements because MySQL blocks the entire table for both reads and writes while modifying the table.
For the huge tables, this might take hours to get the table changed which affects the application, so that, a good planning is required for such operations in order to avoid doing these changes during the peak times. For those people who have 24/7 services or limited maintenance window, DDL on huge tables is a really nightmare.
Percona developed a very good tool called pt-online-schema-change (version 2.2.6 at the time of writing this article) to perform such operations online without blocking/affecting the application and read/write operations to the table being changed is available.
Also MySQL made some enhancements for DDL statements and introduced the Online DDL feature in MySQL 5.6.
In this article, I will talk about an overview of both ways (Online DDL & pt-online-schema-change) alongside with an example and which one of them should be used in different scenarios.
pt-online-schema-change
Overview
This tool is developed by Percona to alter tables without locking them during the ALTER operation.
Simply, this tool creates a new empty table like the original table with the needed structure change, copy the data from the original table in small chunks to the new table, drop the original table and then rename the new table to the original name. During the copy process all new changes to the original table are being applied to the new one because a trigger is created on the original table which ensure that all new changes will be applied on the new table.
For more information about pt-online-schema-change tool, check out the manual documentation.
Example
Altering a table called "test.test1" by adding an index (name_idx) on column "name":
[root@gcservera ~]# pt-online-schema-change --execute --alter "add index name_idx (name)" D=test,t=test1,h=localhost Operation, tries, wait: copy_rows, 10, 0.25 create_triggers, 10, 1 drop_triggers, 10, 1 swap_tables, 10, 1 update_foreign_keys, 10, 1 Altering `test`.`test1`... Creating new table... Created new table test._test1_new OK. Altering new table... Altered `test`.`_test1_new` OK. 2014-02-09T15:33:27 Creating triggers... 2014-02-09T15:33:27 Created triggers OK. 2014-02-09T15:33:27 Copying approximately 1 rows... 2014-02-09T15:33:27 Copied rows OK. 2014-02-09T15:33:27 Swapping tables... 2014-02-09T15:33:27 Swapped original and new tables OK. 2014-02-09T15:33:27 Dropping old table... 2014-02-09T15:33:27 Dropped old table `test`.`_test1_old` OK. 2014-02-09T15:33:27 Dropping triggers... 2014-02-09T15:33:27 Dropped triggers OK. Successfully altered `test`.`test1`.
Note:
The output is perfectly describing all steps that the tool is doing in the background.
Limitations of pt-online-schema-change
- A PRIMARY KEY or a unique index should be defined for the table before using this tool because it is required for the DELETE trigger.
- Not supported if the table has already triggers defined.
- The tool become complicate a little if the table has a foreign key constraint and an additional option
--alter-foreign-keys-method
should be used. - Also because of the foreign keys, the object names might be changed (indexes names , .. etc).
- In Galera Cluster environment, altering MyISAM tables is not supported and the system variable "wsrep_OSU_method" must be set to "TOI" (total order isolation).
Online DDL
Overview
In MySQL 5.5 and 5.1 with the InnoDB plugin, a new feature known as Fast Index Creation was introduced to avoid copying the tables data - when adding or removing secondary indexes - using the optimized CREATE INDEX and DROP INDEX statements.
In MySQL 5.6, the Online DDL method was introduced to allow more changes to be made on the table while accessing and writing to the table being changed is available.
The Online DDL syntax is exactly the same like the normal alter statement after specifying two parameters:
ALGORITHM:- INPLACE: the table change will be made in-place without rebuilding the entire table (in most cases, no copying data to temporary table is required).
- COPY: copying data to a temporary table, rebuilding the table and reconstructing the secondary indexes will be made (equivalent to the traditional method).
- NONE: Read and write operations are allowed during the altering process.
- SHARED: Only read operations are allowed during the altering operations (DML is not allowed).
- EXCLUSIVE: The entire table will be locked for both reading and writing (neither select nor DML are allowed).
The Online DDL is perfectly explained in the online manual documentation, you can check it out here for more information.
Example
Altering a table called "test.test2" by adding an index (name_idx) on column "name":
mysql> alter table test2 -> add index name_idx (name),algorithm=inplace, lock=none; Query OK, 0 rows affected (0.03 sec) Records: 0 Duplicates: 0 Warnings: 0
Limitations of Online DDL
- Works only with InnoDB (syntax wise it could be used with other storage engines "like MyISAM" but only "algorithm=copy" is allowed which is equivalent to the traditional method).
- Regardless of the locking used (none,shared or exclusive) a brief period at the beginning and at the end of the process is requiring an exclusive lock on the table.
foreign_key_checks
should be disabled when adding/dropping foreign keys to avoid table copying behavior.- Still some alter operations require table copying or table locking in order to make the change (the old behavior). For more details on which table change require table-copying or table locking, check out this manual page.
- LOCK=NONE is not allowed in the alter table statement if there are ON...CASCADE or ON...SET NULL constraints on the table.
- While the Online DDL will be replicated on the slaves the same like the master (if LOCK=NONE no table-locking will take place on the slaves during the alter execution) but the replication itself will be blocked as the replay process executes in a single thread on the replicas which will cause slave lagging problem.
Comparison results
The following is a comparison results between Online DDL and pt-online-schema-change for some alter operations applied on a table contains 1,078,880 rows:
Online DDL | pt-online-schema-change | |||||
---|---|---|---|---|---|---|
Change Operation | Row(s) affected | Is table locked? | Time (sec) | Row(s) affected | Is table locked? | Time (sec) |
Add Index | 0 | No | 3.76 | All rows | No | 38.12 |
Drop Index | 0 | No | 0.34 | All rows | No | 36.04 |
Add Column | 0 | No | 27.61 | All rows | No | 37.21 |
Rename Column | 0 | No | 0.06 | All rows | No | 34.16 |
Rename Column + change its data type | All rows | Yes | 30.21 | All rows | No | 34.23 |
Drop Column | 0 | No | 22.41 | All rows | No | 31.57 |
Change table ENGINE | All rows | Yes | 25.30 | All rows | No | 35.54 |
Which method should be used?
Now the question is, which method should we use to perform alter table statements?
While pt-online-schema-change allows read and write operations to the table being altered, it still copies the tables data to a temporary table in the background which adds overhead on the MySQL server. So basically, we should use pt-online-schema-change if the Online DDL will not work efficiently. In other words, if the Online DDL will require copying data to a temporary table (algorithm=copy) and the table will be blocked for long time (lock=exclusive) or when altering huge tables in a replication environment then we should use pt-online-schema-change tool.
- abdel-mawla's blog
- Log in or register to post comments
Comments
Replication
Replication in Online DDL
Hi Shlomi
Can you please explain more on how did you produce your results?
Because Online DDL are DDL statements anyway and will be written to the binary log as statements (even if RBR is being used) which means that it will be executed on the slave the same like it was on the master.
By the way, I've tested it in a replication environment and the table being changed on the slave was not blocked during the statement execution and it took approximately the same or even less time than it was on the master.
Also I double checked the online documentation and didn't find any hints about slave blocking with online DDL Limitations of Online DDL.
With online DDL an ALTER
Advantage of pt-online-schema-change
Hi Przemek,
I do agree with you that the replication will be blocked until the slaves finish executing the alter statement, but the table being changed on the slaves themselves wont be blocked during the alter statement the same like the master.
I agree also that this could be considered as an advantage of pt-online-schema-change over Online DDL.
I'll add that to the blog.
Thanks Przemek for the hint ...