【该功能在集群6.0.1版本中引入】
集群支持MySQL标准XA协议,包括XA START,XA END,XA PREPARE,XA COMMIT,XA COMMIT ONE PHASE,XA ROLLBACK,XA RECOVER基础语句。
集群配置要求
集群sqlnode节点配置要求
| 配置参数 | 可选值 | 配置说明 |
|---|---|---|
| greatdb_enable_dtm | ON | 外部XA实现依托于dtm-server保证事务性,因此必须开启dtm |
| greatdb_enable_xa | ON | 集群必须首先支持内部两阶段事务,才能支持全局外部两阶段事务 |
| xa_detach_on_prepare | OFF | 基于性能及使用习惯,以及实现等因素,仅在参数关闭场景下支持集群外部XA事务 |
| xa_report_not_in_recover | ON/OFF | 建议参数值为ON 该参数主要影响执行XA语句的返回值 ON:mysql默认行为,如果一个xid被某个connection持有,返回标准XA协议的ER_XAER_NOTA错误 OFF:如果一个xid被某个connection持有,返回集群新增的ER_XAER_NOT_IN_RECOVER,不是标准的XA协议,外部XA分布式事务框架并不能识别该错误 【该参数为集群新引入参数, 引入版本6.0.1】 |
| xa_remote_mode | ON/OFF | 该参数影响集群执行XA语句的行为,建议参数值为ON,可以在必要的场景下,设置为OFF ON:执行XA RECOVER语句时,会获取所有sqlnode节点的完成prepared状态的xid;执行XA COMMIT/ROLLBACK语句时,如果xid不在当前sqlnode节点,会继续在集群其它sqlnode节点提交/回滚事务 OFF:执行XA RECOVER语句时,只返回当前sqlnode节点的prepared状态的xid;执行XA COMMIT/ROLLBACK语句时,如果xid不在当前sqlnode节点,则直接返回ER_XAER_NOTA错误 【该参数为集群新引入参数, 引入版本6.0.1】 |
集群datanode节点配置要求
| 配置参数 | 可选值 | 配置说明 |
|---|---|---|
| xa_detach_on_prepare | OFF | 基于集群实现逻辑,必须为OFF |
| xa_report_not_in_recover | OFF | 基于集群实现逻辑,必须为OFF |
使用说明
集群层面外部XA使用上,尽量保持和标准MySQL一致的使用行为。
支持的XA语句
xa commit one phase支持,示例
mysql> XA START '1'; Query OK, 0 rows affected (0.00 sec) mysql> SELECT * FROM t1; +----|------+ | c1 | c2 | +----|------+ | 1 | 1 | +----|------+ 1 row in set (0.01 sec) mysql> UPDATE t1 SET c2=c2+1; Query OK, 1 row affected (0.01 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> XA END '1'; Query OK, 0 rows affected (0.00 sec) mysql> XA COMMIT '1' ONE PHASE; Query OK, 0 rows affected (0.01 sec) mysql> SELECT * FROM t1; +----|------+ | c1 | c2 | +----|------+ | 1 | 2 | +----|------+ 1 row in set (0.00 sec)xa prepare + xa commit 支持,示例
mysql> XA START '2'; Query OK, 0 rows affected (0.00 sec) mysql> SELECT * FROM t1; +----|------+ | c1 | c2 | +----|------+ | 1 | 2 | +----|------+ 1 row in set (0.00 sec) mysql> UPDATE t1 SET c2=c2+1; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> XA END '2'; Query OK, 0 rows affected (0.00 sec) mysql> XA PREPARE '2'; Query OK, 0 rows affected (0.01 sec) mysql> XA COMMIT '2'; Query OK, 0 rows affected (0.01 sec) mysql> SELECT * FROM t1; +----|------+ | c1 | c2 | +----|------+ | 1 | 3 | +----|------+ 1 row in set (0.00 sec)xa prepare + xa rollback 支持,示例
mysql> XA START '3'; Query OK, 0 rows affected (0.00 sec) mysql> SELECT * FROM t1; +----|------+ | c1 | c2 | +----|------+ | 1 | 3 | +----|------+ 1 row in set (0.00 sec) mysql> UPDATE t1 SET c2=c2+1; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> XA END '3'; Query OK, 0 rows affected (0.00 sec) mysql> XA PREPARE '3'; Query OK, 0 rows affected (0.01 sec) mysql> XA ROLLBACK '3'; Query OK, 0 rows affected (0.01 sec) mysql> SELECT * FROM t1; +----|------+ | c1 | c2 | +----|------+ | 1 | 3 | +----|------+ 1 row in set (0.01 sec)直接xa rollback支持,示例
mysql> XA START '4'; Query OK, 0 rows affected (0.00 sec) mysql> SELECT * FROM t1; +----|------+ | c1 | c2 | +----|------+ | 1 | 3 | +----|------+ 1 row in set (0.01 sec) mysql> XA END '4'; Query OK, 0 rows affected (0.00 sec) mysql> XA ROLLBACK '4'; Query OK, 0 rows affected (0.01 sec) mysql> SELECT * FROM t1; +----|------+ | c1 | c2 | +----|------+ | 1 | 3 | +----|------+ 1 row in set (0.01 sec)xa recover支持,示例
mysql> XA START '5'; Query OK, 0 rows affected (0.00 sec) mysql> SELECT * FROM t1; +----|------+ | c1 | c2 | +----|------+ | 1 | 3 | +----|------+ 1 row in set (0.00 sec) mysql> XA END '5'; Query OK, 0 rows affected (0.00 sec) mysql> XA PREPARE '5'; Query OK, 0 rows affected (0.00 sec) mysql> XA RECOVER; +----------|--------------|--------------|------+ | formatID | gtrid_length | bqual_length | data | +----------|--------------|--------------|------+ | 1 | 1 | 0 | 5 | +----------|--------------|--------------|------+ 1 row in set (0.00 sec) mysql> XA COMMIT '5'; Query OK, 0 rows affected (0.02 sec) mysql> XA RECOVER; Empty set (0.00 sec)
用户主动关闭连接
对于用户主动关闭连接的场景,集群实现和MySQL原生的xa有些不一致。在执行XA PREPARE语句成功后,关闭连接,原生MySQL可以使用任意其它的连接,执行XA RECOVER查询xid状态,执行XA COMMIT/ROLLBACK进行事务的提交\回滚操作。
如下示例是MySQL的xa行为
--conn1
mysql> XA START 'm1';
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT * FROM n1;
+----|------+
| c1 | c2 |
+----|------+
| 1 | 1 |
+----|------+
1 row in set (0.00 sec)
mysql> UPDATE n1 SET c2=c2+1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> XA END '1';
ERROR 1397 (XAE04): XAER_NOTA: Unknown XID
mysql> XA END 'm1';
Query OK, 0 rows affected (0.00 sec)
mysql> XA PREPARE 'm1';
Query OK, 0 rows affected (0.00 sec)
mysql> quit
Bye
--conn2
mysql> XA RECOVER;
+----------|--------------|--------------|------+
| formatID | gtrid_length | bqual_length | data |
+----------|--------------|--------------|------+
| 1 | 2 | 0 | m1 |
+----------|--------------|--------------|------+
1 row in set (0.00 sec)
mysql> XA COMMIT 'm1';
Query OK, 0 rows affected (0.01 sec)
mysql> SELECT * FROM n1;
+----|------+
| c1 | c2 |
+----|------+
| 1 | 2 |
+----|------+
1 row in set (0.00 sec)
在集群实现中,如果一个sqlnode节点释放了连接,该sqlnode将同时释放这个xid;同时,由dtm-server的后台线程去接管,并在dtm-server节点重做该xid。用户可以在dtm-server节点通过XA RECOVER/COMMIT/ROLLBACK查看、提交、回滚该xid,或者在开启xa_remote_mode模式下,在任意online的sqlnode节点执行XA RECOVER/COMMIT/ROLLBACK。
如下是集群中关闭连接的示例场景:
--conn1 非dtm-server节点
mysql> XA START '6';
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT * FROM t1;
+----|------+
| c1 | c2 |
+----|------+
| 1 | 3 |
+----|------+
1 row in set (0.00 sec)
mysql> UPDATE t1 SET c2=c2+1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> XA END '6';
Query OK, 0 rows affected (0.00 sec)
mysql> XA PREPARE '6';
Query OK, 0 rows affected (0.00 sec)
mysql> quit
Bye
--conn2 非dtm-server节点
mysql> set xa_remote_mode=OFF;
Query OK, 0 rows affected (0.00 sec)
mysql> XA RECOVER;
Empty set (0.00 sec)
mysql> set xa_remote_mode=ON;
Query OK, 0 rows affected (0.00 sec)
mysql> XA RECOVER;
+----------|--------------|--------------|------+
| formatID | gtrid_length | bqual_length | data |
+----------|--------------|--------------|------+
| 1 | 1 | 0 | 6 |
+----------|--------------|--------------|------+
1 row in set (0.10 sec)
--conn3 dtm-server节点
mysql> set xa_remote_mode=OFF;
Query OK, 0 rows affected (0.00 sec)
mysql> XA RECOVER;
+----------|--------------|--------------|------+
| formatID | gtrid_length | bqual_length | data |
+----------|--------------|--------------|------+
| 1 | 1 | 0 | 6 |
+----------|--------------|--------------|------+
1 row in set (0.00 sec)
mysql> set xa_remote_mode=ON;
Query OK, 0 rows affected (0.00 sec)
mysql> XA RECOVER;
+----------|--------------|--------------|------+
| formatID | gtrid_length | bqual_length | data |
+----------|--------------|--------------|------+
| 1 | 1 | 0 | 6 |
+----------|--------------|--------------|------+
1 row in set (0.10 sec)
--conn4 非dtm-server节点
mysql> set xa_remote_mode=OFF;
Query OK, 0 rows affected (0.00 sec)
mysql> XA COMMIT '6';
ERROR 1397 (XAE04): XAER_NOTA: Unknown XID
mysql> set xa_remote_mode=ON;
Query OK, 0 rows affected (0.00 sec)
mysql> XA COMMIT '6';
Query OK, 0 rows affected (0.02 sec)
mysql> SELECT * FROM t1;
+----|------+
| c1 | c2 |
+----|------+
| 1 | 4 |
+----|------+
1 row in set (0.00 sec)
事务执行异常
这里主要考虑事务在执行增删改查等操作过程中,出现错误并导致隐式回滚等场景的问题。
首先,可以对比下MySQL原生xa在事务隐式回滚时的行为。如下示例,在执行XA COMMIT后,会报错XA_RBDEADLOCK,并清理xid状态,此时,session-1连接可以继续执行下一个事务。
| session-1 | session-2 |
|---|---|
| truncate table n1; insert into n1 values (1,1),(2,2); | |
| XA START 'm2'; UPDATE n1 SET c2=c2+1 WHERE c1=1; | |
| BEGIN; UPDATE n1 SET c2=c2+10 WHERE c1=2; | |
| UPDATE n1 SET c2=c2+10 WHERE c1=1; | |
| UPDATE n1 SET c2=c2+1 WHERE c1=2; ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction | |
| XA COMMIT 'm2'; ERROR 1614 (XA102): XA_RBDEADLOCK: Transaction branch was rolled back: deadlock was detected |
集群在开启隐式回滚参数后,保持和MySQL原生xa一致的行为。如下示例:
mysql> show variables like '%greatdb%rollback%';
+-------------------------------------|-------+
| Variable_name | Value |
+-------------------------------------|-------+
| greatdb_implicit_rollback_error_trx | ON |
+-------------------------------------|-------+
1 row in set (0.00 sec)
mysql> truncate table t1;
Query OK, 0 rows affected (0.11 sec)
mysql> insert into t1 values (1,1),(2,2);
Query OK, 2 rows affected (0.02 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql> XA START '7';
Query OK, 0 rows affected (0.00 sec)
mysql> UPDATE t1 SET c2=c2+1 WHERE c1=1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> INSERT INTO t1 VALUES (2,2);
ERROR 1062 (23000): Duplicate entry '2' for key 't1.PRIMARY'
mysql> XA COMMIT '7';
ERROR 1402 (XA100): XA_RBROLLBACK: Transaction branch was rolled back
当然,在关闭隐式提交参数的场景下,保持和普通事务一致的行为,事务无法执行新的操作,并且事务必须回滚。如下示例:
mysql> set global greatdb_implicit_rollback_error_trx=off;
Query OK, 0 rows affected (0.00 sec)
mysql> show variables like '%greatdb%rollback%';
+-------------------------------------|-------+
| Variable_name | Value |
+-------------------------------------|-------+
| greatdb_implicit_rollback_error_trx | OFF |
+-------------------------------------|-------+
1 row in set (0.01 sec)
mysql> XA START '8';
Query OK, 0 rows affected (0.00 sec)
mysql> UPDATE t1 SET c2=c2+1 WHERE c1=1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> INSERT INTO t1 VALUES (2,2);
ERROR 1062 (23000): Duplicate entry '2' for key 't1.PRIMARY'
mysql> INSERT INTO t1 VALUES (3,3);
ERROR 8539 (HY000): There have error occur in transaction, should rollback
mysql> XA COMMIT '8';
ERROR 1399 (XAE07): XAER_RMFAIL: The command cannot be executed when global transaction is in the ACTIVE state
mysql> XA END '8';
Query OK, 0 rows affected (0.00 sec)
mysql> XA COMMIT '8';
ERROR 1399 (XAE07): XAER_RMFAIL: The command cannot be executed when global transaction is in the IDLE state
mysql> XA PREPARE '8';
ERROR 1402 (XA100): XA_RBROLLBACK: Transaction branch was rolled back
mysql> SELECT * FROM t1;
+----|------+
| c1 | c2 |
+----|------+
| 1 | 1 |
| 2 | 2 |
+----|------+
2 rows in set (0.00 sec)
sqlnode节点异常
在存在外部XA事务的场景下,如果sqlnode节点异常,退出了group_replication,此时,sqlnode节点会强制kill掉所有xa事务的连接。如此操作的目的是,释放sqlnode同datanode之间的连接,dtm-server节点可以依赖后台线程接管xid,方便用户及时处理故障。如下示例:
--conn1
mysql> xa start '11';
Query OK, 0 rows affected (0.00 sec)
mysql> update t1 set c2=c2+1 where c1=2;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> xa end '11';
Query OK, 0 rows affected (0.00 sec)
mysql> xa prepare '11';
Query OK, 0 rows affected (0.00 sec)
--conn2
stop group_replication;
--conn1
mysql> select * from t1;
ERROR 2013 (HY000): Lost connection to MySQL server during query
当然,这里需要注意的一个问题是,在数据库sqlnode节点本身未出现shutdown故障时,会报错Lost connection。
datanode节点异常
- 对于XA COMMIT ONE PHASE,保持和普通dtm事务一致的行为。
- 如果在执行XA PREPARE,XA COMMIT,XA ROLLBACK 过程中,出现了后端节点异常,如果不能确定事务最终执行成功或者失败,一般报错ER_XAER_RMERR,这是XA协议的标准错误码。
- 对于XA COMMIT,会先往系统表
mysql.greatdb_dtm_external_xa_commit_flag中写入commit-flag标记,只要该标记写入成功,数据会最终提交 - 对于XA COMMIT,只要完成dtm-server提交,不论是否完成后端数据节点的提交,都直接返回事务提交成功
- 对于XA ROLLBACK,只要后端任意一个数据节点完成事务回滚,事务将最终回滚;但如果后端datanode节点异常,当前连接会一直等待直到所有的datanode均完成回滚后才返回用户回滚完成
- 对于XA RECOVER操作,如果后端datanode节点异常,导致无法查询,可能会报错ER_XAER_RMERR。如果用户关闭xa-remote-mode参数,只获取当前sqlnode节点的xid信息,则不会感知后端datanode节点异常。




