
PolarDB-X 是由阿里巴巴自主研发的云原生分布式数据库,融合了分布式 SQL 引擎 GalaxySQL 和分布式存储引擎 GalaxyEngine,其中 GalaxyEngine 是新一代面向分布式场景的 MySQL 发行版本,作为官方 MySQL 版本的一个分支,除了吸收和跟进官方版本的持续改进以外,尤其在分布式场景下,实现了 Lizard 分布式事务和全局一致性解决方案、 Galaxy X-Protocol 交互协议 pipeline request、 X-Engine 存储引擎、 Galaxy X-Paxos Cluster 保证数据零丢失并持续可用,以及共享的 RDS MySQL 内核企业级功能等,GalaxyEngine 遵循 GPLv2 License,希望通过开源,回馈社区,共建 MySQL 生态繁荣。
GalaxyEngine 开源的技术细节:
1. Lizard 分布式事务和全局一致性解决方案
1.1 问题背景
1.2 Lizard SCN 单机事务系统
MySQL InnoDB 事务系统
1.从 Active Trx 拷贝一个 Read View;
2.检索行记录,根据 TID 到 Read View 中进行二分查找,判断可见性。

Lizard SCN 事务系统

Delayed Record Cleanout
Flashback Query
SELECT ... FROM tablename AS OF [SCN | TIMESTAMP] expr;
mysql> SELECT * FROM tab AS OF TIMESTAMP '2020-12-17 16:40:40';+----+---------+---------------------+| id | version | gmt_modify |+----+---------+---------------------+| 1 | 1 | 2020-12-17 16:40:38 || 2 | 1 | 2020-12-17 16:40:39 |+----+---------+---------------------+mysql> SELECT * FROM tab AS OF TIMESTAMP '2020-12-17 16:40:55';+----+---------+---------------------+| id | version | gmt_modify |+----+---------+---------------------+| 1 | 2 | 2020-12-17 16:40:54 || 2 | 2 | 2020-12-17 16:40:54 |+----+---------+---------------------+
Lizard SCN 性能表现

1.3 Lizard GCN 分布式事务系统
两阶段提交
SET SESSION innodb_commit_seq = [GCN];XA START xid......XA COMMIT xid;
XA START xid;......XA COMMIT xid $GCN;

全局一致性
SET SESSION innodb_snapshot_seq = [GCN]
SELECT ... FROM tablename AS OF GCN expr;
TSO 解决方案
CREATE SEQUENCE [IF NOT EXISTS] schema.sequence_name [START WITH <constant>] [MINVALUE <constant>] [MAXVALUE <constant>] [INCREMENT BY <constant>] [CACHE <constant> | NOCACHE] [CYCLE | NOCYCLE] ;
1. SELECT [nextval | currval] FROM seq; 2. SELECT [nextval(seq) | currval(seq)]; 3. SELECT [seq.currval | seq.nextval] FROM dual;
CREATE SEQUENCE [IF NOT EXISTS] schema.sequence_name [CACHE <constant> | NOCACHE] TIMESTEAMP;
XA 事务完整性

2. Galaxy X-protocol 交互协议
消息格式
消息处理模型

3. X-Engine 存储引擎
X-Engine 整体架构
1. 利用LSM-tree先天优势,提升了系统写性能天花板。 2. 优化LSM-tree的compaction操作,以降低对系统性能的冲击,使得系统性能表现趋于平稳。
3. 利用持久化数据层只读特点,发挥压缩优势降低成本。 4. 利用天然分层结构,结合硬件能力使用冷热分层结构,降低综合成本。
5. 利用精细化访问机制和缓存技术,弥补LSM-tree读性能短板。

X-Engine 使用方法
建表语法
CREATE TABLE t1 ( c1 int PRIMARY KEY, c2 int) ENGINE = xengine;
X-Engine 性能水准

4. RDS MySQL 企业级功能
4.1 Purge Large Table Asynchronously

mysql> select * from information_schema.innodb_purge_files;+--------+---------------------+--------------------+---------------+-------------------------+--------------+| log_id | start_time | original_path | original_size | temporary_path | current_size |+--------+---------------------+--------------------+---------------+-------------------------+--------------+| 0 | 2021-05-14 14:40:01 | ./file_purge/t.ibd | 146800640 | ./#FP_210514 14:40:01_9 | 79691776 |+--------+---------------------+--------------------+---------------+-------------------------+--------------+1 row in set (0.20 sec)
4.2 Recycle Bin
Recycle Bin 机制介绍
回收机制:
如果是与表无关的对象,根据操作语句决定是否保留,不做回收。 如果是表的附属对象,可能会修改表数据的,做删除处理,例如 Trigger 和 Foreign key。但 Column statistics 不做清理,随表进入回收站。
清理机制:
权限机制:
命名机制:
Recycle Bin 相关参数

Recycle Bin 接口设计
-- Purge Tabledbms_recycle.purge_table('<TABLE>');-- Restore tabledbms_recycle.restore_table('<RECYCLE_TABLE>','<DEST_DB>','<DEST_TABLE>');-- Show tablesmysql> call dbms_recycle.show_tables();+-----------------+---------------+---------------+--------------+---------------------+---------------------+| SCHEMA | TABLE | ORIGIN_SCHEMA | ORIGIN_TABLE | RECYCLED_TIME | PURGE_TIME |+-----------------+---------------+---------------+--------------+---------------------+---------------------+| __recycle_bin__ | __innodb_1063 | product_db | t1 | 2019-08-08 11:01:46 | 2019-08-15 11:01:46 || __recycle_bin__ | __innodb_1064 | product_db | t2 | 2019-08-08 11:01:46 | 2019-08-15 11:01:46 || __recycle_bin__ | __innodb_1065 | product_db | parent | 2019-08-08 11:01:46 | 2019-08-15 11:01:46 || __recycle_bin__ | __innodb_1066 | product_db | child | 2019-08-08 11:01:46 | 2019-08-15 11:01:46 |+-----------------+---------------+---------------+--------------+---------------------+---------------------+4 rows in set (0.00 sec)
4.3 Returning
DBMS_TRANS.returning(<Field_list>,<Statement>);
mysql> call dbms_trans.returning("*", "insert into t(id) values(NULL),(NULL)");+----+------+---------------------+| id | col1 | col2 |+----+------+---------------------+| 1 | 1 | 2019-09-03 10:39:05 || 2 | 1 | 2019-09-03 10:39:05 |+----+------+---------------------+2 rows in set (0.01 sec)mysql> call dbms_trans.returning("id, col1, col2", "update t set col1 = 2 where id >2");+----+------+---------------------+| id | col1 | col2 |+----+------+---------------------+| 3 | 2 | 2019-09-03 10:41:06 || 4 | 2 | 2019-09-03 10:41:06 |+----+------+---------------------+2 rows in set (0.01 sec)mysql> call dbms_trans.returning("id, col1, col2", "delete from t where id < 3");+----+------+---------------------+| id | col1 | col2 |+----+------+---------------------+| 1 | 1 | 2019-09-03 10:40:55 || 2 | 1 | 2019-09-03 10:40:55 |+----+------+---------------------+2 rows in set (0.00 sec)
4.4 Statement Concurrency Control
功能设计
SQL command: SQL命令类型,例如SELECT、UPDATE、INSERT、DELETE等。 Object: SQL命令操作的对象,例如TABLE、VIEW等。
keywords: SQL命令的关键字。
接口设计
-- Add Ruledbms_ccl.add_ccl_rule('<Type>','<Schema_name>','<Table_name>',<Concurrency_count>,'<Keywords>');-- Delete Ruledbms_ccl.del_ccl_rule(<Id>);-- Show Ruledbms_ccl.show_ccl_rule();-- Flush Ruledbms_ccl.flush_ccl_rule();
4.5 Statement Outline
功能设计
Optimizer Hint 根据作用域和 hint 对象,分为 Global level hint、Table/Index level hint、Join order hint等。 Index Hint 根据 Index Hint 的类型和范围进行分类。
outline表中。
接口设计
-- Add optimizer Outlinedbms_outln.add_optimizer_outline('<Schema_name>','<Digest>','<query_block>','<hint>','<query>');-- Add Index Outlinedbms_outln.add_index_outline('<Schema_name>','<Digest>',<Position>,'<Type>','<Hint>','<Scope>','<Query>');-- Delete Outline dbms_outln.del_outline(<Id>);-- Show Outlinemysql> call dbms_outln.show_outline();+------+------------+------------------------------------------------------------------+-----------+-------+------+-------------------------------------------------------+------+----------+-------------------------------------------------------------------------------------+| ID | SCHEMA | DIGEST | TYPE | SCOPE | POS | HINT | HIT | OVERFLOW | DIGEST_TEXT |+------+------------+------------------------------------------------------------------+-----------+-------+------+-------------------------------------------------------+------+----------+-------------------------------------------------------------------------------------+| 33 | outline_db | 36bebc61fce7e32b93926aec3fdd790dad5d895107e2d8d3848d1c60b74bcde6 | OPTIMIZER | | 1 | /*+ SET_VAR(foreign_key_checks=OFF) */ | 1 | 0 | SELECT * FROM `t1` WHERE `id` = ? || 32 | outline_db | 36bebc61fce7e32b93926aec3fdd790dad5d895107e2d8d3848d1c60b74bcde6 | OPTIMIZER | | 1 | /*+ MAX_EXECUTION_TIME(1000) */ | 2 | 0 | SELECT * FROM `t1` WHERE `id` = ? || 34 | outline_db | d4dcef634a4a664518e5fb8a21c6ce9b79fccb44b773e86431eb67840975b649 | OPTIMIZER | | 1 | /*+ BNL(t1,t2) */ | 1 | 0 | SELECT `t1` . `id` , `t2` . `id` FROM `t1` , `t2` || 35 | outline_db | 5a726a609b6fbfb76bb8f9d2a24af913a2b9d07f015f2ee1f6f2d12dfad72e6f | OPTIMIZER | | 2 | /*+ QB_NAME(subq1) */ | 2 | 0 | SELECT * FROM `t1` WHERE `t1` . `col1` IN ( SELECT `col1` FROM `t2` ) || 36 | outline_db | 5a726a609b6fbfb76bb8f9d2a24af913a2b9d07f015f2ee1f6f2d12dfad72e6f | OPTIMIZER | | 1 | /*+ SEMIJOIN(@subq1 MATERIALIZATION, DUPSWEEDOUT) */ | 2 | 0 | SELECT * FROM `t1` WHERE `t1` . `col1` IN ( SELECT `col1` FROM `t2` ) || 30 | outline_db | b4369611be7ab2d27c85897632576a04bc08f50b928a1d735b62d0a140628c4c | USE INDEX | | 1 | ind_1 | 3 | 0 | SELECT * FROM `t1` WHERE `t1` . `col1` = ? AND `t1` . `col2` = ? || 31 | outline_db | 33c71541754093f78a1f2108795cfb45f8b15ec5d6bff76884f4461fb7f33419 | USE INDEX | | 2 | ind_2 | 1 | 0 | SELECT * FROM `t1` , `t2` WHERE `t1` . `col1` = `t2` . `col1` AND `t2` . `col2` = ? |+------+------------+------------------------------------------------------------------+-----------+-------+------+-------------------------------------------------------+------+----------+-------------------------------------------------------------------------------------+7 rows in set (0.00 sec)
4.6 Performance Insight
Performance Insight 介绍
Object statisticsObject statistics查询表和索引的统计信息,包括如下两个表:
TABLE_STATISTICS:记录读取和修改的行。 INDEX_STATISTICS:记录索引的读取行。
Performance pointPerformance point 提供实例的详细性能信息,方便您更快更准确地量化SQL的开销。Performance point包括如下三个维度:
CPU:包括执行任务的总时间(Elapsed time)、CPU执行任务的时间(CPU time)等。 LOCK:包括服务器MDL锁时间、存储事务锁时间、互斥冲突(仅调试模式)、读写锁冲突等。 IO:数据文件读写时间、日志文件写入时间、逻辑读取、物理读取、物理异步读取等。
Object statistics 使用方法
确认参数 OPT_TABLESTAT 和 OPT_INDEXSTAT 的值为 ON。示例如下:
mysql> show variables like "opt_%_stat"; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | opt_indexstat | ON | | opt_tablestat | ON | +---------------+-------+
mysql> select * from TABLE_STATISTICS limit 10; +--------------+--------------+-----------+--------------+------------------------+---------------+--------------+--------------+ | TABLE_SCHEMA | TABLE_NAME | ROWS_READ | ROWS_CHANGED | ROWS_CHANGED_X_INDEXES | ROWS_INSERTED | ROWS_DELETED | ROWS_UPDATED | +--------------+--------------+-----------+--------------+------------------------+---------------+--------------+--------------+ | mysql | db | 2 | 0 | 0 | 0 | 0 | 0 | | mysql | engine_cost | 2 | 0 | 0 | 0 | 0 | 0 | | mysql | proxies_priv | 1 | 0 | 0 | 0 | 0 | 0 | | mysql | server_cost | 6 | 0 | 0 | 0 | 0 | 0 | | mysql | tables_priv | 2 | 0 | 0 | 0 | 0 | 0 | | mysql | user | 7 | 0 | 0 | 0 | 0 | 0 | | test | sbtest1 | 1686 | 142 | 184 | 112 | 12 | 18 | | test | sbtest10 | 1806 | 125 | 150 | 105 | 5 | 15 | | test | sbtest100 | 1623 | 141 | 182 | 110 | 10 | 21 | | test | sbtest11 | 1254 | 136 | 172 | 110 | 10 | 16 | +--------------+--------------+-----------+--------------+------------------------+---------------+--------------+--------------+ mysql> select * from INDEX_STATISTICS limit 10; +--------------+--------------+------------+-----------+ | TABLE_SCHEMA | TABLE_NAME | INDEX_NAME | ROWS_READ | +--------------+--------------+------------+-----------+ | mysql | db | PRIMARY | 2 | | mysql | engine_cost | PRIMARY | 2 | | mysql | proxies_priv | PRIMARY | 1 | | mysql | server_cost | PRIMARY | 6 | | mysql | tables_priv | PRIMARY | 2 | | mysql | user | PRIMARY | 7 | | test | sbtest1 | PRIMARY | 2500 | | test | sbtest10 | PRIMARY | 3007 | | test | sbtest100 | PRIMARY | 2642 | | test | sbtest11 | PRIMARY | 2091 | +--------------+--------------+------------+-----------+
Performance point 使用方法
mysql> show variables like "%performance_point%"; +---------------------------------------+-------+ | Variable_name | Value | +---------------------------------------+-------+ | performance_point_dbug_enabled | OFF | | performance_point_enabled | ON | | performance_point_iostat_interval | 2 | | performance_point_iostat_volume_size | 10000 | | performance_point_lock_rwlock_enabled | ON | +---------------------------------------+-------+
mysql> select * from events_statements_summary_by_digest_supplement limit 10; +--------------------+----------------------------------+-------------------------------------------+--------------+ | SCHEMA_NAME | DIGEST | DIGEST_TEXT | ELAPSED_TIME | ...... +--------------------+----------------------------------+-------------------------------------------+--------------+ | NULL | 6b787dd1f9c6f6c5033120760a1a82de | SELECT @@`version_comment` LIMIT ? | 932 | | NULL | 2fb4341654df6995113d998c52e5abc9 | SHOW SCHEMAS | 2363 | | NULL | 8a93e76a7846384621567fb4daa1bf95 | SHOW VARIABLES LIKE ? | 17933 | | NULL | dd148234ac7a20cb5aee7720fb44b7ea | SELECT SCHEMA ( ) | 1006 | | information_schema | 2fb4341654df6995113d998c52e5abc9 | SHOW SCHEMAS | 2156 | | information_schema | 74af182f3a2bd265678d3dadb53e08da | SHOW TABLES | 3161 | | information_schema | d3a66515192fcb100aaef6f8b6e45603 | SELECT * FROM `TABLE_STATISTICS` LIMIT ? | 2081 | | information_schema | b3726b7c4c4db4b309de2dbc45ff52af | SELECT * FROM `INDEX_STATISTICS` LIMIT ? | 2384 | | information_schema | dd148234ac7a20cb5aee7720fb44b7ea | SELECT SCHEMA ( ) | 129 | | test | 2fb4341654df6995113d998c52e5abc9 | SHOW SCHEMAS | 342 | +--------------------+----------------------------------+-------------------------------------------+--------------+
推荐阅读


点击“阅读原文”查看PolarDB-X更多信息
文章转载自阿里云数据库,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




