tidb本身是一个分布式数据库,省去了分库分表的麻烦,对于mysql分库分表后数据库,如果不是分表,分库key维度的查询,需要跨多个表很麻烦,tidb是选择之一。如何把mysql数据同步到tidb?dm是tidb官方提供的存量、增量数据同步工具。本文将介绍如何在mac上实现服务搭建。
dm是经典的主从架构,dm-master负责元数据的管理和接受用户的dmctl发来的请求,dm-master自己实现了高可用主备切换,dm-worker 类似于canal伪装成mysql的slave,消费mysql的binlog,处理发送给tidb。一个mysql 只能对应一个dm worker,如果worker数量多于mysql实例的数量,worker将处于空闲状态。详细可以参考tidb的文档:https://docs.pingcap.com/zh/tidb/stable/dm-overview

首先安装并启动tidb
curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | shsource Users/xiazemin/.zshrctiup playground
CLUSTER START SUCCESSFULLY, Enjoy it ^-^To connect TiDB: mysql --comments --host 127.0.0.1 --port 4000 -u root -p (no password)To view the dashboard: http://127.0.0.1:2379/dashboardPD client endpoints: [127.0.0.1:2379]To view the Prometheus: http://127.0.0.1:9090To view the Grafana: http://127.0.0.1:3000
按照官方的文档安装dm,默认是安装在linux上的,mac我测试的过程中遇到了如下问题:
tiup install dm dmctlError: component dmctl doesn't support platform darwin/amd64
没有办法只能下载源码编译
git clone https://github.com/pingcap/dmmake buildls ./cmddm-ctl dm-master dm-portal dm-syncer dm-worker
尝试按照官方方式通过模板部署
tiup dm template > topology.yaml
修改配置文件后遇到了ssh连不上的问题
tiup dm deploy dm-mac-master v6.3.0 topology.yaml -u roottiup is checking updates for component dm ...Starting component `dm`: /Users/xiazemin/.tiup/components/dm/v1.11.0/tiup-dm deploy dm-mac-master v6.3.0 topology.yaml -u root+ Detect CPU Arch Name- Detecting node 127.0.0.1 Arch info ... ErrorError: failed to fetch cpu-arch or kernel-name: executor.ssh.execute_failed: Failed to execute command over SSH for 'root@127.0.0.1:22' {ssh_stderr: , ssh_stdout: , ssh_command: export LANG=C; PATH=$PATH:/bin:/sbin:/usr/bin:/usr/sbin uname -m}, cause: dial tcp 127.0.0.1:22: connect: connection refusedVerbose debug logs has been written to /Users/xiazemin/.tiup/logs/tiup-dm-debug-2022-10-29-15-19-53.log.
因为本机sshd服务没有起来,起sshd服务
% sudo launchctl load -w System/Library/LaunchDaemons/ssh.plist% sudo launchctl list | grep ssh- 0 com.openssh.sshd
尝试仍然失败。于是选择手动部署的方式。也是参考官方的文档
https://github.com/pingcap/docs-dm/blob/master/zh/quick-start-with-dm.md
启动master和worker
nohup dm/bin/dm-master --master-addr='127.0.0.1:8261' --log-file=/tmp/dm-master.log --name="master1" >> /tmp/dm-master.log 2>&1 &nohup dm/bin/dm-worker --worker-addr='127.0.0.1:8262' --log-file=/tmp/dm-worker.log --join='127.0.0.1:8261' --name="worker1" >> /tmp/dm-worker.log 2>&1 &
检查下是否启动成功
dm/bin/dmctl --master-addr=127.0.0.1:8261 list-member{"result": true,"msg": "","members": [{"leader": {"msg": "","name": "master1","addr": "127.0.0.1:8261"}},{"master": {"msg": "","masters": [{"name": "master1","memberID": "11007177379717700053","alive": true,"peerURLs": ["http://127.0.0.1:8291"],"clientURLs": ["http://127.0.0.1:8261"]}]}},{"worker": {"msg": "","workers": [{"name": "worker1","addr": "127.0.0.1:8262","stage": "free","source": ""}]}}]}
定义mysql数据源mysql-source-conf.yaml
# MySQL Configuration.source-id: "mysql-replica-01"from:host: "127.0.0.1"user: "root"password: ""port: 3306
定义同步任务testdm-task.yaml
---name: testdmtask-mode: alltarget-database:host: "127.0.0.1"port: 4000user: "root"password: "" # 如果密码不为空,则推荐使用经过 dmctl 加密的密文mysql-instances:- source-id: "mysql-replica-01"block-allow-list: "ba-rule1" # 黑白名单配置名称,如果 DM 版本早于 v2.0.0-beta.2 则使用 black-white-listfilter-rules: ["filter-rule-1"] # 过滤数据源特定操作的规则,可以配置多个过滤规则route-rules: ["route-rule-1"] # 数据源表迁移到目标 TiDB 表的路由规则,可以定义多个规则block-allow-list:ba-rule1:do-dbs: ["testdm"]do-tables:- db-name: "testdm"tbl-name: "t1"- db-name: "testdm"tbl-name: "t2"filters: # 定义过滤数据源特定操作的规则,可以定义多个规则filter-rule-1: # 规则名称schema-pattern: "test*" # 匹配数据源的库名,支持通配符 "*" 和 "?"table-pattern: "t*" # 匹配数据源的表名,支持通配符 "*" 和 "?"events: ["truncate table"] # 匹配上 schema-pattern 和 table-pattern 的库或者表的操作类型action: Ignore # 迁移(Doroutes:route-rule-1:schema-pattern: "testdm"table-pattern: "t*"target-schema: "test"target-table: "t"# route-rule-2:# schema-pattern: "testdm"# target-schema: "testdm"
其实这个配置文件和canal的配置文件很像,包括数据源的连接方式,实例列表,拦截、过滤、路由规则。我们可以把我们的两个分表通过路由规则合并到tidb中的一个表t中
schema-pattern: "testdm"table-pattern: "t*"target-schema: "test"target-table: "t"
写完配置文件,准备启动mysql
% brew services start mysql==> Successfully started `mysql` (label: homebrew.mxcl.mysql)xiazemin@xiazemindeMacBook-Pro ~ % mysql -urootWelcome to the MySQL monitor. Commands end with ; or \g.Your MySQL connection id is 8Server version: 8.0.28 HomebrewCopyright (c) 2000, 2022, Oracle and/or its affiliates.Oracle is a registered trademark of Oracle Corporation and/or itsaffiliates. Other names may be trademarks of their respectiveowners.Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.mysql>
创建源数据表
drop database if exists `testdm`;create database `testdm` DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;use `testdm`;create table t1 (id bigint, uid int, name varchar(80), info varchar(100), primary key (`id`), unique key(`uid`)) DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;create table t2 (id bigint, uid int, name varchar(80), info varchar(100), primary key (`id`), unique key(`uid`)) DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;insert into t1 (id, uid, name) values (1, 10001, 'Gabriel García Márquez'), (2, 10002, 'Cien años de soledad');insert into t2 (id, uid, name) values (3, 20001, 'José Arcadio Buendía'), (4, 20002, 'Úrsula Iguarán'), (5, 20003, 'José Arcadio');
创建dm的数据源
% ./dm/bin/dmctl --master-addr=127.0.0.1:8261 operate-source create mysql-source-conf.yaml{"result": true,"msg": "","sources": [{"result": true,"msg": "","source": "mysql-replica-01","worker": "worker1"}]}
建数据迁移任务
% ./dm/bin/dmctl --master-addr 127.0.0.1:8261 start-task testdm-task.yaml{"result": true,"msg": "","sources": [{"result": true,"msg": "","source": "mysql-replica-01","worker": "worker1"}]}
查看迁移任务状态
./dm/bin/dmctl --master-addr 127.0.0.1:8261 query-status{"result": true,"msg": "","tasks": [{"taskName": "testdm","taskStatus": "Error - Some error occurred in subtask. Please run `query-status testdm` to get more details.","sources": ["mysql-replica-01"]}]}
% ./dm/bin/dmctl --master-addr 127.0.0.1:8261 query-status testdm{"result": true,"msg": "","sources": [{"result": true,"msg": "","sourceStatus": {"source": "mysql-replica-01","worker": "worker1","result": null,"relayStatus": null},"subTaskStatus": [{"name": "testdm","stage": "Paused","unit": "Load","result": {"isCanceled": false,"errors": [{"ErrCode": 10006,"ErrClass": "database","ErrScope": "downstream","ErrLevel": "high","Message": "run db schema failed - dbfile ./dumped_data.testdm/testdm-schema-create.sql: execute statement failed: CREATE DATABASE `testdm` *!40100 DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci */ *!80016 DEFAULT ENCRYPTION='N' */;","RawCause": "Error 1273: Unsupported collation when new collation is enabled: 'utf8mb4_0900_ai_ci'","Workaround": ""}],"detail": null},"unresolvedDDLLockID": "","load": {"finishedBytes": "0","totalBytes": "351","progress": "0.00 %","metaBinlog": "(mysql-bin.000010, 1893)","metaBinlogGTID": ""}}]}]}
发现报错了,原因是tidb默认支持的是mysql5.7,然而我本机的mysql是8.0版本,默认的字符排序方式是
utf8mb4_0900_ai_ci
tidb不支持,所以需要把库和表上的默认字符排序方式改下
CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
重启任务
% ./dm/bin/dmctl --master-addr 127.0.0.1:8261 stop-task testdm-task.yaml{"op": "Stop","result": true,"msg": "","sources": [{"result": true,"msg": "","source": "mysql-replica-01","worker": "worker1"}]}
% ./dm/bin/dmctl --master-addr 127.0.0.1:8261 start-task testdm-task.yaml{"result": true,"msg": "","sources": [{"result": true,"msg": "","source": "mysql-replica-01","worker": "worker1"}]}
中间还是遇到了一个错误
{"result": true,"msg": "","sources": [{"result": false,"msg": "[code=38032:class=dm-master:scope=internal:level=high], Message: some error occurs in dm-worker: ErrCode:10006 ErrClass:\"database\" ErrScope:\"not-set\" ErrLevel:\"high\" Message:\"startLocation: [position: (, 0), gtid-set: ], endLocation: [position: (mysql-bin.000001, 2610), gtid-set: ]: execute statement failed: INSERT INTO `test`.`t` (`id`,`uid`,`name`,`info`) VALUES (?,?,?,?) ON DUPLICATE KEY UPDATE `id`=VALUES(`id`),`uid`=VALUES(`uid`),`name`=VALUES(`name`),`info`=VALUES(`info`)\" RawCause:\"Error 1146: Table 'test.t' doesn't exist\" , Workaround: Please execute `query-status` to check status.","source": "mysql-replica-01","worker": "worker1"}]}
这是因为你改变了表名和库名,tidb在识别ddl的时候没有结合配置规则做对应的改变,导致在tidb上创建目标表失败,如果库名和表名一样就没有这个问题。
连接tidb
% mysql --comments --host 127.0.0.1 --port 4000 -u root -pEnter password:Welcome to the MySQL monitor. Commands end with ; or \g.Your MySQL connection id is 407Server version: 5.7.25-TiDB-v6.3.0 TiDB Server (Apache License 2.0) Community Edition, MySQL 5.7 compatibleCopyright (c) 2000, 2022, Oracle and/or its affiliates.Oracle is a registered trademark of Oracle Corporation and/or itsaffiliates. Other names may be trademarks of their respectiveowners.Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.mysql>
drop database if exists `test`;create database `test` DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;use test;create table t (id bigint, uid int, name varchar(80), info varchar(100), primary key (`id`), unique key(`uid`)) DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
然后再重启下我们的任务,就成功了。检查下发明目标表里有我们两个表的数据
select * from t;+----+--------+--------------------------+------+| id | uid | name | info |+----+--------+--------------------------+------+| 1 | 10001 | Gabriel García Márquez | NULL || 2 | 10002 | Cien años de soledad | NULL || 3 | 20001 | José Arcadio Buendía | NULL || 4 | 20002 | Úrsula Iguarán | NULL || 5 | 20003 | José Arcadio | NULL || 14 | 200012 | José Arcadio Buendía | NULL |+----+--------+--------------------------+------+6 rows in set (0.00 sec)
监控的话可以看下tidb自带的监控,里面有binlog延迟的监控
http://127.0.0.1:3000/d/RDdDTFvZz/test-cluster-binlog?orgId=1&refresh=10s






