暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

oceanbase 实践练习四:迁移 MySQL 数据到 OceanBase 集群

原创 null 2023-09-02
508

1.mysqldump 

#使用 mysqldump 将 mysql的表结构和数据同步到 OceanBase 的MySQL 租户中

  • 源库MySQL

#1.创建测试数据[root@te ~]# mysql -uroot -p111111 --socket=/data/mysql_mha/mysql.sock                 #我这里是percona的数据库
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.7.32-35-47-log Percona XtraDB Cluster binary (GPL) 5.7.32-rel35-47, Revision 2055835, wsrep_31.47
Copyright (c) 2009-2020 Percona LLC and/or its affiliates
Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

root@db 12:00:  [(none)]>show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
+--------------------+
4 rows in set (0.00 sec)

root@db 12:01:  [(none)]>create database test;    #创建一个测试库
Query OK, 1 row affected (0.00 sec)

root@db 12:02:  [(none)]>use test;
Database changed
root@db 12:02:  [test]>create table t1 (id int primary key ,name varchar(20));    #创建一个表
Query OK, 0 rows affected (0.02 sec)

root@db 12:04:  [test]>insert into t1 values(1,'张三'),(2,'李四'),(3,'皮卡丘');   #插入几条数据
Query OK, 3 rows affected (0.01 sec)
Records: 3  Duplicates: 0  Warnings: 0

root@db 12:04:  [test]>select * from t1;
+----+-----------+
| id | name      |
+----+-----------+
|  1 | 张三      |
|  2 | 李四      |
|  3 | 皮卡丘    |
+----+-----------+
3 rows in set (0.00 sec)
#2.使用mysqldump导出表
 mysqldump -uroot -p111111 --socket=/data/mysql_mha/mysql.sock --set-gtid-purged=OFF test t1  > /tmp/test_t1.sql    #导出test库的t1表

  • 目标库oceanbase

#1.拷贝数据因为我这是两台机器,所以拷贝数据到ob的机器的/tmp目录
[root@obcp ~]# ll /tmp/test_t1.sql 
-rw-r--r-- 1 admin admin 2903 9月   2 12:14 /tmp/test_t1.sql

#2.登陆oceanbase指定租户、指定数据库额,我这把库关了要重新启动(记得要在oceanbase目录启动[admin@obcp oceanbase]$ cd ~ [admin@obcp oceanbase]$ ./bin/observer ./bin/observer [admin@obcp ~]$ ps -ef|grep obs #启动成功了 admin 20824 1 99 12:18 ? 00:02:31 ./bin/observer
[admin@obcp ~]$ obclient -h127.0.0.1 -uroot@obmysql -P2881 -A -c my_ob -p Enter password: Welcome to the OceanBase. Commands end with ; or \g. Your OceanBase connection id is 3221487619 Server version: OceanBase_CE 4.0.0.0 (r100000272022110114-6af7f9ae79cd0ecbafd4b1b88e2886ccdba0c3be) (Built Nov 1 2022 14:57:18) Copyright (c) 2000, 2018, OceanBase and/or its affiliates. All rights reserved. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. root@db 12:22: [my_ob]>
#导入sql语句root@db 12:22: [my_ob]>source /tmp/test_t1.sql Query OK, 0 rows affected (0.004 sec) ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your OceanBase version for the right syntax to use near '--' at line 1 1 row in set (0.002 sec) Query OK, 0 rows affected (0.001 sec) Query OK, 0 rows affected (0.001 sec) Query OK, 0 rows affected (0.001 sec) Query OK, 0 rows affected (0.001 sec) Query OK, 0 rows affected (0.001 sec) ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your OceanBase version for the right syntax to use near '--' at line 1 Query OK, 0 rows affected (0.000 sec) ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your OceanBase version for the right syntax to use near '--' at line 1 Query OK, 0 rows affected, 1 warning (0.047 sec) Query OK, 0 rows affected (0.001 sec) ....##中间是有些报错的,其实是没有影响的(如果出现其他的内容需要确认),检查数据文件后发现是一些注释,ob没有识别##类似如下这种-- Host: localhost Database: test -- ------------------------------------------------------ -- Server version 5.7.32-35-47-log -- -- Table structure for table `t1` -- #3.检查表数据root@db 12:23: [my_ob]>select * from t1; +----+-----------+ | id | name | +----+-----------+ | 1 | 张三 | | 2 | 李四 | | 3 | 皮卡丘 | +----+-----------+ 3 rows in set (0.010 sec)ok,表数据都是存在的,下面对比下两张表的建表语句
#4.建表语句

第一张图片是mysql的建表语句,第二张图片是oceanbase的建表语句;此时可以看出,oceanabse已经将其转化为自己语法的表了

综上所看,使用mysqldump导出导入数据成功!


2.DataX

datax下载地址:https://datax-opensource.oss-cn-hangzhou.aliyuncs.com/202308/datax.tar.gz

#总计1.5G大小,是github上的地址

下面是一个开源的web界面工具,官方视频里面的,喜欢的可以试试。


2.1.上传DataX

[admin@obcp ~]$ cd ~; tar -xvf  datax.tar.gz
[admin@obcp ~]$ ll
总用量 1612164
drwxrwxr-x  9 admin admin         90 9月   2 13:57 datax
-rw-r--r--  1 admin admin 1650855042 9月   2 13:56 datax.tar.gz
drwxr-xr-x  8 admin admin         97 9月   1 15:45 obproxy-4.0.0
drwxr-xr-x 10 admin admin        100 9月   1 12:44 oceanbase

这块就不写了,自行安排即可

2.2.构造数据

[root@te ~]# mysql -uroot -p111111 -h192.168.61.144 -A -c test
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
root@db 14:21:  [test]>create table t2(            #创建迁移表
    -> id  int primary key,
    -> name varchar(20),
    -> sex varchar(2),
    -> age int,
    -> birth_date date,
    -> id_code varchar(18)) ;
Query OK, 0 rows affected (0.01 sec)
root@db 14:25:  [test]>insert into t2 values (1,'张三','男',18,'2000-01-01',131011200001010286);
Query OK, 1 row affected (0.00 sec)

root@db 14:25:  [test]>insert into t2 values (2,'张三丰','男',99,'1924-09-09',131011192409090287);
Query OK, 1 row affected (0.01 sec)

root@db 14:27:  [test]>insert into t2 values (3,'张三疯了','男',88,'1934-06-24',131011193409090288);
Query OK, 1 row affected (0.00 sec)

root@db 14:27:  [test]>insert into t2 values (4,'张三疯了吧','男',55,'1964-06-24',131011196409090289);
Query OK, 1 row affected (0.02 sec)

root@db 14:27:  [test]>select * from t2;
+----+-----------------+------+------+------------+--------------------+
| id | name            | sex  | age  | birth_date | id_code            |
+----+-----------------+------+------+------------+--------------------+
|  1 | 张三            | 男   |   18 | 2000-01-01 | 131011200001010286 |
|  2 | 张三丰          | 男   |   99 | 1924-09-09 | 131011192409090287 |
|  3 | 张三疯了        | 男   |   88 | 1934-06-24 | 131011193409090288 |
|  4 | 张三疯了吧      | 男   |   55 | 1964-06-24 | 131011196409090289 |
+----+-----------------+------+------+------------+--------------------+
4 rows in set (0.00 sec)


2.3.修改配置文件

[admin@obcp ~]$ cd datax/job/
[admin@obcp job]$ ll
总用量 4
-rw-r--r-- 1 admin admin 1583 8月  21 17:07 job.json     #只有一个job
[admin@obcp job]$ cp job.json  mysql_oc.json             #复制一份
[admin@obcp job]$ vim mysql_oc.json                      #接下来重头戏,修改配置文件
 
{
    "job":{
        "setting":{
            "speed":{
                "channel":4    ####4个并行度
            },
            "errorLimit":{     ####控制脏数据到多少条,或者说百分比到多少,同步就停止了(脏数据:不符合约束的数据)
                "record":0,
                "percentage":0.02      
            }
        },
        "content":[
            {
                "reader":{
                    "name":"mysqlreader",   ####reader插件名字
                    "parameter":{
                        "username":"root",  ####源库用户、密码
                        "password":"111111",
                        "column":[          ####源库要同步的列,若源库目标库表结构一样,列都一致,则直接写*即可
                            "*"
                        ],
                        "connection":[
                            {
                                "table":[   ####要同步的表
                                    "t2"
                                ],
                                "jdbcUrl":[ ####jdbc连接
                                    "jdbc:mysql://192.168.61.144:3306/test?useUnicode=true&characterEncoding=utf8"
                                ]
                            }
                        ]
                    }
                },
                "writer":{
                    "name":"oceanbasev10writer",  ####writer插件名字
                    "parameter":{
                        "obWriteMode":"insert",   ####数据导入的方式,有三种insert、replace、update(参考MySQL语法)
                        "column":[         ####同上
                            "*"
                        ],
                        "preSql":[
                            "truncate table t2"     ##预先执行的sql,可不写
                        ],
                        "connection":[
                            {
                                "jdbcUrl":"||_dsc_ob10_dsc_||obdemo:obmysql||_dsc_ob10_dsc_||jdbc:oceanbase://127.0.0.1:2883/my_ob?useLocalSessionState=true&allowBatch=true&allowMultiQueries=true&reweiteBatchedStatements=true",
                                "table":[             ####jdbc+导入的表
                                    "t2"
                                ]
                            }
                        ],
                        "username":"root",
                        "password":"rootPWD123",
                        "writerThreadCount":10,  ####10个并发
                        "batchSize":1000,      ####1000行提交一次
                        "memstoreThreshold":"0.9"
                    }
                }
            }
        ]
    }
}
上述配置文件中的注释在使用时要去掉、


2.4.jdk及java_home

算了,还是写上吧,方便后期再看
jdk下载地址:https://www.oracle.com/java/technologies/downloads/#java8
官方建议是1.8以上
觉得慢的可以换这个地址:https://repo.huaweicloud.com/java/jdk/ [admin@obcp ~]# tar -zxvf jdk-8u181-linux-x64.tar.gz [admin@obcp ~]# ll 总用量 181296 drwxr-xr-x 7 10 143 245 7月 7 2018 jdk1.8.0_181 -rw-r--r-- 1 root root 185646832 9月 2 15:54 jdk-8u181-linux-x64.tar.gz drwxr-xr-x 2 root root 257 9月 1 10:48 rpm [root@obcp ~]# vim /etc/profile export JAVA_HOME=/home/admin/jdk1.8.0_181 export PATH=$JAVA_HOME/bin:$PATH export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar [root@obcp ~]# source /etc/profile [root@obcp ~]# java -version java version "1.8.0_181" Java(TM) SE Runtime Environment (build 1.8.0_181-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode) ##这样就可以了


2.5.迁移源表结构到oceanbase

mysqldump -uroot -p111111 -h192.168.61.144 --set-gtid-purged=OFF test t2 -d  > /tmp/test_obj.sql    #导出test库的t2表结构
传输到oceanbase服务器执行
root@db 16:13:  [my_ob]>source /tmp/test_obj.sql
Query OK, 0 rows affected (0.006 sec)

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your OceanBase version for the right syntax to use near '--' at line 1
Query OK, 0 rows affected (0.002 sec)
root@db 16:15:  [my_ob]>show create table t2;
+-------+-------------------------------------------------------------------------------------------------
| Table | Create Table                                                                                                                                                                                                                                                                                                                                                                                                            |
+-------+--------------------------------------------------------------------------------------------------
  `name` varchar(20) DEFAULT NULL,
  `sex` varchar(2) DEFAULT NULL,
  `age` int(11) DEFAULT NULL,
  `birth_date` date DEFAULT NULL,
  `id_code` varchar(18) DEFAULT NULL,
  PRIMARY KEY (`id`)
) DEFAULT CHARSET = utf8mb4 ROW_FORMAT = DYNAMIC COMPRESSION = 'zstd_1.3.8' REPLICA_NUM = 1 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0 |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------


2.6.启动datax

[admin@obcp datax]$ cd bin/
[admin@obcp bin]$ ll
总用量 40
-rw-r--r-- 1 admin admin  9339 8月  21 17:07 datax.py
-rw-r--r-- 1 admin admin  6906 8月  21 17:07 dxprof.py
-rw-r--r-- 1 admin admin 17024 8月  21 17:07 perftrace.py
[admin@obcp bin]$ chmod +x *    #加执行权限
###下面要注意,装jdk的路径要保证admin用户可以有权访问,否则会报错

[admin@obcp datax]$ bin/datax.py job/mysql_oc.json 

DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
经DataX智能分析,该任务最可能的错误原因是:
com.alibaba.datax.common.exception.DataXException: Code:[DBUtilErrorCode-03], Description:[您缺失了必须填写的参数值.].  - 您提供配置文件有误,[jdbcUrl]是必填参数,不允许为空或者留白 .

######报错了#####
#仔细检查了下jdbcUrl,,写成全小写了
################

改完再次运行,又报错了,oceanbase的机器没有mysql环境,重新配置了个mysql
(用2881登陆obdemo集群报错了,改到2883端口了)
#终于不报错了
2023-09-02 17:27:14.952 [0-0-0-writer] INFO  ConcurrentTableWriterTask - start 2 insert task.
2023-09-02 17:27:14.980 [0-0-0-writer] INFO  ConcurrentTableWriterTask - start 3 insert task.
2023-09-02 17:27:15.027 [0-0-0-writer] INFO  ConcurrentTableWriterTask - start 4 insert task.
2023-09-02 17:27:15.104 [0-0-0-writer] INFO  ConcurrentTableWriterTask - start 5 insert task.
2023-09-02 17:27:15.177 [0-0-0-writer] INFO  ConcurrentTableWriterTask - start 6 insert task.
2023-09-02 17:27:15.212 [0-0-0-writer] INFO  ConcurrentTableWriterTask - start 7 insert task.
2023-09-02 17:27:15.235 [0-0-0-writer] INFO  ConcurrentTableWriterTask - start 8 insert task.
2023-09-02 17:27:15.268 [0-0-0-writer] INFO  ConcurrentTableWriterTask - start 9 insert task.
2023-09-02 17:27:15.317 [0-0-0-writer] INFO  ConcurrentTableWriterTask - start 10 insert task.
2023-09-02 17:27:15.441 [0-0-0-writer] INFO  DBUtil - this is ob1_0 jdbc url.
2023-09-02 17:27:15.442 [0-0-0-writer] INFO  DBUtil - this is ob1_0 jdbc url. user=obdemo:sys:root :url=jdbc:oceanbase://127.0.0.1:2883/my_ob?useLocalSessionState=true&allowBatch=true&allowMultiQuerie
s=true&reweiteBatchedStatements=true&yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true
2023-09-02 17:27:15.506 [0-0-0-writer] INFO  ColumnMetaCache - fetch columnMeta of table t2 success
2023-09-02 17:27:15.599 [0-0-0-writer] INFO  CommonRdbmsWriter$Task - isMemstoreFull=false
2023-09-02 17:27:15.599 [0-0-0-writer] INFO  ConcurrentTableWriterTask - ConcurrentTableWriter has put all task in queue, queueSize = 0,  total = 1, finished = 1
2023-09-02 17:27:15.679 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[920]ms
2023-09-02 17:27:15.680 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] completed it's tasks.
2023-09-02 17:27:24.722 [job-0] INFO  StandAloneJobContainerCommunicator - Total 4 records, 134 bytes | Speed 13B/s, 0 records/s | Error 0 records, 0 bytes |  
All Task WaitWriterTime 0.001s |  All Task WaitReaderTime 0.000s | Percentage 100.00%
2023-09-02 17:27:24.722 [job-0] INFO  AbstractScheduler - Scheduler accomplished all tasks.
2023-09-02 17:27:24.794 [job-0] INFO  JobContainer - DataX Writer.Job [oceanbasev10writer] do post work.
2023-09-02 17:27:24.795 [job-0] INFO  JobContainer - DataX Reader.Job [mysqlreader] do post work.
2023-09-02 17:27:24.795 [job-0] INFO  JobContainer - DataX jobId [0] completed successfully.
2023-09-02 17:27:24.796 [job-0] INFO  HookInvoker - No hook invoked, because base dir not exists or is a file: /home/admin/datax/hook
2023-09-02 17:27:24.799 [job-0] INFO  JobContainer - 
	 [total cpu info] => 
		averageCpu                     | maxDeltaCpu                    | minDeltaCpu                    
		-1.00%                         | -1.00%                         | -1.00%
                        

	 [total gc info] => 
		 NAME                 | totalGCCount       | maxDeltaGCCount    | minDeltaGCCount    | totalGCTime        | maxDeltaGCTime     | minDeltaGCTime     
		 PS MarkSweep         | 1                  | 1                  | 1                  | 0.263s             | 0.263s             | 0.263s             
		 PS Scavenge          | 1                  | 1                  | 1                  | 0.296s             | 0.296s             | 0.296s             

2023-09-02 17:27:24.799 [job-0] INFO  JobContainer - PerfTrace not enable!
2023-09-02 17:27:24.799 [job-0] INFO  StandAloneJobContainerCommunicator - Total 4 records, 134 bytes | Speed 13B/s, 0 records/s | Error 0 records, 0 bytes | 
 All Task WaitWriterTime 0.001s |  All Task WaitReaderTime 0.000s | Percentage 100.00%
2023-09-02 17:27:24.801 [job-0] INFO  JobContainer - 
任务启动时刻                    : 2023-09-02 17:27:10
任务结束时刻                    : 2023-09-02 17:27:24
任务总计耗时                    :                 14s
任务平均流量                    :               13B/s
记录写入速度                    :              0rec/s
读出记录总数                    :                   4
读写失败总数                    :                   0

ok,看样子导入成功了,部分报错无影响哈

2.7.检查数据

root@db 17:26:  [(none)]>select * from my_ob.t2;
+----+-----------------+------+------+------------+--------------------+
| id | name            | sex  | age  | birth_date | id_code            |
+----+-----------------+------+------+------------+--------------------+
|  1 | 张三            | 男   |   18 | 2000-01-01 | 131011200001010286 |
|  2 | 张三丰          | 男   |   99 | 1924-09-09 | 131011192409090287 |
|  3 | 张三疯了        | 男   |   88 | 1934-06-24 | 131011193409090288 |
|  4 | 张三疯了吧      | 男   |   55 | 1964-06-24 | 131011196409090289 |
+----+-----------------+------+------+------------+--------------------+
4 rows in set (0.044 sec)

ok,使用datax进行简单的离线数据迁移完成了,后续的canal 和 导出csv等等后边再继续搞了,第一次操作这个,还是很多预想不到的问题,虽然不是大问题,但是一卡一卡的也难受呀,哈哈,搞一次出来就好了

电脑内存还是太小了,跑多了会卡,,先搞到这吧!

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论