暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Oceanbase学习之一迁移mysql数据到oceanbase

forever 2023-12-21
241

一、数据库环境

#mysql环境

root@192.168.150.162 20:28:  [(none)]> select version();

±----------+

| version() |

±----------+

| 8.0.26    |

±----------+

1 row in set (0.00 sec)

root@192.168.150.162 20:28:  [(none)]> show variables like ‘%char%’;

±-------------------------±----------------------------------+

| Variable_name            | Value                             |

±-------------------------±----------------------------------+

| character_set_client     | utf8mb4                           |

| character_set_connection | utf8mb4                           |

| character_set_database   | utf8mb4                           |

| character_set_filesystem | binary                            |

| character_set_results    | utf8mb4                           |

| character_set_server     | utf8mb4                           |

| character_set_system     | utf8mb3                           |

| character_sets_dir       | /usr/local/mysql8/share/charsets/ |

±-------------------------±----------------------------------+

8 rows in set (0.00 sec)

#当前mysql环境下的表

#oceanbase环境

obclient [test]> select version();

±-----------------------------+

| version()                    |

±-----------------------------+

| 5.7.25-OceanBase_CE-v4.2.1.2 |

±-----------------------------+

1 row in set (0.002 sec)

obclient [test]> show variables like ‘%chara%’;

±-------------------------±--------+

| Variable_name            | Value   |

±-------------------------±--------+

| character_set_client     | utf8mb4 |

| character_set_connection | utf8mb4 |

| character_set_database   | utf8mb4 |

| character_set_filesystem | binary  |

| character_set_results    | utf8mb4 |

| character_set_server     | utf8mb4 |

| character_set_system     | utf8mb4 |

±-------------------------±--------+

7 rows in set (0.005 sec)

确认mysql与oceanbase的字符集一样

二、mysqldump迁移数据到OceanBase

通过MySQL下的mysqldump将数据导出为SQL文本格式,将数据备份文件传输到OceanBase数据库主机后,通过source命令导入到OceanBase数据库。

#当前mysql下的表

MySQL [(none)]> use test;

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

Database changed

MySQL [test]> show tables;

±------------------+

| Tables_in_test    |

±------------------+

| cluster_test      |

| cluster_test1     |

| cluster_test2     |

| cluster_test3     |

| cluster_test4     |

| t1                |

| t2                |

| t8                |

| t_smallint        |

| test_clustered    |

| test_nonclustered |

±------------------+

11 rows in set (0.00 sec)

# 通过mysqldump导出数据

mysqldump -h  192.168.150.162 -uroot -P4000 -p --database test  > test_oceanbase.sql

#传输脚本到oceanbase服务器

scp test_oceanbase.sql 192.168.150.116:/home/admin

#oceanbase导入

obclient [test]> source test_oceanbase.sql

obclient [test]> show tables;

±------------------+

| Tables_in_test    |

±------------------+

| cluster_test      |

| cluster_test1     |

| cluster_test2     |

| cluster_test3     |

| cluster_test4     |

| t1                |

| t2                |

| t8                |

| t_smallint        |

| test_clustered    |

| test_nonclustered |

±------------------+

11 rows in set (0.004 sec)

#抽查表和数据已经导入

obclient [test]> select * from big

-> ;

±---------------------±--------------------+

| id                   | id1                 |

±---------------------±--------------------+

| 18446744073709551615 | 9223372036854775807 |

±---------------------±--------------------+

1 row in set (0.003 sec)

三、通过datax从MySQL离线导入数据到OceanBase

#datax部署安装

datax 下载地址:https://datax-opensource.oss-cn-hangzhou.aliyuncs.com/202210/datax.tar.gz

1、直接服务器上下载datax

wget https://datax-opensource.oss-cn-hangzhou.aliyuncs.com/202210/datax.tar.gz

2、解压datax

[admin@localhost ~]$ tar zxvf datax.tar.gz

3、安装java

yum install java

4、测试datax是否安装成功

[admin@localhost bin]$ python datax.py …/job/job.json

DataX (DATAX-OPENSOURCE-3.0), From Alibaba !

Copyright © 2010-2017, Alibaba Group. All Rights Reserved.

2023-12-21 23:22:50.245 [main] INFO  MessageSource - JVM TimeZone: GMT+08:00, Locale: zh_CN

2023-12-21 23:22:50.248 [main] INFO  MessageSource - use Locale: zh_CN timeZone: sun.util.calendar.ZoneInfo[id=“GMT+08:00”,offset=28800000,dstSavings=0,useDaylight=false,transitions=0,lastRule=null]

2023-12-21 23:22:50.307 [main] INFO  VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl

2023-12-21 23:22:50.312 [main] INFO  Engine - the machine info  =>

osInfo: Red Hat, Inc. 1.8 25.392-b08

jvmInfo:        Linux amd64 3.10.0-1160.el7.x86_64

cpu num:        8

totalPhysicalMemory:    -0.00G

freePhysicalMemory:     -0.00G

maxFileDescriptorCount: -1

currentOpenFileDescriptorCount: -1

GC Names        [PS MarkSweep, PS Scavenge]

MEMORY_NAME                    | allocation_size                | init_size

PS Eden Space                  | 256.00MB                       | 256.00MB

Code Cache                     | 240.00MB                       | 2.44MB

Compressed Class Space         | 1,024.00MB                     | 0.00MB

PS Survivor Space              | 42.50MB                        | 42.50MB

PS Old Gen                     | 683.00MB                       | 683.00MB

Metaspace                      | -0.00MB                        | 0.00MB

2023-12-21 23:22:50.329 [main] INFO  Engine -

{

“content”:[

{

“reader”:{

“name”:“streamreader”,

“parameter”:{

“column”:[

{

“type”:“string”,

“value”:“DataX”

},

{

“type”:“long”,

“value”:19890604

},

{

“type”:“date”,

“value”:“1989-06-04 00:00:00”

},

{

“type”:“bool”,

“value”:true

},

{

“type”:“bytes”,

“value”:“test”

}

],

“sliceRecordCount”:100000

}

},

“writer”:{

“name”:“streamwriter”,

“parameter”:{

“encoding”:“UTF-8”,

“print”:false

}

}

}

],

“setting”:{

“errorLimit”:{

“percentage”:0.02,

“record”:0

},

“speed”:{

“channel”:1

}

}

}

2023-12-21 23:22:50.348 [main] WARN  Engine - prioriy set to 0, because NumberFormatException, the value is: null

2023-12-21 23:22:50.361 [main] INFO  PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0

2023-12-21 23:22:50.361 [main] INFO  JobContainer - DataX jobContainer starts job.

2023-12-21 23:22:50.365 [main] INFO  JobContainer - Set jobId = 0

2023-12-21 23:22:50.402 [job-0] INFO  JobContainer - jobContainer starts to do prepare …

2023-12-21 23:22:50.402 [job-0] INFO  JobContainer - DataX Reader.Job [streamreader] do prepare work .

2023-12-21 23:22:50.402 [job-0] INFO  JobContainer - DataX Writer.Job [streamwriter] do prepare work .

2023-12-21 23:22:50.402 [job-0] INFO  JobContainer - jobContainer starts to do split …

2023-12-21 23:22:50.403 [job-0] INFO  JobContainer - Job set Channel-Number to 1 channels.

2023-12-21 23:22:50.403 [job-0] INFO  JobContainer - DataX Reader.Job [streamreader] splits to [1] tasks.

2023-12-21 23:22:50.403 [job-0] INFO  JobContainer - DataX Writer.Job [streamwriter] splits to [1] tasks.

2023-12-21 23:22:50.421 [job-0] INFO  JobContainer - jobContainer starts to do schedule …

2023-12-21 23:22:50.426 [job-0] INFO  JobContainer - Scheduler starts [1] taskGroups.

2023-12-21 23:22:50.429 [job-0] INFO  JobContainer - Running by standalone Mode.

2023-12-21 23:22:50.438 [taskGroup-0] INFO  TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks.

2023-12-21 23:22:50.442 [taskGroup-0] INFO  Channel - Channel set byte_speed_limit to -1, No bps activated.

2023-12-21 23:22:50.442 [taskGroup-0] INFO  Channel - Channel set record_speed_limit to -1, No tps activated.

2023-12-21 23:22:50.468 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started

2023-12-21 23:22:50.789 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[335]ms

2023-12-21 23:22:50.790 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] completed it’s tasks.

2023-12-21 23:23:00.452 [job-0] INFO  StandAloneJobContainerCommunicator - Total 100000 records, 2600000 bytes | Speed 253.91KB/s, 10000 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.046s |  All Task WaitReaderTime 0.056s | Percentage 100.00%

2023-12-21 23:23:00.453 [job-0] INFO  AbstractScheduler - Scheduler accomplished all tasks.

2023-12-21 23:23:00.453 [job-0] INFO  JobContainer - DataX Writer.Job [streamwriter] do post work.

2023-12-21 23:23:00.453 [job-0] INFO  JobContainer - DataX Reader.Job [streamreader] do post work.

2023-12-21 23:23:00.453 [job-0] INFO  JobContainer - DataX jobId [0] completed successfully.

2023-12-21 23:23:00.454 [job-0] INFO  HookInvoker - No hook invoked, because base dir not exists or is a file: /home/admin/datax/hook

2023-12-21 23:23:00.455 [job-0] INFO  JobContainer -

[total cpu info] =>

averageCpu                     | maxDeltaCpu                    | minDeltaCpu

-1.00%                         | -1.00%                         | -1.00%

[total gc info] =>

NAME                 | totalGCCount       | maxDeltaGCCount    | minDeltaGCCount    | totalGCTime        | maxDeltaGCTime     | minDeltaGCTime

PS MarkSweep         | 0                  | 0                  | 0                  | 0.000s             | 0.000s             | 0.000s

PS Scavenge          | 0                  | 0                  | 0                  | 0.000s             | 0.000s             | 0.000s

2023-12-21 23:23:00.455 [job-0] INFO  JobContainer - PerfTrace not enable!

2023-12-21 23:23:00.456 [job-0] INFO  StandAloneJobContainerCommunicator - Total 100000 records, 2600000 bytes | Speed 253.91KB/s, 10000 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.046s |  All Task WaitReaderTime 0.056s | Percentage 100.00%

2023-12-21 23:23:00.457 [job-0] INFO  JobContainer -

任务启动时刻                    : 2023-12-21 23:22:50

任务结束时刻                    : 2023-12-21 23:23:00

任务总计耗时                    :                 10s

任务平均流量                    :          253.91KB/s

记录写入速度                    :          10000rec/s

读出记录总数                    :              100000

读写失败总数                    :                   0

5创建datax-job的json

{

"job": {

"entry": {

"jvm": "-Xms1024m -Xmx1024m"

},

"setting": {

"speed": {

"channel": 4

},

"errorLimit": {

"record": 0,

"percentage": 0.1

}

},

"content": [{

"reader": {

"name": “mysqlreader”,

"parameter": {

"username": “root”,

"password": “oracle123”,

"column": [

"*"

],

"connection": [{

"table": [

"Tab_A"

],

"jdbcUrl": [“jdbc:mysql://192.168.150.162:4000/test?useUnicode=true&characterEncoding=utf8&useSSL=false”]

}]

}

},

"writer": {

"name": “oceanbasev10writer”,

"parameter": {

"obWriteMode": “insert”,

"column": [

"*"

],

"preSql": [

"truncate table Tab_A"

],

"connection": [{

"jdbcUrl": “||_dsc_ob10_dsc_||obdemo:obmysql||_dsc_ob10_dsc_||jdbc:oceanbase://192.168.150.116:2883/test?useLocalSessionState=true&allowBatch=true&allowMultiQueries=true&rewriteBatchedStatements=true”,

"table": [

"Tab_A"

]

}],

"username": “root”,

"password": “oracle123”,

"writerThreadCount": 10,

"batchSize": 1000,

"memstoreThreshold": "0.9"

}

}

}]

}

}

6、执行离线数据同步

源端数据:

MySQL [test]> select * from Tab_A;

±—±-----±-----±-----±-----±-----±-------+

| id | bid  | cid  | name | type | num  | amt    |

±—±-----±-----±-----±-----±-----±-------+

|  1 |    1 |    1 | A01  | 01   |  111 | 111.00 |

|  2 |    2 |    2 | A01  | 01   |  112 | 111.00 |

|  3 |    3 |    3 | A02  | 02   |  113 | 111.00 |

|  4 |    4 |    4 | A02  | 02   |  112 | 111.00 |

|  5 |    5 |    5 | A01  | 01   |  111 | 111.00 |

|  6 |    6 |    6 | A02  | 02   |  113 | 111.00 |

|  7 |    5 |    7 | A01  | 01   |  111 |  88.00 |

|  8 |    6 |    8 | A02  | 02   |  113 |  88.00 |

±—±-----±-----±-----±-----±-----±-------+

8 rows in set (0.26 sec)

目标数据:

obclient [test]> select * from Tab_A;

Empty set (0.133 sec)

执行同步:

python ./datax.py …/job/mysql2ob.json

2023-12-22 00:42:13.745 [job-0] INFO  JobContainer - PerfTrace not enable!

2023-12-22 00:42:13.745 [job-0] INFO  StandAloneJobContainerCommunicator - Total 8 records, 134 bytes | Speed 13B/s, 0 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.020s |  All Task WaitReaderTime 0.000s | Percentage 100.00%

2023-12-22 00:42:13.747 [job-0] INFO  JobContainer -

任务启动时刻                    : 2023-12-22 00:41:54

任务结束时刻                    : 2023-12-22 00:42:13

任务总计耗时                    :                 19s

任务平均流量                    :               13B/s

记录写入速度                    :              0rec/s

读出记录总数                    :                   8

读写失败总数                    :                   0

7、检查数据:

obclient [test]>  select * from Tab_A;

±—±-----±-----±-----±-----±-----±-------+

| id | bid  | cid  | name | type | num  | amt    |

±—±-----±-----±-----±-----±-----±-------+

|  1 |    1 |    1 | A01  | 01   |  111 | 111.00 |

|  2 |    2 |    2 | A01  | 01   |  112 | 111.00 |

|  3 |    3 |    3 | A02  | 02   |  113 | 111.00 |

|  4 |    4 |    4 | A02  | 02   |  112 | 111.00 |

|  5 |    5 |    5 | A01  | 01   |  111 | 111.00 |

|  6 |    6 |    6 | A02  | 02   |  113 | 111.00 |

|  7 |    5 |    7 | A01  | 01   |  111 |  88.00 |

|  8 |    6 |    8 | A02  | 02   |  113 |  88.00 |

±—±-----±-----±-----±-----±-----±-------+

8 rows in set (0.002 sec)

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论