概述
备份介质准备
OB 的备份恢复支持 NFS 目录和 OSS 目录。线下测试环境只有 NFS 技术可用。
(可选)挂载 NAS 盘
这步通过虚拟机软件提供的 NAS 存储用来做 NFS 的盘。NAS 通常支持 NFS 协议和 iSCSI 协议读写。由于我这里的 NAS 不知道为什么不支持 NFS 协议,所以我改用 iSCSI 协议连接 NAS 盘,并通过服务器 65 部署一个 NFS Server。iSCISI 协议不支持并发读写。
安装 iSCSI 相关软件。
sudo yum -y install iscsi-initiator-utilssudo systemctl start iscsidsudo systemctl status iscsid[root@server065 obproxy-master]# systemctl status iscsid? iscsid.service - Open-iSCSILoaded: loaded (/usr/lib/systemd/system/iscsid.service; disabled; vendor preset: disabled)Active: active (running) since Mon 2023-06-12 14:21:54 CST; 4s agoDocs: man:iscsid(8)man:iscsiuio(8)man:iscsiadm(8)Main PID: 28743 (iscsid)Status: "Ready to process requests"Tasks: 1Memory: 1.6MCGroup: /system.slice/iscsid.service└─28743 /sbin/iscsid -fJun 12 14:21:54 server065 systemd[1]: Starting Open-iSCSI...Jun 12 14:21:54 server065 systemd[1]: Started Open-iSCSI.
(可选)设置 CHAP 认证。
如果 NAS 盘设置了 CHAP 认证,那么客户端需要配置用户名和密码。
vi etc/iscsi/iscsid.confnode.session.auth.authmethod = CHAPnode.session.auth.username = usernamenode.session.auth.password = password
发现 iSCSI 卷。
iscsiadm -m discovery -t st -p 10.0.0.205:3260[root@server065 obproxy-master]# iscsiadm -m discovery -t st -p 10.0.0.205:326010.0.0.205:3260,1 iqn.2000-01.com.synology:DS918.default-target.392e48dbfa5[fe80::211:32ff:fec5:b34f]:3260,1 iqn.2000-01.com.synology:DS918.default-target.392e48dbfa5
挂载 iSCSI 卷。
iscsiadm -m node -T iqn.2000-01.com.synology:DS918.default-target.392e48dbfa5 -p 10.0.0.205:3260 -l[root@server065 obproxy-master]# iscsiadm -m node -T iqn.2000-01.com.synology:DS918.default-target.392e48dbfa5 -p 10.0.0.205 -lLogging in to [iface: default, target: iqn.2000-01.com.synology:DS918.default-target.392e48dbfa5, portal: 10.0.0.205,3260] (multiple)Login to [iface: default, target: iqn.2000-01.com.synology:DS918.default-target.392e48dbfa5, portal: 10.0.0.205,3260] successful.
注意: 由于iSCSI 协议限制,请勿将一个 iSCSI 卷挂载到多个 Linux 客户端上。
查看 iSCSI 卷。
[root@server065 obproxy-master]# ls -lrth dev/sd* |tail -n 1brw-rw----. 1 root disk 8, 32 Jun 12 14:26 dev/sdc[root@server065 obproxy-master]# lsblk dev/sdcNAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTsdc 8:32 0 200G 0 disk
后面的方法就跟本地盘一样了,先格式化,然后挂载文件系统。
mkfs.ext4 dev/sdcmkdir -p backupmount -t noatime,nodealloc,nodiratime,default dev/sdc backup
(可选)卸载 iSCSI 卷。
iscsiadm -m node -T iqn.2000-01.com.synology:DS918.default-target.392e48dbfa5 -p 10.0.0.205:3260 -u[root@server061 ~]# iscsiadm -m node -T iqn.2000-01.com.synology:DS918.default-target.392e48dbfa5 -p 10.0.0.205:3260 -uLogging out of session [sid: 1, target: iqn.2000-01.com.synology:DS918.default-target.392e48dbfa5, portal: 10.0.0.205,3260]Logout of [sid: 1, target: iqn.2000-01.com.synology:DS918.default-target.392e48dbfa5, portal: 10.0.0.205,3260] successful.
部署 NFS Server
NFS 服务器的 CPU 和内存不要太小,服务器要稳定。否则 NFS 服务器自身的性能问题可能会影响挂载该 NFS 目录的节点的稳定运行。
安装 NFS 软件
sudo yum -y install nfs-utils rpcbind
设置 NFS Exports 目录
sudo vim etc/exports/backup 10.0.0.65/16(rw,sync,all_squash)
设置 NFS Exports 目录的所有者为 nfsnobody 。
sudo chown nfsnobody:nfsnobody -R backup
配置 NFS 参数。
这一步照做。
sudo vim etc/sysconfig/nfsRPCNFSDARGS="-N 2 -N 3 -U"RPCNFSDCOUNT=16NFSD_V4_GRACE=90NFSD_V4_LEASE=90
重启 NFS 服务。
sudo systemctl restart nfs-configsudo systemctl restart nfs-server
设置 Slot Table
vim + /etc/sysctl.confsunrpc.tcp_max_slot_table_entries=128
报错处理:
[root@server065 ~]# sysctl -psysctl: cannot stat proc/sys/sunrpc/tcp_max_slot_table_entries: No such file or directory[root@server065 ~]# modprobe sunrpc[root@server065 ~]# lsmod|grep rpcsunrpc 366617 0[root@server065 ~]# sysctl -psunrpc.tcp_max_slot_table_entries = 128
部署 NFS 客户端
OB 集群所有节点都要求部署 NFS 客户端。
为了减轻工作量,这里使用 ansible 操作。
vim + etc/ansible/hosts[dbservers]10.0.0.6110.0.0.6210.0.0.63
后面使用 ansible 批量操作 OB 集群节点。
安装 NFS 软件
ansible dbservers -m shell -a "sudo yum -y install nfs-utils "
设置 Slot Table
这里使用 ansible 批量操作 OB 集群节点。
ansible dbservers -m shell -a "modprobe sunrpc"ansible dbservers -m shell -a "sysctl -w sunrpc.tcp_max_slot_table_entries=128"ansible dbservers -m shell -a "sudo sysctl -a 2>/dev/null |grep sunrpc.tcp_max_slot_table_entries "ansible dbservers -m shell -a "echo 'sunrpc.tcp_max_slot_table_entries=128'
>> etc/sysctl.conf "
查看 NFS 目录
确认一下各个 OB 节点能发现 NFS 目录。
[root@server065 ~]# ansible dbservers -m shell -a "showmount -e 10.0.0.65"10.0.0.62 | CHANGED | rc=0 >>Export list for 10.0.0.65:/backup 10.0.0.65/1610.0.0.63 | CHANGED | rc=0 >>Export list for 10.0.0.65:/backup 10.0.0.65/1610.0.0.61 | CHANGED | rc=0 >>Export list for 10.0.0.65:/backup 10.0.0.65/16[root@server065 ~]#
挂载 NFS 目录
各个 OB 节点挂载 NFS 目录到本地文件系统目录。注意挂载选项要求。
ansible dbservers -m shell -a "mkdir -p backup"ansible dbservers -m shell -a "sudo mount -tnfs4 -o rw,nfsvers=4.1,sync,lookupcache=positive,hard,timeo=600,wsize=1048576,rsize=1048576,namlen=255 10.0.0.65:/backup backup"ansible dbservers -m shell -a "mount |grep backup"
测试一下 NFS 目录读写。
NFS 协议支持多点写入。
date > /tmp/tmpfileansible dbservers -m copy -a "src=/tmp/tmpfile dest=/backup/"ansible dbservers -m shell -a "ls -lrth backup/ && cat backup/tmpfile"

多点读写测试通过。
测试 NFS 性能
fio -filename=/backup/fio-file.test -direct=1 -rw=randwrite -bs=2048K -size=100G -runtime=300 -group_reporting -name=mytest -ioengine=libaio -numjobs=1 -iodepth=64 -iodepth_batch=8 -iodepth_low=8 -iodepth_batch_complete=8
[root@server061 ~]# fio -filename=/backup/fio-file.test -direct=1 -rw=randwrite -bs=2048K -size=100G -runtime=300 -group_reporting -name=mytest -ioengine=libaio -numjobs=1 -iodepth=64 -iodepth_batch=8 -iodepth_low=8 -iodepth_batch_complete=8mytest: (g=0): rw=randwrite, bs=(R) 2048KiB-2048KiB, (W) 2048KiB-2048KiB, (T) 2048KiB-2048KiB, ioengine=libaio, iodepth=64fio-3.13Starting 1 processmytest: Laying out IO file (1 file 102400MiB)Jobs: 1 (f=1): [w(1)][100.0%][w=140MiB/s][w=70 IOPS][eta 00m:00s]mytest: (groupid=0, jobs=1): err= 0: pid=25490: Mon Jun 12 15:53:33 2023write: IOPS=71, BW=143MiB/s (150MB/s)(42.0GiB/300168msec)slat (usec): min=1273, max=84113, avg=4363.27, stdev=4352.55clat (msec): min=114, max=1445, avg=537.07, stdev=231.09lat (msec): min=117, max=1447, avg=541.43, stdev=231.36clat percentiles (msec):| 1.00th=[ 150], 5.00th=[ 180], 10.00th=[ 220], 20.00th=[ 326],| 30.00th=[ 384], 40.00th=[ 468], 50.00th=[ 535], 60.00th=[ 600],| 70.00th=[ 659], 80.00th=[ 726], 90.00th=[ 835], 95.00th=[ 927],| 99.00th=[ 1133], 99.50th=[ 1200], 99.90th=[ 1401], 99.95th=[ 1435],| 99.99th=[ 1452]bw ( KiB/s): min=32702, max=229376, per=99.68%, avg=146303.09, stdev=37605.19, samples=600iops : min= 15, max= 112, avg=71.10, stdev=18.42, samples=600lat (msec) : 250=12.37%, 500=32.01%, 750=38.43%, 1000=14.33%, 2000=2.87%cpu : usr=1.32%, sys=2.35%, ctx=8653, majf=0, minf=131171IO depths : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.1%, 16=28.6%, 32=57.1%, >=64=14.3%submit : 0=0.0%, 4=0.0%, 8=100.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%complete : 0=0.0%, 4=0.0%, 8=100.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%issued rwts: total=0,21512,0,0 short=0,0,0,0 dropped=0,0,0,0latency : target=0, window=0, percentile=100.00%, depth=64Run status group 0 (all jobs):WRITE: bw=143MiB/s (150MB/s), 143MiB/s-143MiB/s (150MB/s-150MB/s), io=42.
0GiB (45.1GB), run=300168-300168msec
tsar --cpu --traffic --load --io -I sdc -s user,sys,wait,util,bytin,bytout,rs,ws,rsecs,wsecs,rawait,wawait,rarqsz,warqsz,svctm,load1,load5,load15 -l -i 3

[root@server065 ~]# ethtool ens192 |grep -i speed
Speed: 1000Mb/s
OB 集群手动备份
配置 OB 备份
配置 OB 备份目录。
ALTER SYSTEM SET backup_dest='file:///backup/obbackup';show parameters like 'backup_dest';(root@10.0.0.61:2883) [oceanbase]> ALTER SYSTEM SET backup_dest='file:///backup/obbackup';Query OK, 0 rows affected (0.09 sec)(root@10.0.0.61:2883) [oceanbase]> show parameters like 'backup_dest';+-------+----------+-----------+----------+-------------+-----------+-------------------------+-------------+----------+---------+---------+-------------------+| zone | svr_type | svr_ip | svr_port | name | data_type | value | info | section | scope | source | edit_level |+-------+----------+-----------+----------+-------------+-----------+-------------------------+-------------+----------+---------+---------+-------------------+| ZONE3 | observer | 10.0.0.63 | 2882 | backup_dest | NULL | file:///backup/obbackup | backup dest | OBSERVER | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE || ZONE2 | observer | 10.0.0.62 | 2882 | backup_dest | NULL | file:///backup/obbackup | backup dest | OBSERVER | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE || zone1 | observer | 10.0.0.61 | 2882 | backup_dest | NULL | file:///backup/obbackup | backup dest | OBSERVER | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE |+-------+----------+-----------+----------+-------------+-----------+-------------------------+-------------+----------+---------+---------+-------------------+3 rows in set (0.05 sec)
配置 OB 备份参数。
ALTER SYSTEM SET backup_dest_option='log_archive_checkpoint_interval=2m&recovery_window=7d&auto_delete_obsolete_backup=true&log_archive_piece_switch_interval=1d&backup_copies=0';show parameters like 'backup_dest_option';
发起日志备份
配置备份模式。
Optional
模式表示以用户业务优先。在该模式下,当备份(日志归档)来不及的情况下,日志可能来不及备份就回收了,可能会发生备份断流。Mandatory
模式表示以备份优先。在该模式下如果备份跟不上用户数据的写入,可能会导致用户无法写入。
ALTER SYSTEM SET backup_log_archive_option = 'optional compression= enable';
设置后可以动态修改压缩算法。
ALTER SYSTEM SET backup_log_archive_option='optional compression= zstd_1.3.8';
ALTER SYSTEM ARCHIVELOG;
SELECT INCARNATION , LOG_ARCHIVE_ROUND , TENANT_ID , STATUS , START_PIECE_ID , BACKUP_PIECE_ID , MIN_FIRST_TIMEFROM CDB_OB_BACKUP_ARCHIVELOG;
SELECT INCARNATION , TENANT_ID ,ROUND_ID ,BACKUP_PIECE_ID ,COPY_ID ,CREATE_DATE ,START_TS ,CHECKPOINT_TS ,STATUS ,FILE_STATUS ,START_PIECE_IDFROM CDB_OB_BACKUP_PIECE_FILES;
发起数据备份
(可选)设置备份密码。
SET ENCRYPTION ON IDENTIFIED BY 'aaAA11__' ONLY;
发起全量备份。
ALTER SYSTEM BACKUP DATABASE;
SELECT INCARNATION , BS_KEY , BACKUP_TYPE , TENANT_ID , PARTITION_COUNT , MACRO_BLOCK_COUNT ,FINISH_PARTITION_COUNT ,FINISH_MACRO_BLOCK_COUNT , INPUT_BYTES ,OUTPUT_BYTES , START_TIME , COMPLETION_TIME ,statusFROM oceanbase.CDB_OB_BACKUP_PROGRESS;
SELECT BS_KEY, INCARNATION , TENANT_ID , COPY_ID , BACKUP_TYPE , ENCRYPTION_MODE ,STATUS ,FILE_STATUS ,START_TIME , COMPLETION_TIME , ELAPSED_SECONDES , COMPRESSED , OUTPUT_BYTES , COMPRESSION_RATIO , OUTPUT_RATE_BYTES_DISPLAYFROM oceanbase.CDB_OB_BACKUP_SET_FILESORDER BY BS_KEY DESC, TENANT_ID ;
发起增量备份。
ALTER SYSTEM BACKUP INCREMENTAL DATABASE;
校验备份数据
校验本集群所有备份集和对应的日志。
ALTER SYSTEM VALIDATE DATABASE;(root@10.0.0.61:2883) [oceanbase]> ALTER SYSTEM VALIDATE DATABASE;Query OK, 0 rows affected (0.04 sec)
SELECT job_id, tenant_id, TENANT_NAME, INCARNATION , BACKUP_SET_ID , PROGRESS_PERCENT , STATUS ,VALIDATE_TYPE ,BACKUP_DEST ,START_TIME ,END_TIME ,`RESULT` , COMMENTFROM CDB_OB_BACKUP_VALIDATION_JOB;SELECT job_id, tenant_id, TENANT_NAME, INCARNATION , BACKUP_SET_ID , PROGRESS_PERCENT , STATUS ,VALIDATE_TYPE ,BACKUP_DEST ,START_TIME ,END_TIME ,`RESULT` , COMMENTFROM CDB_OB_BACKUP_VALIDATION_JOB_HISTORY;
SELECT job_id, task_id, TENANT_ID , INCARNATION , BACKUP_SET_ID ,VALIDATE_TYPE ,STATUS ,BACKUP_DEST , START_TIME ,END_TIME ,TOTAL_PG_COUNT ,FINISH_PG_COUNT , TOTAL_PARTITION_COUNT ,FINISH_PARTITION_COUNT ,RESULT, COMMENTFROM CDB_OB_TENANT_BACKUP_VALIDATION_TASKORDER BY START_TIME DESC ;SELECT job_id, task_id, TENANT_ID , INCARNATION , BACKUP_SET_ID ,VALIDATE_TYPE ,STATUS ,BACKUP_DEST , START_TIME ,END_TIME ,TOTAL_PG_COUNT ,FINISH_PG_COUNT , TOTAL_PARTITION_COUNT ,FINISH_PARTITION_COUNT ,RESULT, COMMENTFROM CDB_OB_BACKUP_VALIDATION_TASK_HISTORYORDER BY START_TIME DESC ;
ALTER SYSTEM VALIDATE BACKUPSET 4 ;ALTER SYSTEM VALIDATE BACKUPPIECE 1;
OB 集群手动恢复
构造用于验证备份和恢复的业务表。
create table t_backup(id bigint not null auto_increment primary key, c1 varchar(100) , c2 timestamp default current_timestamp );insert into t_backup(c1) values('test backup and restore');insert into t_backup(c1) values('test backup and restore 2. ');(root@10.0.0.61:2883) [test]> select * from t_backup;+----+-----------------------------+---------------------+| id | c1 | c2 |+----+-----------------------------+---------------------+| 1 | test backup and restore | 2023-06-12 17:42:01 || 2 | test backup and restore 2. | 2023-06-12 17:42:26 |+----+-----------------------------+---------------------+2 rows in set (0.02 sec)
准备恢复的租户资源
select a.zone,concat(a.svr_ip,':',a.svr_port) observer, cpu_total, (cpu_total-cpu_assigned) cpu_free, round(mem_total/1024/1024/1024) mem_total_gb, round((mem_total-mem_assigned)/1024/1024/1024,2) mem_free_gb, round(disk_total/1024/1024/1024) disk_total_gb, round((disk_total-disk_assigned)/1024/1024/1024) disk_free_gbfrom __all_virtual_server_stat a join __all_server b on (a.svr_ip=b.svr_ip and a.svr_port=b.svr_port)order by a.zone, a.svr_ip;
CREATE RESOURCE UNIT b_4c2g max_cpu 4, max_memory '2g', max_iops 10240, max_disk_size 53687091200, max_session_num 64, MIN_CPU=4, MIN_MEMORY='2g', MIN_IOPS=10240;
zone1已经没有可分配资源了。考虑在
zone3上分配一个单副本的租户。
CREATE RESOURCE POOL restore_pool unit = 'b_4c2g', unit_num = 1, zone_list = ('ZONE3');
开始恢复
(可选)设置加密信息。
SET @kms_encrypt_info = 'aaAA11__';
设置恢复参数
ALTER SYSTEM SET restore_concurrency = 4;ALTER SYSTEM SET _restore_idle_time = '30s';SET DECRYPTION IDENTIFIED BY 'aaAA11__';
执行全量恢复
ALTER SYSTEM RESTORE restored_obmysql FROM obmysql at 'file:///backup/obbackup' until '2023-06-12 17:42:15' with 'backup_cluster_name=obdemo&backup_cluster_id=1680747946&pool_list=restore_pool';(root@10.0.0.61:2883) [oceanbase]> ALTER SYSTEM RESTORE restored_obmysql FROM obmysql at 'file:///backup/obbackup' until '2023-06-12 17:42:15' with 'backup_cluster_name=obdemo&backup_cluster_id=1680747946&pool_list=restore_pool';Query OK, 0 rows affected (0.14 sec)
查看恢复任务进度和详情
SELECT JOB_ID, TENANT_ID , TENANT_NAME, BACKUP_TENANT_ID ,BACKUP_TENANT_NAME ,BACKUP_CLUSTER_ID ,BACKUP_CLUSTER_NAME ,STATUS , START_TIME ,COMPLETION_TIME , PARTITION_COUNT , MACRO_BLOCK_COUNT , FINISH_PARTITION_COUNT , FINISH_MACRO_BLOCK_COUNT ,RESTORE_START_TIMESTAMP , RESTORE_FINISH_TIMESTAMP , BACKUP_SET_LIST , BACKUP_PIECE_LIST ,infoFROM oceanbase.CDB_OB_RESTORE_PROGRESS;
SELECT JOB_ID, TENANT_ID , TENANT_NAME, BACKUP_TENANT_ID ,BACKUP_TENANT_NAME ,BACKUP_CLUSTER_ID ,BACKUP_CLUSTER_NAME ,STATUS , START_TIME ,COMPLETION_TIME , PARTITION_COUNT , MACRO_BLOCK_COUNT , FINISH_PARTITION_COUNT , FINISH_MACRO_BLOCK_COUNT ,RESTORE_START_TIMESTAMP , RESTORE_FINISH_TIMESTAMP , BACKUP_SET_LIST , BACKUP_PIECE_LIST ,infoFROM oceanbase.CDB_OB_RESTORE_HISTORY;
验证恢复结果
[root@server065 ~]# mysql -h10.0.0.61 -uroot@restored_obmysql#obdemo -P2883 -paaAA11__ -c -A testmysql: [Warning] Using a password on the command line interface can be insecure.Welcome to the MySQL monitor. Commands end with ; or \g.Your MySQL connection id is 4706Server version: 5.6.25 OceanBase 3.2.4.1 (r101000052023010822-346aa35c32e99d1b82d713f75f0072c45bdf7aab) (Built Jan 8 2023 22:52:43)Copyright (c) 2000, 2021, Oracle and/or its affiliates.Oracle is a registered trademark of Oracle Corporation and/or itsaffiliates. Other names may be trademarks of their respectiveowners.Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.(root@10.0.0.61:2883) [test]> select * from t_backup;+----+-------------------------+---------------------+| id | c1 | c2 |+----+-------------------------+---------------------+| 1 | test backup and restore | 2023-06-12 17:42:01 |+----+-------------------------+---------------------+1 row in set (0.54 sec)(root@10.0.0.61:2883) [test]>
(root@10.0.0.61:2883) [test]> SELECT unit_id, unit_config_name, resource_pool_name, ZONE, tenant_id,tenant_name,svr_ip,max_cpu,min_cpu,round(max_memory/1024/1024/1024) max_mem_gb-> FROM oceanbase.gv$unit ;+---------+------------------+--------------------+-------+-----------+------------------+-----------+---------+---------+------------+| unit_id | unit_config_name | resource_pool_name | ZONE | tenant_id | tenant_name | svr_ip | max_cpu | min_cpu | max_mem_gb |+---------+------------------+--------------------+-------+-----------+------------------+-----------+---------+---------+------------+| 1010 | b_4c2g | restore_pool | ZONE3 | 1005 | restored_obmysql | 10.0.0.63 | 4 | 4 | 2 |+---------+------------------+--------------------+-------+-----------+------------------+-----------+---------+---------+------------+1 row in set (0.08 sec)
(root@10.0.0.61:2883) [oceanbase]> drop tenant obmysql;Query OK, 0 rows affected (0.07 sec)(root@10.0.0.61:2883) [oceanbase]> alter tenant restored_obmysql rename global_name to obmysql;Query OK, 0 rows affected (0.04 sec)(root@10.0.0.61:2883) [oceanbase]> select tenant_id, tenant_name , locality from __all_tenant;+-----------+-----------------------------------------+---------------------------------------------+| tenant_id | tenant_name | locality |+-----------+-----------------------------------------+---------------------------------------------+| 1 | sys | FULL{1}@zone1 || 1001 | oboracle | FULL{1}@zone1 || 1002 | __recycle_$_1680747946_1686204048874496 | FULL{1}@ZONE2, FULL{1}@ZONE3, FULL{1}@zone1 || 1005 | obmysql | FULL{1}@ZONE3 |+-----------+-----------------------------------------+---------------------------------------------+4 rows in set (0.01 sec)




