校验原表和分表数据
因为只关注主库比较,要做响应改造
表里面增加shard分片字段,并保留每次的原表记录,所以将shard加到主键里面,这样就不会删除原表的上次记录
注意统计信息要最新,不然可能会出现原表和分表的chunk不一致的情况
为了原表和分表有一样的chunk数,要增加chunk参数,这样可以比较crc的sum值
因为修改了结果表结构,文本输出的结果会不一致,无需关注,主要关注结果表记录
根据结果表进行分表比较
CREATE TABLE percona.`checksums` (
`db` CHAR(64) NOT NULL,
`tbl` CHAR(64) NOT NULL,
`chunk` INT(11) NOT NULL,
`chunk_time` FLOAT DEFAULT NULL,
`chunk_index` VARCHAR(200) DEFAULT NULL,
`lower_boundary` TEXT,
`upper_boundary` TEXT,
`this_crc` CHAR(40) NOT NULL,
`this_cnt` INT(11) NOT NULL,
`master_crc` CHAR(40) DEFAULT NULL,
`master_cnt` INT(11) DEFAULT NULL,
`ts` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`shard` VARCHAR(10) NOT NULL DEFAULT '-1',
PRIMARY KEY (`db`,`tbl`,`chunk`),
KEY `ts_db_tbl` (`ts`,`db`,`tbl`)
) ENGINE=INNODB DEFAULT CHARSET=utf8
---数据初始化分片
#!/bin/bash
MYSQLBASE="/mysql/base/bin"
HOST="192.168.56.201"
USER="sfwew"
PASSWORD="sfwew"
DATABASE="db"
for tab in {0..0}; do
SOURCE_TABLE="table_${tab}"
for i in {7..7}; do
SHARD_TABLE="table_${i}${tab}"
echo "Checking $SHARD_TABLE..."
pt-archiver \
--source h=$HOST,u=$USER,p=$PASSWORD,D=$DATABASE,t=$SOURCE_TABLE \
--dest h=$HOST,u=$USER,p=$PASSWORD,D=$DATABASE,t=$SHARD_TABLE \
--charset=UTF8mb4 --where "SUBSTRING(MERCHANT_NO, -2) = ${i}${tab}" \
--progress 100000 \
--limit 5000 \
--txn-size 5000 \
--no-delete \
--statistics
done
done
数据checksum值计算,要注意,主键id必须递增,否则可能会出现chunk不一致的情况,无法比较数据
增量比较的时候,添加相关条件,只比较需要比较的数据即可
[root@wgq-64-80 scripts]# cat checksum.sh
#!/bin/bash
MYSQLBASE="/mysql/base/bin"
HOST="192.168.56.201"
USER="sfwew"
PASSWORD="sfwew"
DATABASE="db"
for tab in {0..0}; do
SOURCE_TABLE="table_${tab}"
for i in {7..7}; do
SHARD_TABLE="table_${i}${tab}"
echo "Checking $SHARD_TABLE..."
pt-table-checksum \
--host=$HOST \
--user=$USER \
--password=$PASSWORD \
--databases=$DATABASE \
--tables=$SOURCE_TABLE,$SHARD_TABLE \
--no-check-binlog-format \
--no-check-replication-filters \
--replicate=percona.checksums \
--noempty-replicate-table \
--chunk-size=10000 \
--set-vars 'autocommit=on' \
--where="SUBSTRING(MERCHANT_NO, -2) = ${i}${tab}"
$MYSQLBASE/mysql -h$HOST -u$USER -p$PASSWORD -e "UPDATE percona.checksums SET SHARD='${i}${tab}' WHERE shard ='-1';commit;"
done
done
少量数据不一致同步脚本
大批量数据不要用这个方式,会锁表,影响产线,数据量比较大可用考虑用pt-archiver加where条件进行同步
[root@wgq-64-80 scripts]# cat sync.sh
#!/bin/bash
MYSQLBASE="/mysql/base/bin"
HOST="192.168.56.201"
USER="sfwew"
PASSWORD="sfwew"
DATABASE="db"
for tab in {0..0}; do
SOURCE_TABLE="table_${tab}"
for i in {7..7}; do
SHARD_TABLE="table_${i}${tab}"
echo "Checking $SHARD_TABLE..."
pt-table-sync --execute --chunk-size=10000\
h=$HOST,D=$DATABASE,t=$SOURCE_TABLE \
h=$HOST,D=$DATABASE,t=$SHARD_TABLE \
--where="SUBSTRING(MERCHANT_NO, -2) = ${i}${tab} " \
--set-vars 'autocommit=on' \
-u$USER \
-p$PASSWORD
done
done
数据比较结果checksum值是否一致,注意比较时候,中间不能插入数据,否则分批错乱,无法一致
[root@wgq-64-80 scripts]# cat result.sh
#!/bin/bash
MYSQLBASE="/mysql/base/bin"
HOST="192.168.56.201"
USER="sfwew"
PASSWORD="sfwew"
$MYSQLBASE/mysql -h$HOST -u$USER -p$PASSWORD -e "
SELECT * FROM (
SELECT
t1.db AS source_db,
t1.tbl AS source_table,
t2.tbl AS shard_table,
t2.shard,
CASE WHEN t1.this_crc = t2.this_crc THEN 'Consistent' ELSE CONCAT('DIFF: ' , t1.this_crc , ' vs ' , t2.this_crc) END AS STATUS
FROM percona.checksums t1
JOIN percona.checksums t2
ON t1.chunk=t2.chunk
AND t1.shard=t2.shard
WHERE t1.tbl=CONCAT(SUBSTR(t2.tbl,1,LENGTH(t2.tbl)-2) ,SUBSTR(t2.tbl,-1))
AND t1.tbl <>t2.tbl
AND t1.this_cnt<>0
AND t2.this_cnt<>0 ) a
WHERE STATUS<>'Consistent';
SELECT *FROM (
SELECT * FROM percona.checksums WHERE (db,tbl,shard,this_crc) IN(
SELECT source_db,shard_table,tab_shard, this_crc FROM (
SELECT
t1.db AS source_db,
t1.tbl AS source_table,
t2.tbl AS shard_table,
t2.shard AS tab_shard,
t2.this_crc,
CASE WHEN t1.this_crc = t2.this_crc THEN 'Consistent' ELSE CONCAT('DIFF: ' , t1.this_crc , ' vs ' , t2.this_crc) END AS STATUS
FROM percona.checksums t1
JOIN percona.checksums t2
ON t1.chunk=t2.chunk
AND t1.shard=t2.shard
AND t1.db=t2.db
WHERE t1.tbl=CONCAT(SUBSTR(t2.tbl,1,LENGTH(t2.tbl)-2) ,SUBSTR(t2.tbl,-1))
AND t1.tbl <>t2.tbl
AND t1.this_cnt<>0
AND t2.this_cnt<>0 ) a
WHERE STATUS<>'Consistent')
UNION ALL
SELECT * FROM percona.checksums WHERE (db,tbl,shard,this_crc) IN(
SELECT source_db,source_table,tab_shard,this_crc FROM (
SELECT
t1.db AS source_db,
t1.tbl AS source_table,
t2.tbl AS shard_table,
t2.shard AS tab_shard, t1.this_crc,
CASE WHEN t1.this_crc = t2.this_crc THEN 'Consistent' ELSE CONCAT('DIFF: ' , t1.this_crc , ' vs ' , t2.this_crc) END AS STATUS
FROM percona.checksums t1
JOIN percona.checksums t2
ON t1.chunk=t2.chunk
AND t1.shard=t2.shard
AND t1.db=t2.db
WHERE t1.tbl=CONCAT(SUBSTR(t2.tbl,1,LENGTH(t2.tbl)-2) ,SUBSTR(t2.tbl,-1))
AND t1.tbl <>t2.tbl
AND t1.this_cnt<>0
AND t2.this_cnt<>0 ) a
WHERE STATUS<>'Consistent') AND this_crc<>'0'
) a
ORDER BY shard,tbl,chunk;
"
对比查询结果,chunk过小,可能会出现分批混乱的情况,增加分批即可
mysql: [Warning] Using a password on the command line interface can be insecure.
+-----------+----------------------+-----------------------+-------+----------------------------+
| source_db | source_table | shard_table | shard | STATUS |
+-----------+----------------------+-----------------------+-------+----------------------------+
| testdb | test_table_0 | test_table_40 | 40 | DIFF: b05d599 vs edeada18 |
| testdb | test_table_1 | test_table_31 | 31 | DIFF: 26ecfb29 vs 51f4a6b8 |
| testdb | test_table_2 | test_table_22 | 22 | DIFF: f963ddde vs ab260909 |
| testdb | test_table_3 | test_table_33 | 33 | DIFF: f3cab39a vs 7b041fa |
+-----------+----------------------+-----------------------+-------+----------------------------+
+--------+-----------------------+-------+------------+-------------+---------------------+---------------------+----------+----------+------------+------------+---------------------+-------+
| db | tbl | chunk | chunk_time | chunk_index | lower_boundary | upper_boundary | this_crc | this_cnt | master_crc | master_cnt | ts | shard |
+--------+-----------------------+-------+------------+-------------+---------------------+---------------------+----------+----------+------------+------------+---------------------+-------+
| testdb | test_table_2 | 1 | 0.010595 | PRIMARY | 698827608351571968 | 973439053951664129 | f963ddde | 50 | 7487a23a | 27 | 2025-03-19 14:24:40 | 22 |
| testdb | test_table_2 | 2 | 0.002195 | PRIMARY | 973443832459296769 | 1088829439192752129 | 5245d4d7 | 17 | 0 | 0 | 2025-03-19 14:24:40 | 22 |
| testdb | test_table_22 | 1 | 0.00264 | NULL | NULL | NULL | ab260909 | 67 | ab260909 | 67 | 2025-03-19 14:24:39 | 22 |
| testdb | test_table_1 | 1 | 0.004085 | PRIMARY | 973340459944902656 | 1068478014973698048 | 26ecfb29 | 50 | d6199ce9 | 41 | 2025-03-19 14:24:36 | 31 |
| testdb | test_table_1 | 2 | 0.0019 | PRIMARY | 1068481365052383233 | 1088830290875539457 | 77185d91 | 9 | 0 | 0 | 2025-03-19 14:24:36 | 31 |
| testdb | test_table_31 | 1 | 0.002552 | NULL | NULL | NULL | 51f4a6b8 | 59 | 51f4a6b8 | 59 | 2025-03-19 14:24:36 | 31 |
| testdb | test_table_3 | 1 | 0.069563 | PRIMARY | 601115906705719104 | 1088838792654249985 | f3cab39a | 50 | 4ee90aeb | 29 | 2025-03-19 14:24:49 | 33 |
| testdb | test_table_3 | 2 | 0.006317 | PRIMARY | 1095715336995627008 | 1105903474758017024 | f47af260 | 7 | 0 | 0 | 2025-03-19 14:24:49 | 33 |
| testdb | test_table_33 | 1 | 0.024447 | NULL | NULL | NULL | 7b041fa | 57 | 7b041fa | 57 | 2025-03-19 14:24:49 | 33 |
| testdb | test_table_0 | 1 | 0.010594 | PRIMARY | 706499573845590016 | 1075733281432035328 | b05d599 | 50 | b05d599 | 50 | 2025-03-19 14:24:33 | 40 |
| testdb | test_table_0 | 2 | 0.002016 | PRIMARY | 1078805534004633601 | 1088839445841600513 | e6ef0f81 | 4 | e6ef0f81 | 4 | 2025-03-19 14:24:33 | 40 |
| testdb | test_table_40 | 1 | 0.030995 | NULL | NULL | NULL | edeada18 | 54 | edeada18 | 54 | 2025-03-19 14:24:33 | 40 |
+--------+-----------------------+-------+------------+-------------+---------------------+---------------------+----------+----------+------------+------------+---------------------+-------+
结果如下所示:
这里的一致不一致不代表任何意义,这里是主库。
"checksum_table.sh" 26L, 744C written
[10:49:52 root@wgq-160-17 scripts]# ./checksum_table.sh
Checking test_table_00...
Checking if all tables can be checksummed ...
Starting checksum ...
TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
03-19T10:50:08 0 0 24 0 3 0 0.291 testdb.test_table_0
03-19T10:50:08 0 0 24 0 1 0 0.101 testdb.test_table_00
mysql: [Warning] Using a password on the command line interface can be insecure.
Checking test_table_10...
Checking if all tables can be checksummed ...
Starting checksum ...
TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
03-19T10:50:09 0 1 32 8 3 0 0.195 testdb.test_table_0
03-19T10:50:09 0 0 32 0 1 0 0.141 testdb.test_table_10
mysql: [Warning] Using a password on the command line interface can be insecure.
Checking test_table_20...
Checking if all tables can be checksummed ...
Starting checksum ...
TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
03-19T10:50:10 0 2 120 20 4 0 0.380 testdb.test_table_0
03-19T10:50:10 0 0 120 0 1 0 0.054 testdb.test_table_20
mysql: [Warning] Using a password on the command line interface can be insecure.
Checking test_table_30...
Checking if all tables can be checksummed ...
Starting checksum ...
TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
03-19T10:50:11 0 2 41 20 3 0 0.365 testdb.test_table_0
03-19T10:50:11 0 0 41 0 1 0 0.053 testdb.test_table_30
mysql: [Warning] Using a password on the command line interface can be insecure.
Checking test_table_40...
Checking if all tables can be checksummed ...
Starting checksum ...
TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
03-19T10:50:12 0 2 54 20 3 0 0.378 testdb.test_table_0
03-19T10:50:12 0 0 54 0 1 0 0.050 testdb.test_table_40
mysql: [Warning] Using a password on the command line interface can be insecure.
Checking test_table_50...
Checking if all tables can be checksummed ...
Starting checksum ...
TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
03-19T10:50:13 0 4 1605 100 19 0 0.564 testdb.test_table_0
03-19T10:50:13 0 0 1605 0 19 0 0.249 testdb.test_table_50
mysql: [Warning] Using a password on the command line interface can be insecure.
Checking test_table_60...
Checking if all tables can be checksummed ...
Starting checksum ...
TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
03-19T10:50:14 0 4 72 100 3 0 0.197 testdb.test_table_0
03-19T10:50:14 0 0 72 0 1 0 0.177 testdb.test_table_60
mysql: [Warning] Using a password on the command line interface can be insecure.
Checking test_table_70...
Checking if all tables can be checksummed ...
Starting checksum ...
TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
03-19T10:50:14 0 4 83 100 3 0 0.265 testdb.test_table_0
03-19T10:50:15 0 0 83 0 1 0 0.101 testdb.test_table_70
mysql: [Warning] Using a password on the command line interface can be insecure.
Checking test_table_80...
Checking if all tables can be checksummed ...
Starting checksum ...
TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
03-19T10:50:15 0 4 45 100 3 0 0.189 testdb.test_table_0
03-19T10:50:15 0 0 45 0 1 0 0.139 testdb.test_table_80
mysql: [Warning] Using a password on the command line interface can be insecure.
Checking test_table_90...
Checking if all tables can be checksummed ...
Starting checksum ...
TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
03-19T10:50:16 0 9 657 100 9 0 0.449 testdb.test_table_0
03-19T10:50:16 0 0 657 0 9 0 0.149 testdb.test_table_90
mysql: [Warning] Using a password on the command line interface can be insecure.
[10:50:16 root@wgq-160-17 scripts]# cat checksum_table.sh




