暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

mysql分表后用pt-table-checksum检查数据一致性方法

qtiger 2025-03-19
227

校验原表和分表数据
因为只关注主库比较,要做响应改造
表里面增加shard分片字段,并保留每次的原表记录,所以将shard加到主键里面,这样就不会删除原表的上次记录
注意统计信息要最新,不然可能会出现原表和分表的chunk不一致的情况
为了原表和分表有一样的chunk数,要增加chunk参数,这样可以比较crc的sum值
因为修改了结果表结构,文本输出的结果会不一致,无需关注,主要关注结果表记录
根据结果表进行分表比较
CREATE TABLE percona.`checksums` (
`db` CHAR(64) NOT NULL,
`tbl` CHAR(64) NOT NULL,
`chunk` INT(11) NOT NULL,
`chunk_time` FLOAT DEFAULT NULL,
`chunk_index` VARCHAR(200) DEFAULT NULL,
`lower_boundary` TEXT,
`upper_boundary` TEXT,
`this_crc` CHAR(40) NOT NULL,
`this_cnt` INT(11) NOT NULL,
`master_crc` CHAR(40) DEFAULT NULL,
`master_cnt` INT(11) DEFAULT NULL,
`ts` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`shard` VARCHAR(10) NOT NULL DEFAULT '-1',
PRIMARY KEY (`db`,`tbl`,`chunk`),
KEY `ts_db_tbl` (`ts`,`db`,`tbl`)
) ENGINE=INNODB DEFAULT CHARSET=utf8

---数据初始化分片

#!/bin/bash

MYSQLBASE="/mysql/base/bin"

HOST="192.168.56.201"

USER="sfwew"

PASSWORD="sfwew"

DATABASE="db"

for tab in {0..0}; do

SOURCE_TABLE="table_${tab}"

for i in {7..7}; do

SHARD_TABLE="table_${i}${tab}"

echo "Checking $SHARD_TABLE..."

pt-archiver \

--source h=$HOST,u=$USER,p=$PASSWORD,D=$DATABASE,t=$SOURCE_TABLE \

--dest h=$HOST,u=$USER,p=$PASSWORD,D=$DATABASE,t=$SHARD_TABLE \

--charset=UTF8mb4 --where "SUBSTRING(MERCHANT_NO, -2) = ${i}${tab}" \

--progress 100000 \

--limit 5000 \

--txn-size 5000 \

--no-delete \

--statistics

done

done

数据checksum值计算,要注意,主键id必须递增,否则可能会出现chunk不一致的情况,无法比较数据

增量比较的时候,添加相关条件,只比较需要比较的数据即可

[root@wgq-64-80 scripts]# cat checksum.sh

#!/bin/bash

MYSQLBASE="/mysql/base/bin"

HOST="192.168.56.201"

USER="sfwew"

PASSWORD="sfwew"

DATABASE="db"

for tab in {0..0}; do

SOURCE_TABLE="table_${tab}"

for i in {7..7}; do

SHARD_TABLE="table_${i}${tab}"

echo "Checking $SHARD_TABLE..."

pt-table-checksum \

--host=$HOST \

--user=$USER \

--password=$PASSWORD \

--databases=$DATABASE \

--tables=$SOURCE_TABLE,$SHARD_TABLE \

--no-check-binlog-format \

--no-check-replication-filters \

--replicate=percona.checksums \

--noempty-replicate-table \

--chunk-size=10000 \

--set-vars 'autocommit=on' \

--where="SUBSTRING(MERCHANT_NO, -2) = ${i}${tab}"

$MYSQLBASE/mysql -h$HOST -u$USER -p$PASSWORD -e "UPDATE percona.checksums SET SHARD='${i}${tab}' WHERE shard ='-1';commit;"

done

done

少量数据不一致同步脚本

大批量数据不要用这个方式,会锁表,影响产线,数据量比较大可用考虑用pt-archiver加where条件进行同步

[root@wgq-64-80 scripts]# cat sync.sh

#!/bin/bash

MYSQLBASE="/mysql/base/bin"

HOST="192.168.56.201"

USER="sfwew"

PASSWORD="sfwew"

DATABASE="db"

for tab in {0..0}; do

SOURCE_TABLE="table_${tab}"

for i in {7..7}; do

SHARD_TABLE="table_${i}${tab}"

echo "Checking $SHARD_TABLE..."

pt-table-sync --execute --chunk-size=10000\

h=$HOST,D=$DATABASE,t=$SOURCE_TABLE \

h=$HOST,D=$DATABASE,t=$SHARD_TABLE \

--where="SUBSTRING(MERCHANT_NO, -2) = ${i}${tab} " \

--set-vars 'autocommit=on' \

-u$USER \

-p$PASSWORD

done

done

数据比较结果checksum值是否一致,注意比较时候,中间不能插入数据,否则分批错乱,无法一致

[root@wgq-64-80 scripts]# cat result.sh

#!/bin/bash

MYSQLBASE="/mysql/base/bin"

HOST="192.168.56.201"

USER="sfwew"

PASSWORD="sfwew"

$MYSQLBASE/mysql -h$HOST -u$USER -p$PASSWORD -e "

SELECT * FROM (

SELECT

t1.db AS source_db,

t1.tbl AS source_table,

t2.tbl AS shard_table,

t2.shard,

CASE WHEN t1.this_crc = t2.this_crc THEN 'Consistent' ELSE CONCAT('DIFF: ' , t1.this_crc , ' vs ' , t2.this_crc) END AS STATUS

FROM percona.checksums t1

JOIN percona.checksums t2

ON t1.chunk=t2.chunk

AND t1.shard=t2.shard

WHERE t1.tbl=CONCAT(SUBSTR(t2.tbl,1,LENGTH(t2.tbl)-2) ,SUBSTR(t2.tbl,-1))

AND t1.tbl <>t2.tbl

AND t1.this_cnt<>0

AND t2.this_cnt<>0 ) a

WHERE STATUS<>'Consistent';

SELECT *FROM (

SELECT * FROM percona.checksums WHERE (db,tbl,shard,this_crc) IN(

SELECT source_db,shard_table,tab_shard, this_crc FROM (

SELECT

t1.db AS source_db,

t1.tbl AS source_table,

t2.tbl AS shard_table,

t2.shard AS tab_shard,

t2.this_crc,

CASE WHEN t1.this_crc = t2.this_crc THEN 'Consistent' ELSE CONCAT('DIFF: ' , t1.this_crc , ' vs ' , t2.this_crc) END AS STATUS

FROM percona.checksums t1

JOIN percona.checksums t2

ON t1.chunk=t2.chunk

AND t1.shard=t2.shard

AND t1.db=t2.db

WHERE t1.tbl=CONCAT(SUBSTR(t2.tbl,1,LENGTH(t2.tbl)-2) ,SUBSTR(t2.tbl,-1))

AND t1.tbl <>t2.tbl

AND t1.this_cnt<>0

AND t2.this_cnt<>0 ) a

WHERE STATUS<>'Consistent')

UNION ALL

SELECT * FROM percona.checksums WHERE (db,tbl,shard,this_crc) IN(

SELECT source_db,source_table,tab_shard,this_crc FROM (

SELECT

t1.db AS source_db,

t1.tbl AS source_table,

t2.tbl AS shard_table,

t2.shard AS tab_shard, t1.this_crc,

CASE WHEN t1.this_crc = t2.this_crc THEN 'Consistent' ELSE CONCAT('DIFF: ' , t1.this_crc , ' vs ' , t2.this_crc) END AS STATUS

FROM percona.checksums t1

JOIN percona.checksums t2

ON t1.chunk=t2.chunk

AND t1.shard=t2.shard

AND t1.db=t2.db

WHERE t1.tbl=CONCAT(SUBSTR(t2.tbl,1,LENGTH(t2.tbl)-2) ,SUBSTR(t2.tbl,-1))

AND t1.tbl <>t2.tbl

AND t1.this_cnt<>0

AND t2.this_cnt<>0 ) a

WHERE STATUS<>'Consistent') AND this_crc<>'0'

) a

ORDER BY shard,tbl,chunk;

"


对比查询结果,chunk过小,可能会出现分批混乱的情况,增加分批即可
mysql: [Warning] Using a password on the command line interface can be insecure.
+-----------+----------------------+-----------------------+-------+----------------------------+
| source_db | source_table | shard_table | shard | STATUS |
+-----------+----------------------+-----------------------+-------+----------------------------+
| testdb | test_table_0 | test_table_40 | 40 | DIFF: b05d599 vs edeada18 |
| testdb | test_table_1 | test_table_31 | 31 | DIFF: 26ecfb29 vs 51f4a6b8 |
| testdb | test_table_2 | test_table_22 | 22 | DIFF: f963ddde vs ab260909 |
| testdb | test_table_3 | test_table_33 | 33 | DIFF: f3cab39a vs 7b041fa |
+-----------+----------------------+-----------------------+-------+----------------------------+
+--------+-----------------------+-------+------------+-------------+---------------------+---------------------+----------+----------+------------+------------+---------------------+-------+
| db | tbl | chunk | chunk_time | chunk_index | lower_boundary | upper_boundary | this_crc | this_cnt | master_crc | master_cnt | ts | shard |
+--------+-----------------------+-------+------------+-------------+---------------------+---------------------+----------+----------+------------+------------+---------------------+-------+
| testdb | test_table_2 | 1 | 0.010595 | PRIMARY | 698827608351571968 | 973439053951664129 | f963ddde | 50 | 7487a23a | 27 | 2025-03-19 14:24:40 | 22 |
| testdb | test_table_2 | 2 | 0.002195 | PRIMARY | 973443832459296769 | 1088829439192752129 | 5245d4d7 | 17 | 0 | 0 | 2025-03-19 14:24:40 | 22 |
| testdb | test_table_22 | 1 | 0.00264 | NULL | NULL | NULL | ab260909 | 67 | ab260909 | 67 | 2025-03-19 14:24:39 | 22 |
| testdb | test_table_1 | 1 | 0.004085 | PRIMARY | 973340459944902656 | 1068478014973698048 | 26ecfb29 | 50 | d6199ce9 | 41 | 2025-03-19 14:24:36 | 31 |
| testdb | test_table_1 | 2 | 0.0019 | PRIMARY | 1068481365052383233 | 1088830290875539457 | 77185d91 | 9 | 0 | 0 | 2025-03-19 14:24:36 | 31 |
| testdb | test_table_31 | 1 | 0.002552 | NULL | NULL | NULL | 51f4a6b8 | 59 | 51f4a6b8 | 59 | 2025-03-19 14:24:36 | 31 |
| testdb | test_table_3 | 1 | 0.069563 | PRIMARY | 601115906705719104 | 1088838792654249985 | f3cab39a | 50 | 4ee90aeb | 29 | 2025-03-19 14:24:49 | 33 |
| testdb | test_table_3 | 2 | 0.006317 | PRIMARY | 1095715336995627008 | 1105903474758017024 | f47af260 | 7 | 0 | 0 | 2025-03-19 14:24:49 | 33 |
| testdb | test_table_33 | 1 | 0.024447 | NULL | NULL | NULL | 7b041fa | 57 | 7b041fa | 57 | 2025-03-19 14:24:49 | 33 |
| testdb | test_table_0 | 1 | 0.010594 | PRIMARY | 706499573845590016 | 1075733281432035328 | b05d599 | 50 | b05d599 | 50 | 2025-03-19 14:24:33 | 40 |
| testdb | test_table_0 | 2 | 0.002016 | PRIMARY | 1078805534004633601 | 1088839445841600513 | e6ef0f81 | 4 | e6ef0f81 | 4 | 2025-03-19 14:24:33 | 40 |
| testdb | test_table_40 | 1 | 0.030995 | NULL | NULL | NULL | edeada18 | 54 | edeada18 | 54 | 2025-03-19 14:24:33 | 40 |
+--------+-----------------------+-------+------------+-------------+---------------------+---------------------+----------+----------+------------+------------+---------------------+-------+

结果如下所示:
这里的一致不一致不代表任何意义,这里是主库。
"checksum_table.sh" 26L, 744C written
[10:49:52 root@wgq-160-17 scripts]# ./checksum_table.sh
Checking test_table_00...
Checking if all tables can be checksummed ...
Starting checksum ...
TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
03-19T10:50:08 0 0 24 0 3 0 0.291 testdb.test_table_0
03-19T10:50:08 0 0 24 0 1 0 0.101 testdb.test_table_00
mysql: [Warning] Using a password on the command line interface can be insecure.
Checking test_table_10...
Checking if all tables can be checksummed ...
Starting checksum ...
TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
03-19T10:50:09 0 1 32 8 3 0 0.195 testdb.test_table_0
03-19T10:50:09 0 0 32 0 1 0 0.141 testdb.test_table_10
mysql: [Warning] Using a password on the command line interface can be insecure.
Checking test_table_20...
Checking if all tables can be checksummed ...
Starting checksum ...
TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
03-19T10:50:10 0 2 120 20 4 0 0.380 testdb.test_table_0
03-19T10:50:10 0 0 120 0 1 0 0.054 testdb.test_table_20
mysql: [Warning] Using a password on the command line interface can be insecure.
Checking test_table_30...
Checking if all tables can be checksummed ...
Starting checksum ...
TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
03-19T10:50:11 0 2 41 20 3 0 0.365 testdb.test_table_0
03-19T10:50:11 0 0 41 0 1 0 0.053 testdb.test_table_30
mysql: [Warning] Using a password on the command line interface can be insecure.
Checking test_table_40...
Checking if all tables can be checksummed ...
Starting checksum ...
TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
03-19T10:50:12 0 2 54 20 3 0 0.378 testdb.test_table_0
03-19T10:50:12 0 0 54 0 1 0 0.050 testdb.test_table_40
mysql: [Warning] Using a password on the command line interface can be insecure.
Checking test_table_50...
Checking if all tables can be checksummed ...
Starting checksum ...
TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
03-19T10:50:13 0 4 1605 100 19 0 0.564 testdb.test_table_0
03-19T10:50:13 0 0 1605 0 19 0 0.249 testdb.test_table_50
mysql: [Warning] Using a password on the command line interface can be insecure.
Checking test_table_60...
Checking if all tables can be checksummed ...
Starting checksum ...
TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
03-19T10:50:14 0 4 72 100 3 0 0.197 testdb.test_table_0
03-19T10:50:14 0 0 72 0 1 0 0.177 testdb.test_table_60
mysql: [Warning] Using a password on the command line interface can be insecure.
Checking test_table_70...
Checking if all tables can be checksummed ...
Starting checksum ...
TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
03-19T10:50:14 0 4 83 100 3 0 0.265 testdb.test_table_0
03-19T10:50:15 0 0 83 0 1 0 0.101 testdb.test_table_70
mysql: [Warning] Using a password on the command line interface can be insecure.
Checking test_table_80...
Checking if all tables can be checksummed ...
Starting checksum ...
TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
03-19T10:50:15 0 4 45 100 3 0 0.189 testdb.test_table_0
03-19T10:50:15 0 0 45 0 1 0 0.139 testdb.test_table_80
mysql: [Warning] Using a password on the command line interface can be insecure.
Checking test_table_90...
Checking if all tables can be checksummed ...
Starting checksum ...
TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE
03-19T10:50:16 0 9 657 100 9 0 0.449 testdb.test_table_0
03-19T10:50:16 0 0 657 0 9 0 0.149 testdb.test_table_90
mysql: [Warning] Using a password on the command line interface can be insecure.
[10:50:16 root@wgq-160-17 scripts]# cat checksum_table.sh

最后修改时间:2025-03-26 17:56:16
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论