暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

技术解读 | Klustron表重分布用法示例

点击上方蓝字关注我们





本文目标
通过命令行工具和XPanel演示Klustron表重分布的功能,指导客户如何将普通表转换为分区表。模拟了一个在线应用的场景下进行表重分布的案例,测试了表重分布对在线应用的影响以及验证了数据的完整性。KunlunBase(也叫做Klustron)

关键词XPanel、表重分布、Online DDL、使用示例、repartition

本文约2300文字,大概阅读时长5分钟。

注意:

如无特别说明,文中的版本号可以使用任何已发布版本的版本号代替。所有已发布版本详见:http://doc.klustron.com/zh/Release_notes.html

在现实的业务场景中,常常会出现随着业务需求和表的数据量变化,原先创建的表的方式可能不再适合了。例如本文例子的sales_order表,在创建的初期由于业务量不大,创建的是非分区表,但是随着公司业务的发展,表中记录的数量越来越多,因此需要将其转换成分区表,方便后期的数据归档和加快应用的访问速度。

具体环境信息如下:

节点类型
IP
端口
计算节点
192.168.40.152
47002
Shard1主节点
192.168.40.152
57003
Shard2主节点
192.168.26.153
57005
XPanel
192.168.40.151
18080


1

环境准备



通过PG客户端登录计算节点,创建用户和数据库。
    psql -h 192.168.40.152 -p 47001 postgres
    create user kunlun_test with password 'kunlun';
    create database test_db with owner kunlun_test encoding utf8 template template0;
    \q
    psql -h 192.168.40.152 -p 47001 -U kunlun_test test_db

    创建sales_order表,并通过存储过程装载数据
      create table sales_order
      (
      order_number INT NOT NULL AUTO_INCREMENT,
      customer_number INT NOT NULL,
      product_code INT NOT NULL,
      order_date DATETIME NOT NULL,
      entry_date DATETIME NOT NULL,
      order_amount DECIMAL(18,2) NOT NULL,
      primary key(order_number,order_date)
      ) ;


      create or replace procedure generate_order_data()
      AS $$
      DECLARE
      v_customer_number integer;
      v_product_code integer;
      v_order_date date;
      v_amount integer;
      start_date date := to_date('2021-01-01','yyyy-mm-dd');
      i integer :=1;
      BEGIN
      while i<=10000 loop
      v_customer_number := FLOOR(1+RANDOM()*6);
      v_product_code := FLOOR(1+RANDOM()*500);
      v_order_date := to_date('2021-01-01','yyyy-mm-dd') + CAST(FLOOR(RANDOM()*365) AS INT);
      v_amount := FLOOR(1000+RANDOM()*9000);
      INSERT INTO sales_order VALUES (i,v_customer_number,v_product_code,v_order_date,v_order_date,v_amount);
      commit;
      i := i+1;
      end loop;
      END; $$
      LANGUAGE plpgsql;
      set statement_timeout=0;
      call generate_order_data();
      test_db=> select count(*) from sales_order;
      count
      -------
      10000
      (1 row)



      创建目标表sales_order_new, 根据订单时间的Range分区表。
        create table sales_order_new
        (
        order_number INT NOT NULL AUTO_INCREMENT,
        customer_number INT NOT NULL,
        product_code INT NOT NULL,
        order_date DATETIME NOT NULL,
        entry_date DATETIME NOT NULL,
        order_amount DECIMAL(18,2) NOT NULL,
        primary key(order_number,order_date)
        ) partition by range(order_date);
        create table sales_order_p0 partition of sales_order_new
        for values from ('2021-01-01') to ('2021-05-01');
        create table sales_order_p1 partition of sales_order_new
        for values from ('2021-05-01') to ('2021-09-01');
        create table sales_order_p2 partition of sales_order_new
        for values from ('2021-09-01'to ('2022-01-01');


        2

        准备一个Python程序kinsert.py,模拟一直运行在sales_order表上的插入应用。



          import psycopg2.extras
          import time


          conn = psycopg2.connect(database='test_db',user='kunlun_test',
          password='kunlun',host='192.168.40.152',port='47001')




          cursor = conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor)


          # insert sales_order table
          insert_sql = '''INSERT INTO sales_order VALUES (%s,
          FLOOR(1+RANDOM()*6), FLOOR(1+RANDOM()*3),
          to_date('2021-01-01','yyyy-mm-dd') + CAST(FLOOR(RANDOM()*365) AS INT),
          to_date('2021-01-01','yyyy-mm-dd') + CAST(FLOOR(RANDOM()*365) AS INT),
          FLOOR(1000+RANDOM()*9000));
          ;'''


          #print("Press Enter to continue.")
          input('Press any key and Enter to continue ~!')
          i = 10001
          while (i < 20001) :
          cursor.execute(insert_sql,[i])
          conn.commit()
          print("order_number:",i," is inserted.")
          i = i+1


          cursor.close()
          conn.close()

          应用模拟从order_number 10001开始插入,程序运行后会等待输入才能执行真正的插入逻辑。在XPanel界面开始重新分布任务后,立刻恢复程序的运行,
            [klbase@server-0 ~]$ python kinsert.py
            Press any key and Enter to continue ~!


            3

            XPanel界面进行表重分布的操作



            在“集群列表信息”页面,点击“设置”


            在左边栏位点击“表重分布”


            选择“目标表集群”,“源表”,“目标表”以及删除源表的策略。如果策略选择“自动”,则会默认保留源表7天;如果选择“手动”,则用户自行删除源表。

            点击“提交”。与此同时,切换到运行Python程序kinsert.py的终端,输入Enter继续插入记录。
              [klbase@server-0 ~]$ python kinsert.py
              Press any key and Enter to continue ~!
              order_number: 10001 is inserted.
              order_number: 10002 is inserted.
              order_number: 10003 is inserted.
              order_number: 10004 is inserted.
              order_number: 10005 is inserted.
              order_number: 10006 is inserted.
              order_number: 10007 is inserted.
              order_number: 10008 is inserted.
              order_number: 10009 is inserted.
              order_number: 10010 is inserted.
              order_number: 10011 is inserted.
              order_number: 10012 is inserted.
              order_number: 10013 is inserted.
              order_number: 10014 is inserted.
              order_number: 10015 is inserted.
              order_number: 10016  is inserted.

              稍等片刻会出现插入错误,此时源表sales_order被重命名了。
                order_number: 10281 is inserted.
                order_number: 10282 is inserted.
                order_number: 10283 is inserted.
                order_number: 10284 is inserted.
                order_number: 10285 is inserted.
                order_number: 10286 is inserted.
                order_number: 10287 is inserted.
                order_number: 10288 is inserted.
                order_number: 10289 is inserted.
                order_number: 10290 is inserted.
                order_number: 10291 is inserted.
                order_number: 10292 is inserted.
                order_number: 10293 is inserted.
                order_number: 10294 is inserted.
                order_number: 10295 is inserted.
                order_number: 10296 is inserted.
                order_number: 10297 is inserted.
                Traceback (most recent call last):
                File "kinsert.py", line 22, in <module>
                cursor.execute(insert_sql,[i])
                File "/usr/local/lib64/python3.6/site-packages/psycopg2/extras.py", line 236, in execute
                return super().execute(query, vars)
                psycopg2.errors.UndefinedTable: relation "sales_order" does not exist
                LINE 1: INSERT INTO sales_order VALUES (10298,
                此时在线应用插入的order_number为10297,当表重分布执行的时侯sales_order会被重命名,所以应用此时插入失败了。

                 

                4

                检查目标表和源表的情况



                回到PG客户端检查目标表和源表的情况。发现源表已经变成了分区表,目标表根据“删除源表”策略修改为__sales_order$$tb_repartition_13。

                命名规则为:__[源表名]$$tb_repartition_[任务号]。等到7天之后,系统会自动删除表__sales_order$$tb_repartition_13。

                查询sales_order中的数据,数据量正好和应用错误之前插入点是吻合的,在开始进行表重新分布的之后,在线应用仍然插入了297行记录。
                  test_db=> select count(*) from sales_order;
                  count
                  -------
                  10297
                  (1 row)









                  现在,我们发起一轮新的投票,希望就以下问题请教大家,收集大家的反馈。为了感谢您的参与,所有参与问卷调查并留下邮箱联系方式的,由于我们的问卷系统不支持在线抽奖,我们将在后续会统一随机抽取 15 名有效填写者作为幸运参与者,送精美纪念礼品一份,感谢大家参加。
                  👆扫码填写

                  END

                  同时欢迎大家扫码👇添加小助手(备注:加入KunlunBase技术交流群)欢迎大家在交流群共同探讨更多问题及主题。

                   点击👆上方,关注获取源代码及技术信息~



                  推荐阅读


                  关于我司将 KunlunBase 产品名称改为 Klustron 的通知

                  技术解读 | KunlunBase多层级并行查询处理技术

                  技术解读 | KunlunBase全局死锁检测技术

                  技术解读 | MySQL分布式事务处理的问题和kunlun-storage的解决方案

                  技术解读 | 深入了解词法分析:从原理到实践

                  技术解读 | Klustron 数据备份及全局一致性恢复

                  技术解读|Klustron Mirror表功能介绍和使用示例

                  MySQL Bug 的处理及修复

                  MySQL内核研发系列之二 - MySQL测试框架MTR简介

                  文章转载自KunlunBase 昆仑数据库,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                  评论