暂无图片
暂无图片
3
暂无图片
暂无图片
暂无图片

GBase 8a 集群实战演练

原创 陆户习习门 2022-03-22
28153

WechatIMG16.jpeg

活动背景

GBase 8a 集群实战演练活动
具体实战演练内容详见下文。

一、安装

演练环境

数据库版本:GBase 8a MPP Cluster V95

服务器:10.206.16.106(主安装节点)、10.206.16.108

SSH工具:FinalShell

客户端工具:GBase Data Studio

安装步骤

环境准备

106、108执行如下命令:

$ #root 用户登录 $ #创建gbase用户(密码:gbase) $ useradd gbase $ passwd gbase $ cd /opt/ $ mkdir -p /opt/gbase $ #授权 $ chown gbase:gbase /opt/gbase/

集群安装

解压软件包

106:

$ tar xjf GBase8a_MPP_Cluster-License-9.5.2.39-redhat7.3-x86_64.tar.bz2 $ cd gcinstall/ $ cp SetSysEnv.py /opt/

执行环境配置脚本

108:

$ #复制环境配置脚本 $ scp root@10.206.16.106:/opt/gcinstall/SetSysEnv.py /opt/

106、108:

$ #运行SetSysEnv.py配置安装环境 $ python SetSysEnv.py --dbaUser=gbase --installPrefix=/opt/gbase --cgroup

数据库安装

106:

$ #DBA用户在主安装节点执行 $ su - gbase $ cd /opt/gcinstall/ $ vi demp.options

修改demo.options如下(两个节点对称部署):

installPrefix= /opt/gbase
coordinateHost = 10.206.16.106,10.206.16.108
coordinateHostNodeID = 106,108
dataHost = 10.206.16.106,10.206.16.108
#existCoordinateHost =
#existDataHost =
dbaUser = gbase
dbaGroup = gbase
dbaPwd = ‘gbase’
rootPwd = ‘Gbase_0312!@’
#rootPwdFile = rootPwd.json

$ ./gcinstall.py --silent=demo.options # 安装完成后显示如下日志 10.206.16.108 install cluster on host 10.206.16.108 successfully. 10.206.16.106 install cluster on host 10.206.16.106 successfully. Starting all gcluster nodes... start service failed on host 10.206.16.108. start service failed on host 10.206.16.106. adding new datanodes to gcware... InstallCluster Successfully. $ exit $ su - gbase $ gcadmin CLUSTER STATE: ACTIVE ================================================================ | GBASE COORDINATOR CLUSTER INFORMATION | ================================================================ | NodeName | IpAddress | gcware | gcluster | DataState | ---------------------------------------------------------------- | coordinator1 | 10.206.16.106 | OPEN | CLOSE | 0 | ---------------------------------------------------------------- | coordinator2 | 10.206.16.108 | OPEN | CLOSE | 0 | ---------------------------------------------------------------- ============================================================== | GBASE CLUSTER FREE DATA NODE INFORMATION | ============================================================== | NodeName | IpAddress | gnode | syncserver | DataState | -------------------------------------------------------------- | FreeNode1 | 10.206.16.106 | CLOSE | OPEN | 0 | -------------------------------------------------------------- | FreeNode2 | 10.206.16.108 | CLOSE | OPEN | 0 | -------------------------------------------------------------- 0 virtual cluster 2 coordinator node 2 free data node

导入License

106:

$ #查看license状态 $ ./chkLicense -n 10.206.16.106,10.206.16.108 -u gbase -p gbase ====================================================================== 10.206.16.108 is_exist:no ====================================================================== 10.206.16.106 is_exist:no $ #导入license文件 $ ./License -n 10.206.16.106,10.206.16.108 -f /opt/20220315-02.lic -u gbase -p gbase ====================================================================== Successful node nums: 2 ====================================================================== $ #再次查看license状态,已经配置 $ ./chkLicense -n 10.206.16.106,10.206.16.108 -u gbase -p gbase ====================================================================== 10.206.16.108 is_exist:yes version:trial expire_time:20220615 is_valid:yes ====================================================================== 10.206.16.106 is_exist:yes version:trial expire_time:20220615 is_valid:yes

106,108:

$ #重启集群服务 $ gcluster_services all stop $ gcluster_services all start $ gcadmin CLUSTER STATE: ACTIVE ================================================================ | GBASE COORDINATOR CLUSTER INFORMATION | ================================================================ | NodeName | IpAddress | gcware | gcluster | DataState | ---------------------------------------------------------------- | coordinator1 | 10.206.16.106 | OPEN | OPEN | 0 | ---------------------------------------------------------------- | coordinator2 | 10.206.16.108 | OPEN | OPEN | 0 | ---------------------------------------------------------------- ============================================================== | GBASE CLUSTER FREE DATA NODE INFORMATION | ============================================================== | NodeName | IpAddress | gnode | syncserver | DataState | -------------------------------------------------------------- | FreeNode1 | 10.206.16.106 | OPEN | OPEN | 0 | -------------------------------------------------------------- | FreeNode2 | 10.206.16.108 | OPEN | OPEN | 0 | -------------------------------------------------------------- 0 virtual cluster 2 coordinator node 2 free data node

生成 distribution

#一个分片,一个备份,负载均衡模式 $ gcadmin distribution gcChangeInfo.xml p 1 d 1 pattern 1 gcadmin generate distribution ... NOTE: node [10.206.16.106] is coordinator node, it shall be data node too NOTE: node [10.206.16.108] is coordinator node, it shall be data node too gcadmin generate distribution successful $ gcadmin showdistribution node Distribution ID: 1 | State: new | Total segment num: 2 ============================================================================================ | nodes | 10.206.16.106 | 10.206.16.108 | -------------------------------------------------------------------------------------------- | primary | 1 | 2 | | segments | | | -------------------------------------------------------------------------------------------- |duplicate | 2 | 1 | |segments 1| | | ============================================================================================

数据库初始化

$ gccli -u root -p Enter password: GBase client 9.5.2.39.126761. Copyright (c) 2004-2022, GBase. All Rights Reserved. gbase> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | performance_schema | | gbase | | gctmpdb | +--------------------+ 4 rows in set (Elapsed: 00:00:00.01) gbase> initnodedatamap; Query OK, 0 rows affected (Elapsed: 00:00:00.31)

二、数据准备

建库

gbase> create database GBase8a_cuifengyi; Query OK, 1 row affected (Elapsed: 00:00:00.01) gbase> use GBase8a_cuifengyi; Query OK, 0 rows affected (Elapsed: 00:00:00.01)

建表

创建员工、部门、薪水三张表结构

gbase> CREATE TABLE Employee ( -> id INT COMMENT '主键', -> NAME VARCHAR ( 50 ) COMMENT '员工姓名', -> departmentId INT COMMENT 'Department表中ID的外键', -> workingAge TINYINT COMMENT '工龄', -> PRIMARY KEY ( id ) -> ) REPLICATED; Query OK, 0 rows affected (Elapsed: 00:00:00.26) gbase> CREATE TABLE Department ( -> id INT COMMENT '主键', -> Ename VARCHAR ( 100 ) COMMENT '部门英文名称', -> Cname VARCHAR ( 50 ) COMMENT '部门中文名称', -> PRIMARY KEY ( id ) -> ) REPLICATED; Query OK, 0 rows affected (Elapsed: 00:00:00.05) gbase> CREATE TABLE Salary ( -> employeeID INT COMMENT 'Employee.id', -> salary INT COMMENT '工资', -> yearmonth VARCHAR ( 6 ) COMMENT '年月(YYYYMM)' -> ) DISTRIBUTED BY ( 'employeeID' ); Query OK, 0 rows affected (Elapsed: 00:00:00.07) gbase> show tables; +-----------------------------+ | Tables_in_gbase8a_cuifengyi | +-----------------------------+ | department | | employee | | salary | +-----------------------------+ 3 rows in set (Elapsed: 00:00:00.00)

文件入库

文件上传

将《南大通用2022数据库大赛造数.xls 》拆分为employee.csv、salary.csv、department.csv三个文件,使用finalshell工具,通过拖拽的方式,将文件上传至服务器。

image20220317174957103.png

loadfile

employ.csv 的“name”字段值包含 ‘,’(英文逗号),与csv默认分隔符冲突,故先在csv文件中做处理,将’,’(英文逗号)替换为"@",入库后再更新回初始数据。

gbase> LOAD DATA INFILE 'file://10.206.16.106/sampledata/employee.csv' INTO TABLE gbase8a_cuifengyi.employee data_format 3 fields terminated by ',' table_fields 'id,name,departmentId,workingAge' trace 1 trace_path '/home/gbase/' ignore 1 lines; Query OK, 740 rows affected (Elapsed: 00:00:00.06) Task 29 finished, Loaded 740 records, Skipped 0 records gbase> gbase> gbase> LOAD DATA INFILE 'file://10.206.16.106/sampledata/department.csv' INTO TABLE gbase8a_cuifengyi.department data_format 3 fields terminated by ',' table_fields 'id,Ename,Cname' trace 1 trace_path '/home/gbase/' ignore 1 lines; Query OK, 39 rows affected (Elapsed: 00:00:00.07) Task 30 finished, Loaded 39 records, Skipped 0 records gbase> gbase> LOAD DATA INFILE 'file://10.206.16.106/sampledata/salary.csv' INTO TABLE gbase8a_cuifengyi.salary data_format 3 fields terminated by ',' table_fields 'employeeID,salary,yearmonth' trace 1 trace_path '/home/gbase/' ignore 1 lines; Query OK, 4440 rows affected (Elapsed: 00:00:00.11) Task 31 finished, Loaded 4440 records, Skipped 0 records #更新回初始数据 gbase> update gbase8a_cuifengyi.employee set name = replace(name,'@',','); Query OK, 740 rows affected (Elapsed: 00:00:00.04) Rows matched: 740 Changed: 740 Warnings: 0

三、SQL编码题

编码题在GBase Data Studio完成。

T1

将Department.Ename分解为两列,一列是部门简称EnameAbbr,一列是部门全称EnameDetail,分隔符是冒号。分隔后的名称不能有空格。请将结果存入新表department_new中。

create table department_new as select replace(if(instr(ename,':',1)=0,ename,substr(ename,0,instr(ename,':',1)-1)),' ','') as EnameAbbr, replace(substr(ename,instr(ename,':',1)+1,length(ename)-instr(ename,':',1)),' ','') as EnameDetail from department; select * from department_new;

image20220318110001102.png

T2

列出各部门员工工龄段。分别统计各部门员工工龄(1~5年)、工龄(6~10年)、工龄(大于10年)的数量。

-- 1.列出各部门员工工龄段 select t2.cname, t1.name, t1.workingAge, case when t1.workingAge < 6 then '1~5年' when t1.workingAge < 11 then '6~10年' else '10年以上' end as 工龄段 from employee t1 left join department t2 on t1.departmentId = t2.id order by t2.cname,t1.workingAge;

image20220318111050025.png

-- 2.分别统计各部门员工工龄 select t2.cname, sum(case when t1.workingAge > 0 and t1.workingAge < 6 then 1 else 0 end) as "工龄1~5年", sum(case when t1.workingAge > 5 and t1.workingAge < 11 then 1 else 0 end) as "工龄6~10年", sum(case when t1.workingAge > 10 then 1 else 0 end) as "工龄大于10年" from employee t1 left join department t2 on t1.departmentId = t2.id group by t2.cname order by t2.cname;

image20220318111919904.png

T3

列出2021年06月每个部门收入排名前三的员工。结果集为部门名、员工姓名、工资。注意:某部门工资前三名员工数量可能大于或小于3;特例:没有第三高的工资。

select cname, name, salary from ( select t3.cname, t2.name, t1.salary, dense_rank() over(partition by cname order by salary desc) as rk from salary t1 left join employee t2 on t1.employeeID = t2.id left join department t3 on t2.departmentid = t3.id where t1.yearmonth = '202106' )t4 where rk < 4 order by cname,rk;

image20220318141025834.png

T4

统计每月各部门内员工收入在本部门的占比。结果集为:年月、部门、员工姓名、收入在本部门的占比。

select t1.yearmonth, t3.cname, t2.name, round(t1.salary/sum(salary) over(partition by cname,yearmonth)*100,2)||'%' as 部门占比 from salary t1 left join employee t2 on t1.employeeID = t2.id left join department t3 on t2.departmentid = t3.id;

image20220318141957065.png

T5

统计每个部门总收入每月的环比增长率( (本月-上月)*100%/上月 )。注意202101没有上月的比对数据,所以环比为空。

select cname, yearmonth, cur_salary as 本月工资总额, round((cur_salary-bef_salary)*100/bef_salary,2)||'%' as '环比增长率' from ( select t4.*, lag(cur_salary) over(partition by cname order by yearmonth) as bef_salary from( select t3.cname, t1.yearmonth, sum(salary) cur_salary from salary t1 left join employee t2 on t1.employeeID = t2.id left join department t3 on t2.departmentid = t3.id group by cname,yearmonth )t4 )t5;

image20220318143821087.png

活动感受

很荣幸有这次免费的学习机会,先学习了GDCA课程,后参加本次GBase8a集群实战演练活动,课程中知识点讲解非常细致,再结合实践,对GBase8aMPP整体有了更加深入的认识。感谢GBase给予这样的学习平台,以及老师们的耐心指导。收获颇丰。

最后修改时间:2022-03-23 11:39:27
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论