1 用户认证连接
1.1 配置星环
1.1.1 修改参数,需重启
将hadoop_security.authentication.oauth2.enable,改为false
1.1.2 创建数据交换目录,加配额
hdfs dfs –mkdir /tmp/gbase8up
hdfs dfs –chmod 777 /tmp/gbase8up
hdfs dfsadmin -setSpaceQuota 10G /tmp/gbase8up
1.1.3 提供认证用户和密码
提供连接时所需的用户和密码,并赋予建库,建表,增删改查的权限
1.2 操作系统配置
GBase UP集群的所有管理节点都要安装星环客户端,注意:开kerboers和不开kerberos的不一样,需要在星环HD管理页面上下载
①安装配置java环境
②配置yum源
③解压tdh客户端安装包,使用root用户执行一键安装脚本
source init.sh
④把 source ..../init.sh添加到gbase用户的环境变量(.bash_profile)中去。
⑤验证beeline客户端,是否可以连接通星环HD
su - gbase
beeline
!connect jdbc:hive2://tdh01:10000
⑥验证hdfs 命令
hdfs dfs –ls /tmp
hdfs dfs –touch /tmp/gbase8up/1.txt
1.3 配置GBase UP
1.3.1 修改配置文件gbase_8a_gcluster.cnf,需重启集群
[hive]
# According to the realistic environment to modify those configuration as follows.
hive_hdfs_user_name=ndty01 ###认证用户
# HA zookeeper connect
hive_zookeeper_host=tdh01:2181,tdh02:2181,tdh03:2181 ##zookeeper
# HA namenode
hive_hdfs_cluster_name=nameservice1 ##hadoop集群名称
# The NameNode's IP.
hive_hdfs_server_name=tdh03 ##namenodeIP,如果多个,逗号分隔(如果报错,就只写一个)
# The port of HDFS server.
hive_hdfs_server_port=8020 ##HDFS端口
# Tha HBase Master's IP
hive_hbase_server_name=n35
# Which table to store Blob data.
hive_hbase_table_name=HbaseStream
# The local directory to store BlobCache.
hive_blob_cache_path=/opt/gcluster/userdata/blob_cache
# Which directory to store Blob data.
hive_hdfs_blob_path=/blobstorage
# When data exchange,the minmum hdfs space must be >1G.Default is 10G,units values is in bytes.
hive_hdfs_tmpspace_limit=10G
# When the data exchange,the minmum hive space must be 1G-100G.Default is 10G,units values is in bytes.
hive_space_limit=10G
# When the data exchange,the minmum of gnode space must be >1G.Default is 10G,units values is in bytes.
hive_gcluster_space_limit=10G
# The maximum of connections range 1-64,default 5.
hive_max_connections=5
# The number of connection in pool error,range 0-64,default 3.
hive_max_connections_in_pool=2
# The number of retry when get connection error,range 1-10,default 5.
hive_max_try_connect_times=5
# The Column Family name of table which to store Blob data.
hive_hbase_col_family_name=file
# The port of HBase server.
hive_hbase_server_port=9090
# The maximum of connections to HBase is range 1-64,default 5.
hive_hbase_max_connections=5
# The maximum length of single blob data,range 2M-16M,default 16M,units values is in bytes.
# If the length of blob data exceed 16M,it will be stored to HDFS.
hive_blob_max_size_in_hbase=16M
# The maximum length of single blob data stored to HDFS,range 1G-1T,default 16G,units values is in bytes.
hive_blob_max_size_in_hdfs=16G
# The maximum space of BlobCache cached in the local disk,range 1G-16G,default 8G,units values is in bytes.
hive_blob_cache_space_limit=8G
# range: 0 or 1
# 0: Will send the blob data stored by HDFS to client.
# 1: First,get the uri from 8a mpp.Then,get the blob data by uri.Finally,send the blob data to client.
hive_blob_send_from_uri=1
1.3.2 添加实例
install plugin hive soname 'ha_hive.so';
select create_engine_instance('hive','inst1','param://hive?hostlist=tdh01?port=10000?thriftversion=6?thrifttimeout=60?authmode=1?authtype=PLAIN?mechanism=PLAIN?user=ndty01?password=ndty01');
1.3.3 设置rpc协议
set global gbase_hdfs_protocol=RPC;
-- set global gbase_hdfs_namenodes=tdh02:8020,tdh03:8020;
set global gbase_hdfs_port=8020;
2 kerberos认证连接
2.1 配置星环
2.1.1 修改星环参数,需重启
将hadoop_security.authentication.oauth2.enable,改为false
2.1.2 创建数据交换目录,并加配额
hdfs dfs -mkdir /tmp/gbase8up
hdfs dfs -chmod 777 /tmp/gbase8up
hdfs dfsadmin -setSpaceQuota 10G /tmp/gbase8up
2.1.3 提供认证用户、keytab文件
提供连接时所需的用户、keytab文件,并赋予建库,建表,增删改查的权限
2.2 配置操作系统
2.2.1 安装星环客户端
GBase UP集群的所有管理节点都要安装星环客户端,注意:开kerboers和不开kerberos的客户端不一样,需要在星环HD管理页面上下载
①安装配置java环境
②配置yum源
③解压tdh客户端安装包,使用root用户执行一键安装脚本
source init.sh
④初始化keytab,并将命令直接放入gbase用户的环境变量(.bash_profile)中去,然后把keytab文件拷到每个节点上的相同目录
kinit -kt /opt/gbase/keytab文件 username
⑤把 source ..../init.sh添加到gbase用户的环境变量(.bash_profile)中去。
⑥使用gbase用户,验证beeline客户端,是否可以连接通星环HD,验证建库、建表的权限
su - gbase
beeline
!connect jdbc:hive2://tdh01:10000/default;principal=hive/tdh01@TDH
⑤验证hdfs 命令的读和写
hdfs dfs -ls /tmp
hdfs dfs -put 1.txt /tmp/gbase8up/
2.2.2 配置gbase环境变量
①创建目录
mkdir -p /home/gbase/tmp
②添加环境变量
vi /home/gbase/.bash_profile中添加
export KRB5CCNAME=/home/gbase/tmp/krb5cache
export KRB5CONF=/etc/krb5.conf
2.2.3 Keytab文件处理
①klist -kt /opt/hdfs.headless.keytab命令获取主题名

2.3 配置GBase UP
2.3.1 修改配置文件gbase_8a_gbase.cnf,需重启集群
[gbased]
........
gbase_hdfs_auth_mode=kerberos
gbase_hdfs_protocol=RPC
gbase_hdfs_keytab=/opt/ndty01.keytab
#gbase_hdfs_namenodes=tdh02:8020,tdh03:8020
_gbase_hdfs_rpcconfig='dfs.block.access.token.enable=true'
gbase_hdfs_port=8020
gbase_hdfs_principal=ndty01@TDH
2.3.2 添加实例
install plugin hive soname 'ha_hive.so';
select create_engine_instance('hive','inst1','param://hive?hostlist=tdh01?port=10000?thriftversion=6?thrifttimeout=60?authmode=1?authtype=KERBEROS?mechanism=GSSAPI?principal=ndty01@TDH?keytabfile=/opt/ndty01.keytab?serverid=tdh01?protocol=hive?renewtime=7');
2.3.3 修改配置文件gbase_8a_gcluster.cnf,需重启集群
[gbased]
........
gbase_hdfs_auth_mode=kerberos
gbase_hdfs_protocol=RPC
gbase_hdfs_keytab=/opt/ndty01.keytab
#gbase_hdfs_namenodes=tdh02:8020,tdh03:8020
_gbase_hdfs_rpcconfig='dfs.block.access.token.enable=true'
gbase_hdfs_port=8020
gbase_hdfs_principal=ndty01@TDH
.......
[hive]
# According to the realistic environment to modify those configuration as follows.
hive_hdfs_user_name=ndty01 ###认证用户
# HA zookeeper connect
hive_zookeeper_host=tdh01:2181,tdh02:2181,tdh03:2181 ##zookeeper
# HA namenode
hive_hdfs_cluster_name=nameservice1 ##hadoop集群名称
# The NameNode's IP.
hive_hdfs_server_name=tdh03 ##namenodeIP,如果多个,逗号分隔(如果报错,就只写一个)
# The port of HDFS server.
hive_hdfs_server_port=8020 ##HDFS端口
# Tha HBase Master's IP
hive_hbase_server_name=n35
# Which table to store Blob data.
hive_hbase_table_name=HbaseStream
# The local directory to store BlobCache.
hive_blob_cache_path=/opt/gcluster/userdata/blob_cache
# Which directory to store Blob data.
hive_hdfs_blob_path=/blobstorage
# When data exchange,the minmum hdfs space must be >1G.Default is 10G,units values is in bytes.
hive_hdfs_tmpspace_limit=10G
# When the data exchange,the minmum hive space must be 1G-100G.Default is 10G,units values is in bytes.
hive_space_limit=10G
# When the data exchange,the minmum of gnode space must be >1G.Default is 10G,units values is in bytes.
hive_gcluster_space_limit=10G
# The maximum of connections range 1-64,default 5.
hive_max_connections=5
# The number of connection in pool error,range 0-64,default 3.
hive_max_connections_in_pool=2
# The number of retry when get connection error,range 1-10,default 5.
hive_max_try_connect_times=5
# The Column Family name of table which to store Blob data.
hive_hbase_col_family_name=file
# The port of HBase server.
hive_hbase_server_port=9090
# The maximum of connections to HBase is range 1-64,default 5.
hive_hbase_max_connections=5
# The maximum length of single blob data,range 2M-16M,default 16M,units values is in bytes.
# If the length of blob data exceed 16M,it will be stored to HDFS.
hive_blob_max_size_in_hbase=16M
# The maximum length of single blob data stored to HDFS,range 1G-1T,default 16G,units values is in bytes.
hive_blob_max_size_in_hdfs=16G
# The maximum space of BlobCache cached in the local disk,range 1G-16G,default 8G,units values is in bytes.
hive_blob_cache_space_limit=8G
# range: 0 or 1
# 0: Will send the blob data stored by HDFS to client.
# 1: First,get the uri from 8a mpp.Then,get the blob data by uri.Finally,send the blob data to client.
hive_blob_send_from_uri=1
3 注意事项
1. GBase UP不支持ladp认证连接,只支持无认证、用户密码认证、kerberos认证
2. GBase UP直接去星环中建表(text表),无法进行insert,需要先通过beeline创建orc表,再通过gccli命令行create table if not exists 。
beeline
> create table if not exists t_hive(a int) clustered by (a) into 4 buckets stored as orc tblproperties ("transactional"="true");
gccli
> create table if not exists t_hive(a int)engine=hive.inst1;
3. 元数据同步时,字段必须加not null设置,gccli语法上不兼容 default 默认,如果不设置not null, 会自动识别为 default null
4. gccli不支持创建 (orc\holodesk\parquet\cvs)格式的hive表,语法不通过




