暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

CDH 6.3.3 搭建

爱学习de小馋猫 2020-03-29
1849

好久没更新了。。


最近参加了大数据平台建设的项目,所以在学习一些相关的内容,本篇介绍的是CDH环境搭建,涉及到一些隐私已经修改了实际的IP地址等相关信息,如果只是做测试也建议至少使用三台机器搭建。


大概整理了下搭建过程和遇到的坑,供参考


PS:其实搭建了好几次了,第一次是使用全手动安装的就是在所有节点都进行安装,因为我只有三个节点,比较好搭;后来被问到如果是100个节点这样搭建会累死的。。所以重新按照官网教程搭建了6.3.2;然后被告知生产环境用的6.3.3,好吧,参考某公司搭建文档再来一遍。


简介:CDH算是Hadoop的一个分支吧,商业用的较多

Cloudera版本(Cloudera’s Distribution Including Apache Hadoop,简称“CDH”),基于Web的用户界面,支持大多数Hadoop组件,包括HDFS、MapReduce、Hive、Pig、 Hbase、Zookeeper、Sqoop,简化了大数据平台的安装、使用难度。

官方文档地址:

https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/installation.html


准备三台机器:

192.168.247.142 ORACLE-TEST2
192.168.247.143 ORACLE-TEST3
192.168.247.144 OR
ACLE-TEST4


Redhat 7.4 


安装包:(目前官网只有6.3.1的CM和6.3.2的parcel,可以安装低版本,建议CM与parcel安装同版本)


jdk-8u231-linux-x64.tar.gz

mysql-5.7.28-1.el7.x86_64.rpm-bundle.tar

cm6.3.3-redhat7.tar.gz

CDH-6.3.3-1.cdh6.3.3.p0.1796617-el7.parcel

manifest.json


1.禁用防火墙

systemctl stop firewalld.service

systemctl disable firewalld.service

echo "SELINUX=disabled" > etc/sysconfig/selinux


2.配置NTP

NTP同步采用集群内设置NTP Server的办法,将管理节点做成NTP服务器,集群内的其他节点与管理节点进行时钟同步。

首先在所有节点上安装NTP包:

yum -y install ntp

启动NTP服务,并且查看NTP状态,设置自启动:

service ntpd restart

systemctl enable ntpd.service


3.配置互信

ssh-keygen -N ""
ssh-copy-id 192.168.247.143
ssh-copy-id 192.168.247.144

其他机器也需要执行对另外两台的互信

如果root用户不能远程登录,也需要进行配置:

去其他机器执行以下

[root@ORALCE-TEST3 ~]# who

admin pts/0 2020-03-24 13:42 (10.141.24.13)

vi etc/securetty 

加入如下内容

pts/1
pts/2
pts/3


4.192.168.247.142 和 192.168.247.143 安装mysql

建议使用rpm 安装,安装过程可能需要json包

yum install -y perl-JSON


创建用户和数据库并授权

create database metastore default character set utf8;
CREATE USER 'hive'@'%' IDENTIFIED BY 'hive';
GRANT ALL PRIVILEGES ON metastore.* TO 'hive'@'%';

create database rman default character set utf8;
create user 'rman'@'%' identified by 'rman';
grant all privileges on rman.* to 'rman'@'%';

create database sentry default character set utf8;
create user 'sentry'@'%' identified by 'sentry';
grant all privileges on sentry.* to 'sentry'@'%';

create database nav default character set utf8;
create user 'nav'@'%' identified by 'nav';
grant all privileges on nav.* to 'nav'@'%';

create database navms default character set utf8;
create user 'navms'@'%' identified by 'navms';
grant all privileges on navms.* to 'navms'@'%';

create database cm default character set utf8;
create user 'cm'@'%' identified by 'cm';
grant all privileges on cm.* to 'cm'@'%';

create database scm default character set utf8;
create database oozie default character set utf8;
create user 'oozie'@'%' identified by 'oozie';
grant all privileges on oozie.* to 'oozie'@'%';

create database hue default character set utf8;
create user 'hue'@'%' identified by 'hue';
grant all privileges on hue.* to 'hue'@'%';

FLUSH PRIVILEGES;


4.配置MySQL Connector

安装jdk,配置好环境变量

yum install mysql-connector-java


5.httpd

yum install httpd

systemctl start httpd

systemctl enable httpd.service

6.将cm6.3.3-redhat7.tar.gz解压包、parcel包和manifest.json分别放置到如下路径下

/var/www/html

/var/www/html/cdh6/parcels/6.3.3/

cd /var/www/html/cdh6/parcels/6.3.3/

计算hash值:

sha1sum CDH-6.3.3-1.cdh6.3.3.p0.1796617-el7.parcel

204dc478dbadd93413ff5dc2cd17ae9969ca4b5c  CDH-6.3.3-1.cdh6.3.3.p0.1796617-el7.parcel

正常,如果hash值不是204dc478dbadd93413ff5dc2cd17ae9969ca4b5c,可能文件有损坏,会影响后面安装

vi CDH-6.3.3-1.cdh6.3.3.p0.1796617-el7.parcel.sha

写入哈希值

204dc478dbadd93413ff5dc2cd17ae9969ca4b5c

说明:每个版本hash值不一样,CDH-6.3.3-1.cdh6.3.3.p0.1796617-el7.parcel.sha文件内容是CDH-6.3.3-1.cdh6.3.3.p0.1796617-el7.parcel文件的hash值

且此hash值与manifest.json中,CDH-6.3.3-1.cdh6.3.3.p0.1796617-el7.parcel文件对应的hash值一样


7.yum源配置

cd etc/yum.repos.d/
vi cdh.repo

写入如下内容:

[cdh]
name=cdh
baseurl=http://192.168.247.142/cm6.3.3
enable=1
gpgcheck=0


8.安装服务

yum install cloudera-manager-daemons cloudera-manager-server

初始化:

/opt/cloudera/cm/schema/scm_prepare_database.sh mysql -h 192.168.247.142 --scm-host 192.168.247.142 scm scm scm

开启服务:

service cloudera-scm-server start

查看日志:

 vi var/log/cloudera-scm-server/cloudera-scm-server.log

查看运行状态:

service cloudera-scm-server.service status -l


9.打开网址:http://192.168.247.142:7180

按提示安装

默认用户名密码都是admin


可能遇到的报错:

报错1:

JAVA_HOME=/usr/lib/jvm/jre-openjdk
Verifying that we can write to etc/cloudera-scm-server
Creating SCM configuration file in etc/cloudera-scm-server
Executing: usr/lib/jvm/jre-openjdk/bin/java -cp usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/java/postgresql-connector-java.jar:/opt/cloudera/cm/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
[ main] DbCommandExecutor INFO Successfully connected to database.
[ main] DbCommandExecutor ERROR Unable to create/drop a table.
java.sql.SQLException: Statement violates GTID consistency: CREATE TABLE ... SELECT.


解决办法:关闭GTID模式

vi etc/my.cnf

gtid_mode=OFF

enforce_gtid_consistency=OFF


报错2:

JAVA_HOME=/usr/lib/jvm/jre-openjdk
Verifying that we can write to /etc/cloudera-scm-server
[ main] DbProvisioner ERROR Exception when creating/dropping database with user 'scm' and jdbc url 'jdbc:mysql://192.168.247.141/?useUnicode=true&characterEncoding=UTF-8'
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Access denied for user 'scm'@'192.168.247.141' to database 'scm'


解决方法:

drop user scm@'%';

drop user scm@'192.168.247.141';

update mysql.user set host = '%' where user = 'root';

然后使用root用户进行初始化:

/opt/cloudera/cm/schema/scm_prepare_database.sh mysql -h 192.168.247.141 -uroot -proot --scm-host 192.168.247.141 scm scm scm


报错3:

ParcelUpdateService:com.cloudera.parcel.components.ParcelDownloaderImpl: Unable to retrieve remote parcel repository manifest

错误处理:

没有创建manifest.json或hash值有误导致的


报错4:

org.apache.hadoop.hive.metastore.HiveMetaException: Failed to retrieve schema tables from Hive Metastore DB,Not supported

原因是缺少JDBC驱动:下载一个:mysql-connector-java-5.1.25.zip

解压到/tmp目录下:

cp /tmp/mysql-connector-java-5.1.25/mysql-connector-java-5.1.25-bin.jar /usr/share/java

cp /usr/share/java/mysql-connector-java-5.1.25-bin.jar /opt/cloudera/parcels/CDH/lib/hive/lib


其他问题:

1.搭建完成后发现143上面的/tmp文件夹总是被撑满,生成很多类似的文件:

mgmt_mgmt-NAVIGATOR-c77e3c9ed97bae1760d495254cb7a0c1_pid2864.hprof

前端修改:Navigator Metadata Server 的 Java 堆栈大小(字节)大小,修改为1-2G

未解决,写了一个定时任务定时清理


2.Service Monitor的内存溢出

前端控制台修改如下参数,按照提示的建议值进行修改即可,一般1G(可以直接在搜索框进行搜索):

Host Monitor 的 Java 堆栈大小(字节)

Service Monitor 的 Java 堆栈大小(字节)


完结

文章转载自爱学习de小馋猫,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论