暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Hive 实战

大数据架构之道 2016-08-15
384

从这个镜像下载在国内来说速度还不错

下载:http://mirror.bit.edu.cn/apache/hive/stable-2/ 

apache-hive-2.1.0-bin.tar.gz

上传并解压:

$ tar zxvf apache-hive-2.1.0-bin.tar.gz  -C home/hadoop/hadoop-2.7.2/

$ cd home/hadoop/hadoop-2.7.2/apache-hive-2.1.0-bin


修改环境变量

$ vi ~/.bash_profile

-------------------

添加如下信息:

# hive

export HIVE_HOME=/home/hadoop/hadoop-2.7.2/apache-hive-2.1.0-bin

export PATH=$PATH:$HIVE_HOME/bin


使之生效:

$ source ~/.bash_profile



配置环境信息:

$ cd conf/

$ cp hive-default.xml.template hive-site.xml

$ cp hive-env.sh.template hive-env.sh

$ chmod 777 hive-site.xml  hive-env.sh


安装方式一:

使用Derby数据库安装,元数据将存储在Derby数据库上,缺点是只能一个实例登陆.

$ vi hive-env.sh

修改如下:

----------------------------------------

HADOOP_HOME=/home/hadoop/hadoop-2.7.2

export HIVE_CONF_DIR=/home/hadoop/hadoop-2.7.2/apache-hive-2.1.0-bin/conf

----------------------------------------


$ vi hive-site.xml

修改如下:

----------------------------------------

<?xml version="1.0" encoding="UTF-8" standalone="no"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> 

<configuration> 

  <property>

    <name>javax.jdo.option.ConnectionURL</name>

    <value>jdbc:derby:;databaseName=metastore_db;create=true</value>

    <description>

      JDBC connect string for a JDBC metastore.

      To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.

      For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.

    </description>

  </property>

   <property>

    <name>javax.jdo.option.ConnectionDriverName</name>

    <value>org.apache.derby.jdbc.EmbeddedDriver</value>

    <description>Driver class name for a JDBC metastore</description>

  </property>

   <property>

    <name>javax.jdo.option.ConnectionUserName</name>

    <value>APP</value>

    <description>Username to use against metastore database</description>

  </property>

  <property>

    <name>javax.jdo.option.ConnectionPassword</name>

    <value>mine</value>

    <description>password to use against metastore database</description>

  </property>

</configuration>

----------------------------------------

初始化数据库

$ schematool -initSchema -dbType derby

检查初始化结果

$ schematool -info -dbType derby 


启动hive

$ hive


建表测试:

hive> create table test(key string);

OK

Time taken: 1.789 seconds

hive> show tables;

OK

test

Time taken: 0.138 seconds, Fetched: 1 row(s)



经验:

$ hive

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.7.2/apache-hive-2.1.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]


Logging initialized using configuration in jar:file:/home/hadoop/hadoop-2.7.2/apache-hive-2.1.0-bin/lib/hive-common-2.1.0.jar!/hive-log4j2.properties Async: true

Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Hive metastore database is not initialized. Please use schematool (e.g. ./schematool -initSchema -dbType ...) to create the schema. If needed, dont forget to include the option to auto-create the underlying database in your JDBC connection string (e.g. ?createDatabaseIfNotExist=true for mysql))

at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:578)

at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:518)

at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:705)

解决方案:

提示没有初始化数据库,使用下面命令初始化

schematool -initSchema -dbType derby

继续报错:

$ schematool -initSchema -dbType derby

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.7.2/apache-hive-2.1.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Metastore connection URL: jdbc:derby:;databaseName=metastore_db;create=true

Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver

Metastore connection User: APP

Starting metastore schema initialization to 2.1.0

Initialization script hive-schema-2.1.0.derby.sql

Error: FUNCTION 'NUCLEUS_ASCII' already exists. (state=X0Y68,code=30000)

org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!

Underlying cause: java.io.IOException : Schema script failed, errorcode 2

Use --verbose for detailed stacktrace.

*** schemaTool failed ***

提示已经存在FUNCTION,那么找到文件hive-schema-2.1.0.derby.sql,注释这个方法即可

文件位置:

/home/hadoop/hadoop-2.7.2/apache-hive-2.1.0-bin/scripts/metastore/upgrade/derby/hive-schema-2.1.0.derby.sql



安装方式二:

使用mysql数据库安装,元数据将存储在mysql数据库上,生产部署一般是以 mysql 或 MariaDB.

参考:mysql在linux安装步骤

在mysql中创建用户hive/hive,数据库:hive

$ mysql -u root -p

Enter password: <<root>>


--创建hive用户,密码:hive

mysql> insert into mysql.user(Host,User,password) values('localhost','hive',password('hive'));

mysql> flush privileges;


--创建数据库

mysql> create database hive_db;


--授权

mysql> grant all on hive_db.* to hive@localhost identified by 'hive';

mysql> flush privileges;


--登陆验证

$ mysql -u hive -p

Enter password: <<hive>>

mysql> use hive_db;

Database changed

mysql> show tables;

Empty set (0.00 sec)




$ vi hive-env.sh

修改如下:

----------------------------------------

HADOOP_HOME=/home/hadoop/hadoop-2.7.2

export HIVE_CONF_DIR=/home/hadoop/hadoop-2.7.2/apache-hive-2.1.0-bin/conf

----------------------------------------

$ vi hive-site.xml

修改如下:

----------------------------------------

<?xml version="1.0" encoding="UTF-8" standalone="no"?>

<?xml-stylesheet

type="text/xsl" href="configuration.xsl"?>

<configuration>

<property>

<name>javax.jdo.option.ConnectionURL</name>

<value>jdbc:mysql://localhost:3306/hive_db</value>

<description>

JDBC connect string for a JDBC metastore.

To use SSL

to encrypt/authenticate the connection, provide database-specific SSL flag in

the connection URL.

For example,

jdbc:postgresql://myhost/db?ssl=true for postgres database.

</description>

</property>

<property>

<name>javax.jdo.option.ConnectionDriverName</name>

<value>com.mysql.jdbc.Driver</value>

<description>Driver class name for a JDBC

metastore</description>

</property>

<property>

<name>javax.jdo.option.ConnectionUserName</name>

<value>hive</value>

<description>Username to use against metastore database</description>

</property>

<property>

<name>javax.jdo.option.ConnectionPassword</name>

<value>hive</value>

<description>password to use against metastore

database</description>

</property>

<property>

<name>hive.hwi.listen.port</name>

<value>9999</value>

<description>This is

the port the Hive Web Interface will listen on </description>

</property>

<property>

<name>datanucleus.autoCreateSchema</name>

<value>false</value>

</property>

<property>

<name>datanucleus.fixedDatastore</name>

<value>true

</value>

</property>

<property>

<name>hive.metastore.local</name>

<value>true</value>

<description>controls whether to connect to remove

metastore server or open a new metastore server in Hive Client JVM

</description>

</property>

</configuration>


----------------------------------------


下载JDBC驱动jar包:

http://mirrors.sohu.com/mysql/Connector-J/mysql-connector-java-5.1.39.tar.gz  

# cp home/mysql/mysql-connector-java-5.1.39/mysql-connector-java-5.1.39-bin.jar home/hadoop/hadoop-2.7.2/apache-hive-2.1.0-bin/lib

# chown hadoop:hadoop home/hadoop/hadoop-2.7.2/apache-hive-2.1.0-bin/lib/mysql-connector-java-5.1.39-bin.jar


初始化数据库

$ schematool -initSchema -dbType mysql

检查初始化结果

$ schematool -info -dbType mysql 


启动hive

$ hive


此时,可以登陆mysql查看,应该有几十个表已经创建

$ mysql -u hive -p

Enter password: <<hive>>

mysql> use hive_db;

Database changed

mysql> show tables;

...

57 rows in set (0.00 sec)


建表测试:

hive> create table test(key string);

OK

Time taken: 1.789 seconds

hive> show tables;

OK

test

Time taken: 0.138 seconds, Fetched: 1 row(s)


此时,在HDFS文件系统也可以看到 test 表存储的文件

$ hdfs dfs -ls /user/hive/warehouse

Found 1 items

drwxr-xr-x   - hadoop supergroup          0 2016-08-07 15:54 /user/hive/warehouse/test

也可以在mysql对应的元数据表中看到:

mysql> select tbl_name,owner from TBLS;

+----------+--------+

| tbl_name | owner  |

+----------+--------+

| test     | hadoop |

+----------+--------+

1 row in set (0.00 sec)



经验:

$ schematool -initSchema -dbType mysql

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.7.2/apache-hive-2.1.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Metastore connection URL: jdbc:mysql://localhost:3306/hive

Metastore Connection Driver : com.mysql.jdbc.Driver

Metastore connection User: hive

org.apache.hadoop.hive.metastore.HiveMetaException: Failed to load driver

Underlying cause: java.lang.ClassNotFoundException : com.mysql.jdbc.Driver

Use --verbose for detailed stacktrace.

*** schemaTool failed ***

原因是缺少mysql的jdbc.jar包,将 mysql-connector-java-5.1.39-bin.jar 文件复制到 hive 的  lib 目录即可

下载jar包:http://mirrors.sohu.com/mysql/Connector-J/mysql-connector-java-5.1.39.tar.gz  

# cp /home/mysql/mysql-connector-java-5.1.39/mysql-connector-java-5.1.39-bin.jar /home/hadoop/hadoop-2.7.2/apache-hive-2.1.0-bin/lib

# chown hadoop:hadoop /home/hadoop/hadoop-2.7.2/apache-hive-2.1.0-bin/lib/mysql-connector-java-5.1.39-bin.jar



经验:

$ schematool -initSchema -dbType mysql

ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console.

/home/hadoop/hadoop-2.7.2/hbase-1.2.2/bin/hbase: line 412: /usr/java/jdk1.7.0_79/bin/java: No such file or directory

/home/hadoop/hadoop-2.7.2/hbase-1.2.2/bin/hbase: line 412: exec: /usr/java/jdk1.7.0_79/bin/java: cannot execute: No such file or directory

Error creating temp dir in java.io.tmpdir /tmp due to Permission denied

原因是

1.检查tmp的所有者有问题:drwxr-xr-x.   2 mysql root  4096 Aug 14 08:48 tmp,使用 chown -R root:root /tmp ;chmod -R 777 /tmp;修改

2.检查没有启动hadoop,使用start-all.sh启动即可


文章转载自大数据架构之道,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论