文档编写目的
整理CDH5中安装Impyla的步骤
集群环境
CDH5.16.2
anaconda3
python3.7
组件介绍
Impyla:适用于分布式查询引擎的HiveServer2实现(例如Impala,Hive)的Python客户端。
Impyla依赖包
six
bit_array
thriftpy
thrift_sasl
sasl
安装依赖
安装thrift_sasl需要先执行,否则安装会提示缺少sasl.h文件
yum install gcc-c++ python-devel.x86_64 cyrus-sasl-devel.x86_64
安装其他依赖
pip install bit_arraypip install thriftpypip install six#指定thrift_sasl==0.2.1, 否则连接hive会报错pip install thrift_saslpip install sasl
安装Impyla
python3.7不支持最新的版本,需要指定impyla的版本为0.15a1
/usr/local/anaconda3/bin/pip install impyla==0.15a1

Impyla测试
Impala
需要impala的jdbc对应的ip和端口
from impala.dbapi import connectconn = connect(host='192.168.xx.xx',port=25004)print(conn)cursor = conn.cursor()cursor.execute('show databases')results = cursor.fetchall()print(results)cursor.execute('SELECT distinct id FROM ods.test limit 10')series_code = cursor.fetchall()print(series_code)

Hive
from impala.dbapi import connectconn = connect(host="192.168.xx.xx", port=25005, database="ods", auth_mechanism="PLAIN")print(conn)cursor = conn.cursor()cursor.execute("show databases")print(cursor.description)results = cursor.fetchall()print(results)cursor.execute("select distinct series_code from ods.test")print(cursor)series_code = cursor.fetchall()print(series_code)

文章转载自Eights做数据,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




