暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

完全分布式hadoop搭建步骤

大数据架构之道 2016-07-12
259

系统版本(cat proc/version) 

Linux version 2.6.18-371.el5 

jdk1.7

redhat1,redhat2,redhat3

对应的IP为:192.168.19.251,192.168.19.252,192.168.19.253

机器名:h1,h2,h3


redhat1修改IP步骤

# vi etc/sysconfig/network-scripts/ifcfg-eth0

DEVICE=eth0 

BOOTPROTO=static 

IPADDR=192.168.19.251 

NETMASK=255.255.255.0 

GATEWAY=192.168.19.1 

ONBOOT=yes



重启配置文件使之生效

# etc/init.d/network restart

或者

# service network restart


检查IP设置是否成功

# ifconfig

eth0  ...

      inet addr:192.168.19.251 ...

      ...

检查jdk版本 ,1.6以上

[root@localhost ~]# java -version

java version "1.7.0_45"

Java(TM) SE Runtime Environment (build 1.7.0_45-b18)

Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)


找到JAVA_HOME,后面配置需要使用

JAVA_HOME=/usr/java/jdk1.7.0_45


创建hadoop默认根目录并指定权限

[root@h1 ~]# mkdir hadoop

[root@h1 ~]# chmod 777 hadoop


创建用户(hadoop,yarn)和组及其目录

[root@localhost hadoop_tools]# groupadd hadoop

[root@localhost hadoop_tools]# useradd -g hadoop hadoop

[root@localhost hadoop_tools]# passwd hadoop


修改计算机名称,主要是在生成密钥时,可以清楚地知道哪个密钥是哪台服务器的

[hadoop@localhost ~]$ su  - root

Password: 

修改点1:

[root@localhost ~]# vi etc/sysconfig/network

NETWORKING=yes

NETWORKING_IPV6=yes

#修改计算机名按IP顺序命名

HOSTNAME=h1


:wq 保存退出


修改点2,和修改点1的计算机名要一致,否则,将会报错,注意,每个服务器都需要按此修改

ssh连接机器,出现ssh:NODE_166:Temporary failure in name resolution

[root@hadoop_251 ~]# vi etc/hosts

127.0.0.1         localhost.localdomain  localhost

192.168.19.251  h1

192.168.19.252  h2

192.168.19.253  h3

:wq 保存退出




关闭防火墙,因为集群之间需要相互传输,各个节点都需要执行

1.首先查看防火墙状态:

[root@h1 ~]# service iptables status


2.两种方式,选择第一种:

第一种:永久性生效,重启后不会复原

开启:chkconfig iptables on

关闭:chkconfig iptables off

第二种:即时生效,重启后复原

开启:service iptables start

关闭:service iptables stop

 

重启计算机


使用hadoop用户登录

[root@localhost ~]# su - hadoop

[hadoop@localhost ~]$ 


配置SSH信任,相互之间免密钥登陆

相互之间分发公钥

A→B,C;B→A,C;C→A,B


执行以下命令生成密钥,一直按回车即可

[hadoop@localhost ~]$ ssh-keygen -t rsa

Generating public/private rsa key pair.

Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): 

Created directory '/home/hadoop/.ssh'.

Enter passphrase (empty for no passphrase): 

Enter same passphrase again: 

Your identification has been saved in home/hadoop/.ssh/id_rsa.

Your public key has been saved in home/hadoop/.ssh/id_rsa.pub.

The key fingerprint is:

be:77:a5:e4:98:f7:10:0f:83:d2:f6:b0:de:ef:98:dc hadoop@hadoop_251



检查是否生成成功,如果生成成功,将会有.ssh目录,并且目录权限是:700

[hadoop@h1 ~]$ ls -ltra ~/.ssh/id_rsa*

-rw-r--r-- 1 hadoop hadoop  391 Jun 28 18:44 home/hadoop/.ssh/id_rsa.pub

-rw------- 1 hadoop hadoop 1675 Jun 28 18:44 home/hadoop/.ssh/id_rsa

注意:文件~/.ssh/authorized_keys的权限必须为600,目录~/.ssh/权限为700,用户家目录权限也必须是700,否者信任会失效。


 


设置登录192.168.18.252不需要密码

方法1(步骤详细,麻烦,适用新手):

---------------

在服务器252和253上执行传输命令,将251的公钥分发到252,253:

[hadoop@hadoop_252 .ssh]$ scp hadoop@192.168.19.251:/home/hadoop/.ssh/id_rsa.pub home/hadoop/.ssh/id_rsa.pub.251

The authenticity of host '192.168.19.251 (192.168.19.251)' can't be established.

RSA key fingerprint is 83:79:cf:20:b0:7a:ed:ca:c8:a9:e7:b7:35:64:ba:10.

#这里输入yes

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added '192.168.19.251' (RSA) to the list of known hosts.

hadoop@192.168.19.251's password: 

id_rsa.pub                                                                                         100%  399     0.4KB/s   00:00    

[hadoop@hadoop_252 .ssh]$ ls -ltr

total 32

-rw-r--r-- 1 hadoop hadoop  399 Jun 27 20:01 id_rsa.pub

-rw------- 1 hadoop hadoop 1675 Jun 27 20:01 id_rsa

-rw-r--r-- 1 hadoop hadoop  396 Jun 27 20:16 known_hosts

-rw------- 1 hadoop hadoop  399 Jun 27 20:17 id_rsa.pub.251

将251的公钥放入252的authorized_keys中:

cat id_rsa.pub.251 >> authorized_keys

---------------



方法2(步骤简单,容易,适用高手):

---------------

在251上执行以下命令,将251的公钥放到252(252上对应的用户hadoop如果没有文件.ssh/authorized_keys,将会自动创建)上:

[hadoop@hadoop_251 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@192.168.19.252  <<IP可以换成h2>>

28

hadoop@192.168.19.252's password: <<输入密码>>

Now try logging into the machine, with "ssh 'hadoop@192.168.19.252'", and check in:


  .ssh/authorized_keys


to make sure we haven't added extra keys that you weren't expecting.

---------------


测试通过后返回用户信息

[hadoop@hadoop_251 ~]$ ssh hadoop@192.168.19.252 whoami

hadoop


将本机的公钥加入到authorized_keys

[hadoop@h1 .ssh]$ cat id_rsa.pub >> authorized_keys


检查结果文件,类似如下,有hosts中所有的别名公钥即可

[hadoop@h1 .ssh]$ cat authorized_keys 

ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAz0J+TfQKdoRvBvhuJ1PgohWxcZ66GPMCowqJtKaU3a09YMBYd6ZN9r6kdmSop5LBnc+PYb1uRH/IM1PQ6P6WUIa472xVeMb/dknChq2rkgbBiLNHekrzqq02QUmWJuNplIAZ+HIFSWyAfH3fNt4prD6qX615AbZnRYpqCXyoRptV9eRz9ulyVBNFZisy2A3OXY/kmmhsNvYe232RE3bSJgS7qfMshwjvCR7KeEXKamJWSoKTc/ziAsAe2jGf6THeBp6sdkM6Y2YBIlD/Rr6/WeIwD/9YqGYaGksQLYWqSnSWINohRjxjcRN96VuJ2lvbCTZZIGOb88ytAabqOIeaXQ== hadoop@h3

ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAxzPb96FrLiWs9cXE8AcNuvPrW9KeYP1feoQJCh/OF6UuoJyNzAkuy72Nv05ris5zMOvu1zOBFd6ppc4He0yBy/jZBAVnNSk4RAB6utagMZjTwyR5BxCMLUS4wXIZShfpfgh40ovgKPHpavSP8EMH8Hb3n5Z7lcGzQRpki/swzN4zHgOWZEXKagIBMhwK+wxbpFbdYqU0hJaAOtnfDoy2w4hovDygC6Kt2B6DyT2X4X+i/n9vDy4tjrd6MdtZ7gyT9CJ0GzXARlgcKy5wyhim9XftdfUzB+OmeYKQFiUsdiUm74nAKL9j9EHL+k+O70zSAUE4FhrJP4oFwhNbenc7ZQ== hadoop@h2

ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAphwmhurmdz9Ri6K5c7mBapnvmy99+NGWPmhdWANARXA24epKcCPEEuRUz/go1G9ABwzLDJOIQSXmVVXfFyGRMiv8obYvzlMLrWUjjDABe3KmdHCHlvFxw0oG1J37ycUQiv6AHL/J86H7Sfsl9fSI5kkzsSEriajqherJqSY2MByU+W6H7BaME5zyMMvyTP8bMtM+w1z10mURHaTdiN42N5D9PCToUvnfz2NhrRwcPo+uL6vJlhtPqvwwW1i7+MTfj1cTAD8CtV6JYkiSthMIMxciRl0LyDCuMVO1/JYB8RcbddgPznGaq0Ri8yhiN74G0hE/Blr92/SK3xsp14byWQ== hadoop@h1


===============

以下步骤在另外两台服务器上进行

用户:hadoop,IP:192.168.19.253

ssh-keygen -t rsa

ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@192.168.19.251

ssh hadoop@192.168.19.251 whoami

ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@192.168.19.252

ssh hadoop@192.168.19.252 whoami

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys


用户:hadoop,IP:192.168.19.252

ssh-keygen -t rsa

ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@192.168.19.251

ssh hadoop@192.168.19.251 whoami

ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@192.168.19.253

ssh hadoop@192.168.19.253 whoami

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

===============



每个节点执行以下语句验证: 

[hadoop@h1 .ssh]$ ssh hadoop@h1 whoami ; ssh hadoop@h2 whoami ; ssh hadoop@h3 whoami

hadoop

hadoop

hadoop


经验总结:

在配置SSH信任免密钥登陆的过程中,遇到过问题,主要原因是文件夹和文件的权限不对,导致信任失效.

另外,如果一个服务器的公钥已经存在于另一个服务器,重复放置一份公钥是没有问题的,但是,如果公钥已经重新生成,必须重新放一份,否则信任失效



解压hadoop并配置

上传文件到 home/hadoop

[hadoop@localhost ~]$ ls -ltr

total 205172

-rw-rw-r-- 1 hadoop hadoop 209879040 Jun 27 19:31 hadoop-2.7.2.tar.gz

解压文件

[hadoop@localhost ~]$ tar -zxvf hadoop-2.7.2.tar.gz     

解压后:

[hadoop@localhost ~]$ ls -ltr

total 207300

drwxr-xr-x 9 hadoop hadoop      4096 Jan 26 08:20 hadoop-2.7.2

-rw-rw-r-- 1 hadoop hadoop 212046774 Jun 27 19:31 hadoop-2.7.2.tar.gz


创建数据存储目录

[hadoop@hadoop_251 hadoop-2.7.2]$ mkdir -p hadoop-2.7.2/data

[hadoop@hadoop_251 hadoop-2.7.2]$ cd data

[hadoop@hadoop_251 data]$ pwd

/home/hadoop/hadoop-2.7.2/data

配置文件hdfs-site.xml需要使用到


创建临时目录

[hadoop@h3 hadoop-2.7.2]$ mkdir tmp

[hadoop@h3 hadoop-2.7.2]$ cd tmp

[hadoop@h3 tmp]$ pwd

/home/hadoop/hadoop-2.7.2/tmp


进入hadoop配置文件目录:/home/hadoop/hadoop-2.7.2/etc/hadoop

[hadoop@hadoop_251 hadoop]$ pwd

/home/hadoop/hadoop-2.7.2/etc/hadoop

[hadoop@hadoop_251 hadoop]$ ls -ltr



hadoop配置文件:

1.hadoop-env.sh : Environment variables that are used in the scripts to run Hadoop 

2.core-site.xml : Configuration settings for Hadoop Core, such as I/O settings that are common to HDFS and MapReduce

3.hdfs-site.xml : Configuration settings for HDFS daemons: the namenode, the secondary namenode, and the datanodes

4.mapred-site.xml : Configuration settings for MapReduce daemons: the jobtracker, and the tasktrackers

5.masters  : A list of machines (one per line) that each run a secondary namenode

6.slaves   : A list of machines (one per line) that each run a datanode and a tasktracker

7.hadoop-metrics.properties : Properties for controlling how metrics are published in Hadoop 

8.log4j.properties  : Properties for system logfiles, the namenode audit log, and the task log for the tasktracker child process  

需要编辑的2个重要脚本文件:

hadoop-env.sh

---------------

找到export JAVA_HOME,指定java_home的路径

export JAVA_HOME=/usr/java/jdk1.7.0_45 

---------------


yarn-env.sh

---------------

找到export JAVA_HOME,指定java_home的路径

export JAVA_HOME=/usr/java/jdk1.7.0_45 

---------------



在编辑配置文件前,先检查端口是否可用:

返回空表示端口未被占用

[hadoop@h1 ~]$ netstat -an|grep 9000

[hadoop@h1 ~]$ netstat -an|grep 9001



需要编辑的3个重要配置文件:

core-site.xml

hdfs-site.xml

yarn-site.xml


core-site.xml的配置如下:

注意namenode和datanode配置一样,不要改IP!

-------------

<configuration>

<property>

<name>fs.default.name</name>

<value>hdfs://h1:9000</value>

</property>

<property>

<name>hadoop.tmp.dir</name>

<value>/home/hadoop/hadoop-2.7.2/tmp</value>

</property>

</configuration>

-------------


hdfs-site.xml的配置如下: 

<configuration>

<property> 

<name>dfs.data.dir</name>

<value>/home/hadoop/hadoop-2.7.2/data</value>

</property>

<property> 

<name>dfs.replication</name>

<value>2</value>

</property>

</configuration> 



yarn-site.xml的配置如下:

注意namenode和datanode配置不一样!

namenode配置:

------------- 

<configuration>


<property>

<name>yarn.resourcemanager.address</name>

<value>h1:9001</value>

</property>


<property>

<name>yarn.resourcemanager.hostname</name>

<value>h1</value>

</property>


<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce.shuffle</value>

</property>


<property>

<name>yarn.nodemanager.resource.memory-mb</name>

<value>10240</value>

</property>


<property>

<name>yarn.scheduler.minimum-allocation-mb</name>

<value>1024</value>

</property>


<property>

<name>yarn.nodemanager.vmem-pmem-ratio</name>

<value>2.1</value>

</property>


</configuration>

------------- 

datanode配置

-------------

<configuration> 

<property> 

<name>yarn.resourcemanager.hostname</name>

<value>h1</value>

</property>


<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

<description>Auxilliary services of NodeManager</description>

</property>


<property>

<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>


<property> 

<name>yarn.nodemanager.resource.memory-mb</name>  

<value>10240</value>  

</property>  


<property>  

<name>yarn.scheduler.minimum-allocation-mb</name>  

<value>1024</value>  

</property>  


<property>  

<name>yarn.nodemanager.vmem-pmem-ratio</name>  

<value>2.1</value> 

</property>

</configuration>

-------------



2.7的版本不需要做以下步骤

In Hadoop 2.0 and later, MapReduce runs on YARN and there is an additional configuration file called yarn-site.xml

------------------------

这个文件需要改下名称

[hadoop@h1 hadoop]$ mv mapred-site.xml.template mapred-site.xml

mapred-site.xml的配置如下:

-------------

<configuration>

<property>

<name>mapred.job.tracker</name>

<value>h1:9001</value>

</property>

</configuration>

-------------

------------------------


如果是完全分布式,还需要编辑的2个配置文件:

masters

slaves


masters 的配置如下:

[hadoop@h1 hadoop]$ vi master

h1


slaves 的配置如下:

[hadoop@h1 hadoop]$ vi slaves 

h2

h3


向其它两个节点分发hadoop包,并修改相应的配置

[hadoop@h1 ~]$ scp -r ./hadoop-2.7.2 h2:~

[hadoop@h1 ~]$ scp -r ./hadoop-2.7.2 h3:~


修改core-site.xml和mapred-site.xml中的h1为h2或h3


检查所有的配置文件是否各个节点都正确


将hadoop加入到搜索路径

[hadoop@h1 ~]$ vi .bash_profile 

PATH=$PATH:$HOME/bin

HADOOP_HOME=/home/hadoop/hadoop-2.7.2

HADOOP_INSTALL=/home/hadoop/hadoop-2.7.2

PATH=$PATH:$HADOOP_INSTALL/bin:$HADOOP_INSTALL/sbin

export HADOOP_HOME HADOOP_INSTALL PATH

export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native

export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"


使之生效

[hadoop@h1 ~]$ source ./.bash_profile



以下命令在namenode(h1)上面执行

1.格式化分布式文件系统,建立结构存放元数据: 

[hadoop@h1 hadoop-2.7.2]$  hadoop namenode -format

DEPRECATED: Use of this script to execute hdfs command is deprecated.

Instead use the hdfs command for it.


16/06/28 19:28:54 INFO namenode.NameNode: STARTUP_MSG: 

/************************************************************

STARTUP_MSG: Starting NameNode

STARTUP_MSG:   host = h1/192.168.19.251

STARTUP_MSG:   args = [-format]

STARTUP_MSG:   version = 2.7.2

STARTUP_MSG:   classpath = home/hadoop/hadoop-2.7.2/etc/hadoop:/home/hadoop/hadoop-2.7.2/share/hadoop/common/lib/hadoop-annotations-2.7.2.jar:/home....

16/06/28 19:32:33 INFO common.Storage: Storage directory tmp/hadoop-hadoop/dfs/name has been successfully formatted. <<出现这段文字时,表示格式化成功>>

....

************************************************************/


 


2.启动集群守护进程 

[hadoop@h1 hadoop-2.7.2]$ home/hadoop/hadoop-2.7.2/sbin/start-all.sh

This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh

16/06/28 20:14:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Starting namenodes on [h1]  <<有警告>>

h1: namenode running as process 6461. Stop it first.

h2: starting datanode, logging to hadoop/hadoop/hadoop-hadoop-datanode-h2.out

h3: starting datanode, logging to hadoop/hadoop/hadoop-hadoop-datanode-h3.out

Starting secondary namenodes [0.0.0.0]

0.0.0.0: secondarynamenode running as process 6665. Stop it first.

16/06/28 20:14:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

starting yarn daemons <<有警告>>

resourcemanager running as process 5424. Stop it first.

h3: nodemanager running as process 4915. Stop it first.

h2: nodemanager running as process 5144. Stop it first.


在namenode验证:

执行以下命令,查看和java相关的进程:

[hadoop@h1 logs]$ jps

17432 NameNode

17793 ResourceManager

18057 Jps

17644 SecondaryNameNode


或者用下面的命令查看详细信息:

[hadoop@h1 logs]$ ps -ef|grep java

hadoop   17432     1  2 02:48 ?        00:00:07 usr/java/jdk1.7.0_45/bin/java -Dproc_namenode -Xmx1000m -Djava.net.preferI...

hadoop   17644     1  2 02:48 ?        00:00:05 usr/java/jdk1.7.0_45/bin/java -Dproc_secondarynamenode -Xmx1000m -Djava.ne...

hadoop   17793     1  4 02:48 pts/3    00:00:10 usr/java/jdk1.7.0_45/bin/java -Dproc_resourcemanager -Xmx1000m -Dhadoop.lo...

hadoop   18084 16250  0 02:52 pts/1    00:00:00 grep java



在另外两个节点检查:

[hadoop@h2 hadoop]$ jps

11104 Jps

10966 NodeManager

10845 DataNode 


[hadoop@h3 logs]$ jps

10589 Jps

10452 NodeManager

10330 DataNode


在namenode检查端口:

[hadoop@h1 hadoop]$ netstat -an|grep 9000

tcp        0      0 192.168.19.251:9000         0.0.0.0:*                   LISTEN      

tcp        0      0 192.168.19.251:9000         192.168.19.251:49774        ESTABLISHED 

tcp        0      0 192.168.19.251:49773        192.168.19.251:9000         TIME_WAIT   

tcp        0      0 192.168.19.251:49774        192.168.19.251:9000         ESTABLISHED 

[hadoop@h1 hadoop]$ netstat -an|grep 9001

tcp        0      0 ::ffff:192.168.19.251:9001  :::*                        LISTEN  


网页验证: 

http://192.168.19.251:50070(HDFS的页面) 

http://192.168.19.251:8088(yarn页面)




3.停止集群守护进程 

[hadoop@h1 hadoop-2.7.2]$ /home/hadoop/hadoop-2.7.2/sbin/stop-all.sh




验证,查看hadoop版本

[hadoop@h1 ~]$ hadoop version

Hadoop 2.7.2

Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r b165c4fe8a74265c792ce23f546c64604acf0e41

Compiled by jenkins on 2016-01-26T00:08Z

Compiled with protoc 2.5.0

From source with checksum d0fda26633fa762bff87ec759ebe689c

This command was run using /home/hadoop/hadoop-2.7.2/share/hadoop/common/hadoop-common-2.7.2.jar


如果没有将其加入搜索路径,那么使用时需要进入安装目录,使用bin/hadoop来执行相关命令

 

 

在客户端机器的hosts中添加配置(C:\Windows\System32\drivers\etc):

192.168.19.251 h1

192.168.19.252 h2

192.168.19.253 h3




hadoop集群测试

在namenode上操作

创建input目录,并创建两个文件

[hadoop@h1 ~]$ mkdir input

[hadoop@h1 ~]$ cd input

[hadoop@h1 input]$ echo "hello world" > test1.txt

[hadoop@h1 input]$ echo "hello hadoop" > test2.txt


将文件上传到hadoop集群,/in目录如果没有需要先创建

[hadoop@h1 dfs]$ hdfs dfs -put /home/hadoop/input/* /in

 



在hdfs中查看上传的文件

[hadoop@h1 ~]$ hdfs dfs -ls /in/*

16/06/29 07:15:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

-rw-r--r--   2 hadoop supergroup         12 2016-06-29 07:15 /in/test1.txt

-rw-r--r--   2 hadoop supergroup         13 2016-06-29 07:15 /in/test2.txt


 


执行wordcount程序

[hadoop@h1 hadoop]$ hadoop jar /home/hadoop/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /in /out1

16/06/30 04:35:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

16/06/30 04:35:10 INFO client.RMProxy: Connecting to ResourceManager at h1/192.168.19.251:9001

16/06/30 04:35:12 INFO input.FileInputFormat: Total input paths to process : 2

16/06/30 04:35:12 INFO mapreduce.JobSubmitter: number of splits:2

16/06/30 04:35:12 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1467232461373_0001

16/06/30 04:35:12 INFO impl.YarnClientImpl: Submitted application application_1467232461373_0001

16/06/30 04:35:12 INFO mapreduce.Job: The url to track the job: http://h1:8088/proxy/application_1467232461373_0001/

16/06/30 04:35:12 INFO mapreduce.Job: Running job: job_1467232461373_0001

16/06/30 04:35:24 INFO mapreduce.Job: Job job_1467232461373_0001 running in uber mode : false

16/06/30 04:35:24 INFO mapreduce.Job:  map 0% reduce 0%

16/06/30 04:35:37 INFO mapreduce.Job:  map 100% reduce 0%

16/06/30 04:35:45 INFO mapreduce.Job:  map 100% reduce 100%

16/06/30 04:35:46 INFO mapreduce.Job: Job job_1467232461373_0001 completed successfully

...


执行结果:

[hadoop@h1 hadoop]$ hdfs dfs -ls /out1

16/06/30 04:37:59 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Found 2 items

-rw-r--r--   2 hadoop supergroup          0 2016-06-30 04:35 /out1/_SUCCESS

-rw-r--r--   2 hadoop supergroup         25 2016-06-30 04:35 /out1/part-r-00000

[hadoop@h1 hadoop]$ hdfs dfs -cat /out1/par*

16/06/30 04:39:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

hadoop 1

hello 2

world 1


注意:

hadoop里面没有当前目录的概念,所以不能用cd命令来转换当前目录


文章转载自大数据架构之道,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论