Skip to content

Commit

Permalink
start hadoop failed
Browse files Browse the repository at this point in the history
  • Loading branch information
kiwenlau committed Jun 7, 2016
1 parent 516e48f commit 7862ac2
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 291 deletions.
270 changes: 0 additions & 270 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -222,273 +222,3 @@ Hello 2

- check the steps d~f of section 3:test serf and dnsmasq, start Hadoop and run wordcount
- please test serf and dnsmasq service before start hadoop


基于Docker快速搭建多节点Hadopp集群
-----

可以直接进入第三部分,快速在本机搭建一个3个节点的Hadoop集群

```
一. 项目简介
二. 镜像简介
三. 3节点Hadoop集群搭建步骤
四. N节点Hadoop集群搭建步骤
```


##一. 项目简介

这个项目的目标是将Hadoop集群运行在[Docker](https://www.docker.com/)容器中,使Hadoop开发者能够快速便捷地在本机搭建多节点的Hadoop集群。

我的项目参考了[alvinhenrick/hadoop-mutinode](https://github.com/alvinhenrick/hadoop-mutinode)项目,不过我做了大量的优化和重构。请参考下面两个表:

```
镜像名称 构建时间 镜像层数 镜像大小
alvinhenrick/serf 258.213s 21 239.4MB
alvinhenrick/hadoop-base 2236.055s 58 4.328GB
alvinhenrick/hadoop-dn 51.959s 74 4.331GB
alvinhenrick/hadoop-nn-dn 49.548s 84 4.331GB
```

```
镜像名称 构建时间 镜像层数 镜像大小
kiwenlau/serf-dnsmasq 509.46s 8 206.6 MB
kiwenlau/hadoop-base 400.29s 7 775.4 MB
kiwenlau/hadoop-master 5.41s 9 775.4 MB
kiwenlau/hadoop-slave 2.41s 8 775.4 MB
```

#####注意:硬盘不够,内存不够,尤其是内核版本过低会导致运行失败:(

##二. 镜像简介

######本项目一共开发了4个Docker镜像: **serf-dnsmasq**, **hadoop-base**, **hadoop-master**, **hadoop-slave**.

#####serf-dnsmasq镜像
基于ubuntu:15.04镜像。安装[serf](https://www.serfdom.io/)[dnsmasq](http://www.thekelleys.org.uk/dnsmasq/doc.html). serf和dnsmasq可以为Hadoop集群提供DNS服务。

#####hadoop-base镜像
基于serf-dnsmasq镜像。安装openjdk, openssh-server, vim和Hadoop 2.3.0。

#####hadoop-master镜像
基于hadoop-base镜像,配置Hadoop的master节点。

#####hadoop-slave镜像
基于hadoop-base镜像。配置Hadoop的slave节点。

下图显示了项目的Docker镜像结构:

![alt text](https://github.com/kiwenlau/hadoop-cluster-docker/raw/master/image architecture.jpg "Image Architecture")


##三. 3节点Hadoop集群搭建步骤


#####1. 拉取镜像

```sh
sudo docker pull index.alauda.cn/kiwenlau/hadoop-master:0.1.0
sudo docker pull index.alauda.cn/kiwenlau/hadoop-slave:0.1.0
sudo docker pull index.alauda.cn/kiwenlau/hadoop-base:0.1.0
sudo docker pull index.alauda.cn/kiwenlau/serf-dnsmasq:0.1.0
```

*查看下载的镜像*

```sh
sudo docker images
```

*运行结果*

```
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
index.alauda.cn/kiwenlau/hadoop-slave 0.1.0 d63869855c03 17 hours ago 777.4 MB
index.alauda.cn/kiwenlau/hadoop-master 0.1.0 7c9d32ede450 17 hours ago 777.4 MB
index.alauda.cn/kiwenlau/hadoop-base 0.1.0 5571bd5de58e 17 hours ago 777.4 MB
index.alauda.cn/kiwenlau/serf-dnsmasq 0.1.0 09ed89c24ee8 17 hours ago 206.7 MB
```

#####2. 修改镜像tag

```sh
sudo docker tag d63869855c03 kiwenlau/hadoop-slave:0.1.0
sudo docker tag 7c9d32ede450 kiwenlau/hadoop-master:0.1.0
sudo docker tag 5571bd5de58e kiwenlau/hadoop-base:0.1.0
sudo docker tag 09ed89c24ee8 kiwenlau/serf-dnsmasq:0.1.0
```

*查看修改tag后镜像*

```sh
sudo docker images
```

*运行结果*

```
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
index.alauda.cn/kiwenlau/hadoop-slave 0.1.0 d63869855c03 17 hours ago 777.4 MB
kiwenlau/hadoop-slave 0.1.0 d63869855c03 17 hours ago 777.4 MB
index.alauda.cn/kiwenlau/hadoop-master 0.1.0 7c9d32ede450 17 hours ago 777.4 MB
kiwenlau/hadoop-master 0.1.0 7c9d32ede450 17 hours ago 777.4 MB
kiwenlau/hadoop-base 0.1.0 5571bd5de58e 17 hours ago 777.4 MB
index.alauda.cn/kiwenlau/hadoop-base 0.1.0 5571bd5de58e 17 hours ago 777.4 MB
kiwenlau/serf-dnsmasq 0.1.0 09ed89c24ee8 17 hours ago 206.7 MB
index.alauda.cn/kiwenlau/serf-dnsmasq 0.1.0 09ed89c24ee8 17 hours ago 206.7 MB
```

- 之所以要修改镜像,是因为我默认是将镜像上传到Dockerhub, 因此Dokerfile以及shell脚本中得镜像名称都是没有alauada前缀的,sorry for this....不过改tag还是很快滴
- 若直接下载我在DockerHub中的镜像,自然就不需要修改tag...

#####3.下载源代码

```sh
git clone https://github.com/kiwenlau/hadoop-cluster-docker
```

- 为了防止Github被XX, 我把代码导入到了开源中国的git仓库

```sh
git clone http://git.oschina.net/kiwenlau/hadoop-cluster-docker
```


#####4. 运行容器

```sh
cd hadoop-cluster-docker
./start-container.sh
```

*运行结果*

```
start master container...
start slave1 container...
start slave2 container...
root@master:~#
```

- 一共开启了3个容器,1个master, 2个slave
- 开启容器后就进入了master容器root用户的家目录(/root)

*查看master的root用户家目录的文件*

```sh
ls
```

*运行结果*

```
hdfs run-wordcount.sh serf_log start-hadoop.sh start-ssh-serf.sh
```

- start-hadoop.sh是开启hadoop的shell脚本
- run-wordcount.sh是运行wordcount的shell脚本,可以测试镜像是否正常工作


#####5.测试serf和dnsmasq服务

*查看hadoop集群成员*

```sh
serf members
```

*运行结果*

```
master.kiwenlau.com 172.17.0.65:7946 alive
slave1.kiwenlau.com 172.17.0.66:7946 alive
slave2.kiwenlau.com 172.17.0.67:7946 alive
```

- 若结果缺少节点,可以稍等片刻,再执行“serf members”命令。因为serf agent需要时间发现所有节点。

*测试ssh*

```sh
ssh slave2.kiwenlau.com
```

*运行结果*

```
Warning: Permanently added 'slave2.kiwenlau.com,172.17.0.67' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 15.04 (GNU/Linux 3.13.0-53-generic x86_64)
* Documentation: https://help.ubuntu.com/
The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.
root@slave2:~#
```

*退出slave2*

```sh
exit
```

*运行结果*
```
logout
Connection to slave2.kiwenlau.com closed.
```

- 若ssh失败,请稍等片刻再测试,因为dnsmasq的dns服务器启动需要时间。
- 测试成功后,就可以开启Hadoop集群了!其实你也可以不进行测试,开启容器后耐心等待一分钟即可!

#####6. 开启hadoop

```sh
./start-hadoop.sh
```

#####7. 运行wordcount

```sh
./run-wordcount.sh
```

*运行结果*

```
input file1.txt:
Hello Hadoop
input file2.txt:
Hello Docker
wordcount output:
Docker 1
Hadoop 1
Hello 2
```

##四. N节点Hadoop集群搭建步骤

#####1. 准备工作
- 参考第二部分1~3:下载镜像,修改tag,下载源代码

#####2. 重新构建hadoop-master镜像
```sh
./resize-cluster.sh 5
```
- 不要担心,1分钟就能搞定
- 你可以为resize-cluster.sh脚本设不同的正整数作为参数数1, 2, 3, 4, 5, 6...

#####3. 启动容器
```sh
./start-container.sh 5
```
- 你可以为resize-cluster.sh脚本设不同的正整数作为参数数1, 2, 3, 4, 5, 6...
- 这个参数呢,最好还是得和上一步的参数一致:)

#####4. 测试工作
- 参考第三部分5~7:测试serf和dnsmasq服务,开启Hadoop,运行wordcount
- 请注意,若节点增加,请务必先测试,然后再开启Hadoop, 因为serf可能还没有发现所有节点,而dnsmasq的DNS服务器表示还没有配置好服务
33 changes: 12 additions & 21 deletions start-container.sh
Original file line number Diff line number Diff line change
@@ -1,33 +1,24 @@
#!/bin/bash

# run N slave containers
N=$1
# run N slave containers, the default valume is 3
N=${1:-3}

# the defaut node number is 3
if [ $# = 0 ]
then
N=3
fi


# delete old master container and start new master container
sudo docker rm -f master &> /dev/null
echo "start master container..."
sudo docker run -d -t --dns 127.0.0.1 -P --name master -h master.kiwenlau.com -w /root kiwenlau/hadoop-master:0.1.0 &> /dev/null
# start hadoop master container
sudo docker rm -f hadoop-master > /dev/null
echo "start hadoop-master container..."
sudo docker run -d -t -P --name hadoop-master -h master.kiwenlau.com -w /root --net=hadoop kiwenlau/hadoop-master:1.0.0 &> /dev/null

# get the IP address of master container
FIRST_IP=$(sudo docker inspect --format="{{.NetworkSettings.IPAddress}}" master)
FIRST_IP=$(sudo docker inspect --format="{{.NetworkSettings.IPAddress}}" hadoop-master)

# delete old slave containers and start new slave containers
# start hadoop slave container
i=1
while [ $i -lt $N ]
do
sudo docker rm -f slave$i &> /dev/null
echo "start slave$i container..."
sudo docker run -d -t --dns 127.0.0.1 -P --name slave$i -h slave$i.kiwenlau.com -e JOIN_IP=$FIRST_IP kiwenlau/hadoop-slave:0.1.0 &> /dev/null
sudo docker rm -f hadoop-slave$i > /dev/null
echo "start hadoop-slave$i container..."
sudo docker run -d -t -P --name hadoop-slave$i -h slave$i.kiwenlau.com --net=hadoop kiwenlau/hadoop-slave:1.0.0 &> /dev/null
i=$(( $i + 1 ))
done


# create a new Bash session in the master container
sudo docker exec -it master bash
# sudo docker exec -it hadoop-master bash

0 comments on commit 7862ac2

Please sign in to comment.