forked from big-data-europe/docker-hadoop
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
wipe README file, need to be rewritten
- Loading branch information
1 parent
b155c89
commit 5ef4c39
Showing
1 changed file
with
0 additions
and
43 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,44 +1 @@ | ||
# Hadoop Docker | ||
|
||
This repository provides Hadoop in Docker containers. You can either run Hadoop in a single node or create a cluster. | ||
|
||
The deployed Hadoop uses data replication "2". To change it edit the hdfs-site.xml file. | ||
|
||
All data are stored in /hdfs-data, so to store data in a host directory run the container using "-v /path/to/host:/hdfs-data". | ||
By default the container formats the namenode directory only if not exists (hdfs namenode -format -nonInteractive). | ||
If you want to mount an external directory that already contains a namenode directory and format it you have to first delete it manually. | ||
|
||
## Single node mode | ||
|
||
To deploy a single Hadoop node run | ||
|
||
docker run -h namenode bde2020/hadoop-base | ||
|
||
To store data in a host directory run the container as as | ||
|
||
docker run -h namenode -v /path/to/host:/hdfs-data bde2020/hadoop-base | ||
|
||
## Cluster mode | ||
|
||
The namenode runs in a seperate container than the datanodes. | ||
|
||
To start the namenode run | ||
|
||
docker run --name namenode -h namenode bde2020/hadoop-namenode | ||
|
||
To add a datanode to the cluster run | ||
|
||
docker run --link namenode:namenode bde2020/hadoop-datanode | ||
|
||
Use the same command to add more datanodes to the cluster | ||
|
||
More info is comming soon on how to deploy a Hadoop cluster using docker network and docker swarm | ||
|
||
# access the namenode | ||
|
||
The namenode listens on | ||
|
||
hdfs://namenode:8020 | ||
|
||
To access the namenode from another container link it using "--link namenode:namenode" and then use the afformentioned URL. | ||
More info on how to access it using docker network coming soon. |