Intro

This is a collection of repos and scripts designed to be able to easily boot up a set of services for docker containers. This is light on documentation for now as things stababilize. The best way to understand what's going on is to read the Makefiles/Dockerfiles in each directory.

If you want a quick start, try running make all and watch it start hadoop and spark containers for you.

There are a number of other containers that work, such as accumulo, tachyon, and zookeeper. These haven't been incorporated into benchmarks yet.

TODO & Caveats

The first thing to know about the internals is that hadoop configurations are shared via docker volumes. This limits you to using one host for now, or propagating a valid hadoop configuration to the other contaiiners. We need to find a nice way of pushing hadoop configurations between different hosts easily.

Useful Links for development

Reducing spark build times: https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools

Potential improvements

Use docker orchestration to provide a cluster interface - https://blog.docker.com/2016/06/docker-1-12-built-in-orchestration/

Name		Name	Last commit message	Last commit date
Latest commit History 186 Commits
benchmarks		benchmarks
docker-accumulo		docker-accumulo
docker-hadoop		docker-hadoop
docker-jupyter		docker-jupyter
docker-postgresql		docker-postgresql
docker-spark		docker-spark
docker-tachyon		docker-tachyon
docker-zeppelin		docker-zeppelin
docker-zookeeper		docker-zookeeper
hadoop-config-gen		hadoop-config-gen
java-profiler		java-profiler
local-hadoop		local-hadoop
perf-map-agent @ 70d7eed		perf-map-agent @ 70d7eed
profiler		profiler
spark-loganalyzer		spark-loganalyzer
spark-utils		spark-utils
util		util
.gitignore		.gitignore
.gitmodules		.gitmodules
Makefile		Makefile
Makefile.options		Makefile.options
README.md		README.md
TUNING		TUNING

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Intro

TODO & Caveats

Useful Links for development

Potential improvements

About

Releases

Packages

Contributors 3

Languages

craiig/docker-bigdata-cluster

Folders and files

Latest commit

History

Repository files navigation

Intro

TODO & Caveats

Useful Links for development

Potential improvements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages