Local Data Science & Engineering Stack

This repo was created in conjunction with a blog post @ kengbailey.com detailing my local data engineering setup.

This repo is intended to create a ready production data and machine learning stack. This stack is composed of:

Minio: S3 Storage
PostgreSQL: Structured SQL Database
MongoDB: Semi-structured Data
DBeaver/CloudBeaver: UI to read Database
Airflow: orchestrator
MLFlow: experiment tracking
Homer: Homepage of the Stack
JupyterHub: Exploration in jupyter notebook
VSCode Server: Online IDE
RStudio Server: Online IDE
Grafana: Monitoring and visualization of data and model drift
Superset: Data Visualization & BI stack
LabelStudio: Asset to labels data

How to install Docker?

Install Docker Desktop (mac, windows, linux)

Install Docker commandline (linux)

How to install Docker Compose?

Install Docker Compose

Commands to start and stop services?

Setup all containers:

make start-all

Open the Homepage

make run

Close all services

make stop-all

Commands to start and stop services?

Run and view logs

docker compose -f postgres-compose.yaml up

Run in detached mode

docker compose -f postgres-compose.yaml up -d

Stop

docker compose -f postgres-compose.yaml down

Important notes

To connect Grafana and Cloudbeaver to PostgreSQL server, use this information:

host=postgres
username= *see .env file*
password= *see .env file*
database=postgres

If you use DBeaver on your Windows with all the stack setuped on your WSL, use:

host=localhost

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
homer		homer
.env.template		.env.template
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
airflow-compose.yaml		airflow-compose.yaml
cloudbeaver-compose.yaml		cloudbeaver-compose.yaml
duckdb_starter.py		duckdb_starter.py
grafana-compose.yaml		grafana-compose.yaml
homer-compose.yaml		homer-compose.yaml
jupyter-compose.yaml		jupyter-compose.yaml
labelstudio-compose.yaml		labelstudio-compose.yaml
minio-compose.yaml		minio-compose.yaml
minio_starter.py		minio_starter.py
mlflow-compose.yaml		mlflow-compose.yaml
mongo-express-compose.yaml		mongo-express-compose.yaml
mongodb-express-compose.yaml		mongodb-express-compose.yaml
postgres-compose.yaml		postgres-compose.yaml
rstudio-compose.yaml		rstudio-compose.yaml
vscode-compose.yaml		vscode-compose.yaml
wait-for-it.sh		wait-for-it.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local Data Science & Engineering Stack

How to install Docker?

How to install Docker Compose?

Commands to start and stop services?

Commands to start and stop services?

Important notes

About

Releases

Packages

Languages

malganis35/local-de-stack

Folders and files

Latest commit

History

Repository files navigation

Local Data Science & Engineering Stack

How to install Docker?

How to install Docker Compose?

Commands to start and stop services?

Commands to start and stop services?

Important notes

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages