This project was created to be a template for a data science project. Out of the box, you get fully connected Airflow, Postgres, R Packages, load balanced APIs through NGINX, and Shiny knitted together by Docker.
docker build -t productor_api --file ./DockerfileApi .
docker build -t productor_rpy --file ./DockerfileRpy .
docker build -t productor_app --file ./DockerfileApp .
docker-compose up -d --build productor_postgres
docker-compose up -d --build productor_initdb
Rscript update_env.R
docker-compose up -d
docker-compose up
docker-compose down
docker-compose up --remove-orphans
When the project is built, go to localhost:8080
and turn on the upsert_tidyverse_data
dag. This will execute the DAG (number 1 below) which ultimately executes (3), the R Script which grabs data from the dlstats
package and upsert the data into Postgres.
See the following files. Airflow doesn't have a native R executor, so you need to wrap the Rscript
argument in a
bash script. (1) is the DAG which uses a Bash Executor to run (2), a bash file which is wrapper for (3), an R Script
- Dag Location:
./airflow/dags/productor_basic.py
- Bash Script:
./airflow/scripts/R/upsert_tidyverse_data
- R Code:
./airflow/scripts/R/upsert_tidyverse_data.R
import airflow
from airflow.operators.bash_operator import BashOperator
from airflow.models import DAG
args = {
'owner': 'Freddy Drennan',
'start_date': airflow.utils.dates.days_ago(2),
'email': ['[email protected]'],
'retries': 2,
'email_on_failure': True,
'email_on_retry': True
}
dag = DAG(dag_id='upsert_tidyverse_data',
default_args=args,
schedule_interval='@daily',
concurrency=1,
max_active_runs=1,
catchup=False)
task_1 = BashOperator(
task_id='upsert_tidyverse_data',
bash_command='. /home/scripts/R/upsert_tidyverse_data',
dag=dag
)
task_1
#!/bin/bash
cd /home/scripts/R/r_files
/usr/bin/Rscript /home/scripts/R/upsert_tidyverse_data.R
library(productor)
(function() {
con <-
postgres_connector(
POSTGRES_HOST = Sys.getenv('POSTGRES_HOST'),
POSTGRES_PORT = Sys.getenv('POSTGRES_PORT'),
POSTGRES_USER = Sys.getenv('PRODUCTOR_POSTGRES_USER'),
POSTGRES_PASSWORD = Sys.getenv('PRODUCTOR_POSTGRES_PASSWORD'),
POSTGRES_DB = Sys.getenv('PRODUCTOR_POSTGRES_DB')
)
on.exit(expr = {
message('Disconnecting')
dbDisconnect(conn = con)
})
upsert_tidyverse_data(con)
})()
http://localhost/api/package_downloads
docker tag local-image:tagname new-repo:tagname
docker push new-repo:tagname
docker tag productor_rpy:latest fdrennan/productor_app:0.1.0
docker push new-repo:tagname
https://blog.baudson.de/blog/stop-and-remove-all-docker-containers-and-images
docker container stop
https://hub.docker.com/u/fdrennan
docker-compose -f docker-compose-beta.yaml up
docker build -t productor_app_basis --file ./DockerfileApp .
docker tag productor_app_basis:latest fdrennan/productor_app:latest
docker push fdrennan/productor_app:latest
docker build -t productor_api_basis --file ./DockerfileApi .
docker tag productor_api_basis:latest fdrennan/productor_api:latest
docker push fdrennan/productor_api:latest
docker build -t productor_rpy_basis --file ./DockerfileRpy .
docker tag productor_rpy_basis:latest fdrennan/productor_rpy:latest
docker push fdrennan/productor_rpy:latest
docker-compose -f docker-compose-scratch.yaml pull docker-compose -f docker-compose-scratch.yaml up -d --build productor_postgres docker-compose -f docker-compose-scratch.yaml up -d --build productor_initdb docker-compose -f docker-compose-scratch.yaml restart docker-compose -f docker-compose-scratch.yaml up docker-compose -f docker-compose-scratch.yaml down
docker-compose -f docker-compose-dev.yaml pull docker-compose -f docker-compose-dev.yaml up -d --build productor_postgres docker-compose -f docker-compose-dev.yaml up -d --build productor_initdb docker-compose -f docker-compose-dev.yaml restart