Setup to run Airflow in AWS ECS containers
- Docker
- AWS IAM User for the infrastructure deployment, with admin permissions
- awscli, intall running
pip install awscli
- terraform
- setup your IAM User credentials inside
~/.aws/credentials
- setup these env variables in your .zshrc or .bashrc, or in your the terminal session that you are going to use
export AWS_ACCOUNT=your_account_id export AWS_DEFAULT_REGION=eu-east-1 # it's the default region that needs to be setup also in infrastructure/config.tf
-
Generate a Fernet Key:
pip install cryptography export AIRFLOW_FERNET_KEY=$(python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())")
More about that here
-
Start Airflow locally simply running:
docker-compose up --build
If everything runs correctly you can reach Airflow navigating to localhost:8080. The current setup is based on Celery Workers. You can monitor how many workers are currently active using Flower, visiting localhost:5555
To run Airflow in AWS we will use ECS (Elastic Container Service).
Run the following commands:
make infra-init make infra-plan make infra-apply
or alternatively
cd infrastructure terraform get terraform init -upgrade; terraform apply
By default the infrastructure is deployed in eu-east-1
.
When the infrastructure is provisioned (the RDS metadata DB will take a while) check the if the ECR repository is created then run:
bash scripts/push_to_ecr.sh airflow-dev
By default the repo name created with terraform is airflow-dev
Without this command the ECS services will fail to fetch the latest
image from ECR
To deploy an update version of Airflow you need to push a new container image to ECR. You can simply doing that running:
./scripts/deploy.sh airflow-dev
The deployment script will take care of:
- push a new ECR image to your repository
- re-deploy the new ECS services with the updated image
- Create Private Subnets
- Move ECS containers to Private Subnets
- Use ECS private Links for Private Subnets
- Improve ECS Task and Service Role