Skip to content

Commit

Permalink
Chart: Add warning about missing knownHosts (apache#15950)
Browse files Browse the repository at this point in the history
Also document knownHosts in production guide, and refactor private
gitsync dags docs to use `extraSecrets` instead of a separate secret.
  • Loading branch information
jedcunningham authored May 19, 2021
1 parent 858f93c commit 6fd6712
Show file tree
Hide file tree
Showing 4 changed files with 67 additions and 31 deletions.
14 changes: 14 additions & 0 deletions chart/templates/NOTES.txt
Original file line number Diff line number Diff line change
Expand Up @@ -65,3 +65,17 @@ You can get Fernet Key value by running the following:
echo Fernet Key: $(kubectl get secret --namespace {{ .Release.Namespace }} {{ .Release.Name }}-fernet-key -o jsonpath="{.data.fernet-key}" | base64 --decode)

{{- end }}

{{- if and .Values.dags.gitSync.enabled .Values.dags.gitSync.sshKeySecret (not .Values.dags.gitSync.knownHosts)}}

#####################################################
# WARNING: You should set dags.gitSync.knownHosts #
#####################################################

You are using ssh authentication for your gitsync repo, however you currently have SSH known_hosts verification disabled,
making you susceptible to man-in-the-middle attacks!

Information on how to set knownHosts can be found here:
https://airflow.apache.org/docs/helm-chart/latest/production-guide.html#knownhosts

{{- end }}
51 changes: 20 additions & 31 deletions docs/helm-chart/manage-dags-files.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,15 +65,15 @@ Finally, update the Airflow pods with that image:
.. code-block:: bash
helm upgrade airflow . \
helm upgrade --install airflow . \
--set images.airflow.repository=my-company/airflow \
--set images.airflow.tag=8a0da78
If you are deploying an image with a constant tag, you need to make sure that the image is pulled every time.
.. code-block:: bash
helm upgrade airflow . \
helm upgrade --install airflow . \
--set images.airflow.repository=my-company/airflow \
--set images.airflow.tag=8a0da78 \
--set images.airflow.pullPolicy=Always
Expand All @@ -90,7 +90,7 @@ for details.
.. code-block:: bash
helm upgrade airflow . \
helm upgrade --install airflow . \
--set dags.persistence.enabled=true \
--set dags.gitSync.enabled=true
# you can also override the other persistence or gitSync values
Expand All @@ -99,7 +99,7 @@ for details.
.. code-block:: bash
helm upgrade airflow . \
helm upgrade --install airflow . \
--set dags.persistence.enabled=true \
--set dags.gitSync.enabled=true \
# you can also override the other persistence or gitSync values
Expand All @@ -116,7 +116,7 @@ seconds. If you are using the ``KubernetesExecutor``, Git-sync will run as an in
.. code-block:: bash
helm upgrade airflow . \
helm upgrade --install airflow . \
--set dags.persistence.enabled=false \
--set dags.gitSync.enabled=true
# you can also override the other gitSync values
Expand All @@ -133,7 +133,7 @@ In this approach, Airflow will read the DAGs from a PVC which has ``ReadOnlyMany
.. code-block:: bash
helm upgrade airflow . \
helm upgrade --install airflow . \
--set dags.persistence.enabled=true \
--set dags.persistence.existingClaim=my-volume-claim
--set dags.gitSync.enabled=false
Expand All @@ -148,37 +148,18 @@ Then create your ssh keys:
ssh-keygen -t rsa -b 4096 -C "[email protected]"
and add the public key to your private repo (under ``Settings > Deploy keys``).
Add the public key to your private repo (under ``Settings > Deploy keys``).
Now, you have to create a Kubernetes Secret object with which the Git-Sync sidecar will authenticate when
fetching or syncing your DAGs from your private Github repo.
You have to convert the private ssh key to a base64. You can convert the private ssh key file like so:
You have to convert the private ssh key to a base64 string. You can convert the private ssh key file like so:
.. code-block:: bash
base64 <my-private-ssh-key> -w 0 > temp.txt
Then copy the string from the ``temp.txt`` file and add it to a yaml file to create your secret object.
For example, ``my-ssh-secret.yaml`` should look like this:
.. code-block:: yaml
Then copy the string from the ``temp.txt`` file. You'll add it to your ``override-values.yaml`` next.
apiVersion: v1
kind: Secret
metadata:
name: airflow-ssh-secret
data:
gitSshKey: '<base64-converted-ssh-private-key>'
And from a terminal then run:
.. code-block:: bash
kubectl create -f my-ssh-secret.yaml --namespace <your-airflow-namespace>
You can easily create a yaml file to override values of interest in the ``values.yaml`` file. In this example, I will
create a yaml file called ``override-values.yaml`` to override values in the ``values.yaml`` file.
In this example, you will create a yaml file called ``override-values.yaml`` to override values in the
``values.yaml`` file, instead of using ``--set``:
.. code-block:: yaml
Expand All @@ -189,13 +170,21 @@ create a yaml file called ``override-values.yaml`` to override values in the ``v
branch: <branch-name>
subPath: ""
sshKeySecret: airflow-ssh-secret
extraSecrets:
airflow-ssh-secret: |
data:
gitSshKey: '<base64-converted-ssh-private-key>'
Don't forget to copy in your private key base64 string.
Finally, from the context of your Airflow Helm chart directory, you can install Airflow:
.. code-block:: bash
helm install airflow --namespace <your-airflow-namespace> . -f override-values.yaml
helm upgrade --install airflow . -f override-values.yaml
If you have done everything correctly, Git-Sync will pick up the changes you make to the DAGs
in your private Github repo.
You should take this a step further and set ``dags.gitSycn.knownHosts`` so you are not susceptible to man-in-the-middle
attacks. This process is documented in the :ref:`production guide <production-guide:knownhosts>`.
32 changes: 32 additions & 0 deletions docs/helm-chart/production-guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,38 @@ DAG Files

See :doc:`manage-dags-files`.

.. _production-guide:knownhosts:

knownHosts
^^^^^^^^^^

If you are using ``dags.gitSync.sshKeySecret``, you should also set ``dags.gitSync.knownHosts``. Here we will show the process
for GitHub, but the same can be done for any provider:

Grab GitHub's public key:

.. code-block:: bash
ssh-keyscan -t rsa github.com > github_public_key
Next, print the fingerprint for the public key:

.. code-block:: bash
ssh-keygen -lf github_public_key
Compare that output with `GitHub's SSH key fingerprints <https://docs.github.com/en/github/authenticating-to-github/githubs-ssh-key-fingerprints>`_.

They match, right? Good. Now, add the public key to your values. It'll look something like this:

.. code-block:: yaml
dags:
gitSync:
knownHosts: |
github.com ssh-rsa AAAA...FAaQ==
Accessing the Airflow UI
------------------------

Expand Down
1 change: 1 addition & 0 deletions docs/spelling_wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -946,6 +946,7 @@ keytab
killMode
kinit
kms
knownHosts
krb
kube
kubeclient
Expand Down

0 comments on commit 6fd6712

Please sign in to comment.