Skip to content

Commit

Permalink
improve k8s deployment instructions (GoogleCloudPlatform#914)
Browse files Browse the repository at this point in the history
  • Loading branch information
xingao267 authored Sep 8, 2020
1 parent 12c8968 commit 2634cd2
Show file tree
Hide file tree
Showing 3 changed files with 113 additions and 93 deletions.
6 changes: 3 additions & 3 deletions deployment.hcl
Original file line number Diff line number Diff line change
Expand Up @@ -538,7 +538,7 @@ resource "google_firebase_project" "firebase" {
project = module.project.project_id
}
# Step 5: uncomment and re-run the engine once all previous steps have been completed.
# Step 5.1: uncomment and re-run the engine once all previous steps have been completed.
# resource "google_firestore_index" "activities_index" {
# project = module.project.project_id
# collection = "Activities"
Expand Down Expand Up @@ -577,7 +577,7 @@ template "project_data" {
host_project_id = "{{$prefix}}-{{$env}}-networks"
}
}
# Step 5: uncomment and re-run the engine once all previous steps have been completed.
# Step 5.2: uncomment and re-run the engine once all previous steps have been completed.
/* terraform_addons = {
raw_config = <<EOF
data "google_secret_manager_secret_version" "my_studies_db_default_password" {
Expand All @@ -588,7 +588,7 @@ data "google_secret_manager_secret_version" "my_studies_db_default_password" {
EOF
} */
resources = {
# Step 5: uncomment and re-run the engine once all previous steps have been completed.
# Step 5.3: uncomment and re-run the engine once all previous steps have been completed.
# cloud_sql_instances = [{
# name = "mystudies"
# type = "mysql"
Expand Down
4 changes: 2 additions & 2 deletions deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -253,8 +253,8 @@ regenerating the Terraform configs several times.

### Step 5: Deploy additional Firebase resources and Data resources through CICD

1. In $ENGINE_CONFIG, uncomment the blocks that are marked as *Step 5* and
regenerate the Terraform configs:
1. In $ENGINE_CONFIG, uncomment the blocks that are marked as *Step 5.1*, *Step
5.2* and *Step 5.3*. Then regenerate the Terraform configs:

```bash
./tfengine --config_path=$ENGINE_CONFIG --output_path=$GIT_ROOT/terraform
Expand Down
196 changes: 108 additions & 88 deletions kubernetes/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,62 +6,66 @@ This directory contains some Kubernetes resources common to all the apps.

All files below are relative to the root of the repo.

* kubernetes/
* cert.yaml
* A Kubernetes ManagedCertificate for using
* kubernetes/
* cert.yaml
* A Kubernetes ManagedCertificate for using
[Google-managed SSL certificates](https://cloud.google.com/kubernetes-engine/docs/how-to/managed-certs).
* ingress.yaml
* A Kubernetes Ingress for routing HTTP calls to services in the
* ingress.yaml
* A Kubernetes Ingress for routing HTTP calls to services in the
cluster.
* pod_security_policy.yaml
* A restrictive Pod Security Policy that applies to the cluster apps.
* pod_security_policy-istio.yaml
* A looser Pod Security Policy that only applies to Istio containers
* pod_security_policy.yaml
* A restrictive Pod Security Policy that applies to the cluster apps.
* pod_security_policy-istio.yaml
* A looser Pod Security Policy that only applies to Istio containers
in the cluster.
* kubeapply.sh
* A helper script that applies all resources to the cluster. Not
* kubeapply.sh
* A helper script that applies all resources to the cluster. Not
required, the manual steps will be described below.
* auth-server-ws/
* tf-deployment.yaml
* A Kubernetes Deployment, deploying the app along with its secrets.
* This is forked from deployment.yaml with modifications for the
Terraform setup.
* tf-service.yaml
* A Kubernetes Service, exposing the app to communicate with other
apps and the Ingress.
* This is forked from service.yaml with modifications for the
Terraform setup.
* response-server-ws/
* <same as auth-server-ws>
* WCP/
* <same as auth-server-ws>
* WCP-WS/
* <same as auth-server-ws>
* user-registration-server-ws/
* <same as auth-server-ws>
* auth-server-ws/
* tf-deployment.yaml
* A Kubernetes Deployment, deploying the app along with its secrets.
* This is forked from deployment.yaml with modifications for the Terraform
setup.
* tf-service.yaml
* A Kubernetes Service, exposing the app to communicate with other apps
and the Ingress.
* This is forked from service.yaml with modifications for the Terraform
setup.
* response-server-module/response-server-service/
* same as auth-server-ws
* WCP/
* same as auth-server-ws
* WCP-WS/
* same as auth-server-ws
* user-registration-server-ws/
* same as auth-server-ws

## Setup

### Prerequisites

Install the following dependencies and add them to your PATH:

* [gcloud](https://cloud.google.com/sdk/gcloud)
* [gsutil](https://cloud.google.com/storage/docs/gsutil_install)
* [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl)
* [gcloud](https://cloud.google.com/sdk/gcloud)
* [gsutil](https://cloud.google.com/storage/docs/gsutil_install)
* [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl)

Find the following project IDs:
Find the following values as defined in your copy of `deployment.hcl`:

* `<apps-project-id>`
* `<data-project-id>`
* `<prefix>`
* `<env>`

Substitute these in the following instructions.
Also note the following project IDs which will be used in subsequent
instructions:

* Apps project ID: `<prefix>-<env>-apps`
* Data project ID: `<prefix>-<env>-data`
* Firebase project ID: `<prefix>-<env>-firebase`

### Terraform

Follow the [Terraform README.md](../Terraform/README.md) to create the
infrastructure. This will create a GKE cluster and a Cloud SQL MySQL database
instance.
Follow the [deployment.md](../deployment.md) to create the infrastructure. This
will create a GKE cluster and a Cloud SQL MySQL database instance.

### SQL

Expand All @@ -70,19 +74,19 @@ deploying the apps.

The gcloud import command only imports from GCS buckets. The Terraform setup
creates a bucket and gives the SQL instance permission to read files from it.
The bucket is named "<data-project>-sql-import"; for example,
"fda-mystudies-dev-data-sql-import"
The bucket is named `<prefix>-<env>-mystudies-sql-import`; for example,
`example-dev-mystudies-sql-import`.

Upload the SQL files to the bucket:

```
```bash
$ gsutil cp \
./auth-server-ws/auth_server_db_script.sql \
./WCP/sqlscript/* \
./response-server-ws/mystudies_response_server_db_script.sql \
./user-registration-server-ws/sqlscript/mystudies_app_info_update_db_script.sql \
./user-registration-server-ws/sqlscript/mystudies_user_registration_db_script.sql \
gs://<data-project-id>-sql-import
gs://<prefix>-<env>-mystudies-sql-import
```

Find the name of your Cloud SQL DB instance. If looking at the GCP Console, this
Expand All @@ -93,49 +97,62 @@ just "mystudies".
Import the scripts, in this order:

#### Auth server
```
$ gcloud sql import sql --project=<data-project-id> <instance-name> gs://<data-project-id>-sql-import/auth_server_db_script.sql

```bash
gcloud sql import sql --project=<prefix>-<env>-data <instance-name> gs://<prefix>-<env>-mystudies-sql-import/auth_server_db_script.sql
```

#### Study builder
```
$ gcloud sql import sql --project=<data-project-id> <instance-name> gs://<data-project-id>-sql-import/HPHC_My_Studies_DB_Create_Script.sql
$ gcloud sql import sql --project=<data-project-id> <instance-name> gs://<data-project-id>-sql-import/procedures.sql
$ gcloud sql import sql --project=<data-project-id> <instance-name> gs://<data-project-id>-sql-import/version_info_script.sql

```bash
gcloud sql import sql --project=<prefix>-<env>-data <instance-name> gs://<prefix>-<env>-mystudies-sql-import/HPHC_My_Studies_DB_Create_Script.sql
gcloud sql import sql --project=<prefix>-<env>-data <instance-name> gs://<prefix>-<env>-mystudies-sql-import/procedures.sql
gcloud sql import sql --project=<prefix>-<env>-data <instance-name> gs://<prefix>-<env>-mystudies-sql-import/version_info_script.sql
```

#### Response datastore
```
$ gcloud sql import sql --project=<data-project-id> <instance-name> gs://<data-project-id>-sql-import/mystudies_response_server_db_script.sql

```bash
gcloud sql import sql --project=<prefix>-<env>-data <instance-name> gs://<prefix>-<env>-mystudies-sql-import/mystudies_response_server_db_script.sql
```

#### User registration datastore
```
$ gcloud sql import sql --project=<data-project-id> <instance-name> gs://<data-project-id>-sql-import/mystudies_user_registration_db_script.sql
$ gcloud sql import sql --project=<data-project-id> <instance-name> gs://<data-project-id>-sql-import/mystudies_app_info_update_db_script.sql

```bash
gcloud sql import sql --project=<prefix>-<env>-data <instance-name> gs://<prefix>-<env>-mystudies-sql-import/mystudies_user_registration_db_script.sql
gcloud sql import sql --project=<prefix>-<env>-data <instance-name> gs://<prefix>-<env>-mystudies-sql-import/mystudies_app_info_update_db_script.sql
```

### Kubernetes Config Values

You may need to make some changes to the Kubernetes configs to match your
organization and deployment.

In each tf-deployment.yaml file:
In each tf-deployment.yaml file listed below (paths are relative to the
root of the repo):

1. auth-server-ws/tf-deployment.yaml
1. response-server-module/response-server-service/tf-deployment.yaml
1. WCP/tf-deployment.yaml
1. WCP-WS/tf-deployment.yaml
1. user-registration-server-ws/tf-deployment.yaml

* For all images except `gcr.io/cloudsql-docker/gce-proxy`, replace the
`gcr.io/<project>` part with `gcr.io/<apps-project-id>`
* For the cloudsql-proxy container, set the `-instances` flag with
Do the following:

* For all images except `gcr.io/cloudsql-docker/gce-proxy`, replace the
`gcr.io/<project>` part with `gcr.io/<prefix>-<env>-apps`
* For the cloudsql-proxy container, set the `-instances` flag with
`-instances=<cloudsq-instance-connection-name>=tcp:3306`

In the ./kubernetes/cert.yaml file:

* Change the name and domain to match your organization.
* Change the name and domain to match your organization.

In the ./kubernetes/ingress.yaml file:

* Change the `networking.gke.io/managed-certificates` annotation to match the
* Change the `networking.gke.io/managed-certificates` annotation to match the
name in ./kubernetes/cert.yaml.
* Change the name and the `kubernetes.io/ingress.global-static-ip-name`
* Change the name and the `kubernetes.io/ingress.global-static-ip-name`
annotation to match your organization.

### GKE Cluster - Terraform
Expand All @@ -148,33 +165,33 @@ they can't be applied by the CI/CD automation.

First, authenticate via gcloud:

```
$ gcloud auth login
$ gcloud auth application-default login
```bash
gcloud auth login
gcloud auth application-default login
```

Enter the Kubernetes Terraform directory

```
$ cd Terraform/kubernetes/
```bash
cd Terraform/kubernetes/
```

**Edit the file `terraform.tfvars`. Make sure the projects and cluster
information is correct.**

Init, plan, and apply the Terraform configs:

```
$ terraform init
$ terraform plan
$ terraform apply
```bash
terraform init
terraform plan
terraform apply
```

(Optional) Lastly, revoke gcloud authentication

```
$ gcloud auth revoke
$ gcloud auth application-default revoke
```bash
gcloud auth revoke
gcloud auth application-default revoke
```

### GKE Cluster - kubectl
Expand All @@ -183,21 +200,21 @@ Run all commands below from the repo root.

First, get kubectl credentials so you can interact with the cluster:

```
$ gcloud container clusters get-credentials "<cluster-name>" --region="<region>" --project="<apps-project-id>"
```bash
gcloud container clusters get-credentials "<cluster-name>" --region="<region>" --project="<prefix>-<env>-apps"
```

Apply the pod security policies:

```
```bash
$ kubectl apply \
-f ./kubernetes/pod_security_policy.yaml \
-f ./kubernetes/pod_security_policy-istio.yaml
```

Apply all deployments:

```
```bash
$ kubectl apply \
-f ./WCP-WS/tf-deployment.yaml \
-f ./response-server-ws/tf-deployment.yaml \
Expand All @@ -208,7 +225,7 @@ $ kubectl apply \

Apply all services:

```
```bash
$ kubectl apply \
-f ./WCP-WS/tf-service.yaml \
-f ./response-server-ws/tf-service.yaml \
Expand All @@ -219,7 +236,7 @@ $ kubectl apply \

Apply the certificate and the ingress:

```
```bash
$ kubectl apply \
-f ./kubernetes/cert.yaml \
-f ./kubernetes/ingress.yaml
Expand All @@ -229,19 +246,22 @@ $ kubectl apply \

If the cluster has issues, there are a few things you can check:

* Wait. It can take some time for all deployments to come up.
* Run `kubectl describe pods` and `kubectl logs <pod> <container>`. A useful
* Wait. It can take some time for all deployments to come up.
* Run `kubectl describe pods` and `kubectl logs <pod> <container>`. A useful
container to look at is `cloudsql-proxy`, to see if the DB connection was
established correctly.
* Make sure all the secrets in Secret Manager have values and are not empty.
* Make sure Pod Security Polices were applied. The cluster has enforcement
* Make sure all the secrets in Secret Manager have values and are not empty.
* Make sure Pod Security Polices were applied. The cluster has enforcement
enabled, and will not start any containers if there are no Pod Security
Policies.
* Follow a troubleshooting guide. Examples are
* Follow a troubleshooting guide. Examples are
[this](https://learnk8s.io/troubleshooting-deployments) and
[this](https://kubernetes.io/docs/tasks/debug-application-cluster/debug-cluster/).
* As of now there is a known issue with Firewalls in ingress-gce. References [kubernetes/ingress-gce#485](https://github.com/kubernetes/ingress-gce/issues/485)
and/or [kubernetes/ingress-gce#584](https://github.com/kubernetes/ingress-gce/issues/584)
1. Run kubectl describe ingress <ingress-name>
1. Look at the suggested commands under "Events", in the form of "Firewall change required by network admin: <gcloud command>".
* As of now there is a known issue with Firewalls in ingress-gce. References
[kubernetes/ingress-gce#485](https://github.com/kubernetes/ingress-gce/issues/485)
and/or
[kubernetes/ingress-gce#584](https://github.com/kubernetes/ingress-gce/issues/584)
1. Run `kubectl describe ingress <ingress-name>`
1. Look at the suggested commands under "Events", in the form of "Firewall
change required by network admin: `<gcloud command>`".
1. Run each of the suggested commands.

0 comments on commit 2634cd2

Please sign in to comment.