We demonstrate here how to use Kasten on the outscale platform :
- Build a simple kubernetes cluster with rancher (One rancher server and one downstream cluster)
- Install the CSI outscale driver to get dynamic PVC provisionning and Snapshot features
- Install Kasten and use it to backup and restore stateful applications.
Go to the cockpit interface and create an EIM User with proper policies to create the infrastructure. Use a policy with full access for quick starting, but you may customize it latter.
Use the AK-SK of this EIM User so that you can use it for your client and terraform configuration.
Configure the client
# we use pip3
pip3 --version
pip 20.3.1 from /usr/local/lib/python3.9/site-packages/pip (python 3.9)
# adapt X.X depending of your version
wget https://github.com/outscale/osc-cli/releases/download/vX.X.X/osc_sdk-X.X-py3-none-any.whl
pip3 install osc_sdk-X.X-py3-none-any.whl
mkdir $HOME/.osc_sdk && cd $HOME/.osc_sdk
wget https://raw.githubusercontent.com/outscale/osc-cli/master/osc_sdk/config.json
Now change the content of the config.json
file and test the cli work as expected
osc-cli icu ListAccessKeys
{
"accessKeys": [
{
"status": "ACTIVE",
"tags": [],
"createDate": "2021-02-08T13:11:06.000Z",
"accessKeyId": "XXXXXXXXXXXXXXXX",
"secretAccessKey": "YYYYYYYYYYYYYYYYYY",
"expirationDate": null,
"touchDate": "2021-02-08T13:11:06.000Z",
"ownerId": "204741904112"
}
],
"marker": null,
"isTruncated": false,
"ResponseMetadata": {
"RequestId": "9aa29d73-0782-4578-8692-f252bff9f7a1"
}
}
We'll need a ssh keypair to access your machine, install docker and run rancher container.
Create one based on your default ssh keypair
osc-cli fcu CreateKeyPair \
--KeyName michael
--PublicKeyMaterial $(cat ~/.ssh/id_rsa.pub|base64)
You may change the KeyName
, this KeyName
will be used in the variable files of the terraform script.
We now work on the terraform directory
cd terraform
To execute terraform you need to export in env vars the AK SK you created previously.
export OUTSCALE_ACCESSKEYID="<YOUR_OUTSCALE_ACCESSKEYID>"
export OUTSCALE_SECRETKEYID="<YOUR_OUTSCALE_SECRETKEYID>"
Check the terraform files. We basically create a public network and 3 private network in a VPC.
In the public network we build the Rancher server and on the private networks we build the downstream cluster. See the rancher architecture page to better understand the link between rancher server and downstream cluster.
The rancher server is also used as a bastion to access the others nodes on the private network.
We did not implement an Authorized Cluster Endpoint because our goal is to quickly validate Kasten on Outscale, but that can be easily be done with an Outscale loadbalancers.
In eu-west-2 there is 2 AZ available hence the first and third network is belonging to the same AZ.
Edit variables.tf and create your infrastructure
# import the outscale provider and init a local terraform backend
terraform init
# check your plan
terraform plan
# execute it
terraform apply
Use the rancher server as a bastion to the other machines.
Here is an example
terraform output
cluster_1_master_1_private_ip = "10.0.1.253"
cluster_1_worker_1_private_ip = "10.0.1.193"
cluster_1_worker_2_private_ip = "10.0.2.51"
cluster_1_worker_3_private_ip = "10.0.3.27"
rke_vm_public_ip = "<public_ip>"
ssh-add
ssh -A outscale@<public_ip>
# access to other machines from there
ssh -A [email protected]
ssh -A [email protected]
ssh -A [email protected]
ssh -A [email protected]
On the machine in the public network.
sudo apt-get update
sudo apt install --assume-yes docker.io
sudo docker run --privileged -d --restart=unless-stopped -p 80:80 -p 443:443 rancher/rancher:stable
Access to https://<public_ip> and follow the install wizard.
From the rancher server UI add a cluster using existing nodes https://<public_ip>/g/clusters/add/launch/custom.
Obtain the command for provisioning master and worker nodes:
Choose control plan and etcd role and use the generated command on cluster_1_master_1
sudo apt-get update
sudo apt install --assume-yes docker.io
# use the command given by the ui to provision master node
# Depending of your options that should look like the one below. We use the same node for controle plan and etcd.
# sudo docker run -d --privileged --restart=unless-stopped --net=host -v /etc/kubernetes:/etc/kubernetes -v /var/run:/var/run rancher/rancher-agent:v2.5.8 --server https://<public_ip> --token <token> --ca-checksum <checksum> --etcd --controlplane
Choose worker role and use the generated command on cluster_1_worker_1 cluster_1_worker_2 cluster_1_worker_3
sudo apt-get update
sudo apt install --assume-yes docker.io
# use the command given by the ui to provision worker node
# Depending of your options that should look like the one below.
# sudo docker run -d --privileged --restart=unless-stopped --net=host -v /etc/kubernetes:/etc/kubernetes -v /var/run:/var/run rancher/rancher-agent:v2.5.8 --server https://<public_ip> --token <token> --ca-checksum <checksum> --worker
When the rancher server has identified the different nodes to create the cluster click the "Done" button and wait for the downstream cluster to be ready.
Once the kubernetes cluster is ready, through the rancher server UI download the kubeconfig file. Change the name of the context to a more meaningful name like "outscale-cluster-1", merge it to your config file and set it as your current context.
kubectl konfig import -s kubeconfig
kubectl config use-context outscale-cluster-1
kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-10-0-1-193 Ready worker 18d v1.20.6
ip-10-0-1-253 Ready controlplane,etcd 18d v1.20.6
ip-10-0-2-51 Ready worker 18d v1.20.6
ip-10-0-3-27 Ready worker 18d v1.20.6
You're ready now to install the outscale CSI driver.
Outscale feature an outscale CSI Driver that is also snapshot feature enabled. This is going to make things much easier for taking snaphot with kasten.
The README.md instructions provide the necessary action you need to follow to have the CSI driver up and running. Basically we just run those instructions:
# ENV VARS
export OSC_ACCESS_KEY=XXXXX
export OSC_SECRET_KEY=XXXXX
export OSC_REGION=eu-west-2
## set the secrets
curl https://raw.githubusercontent.com/outscale-dev/osc-bsu-csi-driver/OSC-MIGRATION/deploy/kubernetes/secret.yaml > secret.yaml
cat secret.yaml | \
sed "s/secret_key: \"\"/secret_key: \"$OSC_SECRET_KEY\"/g" | \
sed "s/access_key: \"\"/access_key: \"$OSC_ACCESS_KEY\"/g" > osc-secret.yaml
/usr/local/bin/kubectl delete -f osc-secret.yaml --namespace=kube-system
/usr/local/bin/kubectl apply -f osc-secret.yaml --namespace=kube-system
## deploy the pod
export IMAGE_NAME=outscale/osc-ebs-csi-driver
export IMAGE_TAG="v0.0.9beta"
git clone [email protected]:outscale-dev/osc-bsu-csi-driver.git
cd osc-bsu-csi-driver
helm uninstall osc-bsu-csi-driver --namespace kube-system
helm install osc-bsu-csi-driver ./osc-bsu-csi-driver \
--namespace kube-system --set enableVolumeScheduling=true \
--set enableVolumeResizing=true --set enableVolumeSnapshot=true \
--set region=$OSC_REGION \
--set image.repository=$IMAGE_NAME \
--set image.tag=$IMAGE_TAG
## Check the pod is running
kubectl get pods -o wide -A -n kube-system
kubectl create -f https://github.com/outscale-dev/osc-bsu-csi-driver/blob/OSC-MIGRATION/examples/kubernetes/dynamic-provisioning/specs/storageclass.yaml
kubectl patch storageclass ebs-sc -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
kubectl get storageclass
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
ebs-sc (default) ebs.csi.aws.com Delete WaitForFirstConsumer false 7d16h
Create the CRDs and the snapshot controller.
git clone https://github.com/kubernetes-csi/external-snapshotter.git
cd external-snapshotter
kubectl create -f client/config/crd
code deploy/kubernetes/snapshot-controller
kubectl create -f deploy/kubernetes/snapshot-controller
Create a volumesnapshotclass and annotate it for kasten
kubectl create -f https://github.com/outscale-dev/osc-bsu-csi-driver/blob/OSC-MIGRATION/examples/kubernetes/snapshot/specs/classes/snapshotclass.yaml
kubectl annotate volumesnapshotclass csi-aws-vsc k10.kasten.io/is-snapshot-class=true
kubectl create ns kasten-io
helm repo add kasten https://charts.kasten.io/
helm repo update
At this point you should get this output from the precheck script, we removed the warning:
curl -s https://docs.kasten.io/tools/k10_primer.sh | sh /dev/stdin -c "storage csi-checker -s ebs-sc --runAsUser=1000"
Namespace option not provided, using default namespace
Checking for tools
--> Found kubectl
--> Found helm
Checking if the Kasten Helm repo is present
--> The Kasten Helm repo was found
Checking for required Helm version (>= v3.0.0)
--> No Tiller needed with Helm v3.5.4
K10Primer image
--> Using Image (gcr.io/kasten-images/k10tools:4.0.2) to run test
Checking access to the Kubernetes context cluster-1
--> Able to access the default Kubernetes namespace
Running K10Primer Job in cluster with command-
./k10tools primer storage csi-checker -s ebs-sc --runAsUser=1000
serviceaccount/k10-primer created
clusterrolebinding.rbac.authorization.k8s.io/k10-primer created
job.batch/k10primer created
Waiting for pod k10primer-v65jd to be ready - ContainerCreating
Waiting for pod k10primer-v65jd to be ready - ContainerCreating
Waiting for pod k10primer-v65jd to be ready - ContainerCreating
Waiting for pod k10primer-v65jd to be ready - ContainerCreating
Pod Ready!
Starting CSI Checker. Could take up to 5 minutes
Creating application
-> Created pod (kubestr-csi-original-podnztrk) and pvc (kubestr-csi-original-pvcnsz5t)
-> Created snapshot (kubestr-snapshot-20210525225623)
Restoring application
-> Restored pod (kubestr-csi-cloned-pod49kpv) and pvc (kubestr-csi-cloned-pvcpp2mf)
Cleaning up resources
CSI Snapshot Walkthrough:
Using annotated VolumeSnapshotClass (csi-aws-vsc)
Successfully tested snapshot restore functionality. - OK
serviceaccount "k10-primer" deleted
clusterrolebinding.rbac.authorization.k8s.io "k10-primer" deleted
job.batch "k10primer" deleted
Install kasten and wait for all pods to be ready.
helm install k10 kasten/k10 -n kasten-io
kubectl get pod -n kasten-io
NAME READY STATUS RESTARTS AGE
aggregatedapis-svc-b85f954f-5p8bm 1/1 Running 0 7d15h
auth-svc-5754fcfccb-284m5 1/1 Running 0 7d15h
catalog-svc-69d6c5f8bc-ngpjk 2/2 Running 0 7d15h
config-svc-689b56cfd8-jmtmh 1/1 Running 0 7d15h
crypto-svc-7988bcdcb4-sxcwx 1/1 Running 0 7d15h
dashboardbff-svc-56c6c6646d-wndlc 1/1 Running 0 7d15h
executor-svc-b8fc79d7f-hm5gk 2/2 Running 0 7d15h
executor-svc-b8fc79d7f-lcbnw 2/2 Running 0 7d15h
executor-svc-b8fc79d7f-wlr29 2/2 Running 0 7d15h
frontend-svc-7b47b5cb5d-bv8cl 1/1 Running 0 7d15h
gateway-6c7d4fc6d5-zj5dd 1/1 Running 0 5d17h
jobs-svc-79cc56ff4d-997dv 1/1 Running 0 7d15h
kanister-svc-7797d44589-xlrdb 1/1 Running 0 7d15h
logging-svc-79cc4d9689-qkmtn 1/1 Running 0 7d15h
metering-svc-7b9f789857-29j5x 1/1 Running 0 7d15h
prometheus-server-78b94b85fb-qtxjd 2/2 Running 0 7d15h
state-svc-d4f7d56c8-2ssqc 1/1 Running 0 7d15h
Now you can use kasten and try some backup and restore.