In this demo, we will see how to deploy HPA controller. HPA will automatically scale the number of pods based on CPU utilization whereas VPA scales by increasing or decreasing CPU and memory resources within the existing pod containers—thus scaling capacity vertically
- Create 1 virtual machine on AWS with 2 CPU, 4GB of RAM (t2.medium)
- Setup minikube on it Minikube setup.
- Ensure you have the Metrics Server installed in your Minikube cluster to enable HPA. If not already installed, you can install it using:
minikube addons enable metrics-server
- Check minikube cluster status and nodes :
minikube status
kubectl get nodes
- If you are using a Kind cluster install Metrics Server
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
- Edit the Metrics Server Deployment
kubectl -n kube-system edit deployment metrics-server
- Add the security bypass to deployment under
container.args
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP,Hostname,ExternalIP
- Restart the deployment
kubectl -n kube-system rollout restart deployment metrics-server
- Verify if the metrics server is running
kubectl get pods -n kube-system
kubectl top nodes
- For VPA
git clone https://github.com/kubernetes/autoscaler.git
cd autoscaler/vertical-pod-autoscaler
./hack/vpa-up.sh
- Verify the pods on VPA
kubectl get pods -n kube-system
In this demo, we will create an deployment & service files for Apache and with the help of HPA, we will automatically scale the number of pods based on CPU utilization.
-
Update the Deployments:
- We'll modify the existing Apache deployment YAML files to include resource requests and limits. This is required for HPA to monitor CPU usage.
#apache-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: apache-deployment
spec:
replicas: 1
selector:
matchLabels:
app: apache
template:
metadata:
labels:
app: apache
spec:
containers:
- name: apache
image: httpd:2.4
ports:
- containerPort: 80
resources:
requests:
cpu: 100m
limits:
cpu: 200m
---
apiVersion: v1
kind: Service
metadata:
name: apache-service
spec:
selector:
app: apache
ports:
- protocol: TCP
port: 80
targetPort: 80
type: ClusterIP
- Apply the updated deployments:
kubectl apply -f apache-deployment.yaml
- Create HPA Resources
- We will create HPA resources for both Apache and NGINX deployments. The HPA will scale the number of pods based on CPU utilization.
#apache-hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: apache-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: apache-deployment
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 20
- Apply the HPA resources:
kubectl apply -f apache-hpa.yaml
- port forward to access the Apache service on browser.
kubectl port-forward svc/apache-service 8081:80 --address 0.0.0.0 &
- Verify HPA
- You can check the status of the HPA using the following command:
kubectl get hpa
This will show you the current state of the HPA, including the current and desired number of replicas.
- To see HPA in action, you can perform a stress test on your deployments. Here is an example of how to generate load on the Apache deployment using 'BusyBox':
kubectl run -i --tty load-generator --image=busybox /bin/sh
- Inside the container, use 'wget' to generate load:
while true; do wget -q -O- http://apache-service.default.svc.cluster.local; done
This will generate continuous load on the Apache service, causing the HPA to scale up the number of pods.
- Now to check if HPA worked or not, open a same new terminal and run the following command
kubectl get hpa -w
Note: Wait for few minutes to get the status reflected.