Skip to content

Latest commit

 

History

History

EKS

EKS and AWS Controllers for Kubernetes

Jump to


AWS EKS, AWS Controllers for Kubernetes and related tools and libs


Kubernetes tools (non-AWS)

Best Practices Guides

Quick Start

Networking

Max number of pods per EC2 instance

  • With certain EC2 instance type (e.g. m5.24xlarge), the instance can have 15 ENIs with 50 ipv4 addresses per ENI. (See Available IP Per ENI)
  • This will give you 750 IP addresses per host, and thus will limit the true number of pods you can run per host to 750.

Amazon VPC CNI plugin for K8s

  • amazon-vpc-cni-k8s - Networking plugin for pod networking in Kubernetes using ENIs on AWS.

Calico add-on - network policy engine for Kubernetes

AWS Secrets Manager and Kubernetes Secrets

Node-based autoscaling

Adding or removing nodes as needed

Cluster Autoscaler vs. Karpenter

Karpenter improvements (Source)

  • Designed to handle the full flexibility of the cloud: Karpenter has the ability to efficiently address the full range of instance types available through AWS. Cluster autoscaler was not originally built with the flexibility to handle hundreds of instance types, zones, and purchase options.
  • Group-less node provisioning: Karpenter manages each instance directly, without using additional orchestration mechanisms like node groups. This enables it to retry in milliseconds instead of minutes when capacity is unavailable. It also allows Karpenter to leverage diverse instance types, availability zones, and purchase options without the creation of hundreds of node groups.
  • Scheduling enforcement: Cluster autoscaler doesn’t bind pods to the nodes it creates. Instead, it relies on the kube-scheduler to make the same scheduling decision after the node has come online. A node that Karpenter launches has its pods bound immediately. The kubelet doesn’t have to wait for the scheduler or for the node to become ready. It can start preparing the container runtime immediately, including pre-pulling the image. This can shave seconds off of node startup latency.
  • Workload Consolidation for Karpenter: Workload consolidation for Karpenter automatically looks for opportunities to reschedule these workloads onto a set of more cost-efficient EC2 instances, whether they are already in the cluster or need to be launched.

Pod-based autoscaling

  1. Horizontal Pod Autoscaler – add or remove more pods to the deployment as needed
  2. Vertical Pod Autoscaler – resize pod's CPU and memory requests and limits to match the load

Autoscaling EKS on Fargate

  1. Autoscaling EKS on Fargate with custom metrics with HorizontalPodAutoscaler
    • Examples of configuring autoscaling based on HTTP traffic, CPU and/or memory usage, App Mesh traffic

EKS Montoring, Logging, Alerting

EKS cluster endpoint

EKS cluster endpoint access control

  • Private access for EKS cluster's Kubernetes API server endpoint (Kubernetes control plane API)
    • A cluster that has been configured to only allow private access can only be accessed from the following: (Source)
      • The VPC where the worker nodes reside.
      • Networks that have been peered with that VPC.
      • A network that has been connected to AWS through Direct Connect (DX) or a VPN.
    • However, the name of the Kubernetes cluster endpoint is only resolvable from the worker node VPC, for the following reasons:
      • The Route 53 private hosted zone that is created for the endpoint is only associated with the worker node VPC.
      • The private hosted zone is created in a separate AWS managed account and cannot be altered.
    • The cluster's API server endpoint is resolved by public DNS servers to a private IP address from the VPC. When you do a DNS query for your API server endpoint (e.g. 9FF86DB0668DC670F27F426024E7CDBD.sk1.us-east-1.eks.amazonaws.com) it will return private IP of EKS Endpoint (e.g. 10.10.10.20).
      • However, .gr7.ap-southeast-2.eks.amazonaws.com is NOT unique to account or region.
        Account-1:
            ap-southeast-2:  https://FD743253263F9932A1C1359F134D9B08.gr7.ap-southeast-2.eks.amazonaws.com
            ap-southeast-1:  https://7296B66098508057814BDC28DD6442FE.gr7.ap-southeast-1.eks.amazonaws.com
        Account-2:
            ap-southeast-2:  https://0395CA66195A658536B531CA06F3246D.gr7.ap-southeast-2.eks.amazonaws.com
            ap-southeast-2:  https://C2D57012E496E6903F975147013EE156.sk1.ap-southeast-2.eks.amazonaws.com
        Account-3:
            us-east-1:       https://9FF86DB0668DC670F27F426024E7CDBD.sk1.us-east-1.eks.amazonaws.com
        
  • How to enable private access for the EKS cluster's Kubernetes API server endpoint?
    1. Update Cluster.endpointAccess to PRIVATE.
    2. Create or update your “Security Group to use for Control Plane ENIs” with the ingress rules for your client (e.g. kubectl) to access the Kubernetes control plane API from the deployment agents (e.g. GitHub Actions Workflow)
    3. If your deployment agent is in another VPC / AWS account, then you will need a solution for directing the DNS request to the corresponding VPC / account.
  • How do I lock down API access to specific IP addresses in my Amazon EKS cluster?
    • You can, optionally, limit the CIDR blocks that can access the public endpoint. If you limit access to specific CIDR blocks, then it is recommended that you also enable the private endpoint, or ensure that the CIDR blocks that you specify include the addresses that nodes and Fargate pods (if you use them) access the public endpoint from.
    1. Update Cluster.endpointAccess to PUBLIC_AND_PRIVATE.
      • In CloudFormation, add the CIDR to ResourcesVpcConfig.PublicAccessCidrs.
      • In CDK, add the CIDR to endpointAccess: EndpointAccess.PUBLIC_AND_PRIVATE.onlyFrom(...) in aws-cdk-lib.aws_eks.Cluster ClusterProps.
  • DNS resolution for EKS cluster endpoints

EKS access control

  • eks-pod-identity-agent - Amazon EKS Pod Identity agent
  • IRSA vs. Pod Identity
  • How do I provide access to other IAM users and roles after cluster creation in Amazon EKS?
    • Important: Keep the following in mind:
      • Avoid syntax errors (such as typos) when you update the aws-auth ConfigMap. These errors can affect the permissions of all IAM users and roles updated within the ConfigMap of the Amazon EKS cluster.
      • It's a best practice to avoid adding cluster_creator to the ConfigMap, because improperly modifying the ConfigMap can cause all IAM users and roles (including cluster_creator) to permanently lose access to the Amazon EKS cluster.
      • You don't need to add cluster_creator to the aws-auth ConfigMap to get admin access to the Amazon EKS cluster. By default, the cluster_creator has admin access to the Amazon EKS cluster that it created.
  • Manage Amazon EKS with Okta SSO
    • EKS uses IAM to provide authentication to your Kubernetes cluster, but it still relies on native Kubernetes Role-Based Access Control (RBAC) for authorization. This means that IAM is only used for authentication of valid IAM entities. All permissions for interacting with your Amazon EKS cluster’s Kubernetes API is managed through the native Kubernetes RBAC system.

    • https://github.com/aws-samples/eks-rbac-sso

EKS security

EKS IAM OIDC Provider

  1. iam:*OpenIDConnectProvider* permissions are not required when creating an EKS cluster with CreateCluster, which creates an OpenID Connect provider URL (OpenID Connect issuer URL) for the cluster (e.g. https://oidc.eks.ap-southeast-2.amazonaws.com/id/ABCABC111222333444ABCABC11122233).

    • And in CloudTrail, there are no *OpenIDConnectProvider* events.
  2. After (1), the cluster has an OpenID Connect issuer URL associated with it. To use IAM roles for service accounts, an IAM OIDC provider must exist for the cluster. See here. You need to run ekctl utils associate-iam-oidc-provider,

    $ eksctl utils associate-iam-oidc-provider --cluster=k-test-oicd --approve --region=ap-southeast-2 --profile test-oidc
    
    • A Open ID Provider with the same URL as (1) is created.
    • For this step, this role needs to have the following permissions
      iam:CreateOpenIDConnectProvider
      iam:GetOpenIDConnectProvider
      iam:TagOpenIDConnectProvider
      
    • CloudTrail does NOT show the events as well (e.g. CreateOpenIDConnectProvider)

EKS with Fargate

There are some potential drawbacks to using Fargate with EKS, both operational and for workload security. (Source)

  1. Kubernetes network policies silently have no effect on pods assigned to Fargate nodes. Daemon sets, which put a pod for a service on each node, cannot place pods on the Fargate virtual nodes. Even if Calico could run as a sidecar in a pod, it would not have permission to manage the pod’s routing, which requires root privileges. Fargate only allows unprivileged containers.
  2. Active security monitoring of a container’s actions on Fargate becomes difficult or nearly impossible.
  3. Any metrics or log collectors that a user may normally run as a cluster daemon set will also have to be converted to sidecars, if possible.
  4. EKS still requires clusters that use Fargate for all their pod scheduling to have at least one node.
  5. The exact security implications and vulnerabilities of running EKS pods on Fargate remain unknown for now.

See also AWS Fargate considerations.

Stress test

  • AWS FIS (Fault Injection Simulator)
    • FIS supports ChaosMesh and Litmus experiments for containerized applications running on EKS.

      E.g. run a stress test on a pod’s CPU using ChaosMesh or Litmus faults while terminating a randomly selected percentage of cluster nodes using FIS fault actions.

CDK EKS+K8s Examples

cdk / cdk8s Gotchas

  • aws-cdk-lib.aws_eks.Cluster supports specifying only one Security Group (but CloudFormation/Console support list of Security Groups).
  • Currently there is no way to get FilterName using CDK, see aws-cdk/issues/8141. Workaround: const clusterSubFilterName = clusterSubFilter.stack.getLogicalId(clusterSubFilter.node.defaultChild as CfnSubscriptionFilter).