Cluster Autoscaler

View as MarkdownOpen in Claude

Overview

The Cluster Autoscaler automatically adjusts the number of nodes in your EKS cluster based on pending pods and resource utilization. When combined with HPA, it provides end-to-end autoscaling from application load to infrastructure capacity.

How It Works

Flow:

  1. HPA scales pods based on metrics
  2. New pods enter “Pending” state (insufficient resources)
  3. Cluster Autoscaler detects pending pods
  4. Adds nodes to Auto Scaling Group
  5. Pods scheduled on new nodes
  6. After scale-down period, removes underutilized nodes

Prerequisites

1

IAM Role

Create IAM role with autoscaling permissions (see IAM & IRSA)

2

Node Group Tags

Ensure node groups have proper tags:

k8s.io/cluster-autoscaler/<cluster-name>: owned
k8s.io/cluster-autoscaler/enabled: true
3

Service Account

IRSA-enabled service account for Cluster Autoscaler

Installation

Using Helm Chart

The Smallest Self-Host chart includes Cluster Autoscaler as a dependency:

values.yaml
1cluster-autoscaler:
2 enabled: true
3 rbac:
4 serviceAccount:
5 name: cluster-autoscaler
6 annotations:
7 eks.amazonaws.com/role-arn: arn:aws:iam::YOUR_ACCOUNT_ID:role/cluster-autoscaler-role
8 autoDiscovery:
9 clusterName: smallest-cluster
10 awsRegion: us-east-1
11
12 extraArgs:
13 balance-similar-node-groups: true
14 skip-nodes-with-system-pods: false
15 scale-down-delay-after-add: 5m
16 scale-down-unneeded-time: 10m

Deploy:

$helm upgrade --install smallest-self-host smallest-self-host/smallest-self-host \
> -f values.yaml \
> --namespace smallest

Standalone Installation

Install Cluster Autoscaler separately:

$helm repo add autoscaler https://kubernetes.github.io/autoscaler
$helm repo update
$
$helm install cluster-autoscaler autoscaler/cluster-autoscaler \
> --namespace kube-system \
> --set autoDiscovery.clusterName=smallest-cluster \
> --set awsRegion=us-east-1 \
> --set rbac.serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=arn:aws:iam::ACCOUNT_ID:role/cluster-autoscaler-role

Configuration

Auto-Discovery

Auto-discover Auto Scaling Groups by cluster name:

1autoDiscovery:
2 clusterName: smallest-cluster
3 tags:
4 - k8s.io/cluster-autoscaler/enabled
5 - k8s.io/cluster-autoscaler/smallest-cluster

Manual Configuration

Explicitly specify Auto Scaling Groups:

1autoscalingGroups:
2 - name: eks-cpu-nodes
3 minSize: 1
4 maxSize: 10
5 - name: eks-gpu-nodes
6 minSize: 0
7 maxSize: 20

Scale-Down Configuration

Control when and how nodes are removed:

1extraArgs:
2 scale-down-enabled: true
3 scale-down-delay-after-add: 10m
4 scale-down-unneeded-time: 10m
5 scale-down-utilization-threshold: 0.5
6 max-graceful-termination-sec: 600

Parameters:

  • scale-down-delay-after-add: Wait time after adding node before considering scale-down
  • scale-down-unneeded-time: How long node must be underutilized before removal
  • scale-down-utilization-threshold: CPU/memory threshold (0.5 = 50%)
  • max-graceful-termination-sec: Max time for pod eviction

Node Group Priorities

Scale specific node groups first:

1extraArgs:
2 expander: priority
3
4priorityConfigMapAnnotations:
5 cluster-autoscaler.kubernetes.io/expander-priorities: |
6 10:
7 - .*-spot-.*
8 50:
9 - .*-ondemand-.*

Priorities:

  • Higher number = higher priority
  • Regex patterns match node group names
  • Useful for preferring spot instances

Verify Installation

Check Cluster Autoscaler Pod

$kubectl get pods -n kube-system -l app.kubernetes.io/name=aws-cluster-autoscaler

Check Logs

$kubectl logs -n kube-system -l app.kubernetes.io/name=aws-cluster-autoscaler -f

Look for:

Starting cluster autoscaler
Auto-discovery enabled
Discovered node groups: [eks-gpu-nodes, eks-cpu-nodes]

Verify IAM Permissions

$kubectl logs -n kube-system -l app.kubernetes.io/name=aws-cluster-autoscaler | grep -i "error\|permission"

Should show no permission errors.

Testing Cluster Autoscaler

Trigger Scale-Up

Create pods that exceed cluster capacity:

$kubectl run test-scale-up-1 \
> --image=nginx \
> --requests='cpu=1,memory=1Gi' \
> --replicas=20 \
> --namespace=smallest

Watch nodes:

$watch -n 5 'kubectl get nodes'

Watch Cluster Autoscaler:

$kubectl logs -n kube-system -l app.kubernetes.io/name=aws-cluster-autoscaler -f

Expected behavior:

  1. Pods enter “Pending” state
  2. Cluster Autoscaler detects pending pods
  3. Logs show: “Scale-up: setting group size to X”
  4. New nodes appear in kubectl get nodes
  5. Pods transition to “Running”

Trigger Scale-Down

Delete test pods:

$kubectl delete deployment test-scale-up-1 -n smallest

After scale-down-unneeded-time (default 10 minutes):

  1. Cluster Autoscaler marks underutilized nodes
  2. Drains pods gracefully
  3. Terminates EC2 instances
  4. Node count decreases

GPU Node Scaling

Configure GPU Node Groups

Tag GPU node groups for autoscaling:

cluster-config.yaml
1managedNodeGroups:
2 - name: gpu-nodes
3 instanceType: g5.xlarge
4 minSize: 0
5 maxSize: 10
6 desiredCapacity: 1
7 tags:
8 k8s.io/cluster-autoscaler/smallest-cluster: "owned"
9 k8s.io/cluster-autoscaler/enabled: "true"
10 k8s.io/cluster-autoscaler/node-template/label/workload: "gpu"

Prevent Cluster Autoscaler on GPU Nodes

Run Cluster Autoscaler on CPU nodes to avoid wasting GPU:

values.yaml
1cluster-autoscaler:
2 nodeSelector:
3 workload: cpu
4
5 tolerations: []

Scale to Zero

Allow GPU nodes to scale to zero during off-hours:

1managedNodeGroups:
2 - name: gpu-nodes
3 minSize: 0
4 maxSize: 10

Cluster Autoscaler will:

  • Add GPU nodes when Lightning ASR pods are pending
  • Remove GPU nodes when all GPU workloads complete

First startup after scale-to-zero takes longer (node provisioning + model download).

Spot Instance Integration

Mixed Instance Groups

Use spot and on-demand instances:

cluster-config.yaml
1managedNodeGroups:
2 - name: gpu-nodes-mixed
3 minSize: 1
4 maxSize: 10
5 instancesDistribution:
6 onDemandBaseCapacity: 1
7 onDemandPercentageAboveBaseCapacity: 20
8 spotAllocationStrategy: capacity-optimized
9 instanceTypes:
10 - g5.xlarge
11 - g5.2xlarge
12 - g4dn.xlarge

Configuration:

  • Base capacity: 1 on-demand node always
  • Additional capacity: 20% on-demand, 80% spot
  • Multiple instance types increase spot availability

Handle Spot Interruptions

Configure Cluster Autoscaler for spot:

1extraArgs:
2 balance-similar-node-groups: true
3 skip-nodes-with-local-storage: false
4 max-node-provision-time: 15m

Add AWS Node Termination Handler:

$helm repo add eks https://aws.github.io/eks-charts
$helm install aws-node-termination-handler eks/aws-node-termination-handler \
> --namespace kube-system \
> --set enableSpotInterruptionDraining=true

Advanced Configuration

Multiple Node Groups

Scale different workloads independently:

1cluster-autoscaler:
2 autoscalingGroups:
3 - name: cpu-small
4 minSize: 2
5 maxSize: 10
6 - name: cpu-large
7 minSize: 0
8 maxSize: 5
9 - name: gpu-a10
10 minSize: 0
11 maxSize: 10
12 - name: gpu-t4
13 minSize: 0
14 maxSize: 5

Scale-Up Policies

Control scale-up behavior:

1extraArgs:
2 max-nodes-total: 50
3 max-empty-bulk-delete: 10
4 new-pod-scale-up-delay: 0s
5 scan-interval: 10s

Resource Limits

Prevent runaway scaling:

1extraArgs:
2 cores-total: "0:512"
3 memory-total: "0:2048"
4 max-nodes-total: 100

Monitoring

CloudWatch Metrics

View Auto Scaling Group metrics in CloudWatch:

  • GroupDesiredCapacity
  • GroupInServiceInstances
  • GroupPendingInstances
  • GroupTerminatingInstances

Kubernetes Events

$kubectl get events -n smallest --sort-by='.lastTimestamp' | grep -i scale

Cluster Autoscaler Status

$kubectl get configmap cluster-autoscaler-status -n kube-system -o yaml

Grafana Dashboard

Import Cluster Autoscaler dashboard:

Dashboard ID: 3831

See Grafana Dashboards

Troubleshooting

Nodes Not Scaling Up

Check pending pods:

$kubectl get pods --all-namespaces --field-selector=status.phase=Pending

Check Cluster Autoscaler logs:

$kubectl logs -n kube-system -l app.kubernetes.io/name=aws-cluster-autoscaler --tail=100

Common issues:

  • Max nodes reached (max-nodes-total)
  • IAM permission denied
  • Auto Scaling Group at max capacity
  • Node group not tagged properly

Nodes Not Scaling Down

Check node utilization:

$kubectl top nodes

Check for blocking conditions:

$kubectl describe node <node-name> | grep -i "scale-down disabled"

Common causes:

  • Pods without PodDisruptionBudget
  • Pods with local storage
  • System pods (unless skip-nodes-with-system-pods: false)
  • Nodes below utilization threshold

Permission Errors

Check service account:

$kubectl describe sa cluster-autoscaler -n kube-system

Verify IAM role:

$kubectl logs -n kube-system -l app.kubernetes.io/name=aws-cluster-autoscaler | grep AccessDenied

Update IAM policy if needed (see IAM & IRSA)

Best Practices

Always tag Auto Scaling Groups:

k8s.io/cluster-autoscaler/smallest-cluster: owned
k8s.io/cluster-autoscaler/enabled: true

Configure appropriate min/max for each node group:

1gpu-nodes:
2 minSize: 0 # Save costs
3 maxSize: 10 # Prevent runaway

Protect critical workloads during scale-down:

1apiVersion: policy/v1
2kind: PodDisruptionBudget
3metadata:
4 name: lightning-asr-pdb
5spec:
6 minAvailable: 1
7 selector:
8 matchLabels:
9 app: lightning-asr

Track scaling decisions in Grafana

Set alerts for scale failures

Periodically test scale-up and scale-down:

$kubectl scale deployment lightning-asr --replicas=20

Watch for proper node addition/removal

What’s Next?