AWS EKS Setup
Overview
This guide walks you through creating an Amazon EKS cluster optimized for running Smallest Self-Host with GPU acceleration.
Prerequisites
Cluster Configuration
Option 1: Quick Start with eksctl
Create a cluster with GPU nodes using a single command:
Then add GPU node group:
This creates a cluster with separate CPU and GPU node groups, allowing for cost-effective scaling.
Option 2: Using Cluster Config File
Create a cluster configuration file for more control:
Create the cluster:
Cluster creation takes 15-20 minutes. Monitor progress in the AWS CloudFormation console.
GPU Instance Types
Choose the right GPU instance type for your workload:
Recommendation: Start with g5.xlarge for development and testing. Scale to g5.2xlarge or higher for production.
Verify Cluster
Check Cluster Status
Verify Node Groups
Configure kubectl
Verify access:
Expected output:
Verify GPU Nodes
Check GPU availability:
Look for nvidia.com/gpu in the output:
Install NVIDIA Device Plugin
The NVIDIA device plugin enables GPU scheduling in Kubernetes.
Using Helm (Recommended)
The Smallest Self-Host chart includes the NVIDIA GPU Operator. Enable it in your values:
Manual Installation
If installing separately:
Verify:
Install EBS CSI Driver
Required for persistent volumes:
Using eksctl
Using AWS Console
- Navigate to EKS → Clusters → smallest-cluster → Add-ons
- Click “Add new”
- Select “Amazon EBS CSI Driver”
- Click “Add”
Verify EBS CSI Driver
Install EFS CSI Driver (Optional)
Recommended for shared model storage across pods.
Create IAM Policy
Create IAM Service Account
Replace YOUR_ACCOUNT_ID with your AWS account ID.
Install EFS CSI Driver
Verify:
Enable Cluster Autoscaler
See the Cluster Autoscaler guide for detailed setup.
Quick setup:
Cost Optimization
Use Spot Instances for GPU Nodes
Reduce costs by up to 70% with Spot instances:
Spot instances can be interrupted with 2-minute warning. Ensure your application handles graceful shutdowns.
Right-Size Node Groups
Start small and scale based on metrics:
Set minSize: 0 to scale down to zero during off-hours.
Enable Cluster Autoscaler
Automatically adjust node count based on demand:
Security Best Practices
Enable Private Endpoint
Enable Logging
Update Security Groups
Restrict inbound access to API server:
Update rules to allow only specific IPs.
Troubleshooting
GPU Nodes Not Ready
Check NVIDIA device plugin:
Pods Stuck in Pending
Check node capacity:
EBS Volumes Not Mounting
Verify EBS CSI driver:

