EFS Configuration
Overview
Amazon Elastic File System (EFS) provides shared, persistent file storage for Kubernetes pods. This is ideal for storing AI models that can be shared across multiple Lightning ASR pods, eliminating duplicate downloads and reducing startup time.
Benefits of EFS
Multiple pods can read/write simultaneously (ReadWriteMany)
Storage grows and shrinks automatically
Models cached once, used by all pods
Pay only for storage used, no upfront provisioning
Prerequisites
Create EFS File System
Using AWS Console
Configure File System
- Name:
smallest-models - VPC: Select your EKS cluster VPC
- Availability and Durability: Regional (recommended)
- Click “Customize”
File System Settings
- Performance mode: General Purpose
- Throughput mode: Bursting (or Elastic for production)
- Encryption: Enable encryption at rest
- Click “Next”
Using AWS CLI
Configure Security Group
Ensure the security group allows NFS traffic (port 2049) from cluster nodes:
If the rule already exists, you’ll see an error. This is safe to ignore.
Deploy with EFS in Helm
Update your values.yaml to enable EFS:
Replace fs-0123456789abcdef with your actual EFS file system ID.
Deploy or Upgrade
Verify EFS Configuration
Check Storage Class
Should show:
Check Persistent Volume
Should show:
Check Persistent Volume Claim
Should show:
Verify Mount in Pod
Should show the EFS mount:
Test EFS
Create a test file in one pod and verify it’s visible in another:
Write test file:
Read from another pod:
Should output: test
How Model Caching Works
With EFS enabled:
-
First Pod Startup:
- Pod downloads model from
asrModelUrl - Saves model to
/app/models(EFS mount) - Takes 5-10 minutes (one-time download)
- Pod downloads model from
-
Subsequent Pod Startups:
- Pod checks
/app/modelsfor existing model - Finds model already downloaded
- Skips download, loads from EFS
- Takes 30-60 seconds
- Pod checks
This is especially valuable when using autoscaling, as new pods start much faster.
Performance Tuning
Choose Throughput Mode
Bursting (Default)
Elastic
Provisioned
Best for: Development, testing, variable workloads
- Throughput scales with storage size
- 50 MB/s per TB of storage
- Bursting to 100 MB/s
- Most cost-effective
Enable Lifecycle Management
Automatically move infrequently accessed files to lower-cost storage:
Cost Optimization
Monitor EFS Usage
Estimate Costs
EFS pricing (us-east-1):
- Standard storage: ~$0.30/GB/month
- Infrequent Access: ~$0.025/GB/month
- Data transfer: Free within same AZ
For 50 GB model:
- Standard: ~$15/month
- With IA (after 30 days): ~$1.25/month
Use lifecycle policies to automatically move old models to Infrequent Access storage.
Backup and Recovery
Enable AWS Backup
Manual Backup
EFS automatically creates point-in-time backups. Access via AWS Console → EFS → Backups.
Troubleshooting
Mount Failed
Check EFS CSI driver:
Verify security group rules:
Ensure port 2049 is open.
Slow Performance
Check throughput mode:
Consider upgrading to Elastic or Provisioned.
Monitor CloudWatch metrics:
PermittedThroughputBurstCreditBalanceClientConnections
Permission Denied
Check mount options in PV:
Should include:
Alternative: EBS for Single Pod
If you don’t need shared storage (single replica only):
EBS volumes can only be attached to one pod at a time. This prevents horizontal scaling.

