Prerequisites
Kubernetes deployment is currently only available for ASR (Speech-to-Text). TTS (Text-to-Speech) Kubernetes support is coming soon. For TTS deployments, please use Docker.
Overview
Before deploying Smallest Self-Host ASR on Kubernetes, ensure your cluster meets the requirements and you have the necessary tools and credentials.
Kubernetes Cluster Requirements
Minimum Cluster Specifications
v1.19 or higher
v1.24+ recommended
Minimum 2 nodes
- 1 CPU node (control plane/general)
- 1 GPU node (Lightning ASR)
Minimum cluster capacity
- 8 CPU cores
- 32 GB RAM
- 1 NVIDIA GPU
Persistent volume support
- Storage class available
- 100 GB minimum capacity
We recommend using L4 or L40s for the best performance.
Required Tools
Install the following tools on your local machine:
Helm
Helm 3.0 or higher is required.
macOS
Linux
Windows
Verify installation:
kubectl
Kubernetes CLI tool for cluster management.
macOS
Linux
Windows
Verify installation:
Cluster Access
Configure kubectl
Ensure kubectl is configured to access your cluster:
Expected output should show your cluster nodes.
Test Cluster Access
Verify you have sufficient permissions:
All should return yes.
GPU Support
NVIDIA GPU Operator
For Kubernetes clusters, install the NVIDIA GPU Operator to manage GPU resources.
The Smallest Self-Host Helm chart includes the GPU Operator as an optional dependency. You can enable it during installation or install it separately.
Verify GPU Nodes
Check that GPU nodes are properly labeled:
Verify GPU resources are available:
Look for nvidia.com/gpu in the capacity.
Credentials
Obtain the following from Smallest.ai before installation:
License Key
Container Registry Credentials
Credentials to pull Docker images from quay.io:
- Username
- Password
Contact: support@smallest.ai
You’ll add these to values.yaml:
Model URLs
Storage Requirements
Storage Class
Verify a storage class is available:
You should see at least one storage class marked as (default) or available.
For AWS Deployments
If deploying on AWS EKS, you’ll need:
- EBS CSI Driver for block storage
- EFS CSI Driver for shared file storage (recommended for model storage)
See the AWS Deployment guide for detailed setup instructions.
Network Requirements
Required Ports
Ensure the following ports are accessible within the cluster:
External Access
The License Proxy requires outbound HTTPS access to:
console-api.smallest.ai(port 443)
Ensure your cluster’s network policies and security groups allow outbound HTTPS traffic from pods.
Optional Components
Prometheus & Grafana
For monitoring and autoscaling based on custom metrics:
- Prometheus Operator (included in chart)
- Grafana (included in chart)
- Prometheus Adapter (included in chart)
These are required for:
- Custom metrics-based autoscaling
- Advanced monitoring dashboards
- Performance visualization
Cluster Autoscaler
For automatic node scaling on AWS EKS:
- IAM role with autoscaling permissions
- IRSA (IAM Roles for Service Accounts) configured
See the Cluster Autoscaler guide for setup.
Namespace
Decide on a namespace for deployment:
Default Namespace
Custom Namespace
Deploy to the default namespace:
Verification Checklist
Before proceeding, ensure:
AWS-Specific Prerequisites
If deploying on AWS EKS, see:
What’s Next?
Once all prerequisites are met, proceed to the quick start:

