Prerequisites | Smallest AI Docs

Kubernetes deployment is currently only available for ASR (Speech-to-Text). TTS (Text-to-Speech) Kubernetes support is coming soon. For TTS deployments, please use Docker.

Overview

Before deploying Smallest Self-Host ASR on Kubernetes, ensure your cluster meets the requirements and you have the necessary tools and credentials.

Kubernetes Cluster Requirements

Minimum Cluster Specifications

Kubernetes Version

v1.19 or higher

v1.24+ recommended

Node Count

Minimum 2 nodes

1 CPU node (control plane/general)
1 GPU node (Lightning ASR)

Total Resources

Minimum cluster capacity

8 CPU cores
32 GB RAM
1 NVIDIA GPU

Storage

Persistent volume support

Storage class available
100 GB minimum capacity

We recommend using L4 or L40s for the best performance.

Required Tools

Install the following tools on your local machine:

Helm

Helm 3.0 or higher is required.

macOS

Linux

Windows

$ brew install helm

Verify installation:

$ helm version

kubectl

Kubernetes CLI tool for cluster management.

macOS

Linux

Windows

$ brew install kubectl

Verify installation:

$ kubectl version --client

Cluster Access

Configure kubectl

Ensure kubectl is configured to access your cluster:

$ kubectl cluster-info
$ kubectl get nodes

Expected output should show your cluster nodes.

Test Cluster Access

Verify you have sufficient permissions:

$ kubectl auth can-i create deployments
$ kubectl auth can-i create services
$ kubectl auth can-i create secrets

All should return yes.

GPU Support

NVIDIA GPU Operator

For Kubernetes clusters, install the NVIDIA GPU Operator to manage GPU resources.

The Smallest Self-Host Helm chart includes the GPU Operator as an optional dependency. You can enable it during installation or install it separately.

Verify GPU Nodes

Check that GPU nodes are properly labeled:

$ kubectl get nodes -l node.kubernetes.io/instance-type

Verify GPU resources are available:

$ kubectl get nodes -o json | jq '.items[].status.capacity'

Look for nvidia.com/gpu in the capacity.

Credentials

Obtain the following from Smallest.ai before installation:

License Key

Your unique license key for validation

Contact: support@smallest.ai

You’ll add this to values.yaml:

1 global:
2   licenseKey: "your-license-key-here"

Container Registry Credentials

Credentials to pull Docker images from quay.io:

Username
Password
Email

Contact: support@smallest.ai

You’ll add these to values.yaml:

1 global:
2   imageCredentials:
3     username: "your-username"
4     password: "your-password"
5     email: "your-email"

Model URLs

Download URL for ASR models

Contact: support@smallest.ai

You’ll add this to values.yaml:

1 models:
2   asrModelUrl: "your-model-url"

Storage Requirements

Storage Class

Verify a storage class is available:

$ kubectl get storageclass

You should see at least one storage class marked as (default) or available.

For AWS Deployments

If deploying on AWS EKS, you’ll need:

EBS CSI Driver for block storage
EFS CSI Driver for shared file storage (recommended for model storage)

See the AWS Deployment guide for detailed setup instructions.

Network Requirements

Required Ports

Ensure the following ports are accessible within the cluster:

Port	Service	Purpose
7100	API Server	Client API requests
2269	Lightning ASR	Internal ASR processing
3369	License Proxy	Internal license validation
6379	Redis	Internal caching

External Access

The License Proxy requires outbound HTTPS access to:

console-api.smallest.ai (port 443)

Ensure your cluster’s network policies and security groups allow outbound HTTPS traffic from pods.

Optional Components

Prometheus & Grafana

For monitoring and autoscaling based on custom metrics:

Prometheus Operator (included in chart)
Grafana (included in chart)
Prometheus Adapter (included in chart)

These are required for:

Custom metrics-based autoscaling
Advanced monitoring dashboards
Performance visualization

Cluster Autoscaler

For automatic node scaling on AWS EKS:

IAM role with autoscaling permissions
IRSA (IAM Roles for Service Accounts) configured

See the Cluster Autoscaler guide for setup.

Namespace

Decide on a namespace for deployment:

Default Namespace

Custom Namespace

Deploy to the default namespace:

$ kubectl config set-context --current --namespace=default

Verification Checklist

Before proceeding, ensure:

Cluster Access

$ kubectl get nodes

Shows all cluster nodes in Ready state

GPU Nodes Available

$ kubectl get nodes -o json | jq '.items[].status.capacity."nvidia.com/gpu"'

Shows GPU count for GPU nodes

Helm Installed

$ helm version

Shows Helm 3.x

Storage Available

$ kubectl get storageclass

Shows at least one storage class

Credentials Ready

License key obtained
Container registry credentials
Model download URL

Sufficient Resources

$ kubectl top nodes

Shows available resources for deployment

AWS-Specific Prerequisites

If deploying on AWS EKS, see:

AWS EKS Setup

Complete guide for setting up EKS cluster with GPU support

What’s Next?

Once all prerequisites are met, proceed to the quick start:

Quick Start

Deploy Smallest Self-Host with Helm