Prerequisites

View as MarkdownOpen in Claude

Kubernetes deployment is currently only available for ASR (Speech-to-Text). TTS (Text-to-Speech) Kubernetes support is coming soon. For TTS deployments, please use Docker.

Overview

Before deploying Smallest Self-Host ASR on Kubernetes, ensure your cluster meets the requirements and you have the necessary tools and credentials.

Kubernetes Cluster Requirements

Minimum Cluster Specifications

Kubernetes Version

v1.19 or higher

v1.24+ recommended

Node Count

Minimum 2 nodes

  • 1 CPU node (control plane/general)
  • 1 GPU node (Lightning ASR)
Total Resources

Minimum cluster capacity

  • 8 CPU cores
  • 32 GB RAM
  • 1 NVIDIA GPU
Storage

Persistent volume support

  • Storage class available
  • 100 GB minimum capacity

We recommend using L4 or L40s for the best performance.

Required Tools

Install the following tools on your local machine:

Helm

Helm 3.0 or higher is required.

$brew install helm

Verify installation:

$helm version

kubectl

Kubernetes CLI tool for cluster management.

$brew install kubectl

Verify installation:

$kubectl version --client

Cluster Access

Configure kubectl

Ensure kubectl is configured to access your cluster:

$kubectl cluster-info
$kubectl get nodes

Expected output should show your cluster nodes.

Test Cluster Access

Verify you have sufficient permissions:

$kubectl auth can-i create deployments
$kubectl auth can-i create services
$kubectl auth can-i create secrets

All should return yes.

GPU Support

NVIDIA GPU Operator

For Kubernetes clusters, install the NVIDIA GPU Operator to manage GPU resources.

The Smallest Self-Host Helm chart includes the GPU Operator as an optional dependency. You can enable it during installation or install it separately.

Verify GPU Nodes

Check that GPU nodes are properly labeled:

$kubectl get nodes -l node.kubernetes.io/instance-type

Verify GPU resources are available:

$kubectl get nodes -o json | jq '.items[].status.capacity'

Look for nvidia.com/gpu in the capacity.

Credentials

Obtain the following from Smallest.ai before installation:

Your unique license key for validation

Contact: support@smallest.ai

You’ll add this to values.yaml:

1global:
2 licenseKey: "your-license-key-here"

Credentials to pull Docker images from quay.io:

  • Username
  • Password
  • Email

Contact: support@smallest.ai

You’ll add these to values.yaml:

1global:
2 imageCredentials:
3 username: "your-username"
4 password: "your-password"
5 email: "your-email"

Download URL for ASR models

Contact: support@smallest.ai

You’ll add this to values.yaml:

1models:
2 asrModelUrl: "your-model-url"

Storage Requirements

Storage Class

Verify a storage class is available:

$kubectl get storageclass

You should see at least one storage class marked as (default) or available.

For AWS Deployments

If deploying on AWS EKS, you’ll need:

  • EBS CSI Driver for block storage
  • EFS CSI Driver for shared file storage (recommended for model storage)

See the AWS Deployment guide for detailed setup instructions.

Network Requirements

Required Ports

Ensure the following ports are accessible within the cluster:

PortServicePurpose
7100API ServerClient API requests
2269Lightning ASRInternal ASR processing
3369License ProxyInternal license validation
6379RedisInternal caching

External Access

The License Proxy requires outbound HTTPS access to:

  • console-api.smallest.ai (port 443)

Ensure your cluster’s network policies and security groups allow outbound HTTPS traffic from pods.

Optional Components

Prometheus & Grafana

For monitoring and autoscaling based on custom metrics:

  • Prometheus Operator (included in chart)
  • Grafana (included in chart)
  • Prometheus Adapter (included in chart)

These are required for:

  • Custom metrics-based autoscaling
  • Advanced monitoring dashboards
  • Performance visualization

Cluster Autoscaler

For automatic node scaling on AWS EKS:

  • IAM role with autoscaling permissions
  • IRSA (IAM Roles for Service Accounts) configured

See the Cluster Autoscaler guide for setup.

Namespace

Decide on a namespace for deployment:

Deploy to the default namespace:

$kubectl config set-context --current --namespace=default

Verification Checklist

Before proceeding, ensure:

1

Cluster Access

$kubectl get nodes

Shows all cluster nodes in Ready state

2

GPU Nodes Available

$kubectl get nodes -o json | jq '.items[].status.capacity."nvidia.com/gpu"'

Shows GPU count for GPU nodes

3

Helm Installed

$helm version

Shows Helm 3.x

4

Storage Available

$kubectl get storageclass

Shows at least one storage class

5

Credentials Ready

  • License key obtained
  • Container registry credentials
  • Model download URL
6

Sufficient Resources

$kubectl top nodes

Shows available resources for deployment

AWS-Specific Prerequisites

If deploying on AWS EKS, see:

What’s Next?

Once all prerequisites are met, proceed to the quick start: