For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DocumentationAPI ReferenceSelf HostModel CardsClient LibrariesIntegrationsDeveloper ToolsChangelog
DocumentationAPI ReferenceSelf HostModel CardsClient LibrariesIntegrationsDeveloper ToolsChangelog
  • Getting Started
    • Introduction
    • Prerequisites
    • Why Self-Host?
    • Architecture
  • Docker Setup
      • Quick Start
      • Services Overview
      • Configuration
      • Multi-checkpoint deployment
      • Troubleshooting
  • Kubernetes Setup
    • Quick Start
    • Troubleshooting
  • Troubleshooting
    • Common Issues
    • Debugging Guide
    • Logs Analysis
  • API Reference
    • Authentication
    • Examples
LogoLogo
Voice AgentsModels
Voice AgentsModels
On this page
  • Overview
  • Step 1: Create Project Directory
  • Step 2: Login to Container Registry
  • Step 3: Create Environment File
  • Step 4: Create Docker Compose File
  • Step 5: Additional Configuration for Lightning ASR
  • Step 6: Start Services
  • Step 7: Monitor Startup
  • Common Startup Issues
  • Managing Your Deployment
  • Stop Services
  • Restart Services
  • View Logs
  • Update Images
  • Remove Deployment
  • What’s Next?
Docker SetupSTT Deployment

Quick Start

||View as Markdown|
Was this page helpful?
Previous

Verification Checklist

Next

Services Overview

Built with

Overview

This guide walks you through deploying Smallest Self-Host using Docker Compose. You’ll have a fully functional speech-to-text service running in under 15 minutes.

Ensure you’ve completed all prerequisites before starting this guide.

Step 1: Create Project Directory

Create a directory for your deployment:

$mkdir -p ~/smallest-self-host
$cd ~/smallest-self-host

Step 2: Login to Container Registry

Authenticate with the Smallest container registry using credentials provided by support:

$docker login quay.io

Enter your username and password when prompted.

Save your credentials securely. You’ll need them if you restart or redeploy the containers.

Step 3: Create Environment File

Create a .env file with your license key:

$cat > .env << 'EOF'
$LICENSE_KEY=your-license-key-here
$EOF

Replace your-license-key-here with the actual license key provided by Smallest.ai.

Never commit your .env file to version control. Add it to .gitignore if using git.

Step 4: Create Docker Compose File

Lightning ASR (Standard)

Best for: Fast inference, real-time applications

Create a docker-compose.yml file:

docker-compose.yml
1version: "3.8"
2
3services:
4 lightning-asr:
5 image: quay.io/smallestinc/lightning-asr:latest
6 ports:
7 - "2233:2233"
8 environment:
9 - MODEL_URL=${MODEL_URL}
10 - LICENSE_KEY=${LICENSE_KEY}
11 - REDIS_URL=redis://redis:6379
12 - PORT=2233
13 deploy:
14 resources:
15 reservations:
16 devices:
17 - driver: nvidia
18 count: 1
19 capabilities: [gpu]
20 restart: unless-stopped
21 networks:
22 - smallest-network
23
24 api-server:
25 image: quay.io/smallestinc/self-hosted-api-server:latest
26 container_name: api-server
27 environment:
28 - LICENSE_KEY=${LICENSE_KEY}
29 - LIGHTNING_ASR_BASE_URL=http://lightning-asr:2233
30 - API_BASE_URL=http://license-proxy:3369
31 ports:
32 - "7100:7100"
33 networks:
34 - smallest-network
35 restart: unless-stopped
36 depends_on:
37 - lightning-asr
38 - license-proxy
39
40 license-proxy:
41 image: quay.io/smallestinc/license-proxy:latest
42 container_name: license-proxy
43 environment:
44 - LICENSE_KEY=${LICENSE_KEY}
45 networks:
46 - smallest-network
47 restart: unless-stopped
48
49 redis:
50 image: redis:7-alpine
51 ports:
52 - "6379:6379"
53 networks:
54 - smallest-network
55 restart: unless-stopped
56 command: redis-server --appendonly yes
57 healthcheck:
58 test: ["CMD", "redis-cli", "ping"]
59 interval: 5s
60 timeout: 3s
61 retries: 5
62
63networks:
64 smallest-network:
65 driver: bridge
66 name: smallest-network

Step 5: Additional Configuration for Lightning ASR

Lightning ASR

Add the model URL to your .env file (required for Lightning ASR):

$echo "MODEL_URL=your-model-url-here" >> .env

The MODEL_URL is provided by Smallest.ai support.

Step 6: Start Services

Launch all services with Docker Compose:

$docker compose up -d

Step 7: Monitor Startup

Watch the logs to monitor startup progress:

$docker compose logs -f

Look for these success indicators:

1

Redis Ready

redis-1 | Ready to accept connections
2

License Proxy Ready

license-proxy | License validated successfully
license-proxy | Server listening on port 3369
3

Model Service Ready

Lightning ASR:

lightning-asr-1 | Model loaded successfully
lightning-asr-1 | Server ready on port 2233
4

API Server Ready

api-server | Connected to Lightning ASR
api-server | API server listening on port 7100

Common Startup Issues

GPU Not Found

Error: could not select device driver "nvidia"

Solution:

$sudo systemctl restart docker
$docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi

If this fails, reinstall NVIDIA Container Toolkit.

License Validation Failed

Error: License validation failed

Solution:

  • Verify LICENSE_KEY in .env is correct
  • Check internet connectivity
  • Ensure firewall allows HTTPS to api.smallest.ai
Model Download Failed

Error: Failed to download model

Solution:

  • Verify MODEL_URL in .env is correct
  • Check disk space: df -h
  • Check internet connectivity
Port Already in Use

Error: port is already allocated

Solution: Check what’s using the port:

$sudo lsof -i :7100

Either stop the conflicting service or change the port in docker-compose.yml

Managing Your Deployment

Stop Services

$docker compose stop

Restart Services

$docker compose restart

View Logs

$docker compose logs -f [service-name]

Examples:

$docker compose logs -f api-server
$docker compose logs -f lightning-asr

Update Images

Pull latest images and restart:

$docker compose pull
$docker compose up -d

Remove Deployment

Stop and remove all containers:

$docker compose down

Remove containers and volumes (including downloaded models):

$docker compose down -v

Using -v flag will delete all data including downloaded models. They will need to be re-downloaded on next startup.

What’s Next?

STT Configuration

Customize your deployment with advanced configuration options

STT Services Overview

Learn about each service component in detail

STT Troubleshooting

Debug common issues and optimize performance

API Reference

Integrate with your applications using the API