This guide walks you through deploying Smallest Self-Host using Docker Compose. You’ll have a fully functional speech-to-text service running in under 15 minutes.
Ensure you’ve completed all prerequisites before starting this guide.
Create a directory for your deployment:
Authenticate with the Smallest container registry using credentials provided by support:
Enter your username and password when prompted.
Save your credentials securely. You’ll need them if you restart or redeploy the containers.
Create a .env file with your license key:
Replace your-license-key-here with the actual license key provided by Smallest.ai.
Never commit your .env file to version control. Add it to .gitignore if
using git.
Best for: Fast inference, real-time applications
Create a docker-compose.yml file:
Add the model URL to your .env file (required for Lightning ASR):
The MODEL_URL is provided by Smallest.ai support.
Launch all services with Docker Compose:
Watch the logs to monitor startup progress:
Look for these success indicators:
Error: could not select device driver "nvidia"
Solution:
If this fails, reinstall NVIDIA Container Toolkit.
Error: License validation failed
Solution:
.env is correctError: Failed to download model
Solution:
.env is correctdf -hError: port is already allocated
Solution: Check what’s using the port:
Either stop the conflicting service or change the port in docker-compose.yml
Examples:
Pull latest images and restart:
Stop and remove all containers:
Remove containers and volumes (including downloaded models):
Using -v flag will delete all data including downloaded models. They will
need to be re-downloaded on next startup.