***
title: Docker Troubleshooting
description: Debug common issues and optimize your STT Docker deployment
------------------------------------------------------------------------
## Common Issues
### GPU Not Accessible
**Symptoms:**
* Error: `could not select device driver "nvidia"`
* Error: `no NVIDIA GPU devices found`
* Lightning ASR fails to start
**Diagnosis:**
```bash
docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi
```
```bash
sudo systemctl restart docker
docker compose up -d
```
```bash
sudo apt-get remove nvidia-container-toolkit
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
```
```bash
nvidia-smi
```
If driver version is below 470, update:
```bash
sudo ubuntu-drivers autoinstall
sudo reboot
```
Verify `/etc/docker/daemon.json` contains:
```json
{
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
}
}
```
Restart Docker after changes:
```bash
sudo systemctl restart docker
```
### License Validation Failed
**Symptoms:**
* Error: `License validation failed`
* Error: `Invalid license key`
* Services fail to start
**Diagnosis:**
Check license-proxy logs:
```bash
docker compose logs license-proxy
```
Check `.env` file:
```bash
cat .env | grep LICENSE_KEY
```
Ensure there are no:
* Extra spaces
* Quotes around the key
* Line breaks
Correct format:
```bash
LICENSE_KEY=abc123def456
```
Test connection to license server:
```bash
curl -v https://console-api.smallest.ai
```
If this fails, check:
* Firewall rules
* Proxy settings
* DNS resolution
If the key appears correct and network is accessible, your license may be:
* Expired
* Revoked
* Invalid
Contact **[support@smallest.ai](mailto:support@smallest.ai)** with:
* Your license key
* License-proxy logs
* Error messages
### Model Download Failed
**Symptoms:**
* Lightning ASR stuck at startup
* Error: `Failed to download model`
* Error: `Connection timeout`
**Diagnosis:**
Check Lightning ASR logs:
```bash
docker compose logs lightning-asr
```
Check `.env` file:
```bash
cat .env | grep MODEL_URL
```
Test URL accessibility:
```bash
curl -I "${MODEL_URL}"
```
Models require \~20-30 GB:
```bash
df -h
```
Free up space if needed:
```bash
docker system prune -a
```
Download model manually and use volume mount:
```bash
mkdir -p ~/models
cd ~/models
wget "${MODEL_URL}" -O model.bin
```
Update docker-compose.yml:
```yaml
lightning-asr:
volumes:
- ~/models:/app/models
```
For slow connections, increase download timeout:
```yaml
lightning-asr:
environment:
- DOWNLOAD_TIMEOUT=3600
```
### Port Already in Use
**Symptoms:**
* Error: `port is already allocated`
* Error: `bind: address already in use`
**Diagnosis:**
Find what's using the port:
```bash
sudo lsof -i :7100
sudo netstat -tulpn | grep 7100
```
If another service is using the port:
```bash
sudo systemctl stop [service-name]
```
Or kill the process:
```bash
sudo kill -9 [PID]
```
Modify docker-compose.yml to use different port:
```yaml
api-server:
ports:
- "8080:7100"
```
Access API at [http://localhost:8080](http://localhost:8080) instead
Old containers may still be bound:
```bash
docker compose down
docker container prune -f
docker compose up -d
```
### Out of Memory
**Symptoms:**
* Container killed unexpectedly
* Error: `OOMKilled`
* System becomes unresponsive
**Diagnosis:**
Check container status:
```bash
docker compose ps
docker inspect [container-name] | grep OOMKilled
```
Lightning ASR requires minimum 16 GB RAM
Check current memory:
```bash
free -h
```
Prevent one service from consuming all memory:
```yaml
services:
lightning-asr:
deploy:
resources:
limits:
memory: 14G
reservations:
memory: 12G
```
Add swap space (temporary solution):
```bash
sudo fallocate -l 16G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
```
Use smaller model or reduce batch size:
```yaml
lightning-asr:
environment:
- BATCH_SIZE=1
- MODEL_PRECISION=fp16
```
### Container Keeps Restarting
**Symptoms:**
* Container status shows `Restarting`
* Logs show crash loop
**Diagnosis:**
View recent logs:
```bash
docker compose logs --tail=100 [service-name]
```
```bash
docker inspect [container-name] --format='{{.State.ExitCode}}'
```
Common exit codes:
* `137`: Out of memory (OOMKilled)
* `139`: Segmentation fault
* `1`: General error
Temporarily disable restart to debug:
```yaml
lightning-asr:
restart: "no"
```
Start manually and watch logs:
```bash
docker compose up lightning-asr
```
Ensure required services are healthy:
```bash
docker compose ps
```
All should show `Up (healthy)` or `Up`
### Slow Performance
**Symptoms:**
* High latency (>500ms)
* Low throughput
* GPU underutilized
**Diagnosis:**
Monitor GPU usage:
```bash
watch -n 1 nvidia-smi
```
Check container resources:
```bash
docker stats
```
Ensure GPU is not throttling:
```bash
nvidia-smi -q -d PERFORMANCE
```
Enable persistence mode:
```bash
sudo nvidia-smi -pm 1
```
```yaml
lightning-asr:
deploy:
resources:
limits:
cpus: '8'
```
For maximum performance (loses isolation):
```yaml
api-server:
network_mode: host
```
Use Redis with persistence disabled for speed:
```yaml
redis:
command: redis-server --save ""
```
Scale Lightning ASR workers:
```bash
docker compose up -d --scale lightning-asr=2
```
## Performance Optimization
### Best Practices
Cache models to avoid re-downloading:
```yaml
volumes:
- model-cache:/app/models
```
Reduces GPU initialization time:
```bash
sudo nvidia-smi -pm 1
```
Allocate appropriate CPU/memory:
```yaml
deploy:
resources:
limits:
cpus: '8'
memory: 14G
```
Use monitoring tools:
```bash
docker stats
nvidia-smi dmon
```
### Benchmark Your Deployment
Test transcription performance:
```bash
time curl -X POST http://localhost:7100/v1/listen \
-H "Authorization: Token ${LICENSE_KEY}" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/test-audio-60s.wav"
}'
```
Expected performance:
* **Cold start**: First request after container start (5-10 seconds)
* **Warm requests**: Subsequent requests (50-200ms)
* **Real-time factor**: 0.05-0.15x (60s audio in 3-9 seconds)
## Debugging Tools
### View All Logs
```bash
docker compose logs -f
```
### Follow Specific Service
```bash
docker compose logs -f lightning-asr
```
### Last N Lines
```bash
docker compose logs --tail=100 api-server
```
### Save Logs to File
```bash
docker compose logs > deployment-logs.txt
```
### Execute Commands in Container
```bash
docker compose exec lightning-asr bash
```
### Check Container Configuration
```bash
docker inspect lightning-asr-1
```
### Network Debugging
Test connectivity between containers:
```bash
docker compose exec api-server ping lightning-asr
docker compose exec api-server curl http://lightning-asr:2233/health
```
## Health Checks
### API Server
```bash
curl http://localhost:7100/health
```
Expected: `{"status": "healthy"}`
### Lightning ASR
```bash
curl http://localhost:2233/health
```
Expected: `{"status": "ready", "gpu": "NVIDIA A10"}`
### License Proxy
```bash
docker compose exec license-proxy wget -q -O- http://localhost:6699/health
```
Expected: `{"status": "valid"}`
### Redis
```bash
docker compose exec redis redis-cli ping
```
Expected: `PONG`
## Log Analysis
### Common Log Patterns
```log
redis-1 | Ready to accept connections
license-proxy | License validated successfully
lightning-asr-1 | Model loaded successfully
lightning-asr-1 | GPU: NVIDIA A10 (24GB)
lightning-asr-1 | Server ready on port 2233
api-server | Connected to Lightning ASR
api-server | API server listening on port 7100
```
```log
license-proxy | ERROR: License validation failed
license-proxy | ERROR: Invalid license key
license-proxy | ERROR: Connection to license server failed
```
```log
lightning-asr-1 | ERROR: No CUDA-capable device detected
lightning-asr-1 | ERROR: CUDA out of memory
lightning-asr-1 | ERROR: GPU not accessible
```
```log
api-server | ERROR: Connection refused: lightning-asr:2233
api-server | ERROR: Timeout connecting to license-proxy
```
## Getting Help
### Before Contacting Support
Collect the following information:
```bash
docker version
docker compose version
nvidia-smi
uname -a
```
```bash
docker compose ps > status.txt
docker stats --no-stream > resources.txt
```
```bash
docker compose logs > all-logs.txt
```
Sanitize and include:
* docker-compose.yml
* .env (remove license key)
### Contact Support
Email: **[support@smallest.ai](mailto:support@smallest.ai)**
Include:
* Description of the issue
* Steps to reproduce
* System information
* Logs and configuration
* License key (via secure channel)
## What's Next?
Advanced configuration options
Integrate with your applications