***

title: Docker Troubleshooting
description: Debug common issues and optimize your STT Docker deployment
---------------------

For clean Markdown of any page, append .md to the page URL. For a complete documentation index, see https://docs.smallest.ai/waves/v-4-0-0/self-host/docker-setup/stt-deployment/llms.txt. For full documentation content, see https://docs.smallest.ai/waves/v-4-0-0/self-host/docker-setup/stt-deployment/llms-full.txt.

## Common Issues

### GPU Not Accessible

**Symptoms:**

* Error: `could not select device driver "nvidia"`
* Error: `no NVIDIA GPU devices found`
* Lightning ASR fails to start

**Diagnosis:**

```bash
docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi
```

<AccordionGroup>
  <Accordion title="Solution 1: Restart Docker">
    ```bash
    sudo systemctl restart docker
    docker compose up -d
    ```
  </Accordion>

  <Accordion title="Solution 2: Reinstall NVIDIA Container Toolkit">
    ```bash
    sudo apt-get remove nvidia-container-toolkit
    sudo apt-get update
    sudo apt-get install -y nvidia-container-toolkit

    sudo systemctl restart docker
    ```
  </Accordion>

  <Accordion title="Solution 3: Update NVIDIA Driver">
    ```bash
    nvidia-smi
    ```

    If driver version is below 470, update:

    ```bash
    sudo ubuntu-drivers autoinstall
    sudo reboot
    ```
  </Accordion>

  <Accordion title="Solution 4: Check Docker Daemon Configuration">
    Verify `/etc/docker/daemon.json` contains:

    ```json
    {
      "runtimes": {
        "nvidia": {
          "path": "nvidia-container-runtime",
          "runtimeArgs": []
        }
      }
    }
    ```

    Restart Docker after changes:

    ```bash
    sudo systemctl restart docker
    ```
  </Accordion>
</AccordionGroup>

### License Validation Failed

**Symptoms:**

* Error: `License validation failed`
* Error: `Invalid license key`
* Services fail to start

**Diagnosis:**

Check license-proxy logs:

```bash
docker compose logs license-proxy
```

<AccordionGroup>
  <Accordion title="Solution 1: Verify License Key">
    Check `.env` file:

    ```bash
    cat .env | grep LICENSE_KEY
    ```

    Ensure there are no:

    * Extra spaces
    * Quotes around the key
    * Line breaks

    Correct format:

    ```bash
    LICENSE_KEY=abc123def456
    ```
  </Accordion>

  <Accordion title="Solution 2: Check Network Connectivity">
    Test connection to license server:

    ```bash
    curl -v https://api.smallest.ai
    ```

    If this fails, check:

    * Firewall rules
    * Proxy settings
    * DNS resolution
  </Accordion>

  <Accordion title="Solution 3: Contact Support">
    If the key appears correct and network is accessible, your license may be:

    * Expired
    * Revoked
    * Invalid

    Contact **[support@smallest.ai](mailto:support@smallest.ai)** with:

    * Your license key
    * License-proxy logs
    * Error messages
  </Accordion>
</AccordionGroup>

### Model Download Failed

**Symptoms:**

* Lightning ASR stuck at startup
* Error: `Failed to download model`
* Error: `Connection timeout`

**Diagnosis:**

Check Lightning ASR logs:

```bash
docker compose logs lightning-asr
```

<AccordionGroup>
  <Accordion title="Solution 1: Verify Model URL">
    Check `.env` file:

    ```bash
    cat .env | grep MODEL_URL
    ```

    Test URL accessibility:

    ```bash
    curl -I "${MODEL_URL}"
    ```
  </Accordion>

  <Accordion title="Solution 2: Check Disk Space">
    Models require \~20-30 GB:

    ```bash
    df -h
    ```

    Free up space if needed:

    ```bash
    docker system prune -a
    ```
  </Accordion>

  <Accordion title="Solution 3: Manual Download">
    Download model manually and use volume mount:

    ```bash
    mkdir -p ~/models
    cd ~/models
    wget "${MODEL_URL}" -O model.bin
    ```

    Update docker-compose.yml:

    ```yaml
    lightning-asr:
      volumes:
        - ~/models:/app/models
    ```
  </Accordion>

  <Accordion title="Solution 4: Increase Timeout">
    For slow connections, increase download timeout:

    ```yaml
    lightning-asr:
      environment:
        - DOWNLOAD_TIMEOUT=3600
    ```
  </Accordion>
</AccordionGroup>

### Port Already in Use

**Symptoms:**

* Error: `port is already allocated`
* Error: `bind: address already in use`

**Diagnosis:**

Find what's using the port:

```bash
sudo lsof -i :7100
sudo netstat -tulpn | grep 7100
```

<AccordionGroup>
  <Accordion title="Solution 1: Stop Conflicting Service">
    If another service is using the port:

    ```bash
    sudo systemctl stop [service-name]
    ```

    Or kill the process:

    ```bash
    sudo kill -9 [PID]
    ```
  </Accordion>

  <Accordion title="Solution 2: Change Port">
    Modify docker-compose.yml to use different port:

    ```yaml
    api-server:
      ports:
        - "8080:7100"
    ```

    Access API at [http://localhost:8080](http://localhost:8080) instead
  </Accordion>

  <Accordion title="Solution 3: Remove Old Containers">
    Old containers may still be bound:

    ```bash
    docker compose down
    docker container prune -f
    docker compose up -d
    ```
  </Accordion>
</AccordionGroup>

### Out of Memory

**Symptoms:**

* Container killed unexpectedly
* Error: `OOMKilled`
* System becomes unresponsive

**Diagnosis:**

Check container status:

```bash
docker compose ps
docker inspect [container-name] | grep OOMKilled
```

<AccordionGroup>
  <Accordion title="Solution 1: Increase System Memory">
    Lightning ASR requires minimum 16 GB RAM

    Check current memory:

    ```bash
    free -h
    ```
  </Accordion>

  <Accordion title="Solution 2: Add Memory Limits">
    Prevent one service from consuming all memory:

    ```yaml
    services:
      lightning-asr:
        deploy:
          resources:
            limits:
              memory: 14G
            reservations:
              memory: 12G
    ```
  </Accordion>

  <Accordion title="Solution 3: Enable Swap">
    Add swap space (temporary solution):

    ```bash
    sudo fallocate -l 16G /swapfile
    sudo chmod 600 /swapfile
    sudo mkswap /swapfile
    sudo swapon /swapfile
    ```
  </Accordion>

  <Accordion title="Solution 4: Optimize Model Loading">
    Use smaller model or reduce batch size:

    ```yaml
    lightning-asr:
      environment:
        - BATCH_SIZE=1
        - MODEL_PRECISION=fp16
    ```
  </Accordion>
</AccordionGroup>

### Container Keeps Restarting

**Symptoms:**

* Container status shows `Restarting`
* Logs show crash loop

**Diagnosis:**

View recent logs:

```bash
docker compose logs --tail=100 [service-name]
```

<AccordionGroup>
  <Accordion title="Solution 1: Check Exit Code">
    ```bash
    docker inspect [container-name] --format='{{.State.ExitCode}}'
    ```

    Common exit codes:

    * `137`: Out of memory (OOMKilled)
    * `139`: Segmentation fault
    * `1`: General error
  </Accordion>

  <Accordion title="Solution 2: Disable Auto-Restart">
    Temporarily disable restart to debug:

    ```yaml
    lightning-asr:
      restart: "no"
    ```

    Start manually and watch logs:

    ```bash
    docker compose up lightning-asr
    ```
  </Accordion>

  <Accordion title="Solution 3: Check Dependencies">
    Ensure required services are healthy:

    ```bash
    docker compose ps
    ```

    All should show `Up (healthy)` or `Up`
  </Accordion>
</AccordionGroup>

### Slow Performance

**Symptoms:**

* High latency (>500ms)
* Low throughput
* GPU underutilized

**Diagnosis:**

Monitor GPU usage:

```bash
watch -n 1 nvidia-smi
```

Check container resources:

```bash
docker stats
```

<AccordionGroup>
  <Accordion title="Solution 1: Optimize GPU Usage">
    Ensure GPU is not throttling:

    ```bash
    nvidia-smi -q -d PERFORMANCE
    ```

    Enable persistence mode:

    ```bash
    sudo nvidia-smi -pm 1
    ```
  </Accordion>

  <Accordion title="Solution 2: Increase CPU Allocation">
    ```yaml
    lightning-asr:
      deploy:
        resources:
          limits:
            cpus: '8'
    ```
  </Accordion>

  <Accordion title="Solution 3: Use Host Network">
    For maximum performance (loses isolation):

    ```yaml
    api-server:
      network_mode: host
    ```
  </Accordion>

  <Accordion title="Solution 4: Optimize Redis">
    Use Redis with persistence disabled for speed:

    ```yaml
    redis:
      command: redis-server --save ""
    ```
  </Accordion>

  <Accordion title="Solution 5: Add More Workers">
    Scale Lightning ASR workers:

    ```bash
    docker compose up -d --scale lightning-asr=2
    ```
  </Accordion>
</AccordionGroup>

## Performance Optimization

### Best Practices

<Steps>
  <Step title="Use Persistent Volumes">
    Cache models to avoid re-downloading:

    ```yaml
    volumes:
      - model-cache:/app/models
    ```
  </Step>

  <Step title="Enable GPU Persistence Mode">
    Reduces GPU initialization time:

    ```bash
    sudo nvidia-smi -pm 1
    ```
  </Step>

  <Step title="Optimize Container Resources">
    Allocate appropriate CPU/memory:

    ```yaml
    deploy:
      resources:
        limits:
          cpus: '8'
          memory: 14G
    ```
  </Step>

  <Step title="Monitor and Tune">
    Use monitoring tools:

    ```bash
    docker stats
    nvidia-smi dmon
    ```
  </Step>
</Steps>

### Benchmark Your Deployment

Test transcription performance:

```bash
time curl -X POST http://localhost:7100/v1/listen \
  -H "Authorization: Token ${LICENSE_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/test-audio-60s.wav"
  }'
```

Expected performance:

* **Cold start**: First request after container start (5-10 seconds)
* **Warm requests**: Subsequent requests (50-200ms)
* **Real-time factor**: 0.05-0.15x (60s audio in 3-9 seconds)

## Debugging Tools

### View All Logs

```bash
docker compose logs -f
```

### Follow Specific Service

```bash
docker compose logs -f lightning-asr
```

### Last N Lines

```bash
docker compose logs --tail=100 api-server
```

### Save Logs to File

```bash
docker compose logs > deployment-logs.txt
```

### Execute Commands in Container

```bash
docker compose exec lightning-asr bash
```

### Check Container Configuration

```bash
docker inspect lightning-asr-1
```

### Network Debugging

Test connectivity between containers:

```bash
docker compose exec api-server ping lightning-asr
docker compose exec api-server curl http://lightning-asr:2233/health
```

## Health Checks

### API Server

```bash
curl http://localhost:7100/health
```

Expected: `{"status": "healthy"}`

### Lightning ASR

```bash
curl http://localhost:2233/health
```

Expected: `{"status": "ready", "gpu": "NVIDIA A10"}`

### License Proxy

```bash
docker compose exec license-proxy wget -q -O- http://localhost:6699/health
```

Expected: `{"status": "valid"}`

### Redis

```bash
docker compose exec redis redis-cli ping
```

Expected: `PONG`

## Log Analysis

### Common Log Patterns

<Tabs>
  <Tab title="Successful Startup">
    ```log
    redis-1              | Ready to accept connections
    license-proxy        | License validated successfully
    lightning-asr-1      | Model loaded successfully
    lightning-asr-1      | GPU: NVIDIA A10 (24GB)
    lightning-asr-1      | Server ready on port 2233
    api-server           | Connected to Lightning ASR
    api-server           | API server listening on port 7100
    ```
  </Tab>

  <Tab title="License Issues">
    ```log
    license-proxy        | ERROR: License validation failed
    license-proxy        | ERROR: Invalid license key
    license-proxy        | ERROR: Connection to license server failed
    ```
  </Tab>

  <Tab title="GPU Issues">
    ```log
    lightning-asr-1      | ERROR: No CUDA-capable device detected
    lightning-asr-1      | ERROR: CUDA out of memory
    lightning-asr-1      | ERROR: GPU not accessible
    ```
  </Tab>

  <Tab title="Network Issues">
    ```log
    api-server           | ERROR: Connection refused: lightning-asr:2233
    api-server           | ERROR: Timeout connecting to license-proxy
    ```
  </Tab>
</Tabs>

## Getting Help

### Before Contacting Support

Collect the following information:

<Steps>
  <Step title="System Information">
    ```bash
    docker version
    docker compose version
    nvidia-smi
    uname -a
    ```
  </Step>

  <Step title="Container Status">
    ```bash
    docker compose ps > status.txt
    docker stats --no-stream > resources.txt
    ```
  </Step>

  <Step title="Logs">
    ```bash
    docker compose logs > all-logs.txt
    ```
  </Step>

  <Step title="Configuration">
    Sanitize and include:

    * docker-compose.yml
    * .env (remove license key)
  </Step>
</Steps>

### Contact Support

Email: **[support@smallest.ai](mailto:support@smallest.ai)**

Include:

* Description of the issue
* Steps to reproduce
* System information
* Logs and configuration
* License key (via secure channel)

## What's Next?

<CardGroup cols={2}>
  <Card title="STT Configuration" href="/waves/self-host/docker-setup/stt-deployment/configuration">
    Advanced configuration options
  </Card>

  <Card title="API Reference" href="/waves/self-host/api-reference/authentication">
    Integrate with your applications
  </Card>
</CardGroup>