***

title: Architecture Overview
description: Understanding the components and architecture of Smallest Self-Host deployments
---------------------

For clean Markdown of any page, append .md to the page URL. For a complete documentation index, see https://docs.smallest.ai/waves/v-4-0-0/self-host/getting-started/llms.txt. For full documentation content, see https://docs.smallest.ai/waves/v-4-0-0/self-host/getting-started/llms-full.txt.

## System Architecture

```mermaid
graph TB
    Client[Client Applications] -->|HTTP/WebSocket| API[API Server]
    API -->|STT Requests| ASR[Lightning ASR]
    API -->|TTS Requests| TTS[Lightning TTS]
    API -->|Validate License| LP[License Proxy]
    LP -->|Report Usage| LS[Smallest License Server]

    subgraph YourInfrastructure[Your Infrastructure]
        API
        ASR
        TTS
        LP
    end

    subgraph SmallestCloud[Smallest Cloud]
        LS
    end

    style ASR fill:#0D9373
    style TTS fill:#0D9373
    style API fill:#07C983
    style LP fill:#1E90FF
    style LS fill:#FF6B6B
```

## Components

<AccordionGroup>
  <Accordion title="API Server">
    Routes requests to Lightning ASR/TTS workers, manages WebSocket connections, and provides a unified REST API interface.

    **Resources:** 0.5-2 CPU cores, 512 MB - 2 GB RAM, no GPU
  </Accordion>

  <Accordion title="Lightning ASR">
    GPU-accelerated speech-to-text engine with 0.05-0.15x real-time factor. Supports real-time and batch transcription.

    **Resources:** 4-8 CPU cores, 12-16 GB RAM, 1x NVIDIA GPU (16+ GB VRAM)
  </Accordion>

  <Accordion title="Lightning TTS">
    GPU-accelerated text-to-speech engine for natural voice synthesis. Supports streaming and batch generation.

    **Resources:** 4-8 CPU cores, 12-16 GB RAM, 1x NVIDIA GPU (16+ GB VRAM)
  </Accordion>

  <Accordion title="License Proxy">
    Validates license keys and reports usage metadata. Supports offline grace periods.

    **Resources:** 0.25-1 CPU core, 256-512 MB RAM, no GPU
  </Accordion>

  <Accordion title="Redis">
    Request queuing, session state, and caching. Can use embedded or external (ElastiCache).

    **Resources:** 0.5-1 CPU core, 512 MB - 2 GB RAM, no GPU
  </Accordion>
</AccordionGroup>

## Data Flow

1. **Client Request** — Your application sends audio (STT) or text (TTS) via HTTP or WebSocket
2. **API Server** — Routes the request to the appropriate worker and validates the license
3. **Worker Processing** — Lightning ASR or TTS processes the request on GPU
4. **Response** — Results stream back through the API server to your application

All processing happens within your infrastructure. Only license validation metadata is sent to Smallest Cloud.

## What's Next?

<CardGroup cols={2}>
  <Card title="Prerequisites" href="/waves/self-host/getting-started/prerequisites">
    License key, credentials, and infrastructure requirements
  </Card>

  <Card title="Why Self-Host?" href="/waves/self-host/getting-started/why-self-host">
    Benefits of self-hosting for your use case
  </Card>
</CardGroup>