Logs Analysis
Overview
Understanding log messages is crucial for diagnosing issues. This guide helps you interpret logs from each component and identify common error patterns.
Log Levels
All components use standard log levels:
Lightning ASR Logs
Successful Startup
Request Processing
Common Errors
GPU Not Found
Cause: GPU not available or drivers not installed
Solution:
- Check
nvidia-smiworks - Verify GPU device plugin (Kubernetes)
- Check NVIDIA Container Toolkit (Docker)
Out of GPU Memory
Cause: Not enough GPU memory
Solution:
- Reduce concurrent requests
- Use larger GPU (A10 vs T4)
- Scale horizontally (more pods)
Model Download Failed
Cause: Network issues, invalid URL, disk full
Solution:
- Verify MODEL_URL
- Check disk space:
df -h - Test URL:
curl -I $MODEL_URL - Use shared storage (EFS)
Audio Processing Error
Cause: Invalid audio file
Solution:
- Verify audio format (WAV, MP3, FLAC supported)
- Check file is not corrupted
- Ensure proper sample rate (16kHz+)
API Server Logs
Successful Startup
Request Handling
Common Errors
Authentication Failed
Cause: Invalid, missing, or expired license key
Solution:
- Verify
Authorization: Token <key>header - Check license key is correct
- Renew expired license
No ASR Workers
Cause: All Lightning ASR pods busy or down
Solution:
- Check Lightning ASR pods:
kubectl get pods - Scale up replicas
- Check HPA configuration
Request Timeout
Cause: Lightning ASR overloaded or crashed
Solution:
- Check Lightning ASR logs
- Increase timeout
- Scale up pods
License Proxy Logs
Successful Validation
Usage Reporting
Common Errors
License Validation Failed
Cause: Invalid or expired license
Solution:
- Verify LICENSE_KEY is correct
- Check license hasn’t expired
- Contact support@smallest.ai
Connection Failed
Cause: Network connectivity issue
Solution:
- Test:
curl https://console-api.smallest.ai - Check firewall allows HTTPS
- Restore connectivity before grace period expires
Grace Period Expiring
Cause: Extended network outage
Solution:
- Restore network connectivity immediately
- Check firewall rules
- Contact support if persistent
Redis Logs
Normal Operation
Common Errors
Memory Limit Reached
Solution:
- Increase memory limit
- Enable eviction policy
- Clear old keys
Persistence Issues
Solution:
- Increase disk space
- Disable persistence if not needed
- Clean up old snapshots
Log Pattern Analysis
Error Rate Analysis
Count errors in last 1000 lines:
Group errors by type:
Performance Analysis
Extract response times:
Calculate average:
Request Tracking
Follow a specific request ID:
Across all pods:
Log Aggregation
Using stern
Install stern:
Follow logs from all Lightning ASR pods:
Filter by pattern:
Using Loki (if installed)
Query logs via LogQL:
Structured Logging
Parse JSON Logs
If logs are in JSON format:
Filter by Field
Log Retention
Configure Log Rotation
Docker:
Kubernetes:
Kubernetes automatically rotates logs via kubelet.
Export Logs
Save logs for analysis:
Debugging Log Issues
No Logs Appearing
Check pod is running:
Check stdout/stderr:
Logs Truncated
Increase log size limits:
Best Practices
Use Structured Logging
Prefer JSON format for easier parsing:
Include Context
Always include relevant context in logs:
- Request ID
- Component name
- Timestamp
- User/session info (if applicable)
Set Appropriate Levels
Use correct log levels:
- DEBUG: Development only
- INFO: Normal operation
- WARNING: Potential issues
- ERROR: Actual problems
- CRITICAL: Service-breaking issues
Aggregate Logs
Use centralized logging:
- ELK Stack (Elasticsearch, Logstash, Kibana)
- Loki + Grafana
- CloudWatch Logs (AWS)
- Cloud Logging (GCP)

