Grafana Dashboards
Overview
Grafana provides powerful visualization of Lightning ASR metrics, autoscaling behavior, and system performance. This guide covers accessing Grafana, importing dashboards, and creating custom visualizations.
Access Grafana
Enable Grafana
Ensure Grafana is enabled in your Helm values:
Port Forward
Access Grafana locally:
Open http://localhost:3000 in your browser.
Default Credentials
- Username:
admin - Password:
prom-operator(or custom password fromadminPassword)
Change the default password immediately in production:
Expose Externally
For permanent access, expose via LoadBalancer or Ingress:
LoadBalancer
Ingress
Import ASR Dashboard
The Smallest Self-Host repository includes a pre-built ASR dashboard.
Import from File
Import via ConfigMap
Automatically load dashboard on Grafana startup:
Or enable via Helm:
ASR Dashboard Overview
The pre-built dashboard includes the following panels:
Active Requests
Shows current requests being processed:
- Metric:
asr_active_requests - Visualization: Stat panel with thresholds
- Colors:
- Green: 0-5 requests
- Yellow: 5-10 requests
- Orange: 10-20 requests
- Red: 20+ requests
Request Rate
Requests per second over time:
- Metric:
rate(asr_total_requests[5m]) - Visualization: Time series graph
- Use: Track traffic patterns
Error Rate
Failed requests percentage:
- Metric:
rate(asr_failed_requests[5m]) / rate(asr_total_requests[5m]) * 100 - Visualization: Stat panel + time series
- Alert: Warning if > 5%
Response Time
Request duration percentiles:
- Metrics:
- P50:
histogram_quantile(0.50, asr_request_duration_seconds_bucket) - P95:
histogram_quantile(0.95, asr_request_duration_seconds_bucket) - P99:
histogram_quantile(0.99, asr_request_duration_seconds_bucket)
- P50:
- Visualization: Time series graph
Pod Count
Number of Lightning ASR replicas:
- Metric:
count(asr_active_requests) - Visualization: Stat panel
- Use: Monitor autoscaling
GPU Utilization
GPU usage per pod:
- Metric:
asr_gpu_utilization - Visualization: Time series graph
- Use: Ensure GPUs are utilized
GPU Memory
GPU memory usage:
- Metric:
asr_gpu_memory_used_bytes / 1024 / 1024 / 1024 - Visualization: Gauge + time series
- Use: Monitor memory leaks
Create Custom Dashboards
Add New Dashboard
Useful Queries
Average Active Requests
Total Throughput (requests/hour)
Pod Resource Usage
Autoscaling Events
GPU Temperature
Dashboard Variables
Add variables for dynamic filtering:
Namespace Variable
Pod Variable
Time Range Variable
Use in queries for dynamic aggregation.
Alerting
Configure Alert Rules
Alert Notification Channels
Configure notifications:
Slack
PagerDuty
Grafana → Alerting → Notification channels → Add channel
- Type: Email
- Addresses: ops@example.com
Pre-Built Dashboard Examples
System Overview Dashboard
Autoscaling Dashboard
Track HPA behavior:
Cost Dashboard
Monitor resource costs:
Best Practices
Use Dashboard Folders
Organize dashboards by category:
- Smallest Overview: High-level metrics
- Lightning ASR: Detailed ASR metrics
- Infrastructure: Node and cluster metrics
- Autoscaling: HPA and scaling behavior
Set Appropriate Time Ranges
Default time ranges for different views:
- Real-time monitoring: Last 15 minutes
- Troubleshooting: Last 1 hour
- Analysis: Last 24 hours
- Trends: Last 7 days
Use Annotations
Mark important events:
- Deployments
- Scaling events
- Incidents
- Configuration changes
Template Dashboards
Create template dashboards for:
- Different environments (dev, staging, prod)
- Different namespaces
- Different models
Export and Version Control
Save dashboard JSON to git:
Troubleshooting
Grafana Not Showing Data
Check Prometheus data source:
Grafana → Configuration → Data Sources → Prometheus
- URL:
http://smallest-prometheus-stack-prometheus:9090 - Access: Server (default)
Test connection with “Save & Test” button.
Check Prometheus is running:
Queries Returning No Data
Verify metric exists in Prometheus:
Open http://localhost:9090 and query the metric.
Check time range: Ensure time range includes data.
Dashboard Not Loading
Check Grafana logs:
Increase memory if needed:

