This page focuses on collecting and validating Lightning ASR metrics with Prometheus and exposing them through the Prometheus Adapter.
Autoscaling documentation is currently under active development. Use this page as a metrics reference. If you need autoscaling now, configure your own HPA/KEDA rules using these metrics.
/* The original had a syntax error in Mermaid—edges must connect nodes, not labels. “Metrics” is now a node, and edge directions/names are consistent. */
Collects and stores metrics from Lightning ASR pods.
Included in chart:
CRD that tells Prometheus which services to scrape.
Enabled for Lightning ASR:
Converts Prometheus metrics to Kubernetes custom metrics API.
Configuration:
Lightning ASR exposes the following metrics:
Forward Prometheus port:
Open http://localhost:9090 and verify:
asr_active_requests or asr_batch_queue_depth - should return dataExpected output:
Describe ServiceMonitor:
Should show:
Verify custom metrics are available:
Expected output:
Query specific metric:
To expose additional metrics for your own autoscaling setup:
Configure how long metrics are stored:
Persist Prometheus data:
Adjust how frequently metrics are collected:
Lower intervals (e.g., 15s) provide faster metrics response but increase storage.
Pre-compute expensive queries:
Use recording rules in your autoscaling queries for better performance.
Create alerts for anomalies:
Directly query Lightning ASR metrics:
Expected output:
Access Prometheus UI and test queries:
Navigate to: http://localhost:9090/targets
Verify Lightning ASR targets are “UP”
Look for scrape errors.
Check ServiceMonitor is created:
Check Prometheus is discovering:
Check service has metrics port:
Should show:
Check Prometheus Adapter logs:
Verify adapter configuration:
Test API manually:
If Prometheus is using too much memory:
Pre-compute expensive queries:
Then use this in your autoscaling logic instead of a raw query
Balance responsiveness vs storage:
Always persist Prometheus data:
Track Prometheus performance: