Optionally collect real GPU utilization metrics via NVIDIA DCGM.
Currently EC2 GPU utilization is inferred from CPU/network proxy signals.
DCGM provides actual GPU compute and memory utilization, eliminating
false positives on instances with high GPU but low CPU usage.
- Query DCGM exporter Prometheus endpoint if available
- Fall back to proxy signals when DCGM is not present
Optionally collect real GPU utilization metrics via NVIDIA DCGM.
Currently EC2 GPU utilization is inferred from CPU/network proxy signals.
DCGM provides actual GPU compute and memory utilization, eliminating
false positives on instances with high GPU but low CPU usage.