NVIDIA GPU performance monitoring for the crucible performance testing framework. Collects GPU metrics using the NVIDIA Management Library (pynvml) during benchmark execution.
- GPU utilization
- GPU temperature
- Memory usage
- Power consumption
Nvidia runs as a profiler tool on endpoint nodes with NVIDIA GPUs. It is allowed on profiler, master, worker, and compute collector roles but blocked on client and server roles. The post-processor (nvidia-post-process) converts raw GPU metrics into crucible's canonical metric format.