Scan your AWS account for GPU waste and get actionable recommendations to cut your cloud spend.
$ gpuaudit scan --profile ml-prod
GPU Fleet Summary
Total GPU instances: 14
Total monthly GPU spend: $47,832
Estimated monthly waste: $18,240 (38%)
CRITICAL (3 instances, $8,940/mo potential savings)
i-0a1b2c3d4e g5.12xlarge (4x A10G) $4,380/mo Idle — no activity for 18 days → terminate
i-9f8e7d6c5b p4d.24xlarge (8x A100) $23,652/mo Idle — <1% CPU for 6 days → terminate
sagemaker:asr ml.g6.48xlarge (8x L40S) $9,490/mo GPU util avg 8% → downsize to ml.g5.xlarge
- Idle GPU instances — running but doing nothing (low CPU + near-zero network for 24+ hours)
- Oversized GPU — multi-GPU instances where utilization suggests a single GPU would suffice
- Pricing mismatch — on-demand instances running 30+ days that should be Reserved Instances
- Stale instances — non-production instances running 90+ days
- SageMaker low utilization — endpoints with <10% GPU utilization
- SageMaker oversized — endpoints using <30% GPU memory on multi-GPU instances
go install github.com/gpuaudit/cli/cmd/gpuaudit@latestOr build from source:
git clone https://github.com/gpuaudit/cli.git
cd gpuaudit
go build -o gpuaudit ./cmd/gpuaudit# Uses default AWS credentials (~/.aws/credentials or environment variables)
gpuaudit scan
# Specific profile and region
gpuaudit scan --profile production --region us-east-1
# JSON output for automation
gpuaudit scan --format json --output report.json
# Markdown for docs/PRs
gpuaudit scan --format markdown
# Slack Block Kit payload (pipe to webhook)
gpuaudit scan --format slack --output - | curl -X POST -H 'Content-Type: application/json' -d @- $SLACK_WEBHOOK
# Skip CloudWatch metrics (faster, less accurate)
gpuaudit scan --skip-metrics
# Skip SageMaker scanning
gpuaudit scan --skip-sagemakergpuaudit is read-only. It never modifies your infrastructure. Generate the minimal IAM policy:
gpuaudit iam-policyThis outputs a JSON policy requiring only Describe*, List*, Get* permissions for EC2, SageMaker, CloudWatch, Cost Explorer, and Pricing APIs.
# List all GPU instance pricing
gpuaudit pricing
# Filter by GPU model
gpuaudit pricing --gpu H100
gpuaudit pricing --gpu A10G
gpuaudit pricing --gpu T4| Format | Flag | Use case |
|---|---|---|
| Table | --format table (default) |
Terminal viewing |
| JSON | --format json |
Automation, CI/CD pipelines |
| Markdown | --format markdown |
PRs, wikis, docs |
| Slack | --format slack |
Slack webhook integration |
- Discovery — Scans EC2 and SageMaker across multiple regions for GPU instance families (g4dn, g5, g6, g6e, p4d, p4de, p5, inf2, trn1)
- Metrics — Collects 7-day CloudWatch metrics: CPU, network I/O for EC2; GPU utilization, GPU memory, invocations for SageMaker
- Analysis — Applies 6 waste detection rules with severity levels (critical/warning)
- Recommendations — Generates specific actions (terminate, downsize, switch pricing) with estimated monthly savings
Regions scanned by default: us-east-1, us-east-2, us-west-2, eu-west-1, eu-west-2, eu-central-1, ap-southeast-1, ap-northeast-1, ap-south-1.
gpuaudit/
├── cmd/gpuaudit/ CLI entry point (cobra)
├── internal/
│ ├── models/ Core data types (GPUInstance, WasteSignal, Recommendation)
│ ├── pricing/ Bundled GPU pricing database (40+ instance types)
│ ├── analysis/ Waste detection rules engine
│ ├── output/ Formatters (table, JSON, markdown, Slack)
│ └── providers/aws/ EC2, SageMaker, CloudWatch, scanner orchestrator
└── LICENSE Apache 2.0
- AWS Cost Explorer integration (actual vs projected spend)
- EKS GPU pod discovery
- SageMaker training job analysis
- Multi-account (AWS Organizations) scanning
- GCP + Azure support
- GitHub Action for scheduled scans
- Historical scan comparison (
gpuaudit diff)
Apache 2.0