-
-
Notifications
You must be signed in to change notification settings - Fork 52
Closed
Labels
bountyenhancementNew feature or requestNew feature or requestkernel-featuresKernel-level AI enhancementsKernel-level AI enhancementstier-1Tier 1 priority featureTier 1 priority feature
Description
Accelerator-Aware Resource Limits - cgroups v2 Wrapper for AI Workloads
Part of: Kernel-Level AI Enhancements (Tier 1 - User-Space)
Description
Wraps Linux cgroups v2 with an AI-friendly CLI for managing GPU, CPU, and memory resources with workload presets.
Effort: 1-2 weeks | Bounty: $100
The Solution
cortex limits create inference-job --preset inference --gpus 2
cortex limits apply inference-job --pid 12345
eval $(cortex limits env inference-job) # for new processes
cortex limits status inference-jobWorkload Presets
| Preset | CPU | Memory | GPU | OOM Score | Use Case |
|---|---|---|---|---|---|
| inference | 4 cores | 32G | 100% | -500 | Low-latency serving |
| training | 16 cores | 128G | 100% | -800 | Long training jobs |
| batch | 8 cores | 64G | 80% | 0 | Background processing |
| interactive | 2 cores | 16G | 50% | -200 | Development |
Features
- cgroups v2 unified hierarchy support
- Workload presets with sensible defaults
- NVIDIA GPU isolation via environment variables
- OOM score adjustment for priority
- CPU quota, weight, and affinity
- Memory hard and soft limits
Acceptance Criteria
- Profiles can be created with presets
- Limits are correctly applied to cgroups
- GPU environment variables are generated
- Works in user mode with delegation
- Unit tests pass
Files
Complete implementation available:
accelerator_limits.py(~700 lines)test_accelerator_limits.py(~300 lines)README_ACCELERATOR_LIMITS.md
Priority
Medium - Important for multi-tenant environments
Metadata
Metadata
Assignees
Labels
bountyenhancementNew feature or requestNew feature or requestkernel-featuresKernel-level AI enhancementsKernel-level AI enhancementstier-1Tier 1 priority featureTier 1 priority feature