Skip to content

[Kernel Feature] Accelerator-Aware Resource Limits - cgroups v2 Wrapper for AI #222

@mikejmorgan-ai

Description

@mikejmorgan-ai

Accelerator-Aware Resource Limits - cgroups v2 Wrapper for AI Workloads

Part of: Kernel-Level AI Enhancements (Tier 1 - User-Space)

Description

Wraps Linux cgroups v2 with an AI-friendly CLI for managing GPU, CPU, and memory resources with workload presets.

Effort: 1-2 weeks | Bounty: $100

The Solution

cortex limits create inference-job --preset inference --gpus 2
cortex limits apply inference-job --pid 12345
eval $(cortex limits env inference-job)  # for new processes
cortex limits status inference-job

Workload Presets

Preset CPU Memory GPU OOM Score Use Case
inference 4 cores 32G 100% -500 Low-latency serving
training 16 cores 128G 100% -800 Long training jobs
batch 8 cores 64G 80% 0 Background processing
interactive 2 cores 16G 50% -200 Development

Features

  • cgroups v2 unified hierarchy support
  • Workload presets with sensible defaults
  • NVIDIA GPU isolation via environment variables
  • OOM score adjustment for priority
  • CPU quota, weight, and affinity
  • Memory hard and soft limits

Acceptance Criteria

  • Profiles can be created with presets
  • Limits are correctly applied to cgroups
  • GPU environment variables are generated
  • Works in user mode with delegation
  • Unit tests pass

Files

Complete implementation available:

  • accelerator_limits.py (~700 lines)
  • test_accelerator_limits.py (~300 lines)
  • README_ACCELERATOR_LIMITS.md

Priority

Medium - Important for multi-tenant environments

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions