-
-
Notifications
You must be signed in to change notification settings - Fork 52
feat: Complete accelerator-aware resource limits (Issue #222) #230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Complete accelerator-aware resource limits (Issue #222) #230
Conversation
Closes cortexlinux#222 ## Summary - Full cgroups v2 integration with unified hierarchy support - Workload presets (inference, training, batch, interactive) - CPU quota/weight, memory max/high limits - GPU isolation via CUDA_VISIBLE_DEVICES - OOM score adjustment for process priority - User mode delegation support - 32 passing unit tests ## Files - cortex/kernel_features/accelerator_limits.py (718 lines) - tests/test_accelerator_limits.py (394 lines) - README_ACCELERATOR_LIMITS.md
WalkthroughIntroduces accelerator-limits functionality for managing GPU, CPU, and memory resources for AI workloads via cgroups v2. Adds comprehensive resource management classes (AcceleratorLimitsManager, LimitsDatabase, CgroupsV2Controller, OOMScoreManager) with a CLI supporting profile creation, application to live processes, environment variable generation, status reporting, and preset-based configuration. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant CLI as CLI<br/>(main)
participant ALM as AcceleratorLimitsManager
participant DB as LimitsDatabase
participant Cgroups as CgroupsV2Controller
participant OOM as OOMScoreManager
participant SysFS as Linux cgroups v2
User->>CLI: cortex limits apply profile-name --pid 12345
CLI->>ALM: apply(profile_name, pid)
ALM->>DB: get(profile_name)
DB-->>ALM: ResourceLimits
ALM->>Cgroups: create_cgroup(path)
Cgroups->>SysFS: mkdir cgroup hierarchy
SysFS-->>Cgroups: success
Cgroups-->>ALM: true
ALM->>Cgroups: apply_cpu_limits(path, quota, weight)
Cgroups->>SysFS: write cpu.max, cpu.weight
SysFS-->>Cgroups: success
Cgroups-->>ALM: true
ALM->>Cgroups: apply_memory_limits(path, max, high)
Cgroups->>SysFS: write memory.max, memory.high
SysFS-->>Cgroups: success
Cgroups-->>ALM: true
ALM->>Cgroups: add_pid(path, pid)
Cgroups->>SysFS: write cgroup.procs
SysFS-->>Cgroups: success
Cgroups-->>ALM: true
ALM->>OOM: set_oom_score_adj(pid, score)
OOM->>SysFS: write /proc/[pid]/oom_score_adj
SysFS-->>OOM: success
OOM-->>ALM: true
ALM->>DB: record_pid(pid, profile_name)
DB-->>ALM: recorded
ALM-->>CLI: true (success)
CLI-->>User: Profile applied successfully
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Suggested labels
Poem
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (7)
cortex/kernel_features/accelerator_limits.py (4)
29-37: Avoid configuring global logging at import time in a library moduleCalling
logging.basicConfigat module import can interfere with applications’ own logging configuration. Prefer configuring logging in the CLI entrypoint (or the top-level app) and keep this module to just creating a logger withgetLogger(__name__).-# Set up logging -logging.basicConfig(level=logging.INFO, format='%(levelname)s: %(message)s') -logger = logging.getLogger(__name__) +# Set up module-level logger; configuration should be done by the caller/CLI. +logger = logging.getLogger(__name__)You can then configure logging inside
main()if desired, e.g.logging.basicConfig(... )there rather than at import time.
243-266: Tighten broadexcept Exceptionusage and improve loggingSeveral methods (
create_cgroup,apply_cpu_limits,apply_memory_limits,add_pid,get_pids, and the OOMScoreManager methods) catch bareException, sometimes withpass, which can hide unexpected errors and complicate debugging.
- Narrow the caught exceptions where feasible (
OSError,IOError,ValueError, etc.).- Where you do intend to catch “anything” for robustness, log with
logger.exception(...)instead oflogger.error(...)orpassso stack traces are preserved.
For example:- except Exception as e: - logger.error(f"Failed to set CPU limits: {e}") + except Exception: + logger.exception("Failed to set CPU limits")And for
try/exceptblocks that currently justpass, at least log at debug level so silent failures are observable when needed.Also applies to: 284-300, 313-329, 336-344, 351-357, 386-394, 400-404
467-471: Remove unnecessary f-string in log messageThe
- else: - print(f"Could not set OOM score (may need root)") + else: + print("Could not set OOM score (may need root)") success = FalseThis also addresses the Ruff/Sonar warning about an f-string with no replacement fields.
638-672: Optional: deduplicate repeated"Profile name"help textThe literal
"Profile name"is repeated across multiple subparsers; Sonar flags this for duplication. Not critical, but you could factor it into a small constant to centralize future changes.PROFILE_NAME_HELP = "Profile name" create_parser.add_argument("name", help=PROFILE_NAME_HELP) apply_parser.add_argument("name", help=PROFILE_NAME_HELP) delete_parser.add_argument("name", help=PROFILE_NAME_HELP) env_parser.add_argument("name", help=PROFILE_NAME_HELP)This is purely cosmetic and can be deferred if you prefer.
tests/test_accelerator_limits.py (2)
1-12: Consider removing the shebang from this test moduleThe file has a
#!/usr/bin/env python3shebang but is not typically executed directly (it’s run viapytest/unittest). This triggers EXE001 because the file isn’t executable and is unnecessary for a test module.-#!/usr/bin/env python3 -""" -Unit tests for Cortex Accelerator-Aware Resource Limits -""" +""" +Unit tests for Cortex Accelerator-Aware Resource Limits +"""Keeping tests as plain modules avoids the linter warning and matches common practice.
212-223: Use_for the unusedpathvariable in cgroup creation testIn
test_create_cgroup_mock,pathis unpacked but never used, which Ruff flags as RUF059. Using_clarifies that the second return value is intentionally ignored.- controller = CgroupsV2Controller(root) - success, path = controller.create_cgroup("cortex.slice/test-job") + controller = CgroupsV2Controller(root) + success, _ = controller.create_cgroup("cortex.slice/test-job")This keeps the test behavior unchanged while addressing the linter warning.
README_ACCELERATOR_LIMITS.md (1)
124-135: Add a language to the architecture fenced code blockmarkdownlint flags this block for missing a language identifier. Since it’s ASCII art / diagram,
text(ornone) would be appropriate.-``` +```text AcceleratorLimitsManager ├── LimitsDatabase (SQLite storage) ├── CgroupsV2Controller (cgroups interface) └── OOMScoreManager (OOM score adjustment) @@ ├── GPU: device IDs, percentage └── OOM: score adjustmentThis keeps rendering unchanged while satisfying MD040. </blockquote></details> </blockquote></details> <details> <summary>📜 Review details</summary> **Configuration used**: CodeRabbit UI **Review profile**: CHILL **Plan**: Pro <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between da3e6355b164ac48826016b8745cbdb2a59df02b and e7ec868a2a3b2206c6e3d5af92fe60faa8713c97. </details> <details> <summary>📒 Files selected for processing (3)</summary> * `README_ACCELERATOR_LIMITS.md` (1 hunks) * `cortex/kernel_features/accelerator_limits.py` (1 hunks) * `tests/test_accelerator_limits.py` (1 hunks) </details> <details> <summary>🧰 Additional context used</summary> <details> <summary>🧬 Code graph analysis (1)</summary> <details> <summary>tests/test_accelerator_limits.py (1)</summary><blockquote> <details> <summary>cortex/kernel_features/accelerator_limits.py (24)</summary> * `ResourceLimits` (90-135) * `LimitsDatabase` (138-213) * `CgroupsV2Controller` (216-368) * `OOMScoreManager` (371-405) * `AcceleratorLimitsManager` (408-613) * `WorkloadPreset` (39-44) * `from_preset` (113-126) * `to_dict` (128-130) * `from_dict` (133-135) * `save` (166-173) * `get` (175-183) * `list_all` (192-196) * `delete` (185-190) * `delete` (587-600) * `record_pid` (198-204) * `get_pids` (206-213) * `get_pids` (346-358) * `is_available` (223-225) * `get_available_controllers` (236-241) * `create_cgroup` (243-266) * `get_oom_score_adj` (397-405) * `create` (416-443) * `get_env` (478-507) * `main` (616-714) </details> </blockquote></details> </details><details> <summary>🪛 GitHub Check: SonarCloud Code Analysis</summary> <details> <summary>tests/test_accelerator_limits.py</summary> [warning] 219-219: Replace the unused local variable "path" with "_". See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZrny9ER6oUfBTHzwt2H&open=AZrny9ER6oUfBTHzwt2H&pullRequest=230 </details> <details> <summary>cortex/kernel_features/accelerator_limits.py</summary> [warning] 470-470: Add replacement fields or use a normal string instead of an f-string. See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZrny9A_6oUfBTHzwt2G&open=AZrny9A_6oUfBTHzwt2G&pullRequest=230 --- [failure] 243-243: Refactor this function to reduce its Cognitive Complexity from 17 to the 15 allowed. See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZrny9A_6oUfBTHzwt2F&open=AZrny9A_6oUfBTHzwt2F&pullRequest=230 --- [failure] 639-639: Define a constant instead of duplicating this literal "Profile name" 4 times. See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZrny9A_6oUfBTHzwt2E&open=AZrny9A_6oUfBTHzwt2E&pullRequest=230 </details> </details> <details> <summary>🪛 markdownlint-cli2 (0.18.1)</summary> <details> <summary>README_ACCELERATOR_LIMITS.md</summary> 124-124: Fenced code blocks should have a language specified (MD040, fenced-code-language) </details> </details> <details> <summary>🪛 Ruff (0.14.7)</summary> <details> <summary>tests/test_accelerator_limits.py</summary> 1-1: Shebang is present but file is not executable (EXE001) --- 219-219: Unpacked variable `path` is never used Prefix it with an underscore or any other dummy variable pattern (RUF059) </details> <details> <summary>cortex/kernel_features/accelerator_limits.py</summary> 265-265: Do not catch blind exception: `Exception` (BLE001) --- 294-294: Consider moving this statement to an `else` block (TRY300) --- 298-298: Do not catch blind exception: `Exception` (BLE001) --- 299-299: Use `logging.exception` instead of `logging.error` Replace with `exception` (TRY400) --- 323-323: Consider moving this statement to an `else` block (TRY300) --- 327-327: Do not catch blind exception: `Exception` (BLE001) --- 328-328: Use `logging.exception` instead of `logging.error` Replace with `exception` (TRY400) --- 342-342: Do not catch blind exception: `Exception` (BLE001) --- 343-343: Use `logging.exception` instead of `logging.error` Replace with `exception` (TRY400) --- 356-357: `try`-`except`-`pass` detected, consider logging the exception (S110) --- 356-356: Do not catch blind exception: `Exception` (BLE001) --- 366-366: Consider moving this statement to an `else` block (TRY300) --- 392-392: Do not catch blind exception: `Exception` (BLE001) --- 393-393: Use `logging.exception` instead of `logging.error` Replace with `exception` (TRY400) --- 403-404: `try`-`except`-`pass` detected, consider logging the exception (S110) --- 403-403: Do not catch blind exception: `Exception` (BLE001) --- 470-470: f-string without any placeholders Remove extraneous `f` prefix (F541) </details> </details> </details> <details> <summary>⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)</summary> * GitHub Check: Agent </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
| def __post_init__(self): | ||
| self.gpu_ids = self.gpu_ids or [] | ||
|
|
||
| if self.gpu_ids is None: | ||
| self.gpu_ids = [] | ||
| if self.cpu_affinity is None: | ||
| self.cpu_affinity = [] | ||
| if not self.cgroup_path: | ||
| self.cgroup_path = f"{CORTEX_CGROUP}/{self.name}" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sanitize profile names before embedding them into cgroup paths
cgroup_path is built directly from name and later used as a relative path under /sys/fs/cgroup. If a caller passes characters like .. or path separators, this can escape the intended CORTEX_CGROUP subtree and potentially alter limits of unrelated cgroups when run as root.
def __post_init__(self):
if self.gpu_ids is None:
self.gpu_ids = []
if self.cpu_affinity is None:
self.cpu_affinity = []
if not self.cgroup_path:
- self.cgroup_path = f"{CORTEX_CGROUP}/{self.name}"
+ # Sanitize profile name to avoid escaping the Cortex cgroup subtree.
+ safe_name = "".join(
+ c for c in self.name
+ if c.isalnum() or c in ("-", "_", ".")
+ )
+ if not safe_name:
+ raise ValueError(f"Invalid profile name {self.name!r}")
+ self.cgroup_path = f"{CORTEX_CGROUP}/{safe_name}"This keeps existing tests (which use simple names) passing while preventing path traversal via the profile name.
🤖 Prompt for AI Agents
In cortex/kernel_features/accelerator_limits.py around lines 104-111, sanitize
self.name before embedding it into cgroup_path: compute a safe_name (e.g., use
os.path.basename(self.name) or strip all path separators and '..', then replace
any non-alphanumeric/_/- characters with a safe substitute or fallback to a
generated slug), ensure it has no leading '/' and no '..' segments, then build
cgroup_path using os.path.join(CORTEX_CGROUP, safe_name), normalize with
os.path.normpath, and finally assert or raise if the resulting path does not
start with CORTEX_CGROUP to prevent path traversal.
| # CPU hints | ||
| if limits.cpu_quota > 0: | ||
| cores = int(limits.cpu_quota / 100) | ||
| env["OMP_NUM_THREADS"] = str(cores) | ||
| env["MKL_NUM_THREADS"] = str(cores) | ||
| env["OPENBLAS_NUM_THREADS"] = str(cores) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard against OMP_NUM_THREADS=0 when cpu_quota < 100
cores = int(limits.cpu_quota / 100) will produce 0 if a user supplies a cpu_quota lower than 100%, leading to OMP_NUM_THREADS=0 and similar values, which are invalid for most runtimes.
- if limits.cpu_quota > 0:
- cores = int(limits.cpu_quota / 100)
+ if limits.cpu_quota > 0:
+ # Ensure at least 1 thread even for sub-100% quotas.
+ cores = max(1, int(limits.cpu_quota / 100))
env["OMP_NUM_THREADS"] = str(cores)
env["MKL_NUM_THREADS"] = str(cores)
env["OPENBLAS_NUM_THREADS"] = str(cores)This keeps current behavior for all presets (multiples of 100) while making custom low-CPU profiles safe.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # CPU hints | |
| if limits.cpu_quota > 0: | |
| cores = int(limits.cpu_quota / 100) | |
| env["OMP_NUM_THREADS"] = str(cores) | |
| env["MKL_NUM_THREADS"] = str(cores) | |
| env["OPENBLAS_NUM_THREADS"] = str(cores) | |
| # CPU hints | |
| if limits.cpu_quota > 0: | |
| # Ensure at least 1 thread even for sub-100% quotas. | |
| cores = max(1, int(limits.cpu_quota / 100)) | |
| env["OMP_NUM_THREADS"] = str(cores) | |
| env["MKL_NUM_THREADS"] = str(cores) | |
| env["OPENBLAS_NUM_THREADS"] = str(cores) | |
🤖 Prompt for AI Agents
In cortex/kernel_features/accelerator_limits.py around lines 500 to 506, the
calculation cores = int(limits.cpu_quota / 100) can produce 0 for cpu_quota <
100 causing invalid OMP_NUM_THREADS=0; change the calculation to ensure at least
1 thread by using cores = max(1, int(limits.cpu_quota / 100)) (keeping the
existing if limits.cpu_quota > 0 guard) so presets (multiples of 100) behave the
same while low custom cpu_quota values result in a safe minimum of 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements a comprehensive accelerator-aware resource limits system for AI workloads using cgroups v2. It introduces a Python-based wrapper that allows users to create and manage resource profiles with predefined workload presets (inference, training, batch, interactive) and apply them to processes. The implementation includes CPU quota/weight controls, memory limits (hard and soft), GPU device isolation via environment variables for multiple vendors (NVIDIA, AMD, Intel), OOM score adjustments, and SQLite-based profile storage with a full CLI interface.
Key Changes:
- Complete cgroups v2 controller implementation with unified hierarchy support and user mode delegation
- Four workload presets optimized for different AI use cases with sensible CPU, memory, GPU, and OOM defaults
- Multi-vendor GPU isolation supporting NVIDIA CUDA, AMD ROCm, and Intel oneAPI through environment variables
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 17 comments.
| File | Description |
|---|---|
| cortex/kernel_features/accelerator_limits.py | Complete implementation with ResourceLimits dataclass, LimitsDatabase (SQLite), CgroupsV2Controller, OOMScoreManager, AcceleratorLimitsManager, and CLI (718 lines) |
| tests/test_accelerator_limits.py | Comprehensive unit tests covering all major components with 32 test cases including preset validation, database operations, cgroups mocking, and CLI testing (394 lines) |
| README_ACCELERATOR_LIMITS.md | Complete documentation with quick start guide, preset descriptions, command reference, environment variables, user mode setup, architecture overview, and testing instructions |
Comments suppressed due to low confidence (1)
cortex/kernel_features/accelerator_limits.py:22
- Import of 'shutil' is not used.
import shutil
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| import logging | ||
|
|
||
| # Configuration | ||
| CORTEX_DB = Path.home() / ".cortex/limits.db" |
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The database path uses Path.home() which resolves to the home directory of the user running the process. If the code is run with sudo or as a different user, this could lead to inconsistent database locations (e.g., /root/.cortex/limits.db vs /home/user/.cortex/limits.db). This could cause profiles created by one user to be invisible to another user, even when they have the necessary permissions to manage cgroups.
Consider documenting this behavior or using a consistent system-wide location (e.g., /var/lib/cortex/limits.db) when running as root, with fallback to user home directory for non-root users.
| CORTEX_DB = Path.home() / ".cortex/limits.db" | |
| def get_cortex_db_path(): | |
| if hasattr(os, "geteuid") and os.geteuid() == 0: | |
| db_path = Path("/var/lib/cortex/limits.db") | |
| else: | |
| db_path = Path.home() / ".cortex/limits.db" | |
| db_path.parent.mkdir(parents=True, exist_ok=True) | |
| return db_path | |
| CORTEX_DB = get_cortex_db_path() |
| Features: | ||
| - cgroups v2 unified hierarchy support | ||
| - Workload presets with sensible defaults | ||
| - NVIDIA GPU isolation via environment variables |
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation in the docstring at line 11 states "NVIDIA GPU isolation via environment variables", but the implementation also supports AMD ROCm (HIP_VISIBLE_DEVICES) and Intel oneAPI (ONEAPI_DEVICE_SELECTOR) as seen in lines 488-492. The docstring should be updated to reflect this multi-vendor support.
Suggested update:
- GPU isolation via CUDA_VISIBLE_DEVICES, HIP_VISIBLE_DEVICES, ONEAPI_DEVICE_SELECTOR| - NVIDIA GPU isolation via environment variables | |
| - GPU isolation via CUDA_VISIBLE_DEVICES, HIP_VISIBLE_DEVICES, ONEAPI_DEVICE_SELECTOR |
| for key, value in env.items(): | ||
| print(f"export {key}={value}") |
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The environment variable values are printed without any shell escaping, which could lead to shell injection vulnerabilities if a malicious profile name or GPU IDs are stored in the database. While the values are currently generated from integers and controlled data, the profile name comes from user input and is displayed in error messages.
Consider using shlex.quote() to properly escape the values, especially since this output is intended to be evaluated with eval $(cortex limits env ...):
import shlex
# ...
for key, value in env.items():
print(f"export {key}={shlex.quote(value)}")Additionally, the profile_name used in the error message should be escaped or validated.
| CORTEX_CGROUP = "cortex.slice" | ||
|
|
||
| # Set up logging | ||
| logging.basicConfig(level=logging.INFO, format='%(levelname)s: %(message)s') |
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logging.basicConfig() call at module level can interfere with the logging configuration of other modules or the main application. This is because basicConfig() only works if the root logger hasn't been configured yet, and calling it in a library module is generally discouraged.
Consider removing this line and letting the application configure logging, or check if the root logger is already configured before calling basicConfig():
if not logging.getLogger().hasHandlers():
logging.basicConfig(level=logging.INFO, format='%(levelname)s: %(message)s')| logging.basicConfig(level=logging.INFO, format='%(levelname)s: %(message)s') |
| if args.command == "create": | ||
| limits = ResourceLimits.from_preset(args.name, args.preset, args.gpus) |
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name parameter from user input is used directly in f-strings for cgroup paths and database operations without validation. This could potentially lead to path traversal or SQL injection issues (though SQLite with parameterized queries provides some protection).
Consider adding validation to ensure the profile name only contains safe characters:
import re
def validate_profile_name(name: str) -> bool:
"""Validate profile name contains only safe characters."""
return bool(re.match(r'^[a-zA-Z0-9_-]+$', name))This should be called in the create() method and other places where profile names are accepted from user input.
|
|
||
| import os | ||
| import sys | ||
| import json |
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import of 'json' is not used.
| import json |
| import tempfile | ||
| import unittest | ||
| from pathlib import Path | ||
| from unittest.mock import patch, MagicMock |
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import of 'MagicMock' is not used.
| from unittest.mock import patch, MagicMock | |
| from unittest.mock import patch |
| try: | ||
| subtree_control.write_text(f"+{ctrl}") | ||
| except PermissionError: | ||
| pass |
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| pass | |
| logger.warning(f"Permission denied enabling controller '{ctrl}' in {subtree_control}") |
| except Exception: | ||
| pass |
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| except Exception: | |
| pass | |
| except Exception as e: | |
| logger.error(f"Failed to read PIDs from {procs_file}: {e}") |
| except Exception: | ||
| pass |
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| except Exception: | |
| pass | |
| except Exception as e: | |
| logger.error(f"Failed to get OOM score for PID {pid}: {e}") |



Summary
Completes the implementation of accelerator-aware resource limits for AI workloads.
Closes #222
Changes
Files
cortex/kernel_features/accelerator_limits.py(718 lines) - Complete implementationtests/test_accelerator_limits.py(394 lines) - 32 unit testsREADME_ACCELERATOR_LIMITS.md- DocumentationAcceptance Criteria
Test Plan
pytest tests/test_accelerator_limits.py -v # All 32 tests passUsage Examples
Summary by CodeRabbit
New Features
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.