Add Kernel-Level AI Features (Tier 1) #224

mikejmorgan-ai · 2025-12-02T10:56:05Z

Summary

This PR adds four kernel-level AI features implemented in user-space:

Feature	Description	Lines
Model Lifecycle Manager	systemd-based LLM services	~300
KV-Cache Manager	shared memory cache pools	~300
Accelerator Limits	cgroups v2 wrapper	~250
/dev/llm Device	FUSE LLM interface	~300

Why This Matters

These are Tier 1 features from our kernel enhancement strategy:

Run on current Ubuntu 24.04 (no kernel patches required)
Demonstrate kernel-level thinking
Support patent portfolio (especially KV-Cache)
Differentiate from 'just a wrapper' positioning

Usage

# Model management
cortex model register llama --path meta-llama/Llama-2-7b --backend vllm
cortex model start llama

# Cache management  
cortex cache create my-cache --size 16G

# Resource limits
cortex limits create inference-job --preset inference --gpus 2

# File-based LLM access
cortex-llm-device mount /mnt/llm
echo 'Hello' > /mnt/llm/claude/prompt
cat /mnt/llm/claude/response

Testing

pytest tests/kernel_features/ -v

Next Steps

Add comprehensive tests
Integration with main Cortex CLI
Documentation in main README
Create GitHub issues for bounties

Part of Kernel Enhancement Roadmap - Tier 1 (User-Space)

Summary by CodeRabbit

Release Notes

New Features
- Model Lifecycle Manager for registering and managing LLM services
- KV-Cache Manager for creating and managing inference cache pools
- Accelerator Limits Manager for configuring resource constraints and GPU allocation
- Virtual LLM Device interface for model interaction
Documentation
- Added comprehensive Kernel Features guide with CLI usage examples
Tests
- Added test coverage for new features

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Implements user-space versions of kernel-level concepts: - Model Lifecycle Manager: systemd-based LLM service management - KV-Cache Manager: shared memory cache pools for inference - Accelerator Limits: cgroups v2 wrapper for AI workloads - /dev/llm Device: FUSE-based file interface to LLMs These demonstrate kernel-level thinking while shipping on Ubuntu 24.04. Part of our patent portfolio for AI-native operating systems. Total: ~1,500 lines of production code

coderabbitai · 2025-12-02T10:56:14Z

Walkthrough

This PR introduces the cortex.kernel_features package, implementing four kernel-level feature managers: Model Lifecycle Manager for systemd-based LLM service management, KV-Cache Manager for shared memory cache pools with configurable policies, Accelerator Limits for cgroups v2-based resource constraints, and a FUSE-based /dev/llm virtual device interface. Tests and documentation are included.

Changes

Cohort / File(s)	Summary
Package initialization `cortex/kernel_features/__init__.py`	Exports six public entities (ModelLifecycleManager, ModelConfig, KVCacheManager, CacheConfig, AcceleratorLimitsManager, ResourceLimits) imported from three sub-modules; defines `__all__` for public API surface.
Model Lifecycle Manager `cortex/kernel_features/model_lifecycle.py`	Implements systemd user-service lifecycle management for LLM models. Includes ModelConfig dataclass, ModelDatabase for SQLite persistence (~/.cortex/models.db), ServiceGenerator for backend-specific systemd unit templates (vllm, llamacpp, ollama), ModelLifecycleManager orchestrator, and CLI entry point supporting register/start/stop/status/list commands.
KV-Cache Manager `cortex/kernel_features/kv_cache_manager.py`	Implements user-space KV-cache pool management using POSIX shared memory. Includes CachePolicy enum (LRU/LFU/FIFO), CacheConfig and CacheEntry dataclasses, CacheDatabase for metadata persistence, SharedMemoryPool for lifecycle management, KVCacheManager coordinator, and CLI supporting create/destroy/status commands with size parsing (K/M/G suffixes).
Accelerator Limits Manager `cortex/kernel_features/accelerator_limits.py`	Implements cgroups v2-based resource profile management for workloads. Defines WorkloadPreset enum with PRESETS dictionary (inference, training, batch, interactive), ResourceLimits dataclass, LimitsDatabase for SQLite persistence (~/.cortex/limits.db), AcceleratorLimitsManager wrapper, and CLI supporting create/env/status/list commands.
FUSE Virtual Device `cortex/kernel_features/llm_device.py`	Implements a FUSE-backed /dev/llm virtual file system exposing models and sessions as files/directories. Includes Session dataclass, MockLLM fallback, LLMDevice with path parsing and file-system operations (getattr, readdir, read, write, truncate, open, create), mount/unmount CLI, and optional Anthropic API integration.
Documentation `docs/KERNEL_FEATURES.md`	User-facing documentation describing the four kernel-feature components, architecture positioning, patent note, and command-line usage examples.
Tests `tests/kernel_features/test_kv_cache.py`, `tests/kernel_features/test_model_lifecycle.py`	Test modules validating CacheConfig defaults/construction and ModelConfig serialization round-trips (to_dict/from_dict).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

model_lifecycle.py: Requires understanding of systemd service generation, template injection, and user-level systemctl integration; review focus on service file correctness and error handling.
kv_cache_manager.py: Requires verification of POSIX shared memory lifecycle, database schema consistency, and size parsing logic.
llm_device.py: Requires familiarity with FUSE file-system semantics, path parsing correctness, and integration with optional Anthropic API.
accelerator_limits.py: Requires validation of cgroups v2 configuration and preset definitions.
Cross-cutting concerns: SQLite schema design and serialization (to_dict/from_dict) consistency across modules; CLI argument parsing patterns and error reporting.

Possibly related issues

Implements the systemd-based Model Lifecycle Manager with ModelConfig, ModelDatabase, and service generation as described in issue #220.
Adds the KV-Cache Manager (KVCacheManager, CacheConfig, SharedMemoryPool, CacheDatabase) and related tests as specified in issue #221.
Directly implements the systemd-based model lifecycle functionality outlined in issues #216 and #217.
Adds the KV-cache component with full POSIX shared memory and database persistence as detailed in issue #218.
Implements the FUSE-backed /dev/llm virtual device (LLMDevice, mount CLI) as described in issue #223.
Adds the accelerator_limits module with cgroups v2 presets and resource profile management as described in issues #219 and #222.

Poem

🐰 Four features bloom, a rabbit's delight!
Services orchestrate, cache pools take flight,
Limits are managed, resources aligned,
A virtual device and a filesystem combined!
Hop into Cortex, where kernels are tamed,
With FUSE and cgroups, the workloads are framed! 🚀

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Add Kernel-Level AI Features (Tier 1)' is clear, specific, and accurately describes the main addition of four kernel-level AI features implemented in user-space.
Description check	✅ Passed	The PR description comprehensively covers the Summary, provides context on Why This Matters, includes Usage examples, Testing instructions, and Next Steps—exceeding the template requirements.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/kernel-features

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sonarqubecloud · 2025-12-02T10:56:39Z

Quality Gate passed

Issues
11 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

cortex/kernel_features/accelerator_limits.py

+cgroups v2 wrapper for AI workloads.
+"""
+
+import os


cortex/kernel_features/accelerator_limits.py

+import os
+import json
+import sqlite3
+import subprocess


cortex/kernel_features/kv_cache_manager.py

+User-space KV-cache management for LLM inference optimization.
+"""
+
+import os


cortex/kernel_features/kv_cache_manager.py

+import os
+import json
+import sqlite3
+import time


cortex/kernel_features/kv_cache_manager.py

+import json
+import sqlite3
+import time
+import hashlib


cortex/kernel_features/kv_cache_manager.py

+import hashlib
+from pathlib import Path
+from dataclasses import dataclass, asdict
+from typing import Optional, List, Dict


cortex/kernel_features/llm_device.py

+"""
+
+import os
+import sys


cortex/kernel_features/llm_device.py

+import stat
+import errno
+from dataclasses import dataclass, field
+from typing import Optional, Dict, List


cortex/kernel_features/model_lifecycle.py

+Manages LLM models as first-class system services using systemd.
+"""
+
+import os


cortex/kernel_features/model_lifecycle.py

+"""
+
+import os
+import sys


coderabbitai

Actionable comments posted: 7

🧹 Nitpick comments (8)

cortex/kernel_features/accelerator_limits.py (2)
33-49: Fix type hint for gpu_ids to match the default value.

The type hint declares List[int] but the default is None. This violates PEP 484 and triggers static analysis warnings.
 @dataclass
 class ResourceLimits:
     name: str
     preset: str = "inference"
     cpu_quota: float = 400.0
     memory_max: int = 32 * 1024**3
-    gpu_ids: List[int] = None
+    gpu_ids: Optional[List[int]] = None
     oom_score_adj: int = 0
97-124: Consider adding help output when no subcommand is provided.

When invoked without a subcommand, the CLI silently exits. Adding a fallback to print help would improve UX. This is consistent with other managers in this PR, so it's optional.
     elif args.cmd in ("status", "list"):
         mgr.status()
+    else:
+        parser.print_help()
 
 
 if __name__ == "__main__":
cortex/kernel_features/llm_device.py (1)
136-138: Consider validating mountpoint and using absolute path for fusermount.

The mountpoint argument is passed directly to subprocess.run(). While command injection is limited here, using the absolute path to fusermount and validating the mountpoint would improve security.
     elif args.cmd == "umount":
         import subprocess
-        subprocess.run(["fusermount", "-u", args.mountpoint])
+        from pathlib import Path
+        mp = Path(args.mountpoint).resolve()
+        subprocess.run(["/bin/fusermount", "-u", str(mp)], check=True)
cortex/kernel_features/kv_cache_manager.py (1)
118-125: Fix implicit Optional and inconsistent tuple handling in status().

The name parameter should be Optional[str]. Also, the tuple handling is inconsistent—get_pool returns (CacheConfig, str) but list_pools returns [CacheConfig], requiring awkward wrapping at line 119.
-    def status(self, name: str = None):
-        pools = [self.db.get_pool(name)] if name else [(p, "") for p in self.db.list_pools()]
+    def status(self, name: Optional[str] = None):
+        if name:
+            result = self.db.get_pool(name)
+            pools = [result[0]] if result else []
+        else:
+            pools = self.db.list_pools()
         print(f"\n{'POOL':<20} {'SIZE':<12} {'POLICY':<10}")
         print("-" * 50)
-        for item in pools:
-            if item:
-                cfg = item[0] if isinstance(item, tuple) else item
-                print(f"{cfg.name:<20} {cfg.size_bytes/1e9:.1f}G{'':<6} {cfg.policy:<10}")
+        for cfg in pools:
+            print(f"{cfg.name:<20} {cfg.size_bytes/1e9:.1f}G{'':<6} {cfg.policy:<10}")
cortex/kernel_features/model_lifecycle.py (1)
134-142: Fix implicit Optional type hint.

The name parameter defaults to None but is typed as str.
-    def status(self, name: str = None):
+    def status(self, name: Optional[str] = None):
tests/kernel_features/test_model_lifecycle.py (1)
9-14: Incomplete round-trip validation.

The test only verifies name and backend after deserialization, but doesn't check other fields like model_path ("/model"), port (8080), or the remaining ModelConfig attributes. A thorough round-trip test should verify all fields are preserved.

Apply this diff to verify all key fields:
 def test_config_roundtrip():
     cfg = ModelConfig("test", "/model", "llamacpp", 8080)
     data = cfg.to_dict()
     restored = ModelConfig.from_dict(data)
     assert restored.name == cfg.name
+    assert restored.model_path == cfg.model_path
     assert restored.backend == cfg.backend
+    assert restored.port == cfg.port
tests/kernel_features/test_kv_cache.py (1)

4-7: Basic config validation is correct, but coverage is minimal.

The test correctly validates CacheConfig defaults. However, the test suite lacks coverage for KVCacheManager operations (create_pool, destroy_pool, status) and actual cache functionality. Consider adding tests for the manager's core operations when expanding test coverage.
cortex/kernel_features/__init__.py (1)
15-19: Static analysis flags unsorted __all__, but current grouping is logical.

Ruff reports that __all__ is not alphabetically sorted. However, the current organization groups related exports (ModelLifecycle, KVCache, AcceleratorLimits), which may be more maintainable than strict alphabetical order.

If you prefer to suppress the warning while keeping the logical grouping, add this comment:
 __all__ = [
+    # ruff: noqa: RUF022
     'ModelLifecycleManager', 'ModelConfig',
     'KVCacheManager', 'CacheConfig', 
     'AcceleratorLimitsManager', 'ResourceLimits',
 ]
Alternatively, sort alphabetically:
 __all__ = [
+    'AcceleratorLimitsManager',
+    'CacheConfig',
+    'KVCacheManager',
+    'ModelConfig',
     'ModelLifecycleManager',
-    'ModelConfig',
-    'KVCacheManager',
-    'CacheConfig', 
-    'AcceleratorLimitsManager',
     'ResourceLimits',
 ]

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ae97db8 and 5dc1038.

📒 Files selected for processing (8)

cortex/kernel_features/__init__.py (1 hunks)
cortex/kernel_features/accelerator_limits.py (1 hunks)
cortex/kernel_features/kv_cache_manager.py (1 hunks)
cortex/kernel_features/llm_device.py (1 hunks)
cortex/kernel_features/model_lifecycle.py (1 hunks)
docs/KERNEL_FEATURES.md (1 hunks)
tests/kernel_features/test_kv_cache.py (1 hunks)
tests/kernel_features/test_model_lifecycle.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (5)

cortex/kernel_features/__init__.py (3)

cortex/kernel_features/model_lifecycle.py (2)

ModelLifecycleManager (107-142)

ModelConfig (22-40)

cortex/kernel_features/kv_cache_manager.py (2)

KVCacheManager (97-125)

CacheConfig (28-32)

cortex/kernel_features/accelerator_limits.py (2)

AcceleratorLimitsManager (73-94)

ResourceLimits (34-49)

tests/kernel_features/test_kv_cache.py (1)

cortex/kernel_features/kv_cache_manager.py (2)

CacheConfig (28-32)

KVCacheManager (97-125)

tests/kernel_features/test_model_lifecycle.py (1)

cortex/kernel_features/model_lifecycle.py (4)

ModelConfig (22-40)

ModelLifecycleManager (107-142)

to_dict (35-36)

from_dict (39-40)

cortex/kernel_features/accelerator_limits.py (3)

cortex/kernel_features/llm_device.py (2)

create (110-110)

main (124-138)

cortex/kernel_features/kv_cache_manager.py (2)

status (118-125)

main (128-153)

cortex/kernel_features/model_lifecycle.py (2)

status (134-142)

main (145-175)

cortex/kernel_features/kv_cache_manager.py (3)

cortex/kernel_features/accelerator_limits.py (4)

create (77-80)

status (88-94)

main (97-120)

get (63-66)

cortex/kernel_features/llm_device.py (3)

create (110-110)

main (124-138)

getattr (66-73)

cortex/kernel_features/model_lifecycle.py (2)

status (134-142)

main (145-175)

🪛 GitHub Check: SonarCloud Code Analysis

cortex/kernel_features/model_lifecycle.py

[failure] 62-62: Don't use datetime.datetime.utcnow to create this datetime object.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZretQQLi-4iu7DOY4jo&open=AZretQQLi-4iu7DOY4jo&pullRequest=224

cortex/kernel_features/llm_device.py

[warning] 45-45: Remove the unused function parameter "max_tokens".

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZretQMui-4iu7DOY4jg&open=AZretQMui-4iu7DOY4jg&pullRequest=224

[warning] 45-45: Remove the unused function parameter "temp".

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZretQMui-4iu7DOY4ji&open=AZretQMui-4iu7DOY4ji&pullRequest=224

[warning] 67-67: Replace the unused local variable "m" with "_".

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZretQMui-4iu7DOY4jj&open=AZretQMui-4iu7DOY4jj&pullRequest=224

[warning] 45-45: Remove the unused function parameter "system".

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZretQMui-4iu7DOY4jf&open=AZretQMui-4iu7DOY4jf&pullRequest=224

[warning] 85-85: Replace the unused local variable "m" with "_".

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZretQMui-4iu7DOY4jl&open=AZretQMui-4iu7DOY4jl&pullRequest=224

[warning] 45-45: Remove the unused function parameter "model".

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZretQMui-4iu7DOY4jh&open=AZretQMui-4iu7DOY4jh&pullRequest=224

[warning] 76-76: Replace the unused local variable "f" with "_".

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZretQMui-4iu7DOY4jk&open=AZretQMui-4iu7DOY4jk&pullRequest=224

cortex/kernel_features/accelerator_limits.py

[warning] 39-39: Replace the type hint "list[int]" with "Optional[list[int]]" or don't assign "None" to "gpu_ids"

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZretQQai-4iu7DOY4jp&open=AZretQQai-4iu7DOY4jp&pullRequest=224

cortex/kernel_features/kv_cache_manager.py

[failure] 93-93: Specify an exception class to catch or reraise the exception

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZretQP9i-4iu7DOY4jn&open=AZretQP9i-4iu7DOY4jn&pullRequest=224

[failure] 80-80: Specify an exception class to catch or reraise the exception

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZretQP9i-4iu7DOY4jm&open=AZretQP9i-4iu7DOY4jm&pullRequest=224

🪛 Ruff (0.14.7)

cortex/kernel_features/__init__.py

15-19: __all__ is not sorted

Apply an isort-style sorting to __all__

(RUF022)

cortex/kernel_features/model_lifecycle.py

1-1: Shebang is present but file is not executable

(EXE001)

81-85: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

113-113: subprocess call: check for execution of untrusted input

(S603)

113-113: Consider ["systemctl", "--user", *list(args)] instead of concatenation

Replace with ["systemctl", "--user", *list(args)]

(RUF005)

134-134: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

cortex/kernel_features/llm_device.py

1-1: Shebang is present but file is not executable

(EXE001)

24-24: Multiple statements on one line (colon)

(E701)

45-45: Unused method argument: model

(ARG002)

45-45: Unused method argument: max_tokens

(ARG002)

45-45: Unused method argument: temp

(ARG002)

45-45: Unused method argument: system

(ARG002)

50-50: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

60-60: Multiple statements on one line (colon)

(E701)

61-61: Multiple statements on one line (colon)

(E701)

62-62: Multiple statements on one line (colon)

(E701)

63-63: Multiple statements on one line (colon)

(E701)

66-66: Unused method argument: fh

(ARG002)

67-67: Unpacked variable m is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

75-75: Unused method argument: fh

(ARG002)

76-76: Unpacked variable f is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

78-78: Multiple statements on one line (colon)

(E701)

79-79: Multiple statements on one line (colon)

(E701)

79-79: Consider [*base, 'prompt', 'response', 'config'] instead of concatenation

Replace with [*base, 'prompt', 'response', 'config']

(RUF005)

80-80: Multiple statements on one line (colon)

(E701)

81-81: Multiple statements on one line (colon)

(E701)

81-81: Consider [*base, 'prompt', 'response', 'history'] instead of concatenation

Replace with [*base, 'prompt', 'response', 'history']

(RUF005)

84-84: Unused method argument: fh

(ARG002)

85-85: Unpacked variable m is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

87-87: Multiple statements on one line (colon)

(E701)

88-88: Multiple statements on one line (colon)

(E701)

91-91: Unused method argument: offset

(ARG002)

91-91: Unused method argument: fh

(ARG002)

101-101: Do not catch blind exception: Exception

(BLE001)

108-108: Unused method argument: path

(ARG002)

108-108: Unused method argument: length

(ARG002)

108-108: Unused method argument: fh

(ARG002)

109-109: Unused method argument: path

(ARG002)

109-109: Unused method argument: flags

(ARG002)

110-110: Unused method argument: path

(ARG002)

110-110: Unused method argument: mode

(ARG002)

110-110: Unused method argument: fi

(ARG002)

138-138: subprocess call: check for execution of untrusted input

(S603)

138-138: Starting a process with a partial executable path

(S607)

cortex/kernel_features/accelerator_limits.py

1-1: Shebang is present but file is not executable

(EXE001)

cortex/kernel_features/kv_cache_manager.py

1-1: Shebang is present but file is not executable

(EXE001)

80-80: Do not use bare except

(E722)

80-81: try-except-pass detected, consider logging the exception

(S110)

93-93: Do not use bare except

(E722)

93-94: try-except-pass detected, consider logging the exception

(S110)

118-118: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

🔇 Additional comments (6)

docs/KERNEL_FEATURES.md (1)

1-48: Documentation looks good and aligns with implementations.

The documentation provides clear usage examples for all four kernel features. The CLI commands shown match the implementations in the corresponding Python modules.

cortex/kernel_features/accelerator_limits.py (1)

52-95: Database and manager implementation looks good.

The LimitsDatabase and AcceleratorLimitsManager classes follow consistent patterns with other kernel feature managers in this PR. The SQLite persistence and status output formatting are well-implemented.

cortex/kernel_features/llm_device.py (1)

66-89: FUSE operations implementation looks correct.

The getattr, readdir, and read methods properly implement the FUSE interface. The unused parameter warnings from static analysis are expected for FUSE callback signatures.

cortex/kernel_features/kv_cache_manager.py (1)

45-68: Database layer implementation looks solid.

The CacheDatabase class properly initializes tables and provides consistent CRUD operations with proper SQLite connection handling.

cortex/kernel_features/model_lifecycle.py (1)

21-41: ModelConfig dataclass is well-structured.

Good use of dataclass with sensible defaults and proper serialization methods.

tests/kernel_features/test_model_lifecycle.py (1)

4-7: LGTM: Basic defaults validated correctly.

The test appropriately verifies the default backend and port values match the ModelConfig specification.

coderabbitai · 2025-12-02T11:00:45Z

cortex/kernel_features/kv_cache_manager.py

+class SharedMemoryPool:
+    def __init__(self, name: str, size: int, create: bool = True):
+        self.name = f"{SHM_PREFIX}{name}"
+        self.size = size
+        if create:
+            try:
+                old = shared_memory.SharedMemory(name=self.name)
+                old.close()
+                old.unlink()
+            except:
+                pass
+            self.shm = shared_memory.SharedMemory(name=self.name, create=True, size=size + 8192)
+        else:
+            self.shm = shared_memory.SharedMemory(name=self.name)
+
+    def get_usage(self):
+        return self.size, 0, 0  # Simplified
+
+    def destroy(self):
+        self.shm.close()
+        try:
+            self.shm.unlink()
+        except:
+            pass


⚠️ Potential issue | 🟡 Minor

Replace bare except clauses with specific exception types.

Bare except: catches all exceptions including KeyboardInterrupt and SystemExit, which can mask critical errors. The static analysis tools flag this as a failure.

class SharedMemoryPool: def __init__(self, name: str, size: int, create: bool = True): self.name = f"{SHM_PREFIX}{name}" self.size = size if create: try: old = shared_memory.SharedMemory(name=self.name) old.close() old.unlink() - except: + except FileNotFoundError: pass self.shm = shared_memory.SharedMemory(name=self.name, create=True, size=size + 8192) else: self.shm = shared_memory.SharedMemory(name=self.name) def get_usage(self): return self.size, 0, 0 # Simplified def destroy(self): self.shm.close() try: self.shm.unlink() - except: + except FileNotFoundError: pass

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class SharedMemoryPool:

def __init__(self, name: str, size: int, create: bool = True):

self.name = f"{SHM_PREFIX}{name}"

self.size = size

if create:

try:

old = shared_memory.SharedMemory(name=self.name)

old.close()

old.unlink()

except:

pass

self.shm = shared_memory.SharedMemory(name=self.name, create=True, size=size + 8192)

else:

self.shm = shared_memory.SharedMemory(name=self.name)

def get_usage(self):

return self.size, 0, 0 # Simplified

def destroy(self):

self.shm.close()

try:

self.shm.unlink()

except:

pass

class SharedMemoryPool:

def __init__(self, name: str, size: int, create: bool = True):

self.name = f"{SHM_PREFIX}{name}"

self.size = size

if create:

try:

old = shared_memory.SharedMemory(name=self.name)

old.close()

old.unlink()

except FileNotFoundError:

pass

self.shm = shared_memory.SharedMemory(name=self.name, create=True, size=size + 8192)

else:

self.shm = shared_memory.SharedMemory(name=self.name)

def get_usage(self):

return self.size, 0, 0 # Simplified

def destroy(self):

self.shm.close()

try:

self.shm.unlink()

except FileNotFoundError:

pass

🧰 Tools

🪛 GitHub Check: SonarCloud Code Analysis

[failure] 93-93: Specify an exception class to catch or reraise the exception

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZretQP9i-4iu7DOY4jn&open=AZretQP9i-4iu7DOY4jn&pullRequest=224

[failure] 80-80: Specify an exception class to catch or reraise the exception

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZretQP9i-4iu7DOY4jm&open=AZretQP9i-4iu7DOY4jm&pullRequest=224

🪛 Ruff (0.14.7)

80-80: Do not use bare except

(E722)

80-81: try-except-pass detected, consider logging the exception

(S110)

93-93: Do not use bare except

(E722)

93-94: try-except-pass detected, consider logging the exception

(S110)

🤖 Prompt for AI Agents

In cortex/kernel_features/kv_cache_manager.py around lines 71 to 94, replace the bare except blocks with specific exception types: when attempting to open/unlink an existing shared memory segment (the first try), catch FileNotFoundError and PermissionError (or OSError) only, and let other exceptions propagate; when unlinking in destroy(), catch FileNotFoundError (and optionally PermissionError/OSError) instead of a bare except. If you want to be explicit, catch (FileNotFoundError, PermissionError, OSError) and avoid swallowing KeyboardInterrupt/SystemExit by not using a bare except.

coderabbitai · 2025-12-02T11:00:45Z

cortex/kernel_features/kv_cache_manager.py

+    def destroy_pool(self, name: str) -> bool:
+        if name in self.pools:
+            self.pools[name].destroy()
+            del self.pools[name]
+        with sqlite3.connect(CORTEX_DB) as conn:
+            conn.execute("DELETE FROM pools WHERE name=?", (name,))
+        print(f"✅ Destroyed pool '{name}'")
+        return True


⚠️ Potential issue | 🟠 Major

Resource leak: shared memory not cleaned up if pool not in current instance.

If a pool was created by a different process instance, it won't be in self.pools. The method deletes the DB record but leaves the shared memory segment orphaned.

def destroy_pool(self, name: str) -> bool: if name in self.pools: self.pools[name].destroy() del self.pools[name] + else: + # Try to clean up SHM created by another process + try: + pool = SharedMemoryPool(name, 0, create=False) + pool.destroy() + except FileNotFoundError: + pass with sqlite3.connect(CORTEX_DB) as conn: conn.execute("DELETE FROM pools WHERE name=?", (name,)) print(f"✅ Destroyed pool '{name}'") return True

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2025-12-02T11:00:45Z

cortex/kernel_features/llm_device.py

+class MockLLM:
+    def complete(self, model, messages, max_tokens, temp, system=None):
+        return f"[Mock] Response to: {messages[-1]['content'][:50]}..."


⚠️ Potential issue | 🟠 Major

MockLLM.complete() interface doesn't match actual Anthropic client usage.

MockLLM.complete() is called with positional args (model, messages, etc.), but the real Anthropic client at line 98 uses self.llm.messages.create(). This interface mismatch will cause issues when testing or when the API key is unavailable.

class MockLLM: - def complete(self, model, messages, max_tokens, temp, system=None): - return f"[Mock] Response to: {messages[-1]['content'][:50]}..." + class messages: + @staticmethod + def create(model, messages, max_tokens, **kwargs): + class Response: + content = [type('obj', (object,), {'text': f"[Mock] Response to: {messages[-1]['content'][:50]}..."})()] + return Response()

Alternatively, unify the call pattern at lines 98-100 to use a consistent interface.

🧰 Tools

🪛 GitHub Check: SonarCloud Code Analysis

[warning] 45-45: Remove the unused function parameter "max_tokens".

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZretQMui-4iu7DOY4jg&open=AZretQMui-4iu7DOY4jg&pullRequest=224

[warning] 45-45: Remove the unused function parameter "temp".

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZretQMui-4iu7DOY4ji&open=AZretQMui-4iu7DOY4ji&pullRequest=224

[warning] 45-45: Remove the unused function parameter "system".

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZretQMui-4iu7DOY4jf&open=AZretQMui-4iu7DOY4jf&pullRequest=224

[warning] 45-45: Remove the unused function parameter "model".

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZretQMui-4iu7DOY4jh&open=AZretQMui-4iu7DOY4jh&pullRequest=224

🪛 Ruff (0.14.7)

45-45: Unused method argument: model

(ARG002)

45-45: Unused method argument: max_tokens

(ARG002)

45-45: Unused method argument: temp

(ARG002)

45-45: Unused method argument: system

(ARG002)

🤖 Prompt for AI Agents

In cortex/kernel_features/llm_device.py around lines 44-46, MockLLM.complete(...) does not match how the real client is used at lines ~98-100 (self.llm.messages.create(...)); update the mock to mirror the real Anthropic client shape by adding a messages attribute with a create(...) method (accepting messages, model=None, max_tokens=None, temperature=None, system=None, **kwargs) that returns the same mock response format, or alternatively change the call sites at lines 98-100 to call MockLLM.complete(...) consistently; pick one approach and implement the matching signature so tests and runtime code use the same interface.

coderabbitai · 2025-12-02T11:00:45Z

cortex/kernel_features/llm_device.py

+    def write(self, path, data, offset, fh):
+        t, m, f = self._parse(path)
+        if t == 'model' and f == 'prompt':
+            s = self.sessions["default"]
+            s.prompt = data.decode().strip()
+            s.messages.append({"role": "user", "content": s.prompt})
+            try:
+                resp = self.llm.messages.create(model=self.MODELS.get(m, "claude-3-sonnet-20240229"),
+                    max_tokens=s.max_tokens, messages=s.messages) if HAS_API else self.llm.complete(m, s.messages, s.max_tokens, s.temp)
+                s.response = resp.content[0].text if HAS_API else resp
+            except Exception as e:
+                s.response = f"Error: {e}"
+            s.messages.append({"role": "assistant", "content": s.response})
+            self.requests += 1
+            return len(data)
+        raise FuseOSError(errno.EACCES)


⚠️ Potential issue | 🔴 Critical

Runtime error when using MockLLM fallback.

When HAS_API is False, line 99 calls self.llm.complete() which returns a string, but line 100 tries to access resp.content[0].text regardless of HAS_API. The conditional only affects how the response is created, not how it's processed.

The current logic:

resp = self.llm.messages.create(...) if HAS_API else self.llm.complete(...) s.response = resp.content[0].text if HAS_API else resp

But self.llm is MockLLM() when API is unavailable, and MockLLM doesn't have a messages attribute—it only has complete(). The if HAS_API check is on the wrong object method call.

try: - resp = self.llm.messages.create(model=self.MODELS.get(m, "claude-3-sonnet-20240229"), - max_tokens=s.max_tokens, messages=s.messages) if HAS_API else self.llm.complete(m, s.messages, s.max_tokens, s.temp) - s.response = resp.content[0].text if HAS_API else resp + if HAS_API and not isinstance(self.llm, MockLLM): + resp = self.llm.messages.create( + model=self.MODELS.get(m, "claude-3-sonnet-20240229"), + max_tokens=s.max_tokens, + messages=s.messages + ) + s.response = resp.content[0].text + else: + s.response = self.llm.complete(m, s.messages, s.max_tokens, s.temp)

🧰 Tools

🪛 Ruff (0.14.7)

91-91: Unused method argument: offset

(ARG002)

91-91: Unused method argument: fh

(ARG002)

101-101: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

In cortex/kernel_features/llm_device.py around lines 91 to 106, the code calls self.llm.messages.create only when HAS_API is True but still tries to access resp.content[0].text unconditionally; when HAS_API is False and self.llm is a MockLLM (which only implements complete()), this raises an attribute error. Fix by branching the call and the response extraction together: if HAS_API call self.llm.messages.create(...) and set s.response = resp.content[0].text (or the appropriate API response field), else call self.llm.complete(...) and set s.response to the returned string; keep the existing try/except, append the assistant message, increment requests, and return len(data).

coderabbitai · 2025-12-02T11:00:45Z

cortex/kernel_features/model_lifecycle.py

+    def save_model(self, config: ModelConfig):
+        with sqlite3.connect(CORTEX_DB_PATH) as conn:
+            conn.execute(
+                "INSERT OR REPLACE INTO models VALUES (?, ?, ?)",
+                (config.name, json.dumps(config.to_dict()), datetime.utcnow().isoformat())
+            )


⚠️ Potential issue | 🟡 Minor

Replace deprecated datetime.utcnow() with timezone-aware alternative.

datetime.utcnow() is deprecated in Python 3.12+ and returns a naive datetime. Use datetime.now(timezone.utc) instead.

+from datetime import datetime, timezone -from datetime import datetime ... def save_model(self, config: ModelConfig): with sqlite3.connect(CORTEX_DB_PATH) as conn: conn.execute( "INSERT OR REPLACE INTO models VALUES (?, ?, ?)", - (config.name, json.dumps(config.to_dict()), datetime.utcnow().isoformat()) + (config.name, json.dumps(config.to_dict()), datetime.now(timezone.utc).isoformat()) )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def save_model(self, config: ModelConfig):

with sqlite3.connect(CORTEX_DB_PATH) as conn:

conn.execute(

"INSERT OR REPLACE INTO models VALUES (?, ?, ?)",

(config.name, json.dumps(config.to_dict()), datetime.utcnow().isoformat())

)

from datetime import datetime, timezone

def save_model(self, config: ModelConfig):

with sqlite3.connect(CORTEX_DB_PATH) as conn:

conn.execute(

"INSERT OR REPLACE INTO models VALUES (?, ?, ?)",

(config.name, json.dumps(config.to_dict()), datetime.now(timezone.utc).isoformat())

)

🧰 Tools

🪛 GitHub Check: SonarCloud Code Analysis

[failure] 62-62: Don't use datetime.datetime.utcnow to create this datetime object.

See more on https://sonarcloud.io/project/issues?id=cortexlinux_cortex&issues=AZretQQLi-4iu7DOY4jo&open=AZretQQLi-4iu7DOY4jo&pullRequest=224

🤖 Prompt for AI Agents

In cortex/kernel_features/model_lifecycle.py around lines 58 to 63, the save_model method uses the deprecated naive datetime.utcnow(); replace it with a timezone-aware timestamp by calling datetime.now(timezone.utc).isoformat() (import timezone from datetime if missing) so the inserted timestamp is aware and compliant with Python 3.12+.

coderabbitai · 2025-12-02T11:00:46Z

cortex/kernel_features/model_lifecycle.py

+class ServiceGenerator:
+    BACKENDS = {
+        "vllm": "python -m vllm.entrypoints.openai.api_server --model {model_path} --port {port}",
+        "llamacpp": "llama-server -m {model_path} --port {port}",
+        "ollama": "ollama serve",
+    }
+
+    def generate(self, config: ModelConfig) -> str:
+        cmd = self.BACKENDS.get(config.backend, self.BACKENDS["vllm"]).format(**asdict(config))
+        return f"""[Unit]
+Description=Cortex Model: {config.name}
+After=network.target
+
+[Service]
+Type=simple
+ExecStart={cmd}
+Environment=CUDA_VISIBLE_DEVICES={','.join(map(str, config.gpu_ids))}
+CPUQuota={int(config.cpu_limit * 100)}%
+MemoryMax={config.memory_limit}
+Restart={config.restart_policy}
+NoNewPrivileges=true
+
+[Install]
+WantedBy=default.target
+"""


⚠️ Potential issue | 🔴 Critical

Command injection risk in generated systemd service.

The model_path from user input is directly interpolated into the ExecStart command without sanitization. A malicious path like foo; rm -rf / would execute arbitrary commands when the service starts.

Consider validating model_path to ensure it's a valid file path without shell metacharacters:

+import shlex +import re + class ServiceGenerator: BACKENDS = { "vllm": "python -m vllm.entrypoints.openai.api_server --model {model_path} --port {port}", "llamacpp": "llama-server -m {model_path} --port {port}", "ollama": "ollama serve", } def generate(self, config: ModelConfig) -> str: + # Validate model_path doesn't contain shell metacharacters + if not re.match(r'^[\w\-./]+$', config.model_path): + raise ValueError(f"Invalid model_path: {config.model_path}") cmd = self.BACKENDS.get(config.backend, self.BACKENDS["vllm"]).format(**asdict(config))

Alternatively, use shlex.quote() on user-provided values, though systemd ExecStart has its own quoting rules.

🧰 Tools

🪛 Ruff (0.14.7)

81-85: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

🤖 Prompt for AI Agents

In cortex/kernel_features/model_lifecycle.py around lines 80-104 the generated systemd ExecStart interpolates config.model_path directly into a shell command allowing command-injection (e.g., "foo; rm -rf /"); fix by validating and sanitizing model_path before use: ensure it is an absolute or relative filesystem path that exists (or matches an allowed pattern), reject or normalize unexpected characters, and strip/escape any shell metacharacters; then build ExecStart using a safe, quoted form (e.g., apply shlex.quote or the appropriate systemd ExecStart escaping rules) or avoid a shell invocation entirely by constructing ExecStart as a command+argv list so untrusted values are never interpreted by a shell.

coderabbitai · 2025-12-02T11:00:46Z

cortex/kernel_features/model_lifecycle.py

+    for cmd in ["start", "stop", "unregister"]:
+        p = sub.add_parser(cmd)
+        p.add_argument("name")
+
+    sub.add_parser("status").add_argument("name", nargs="?")
+    sub.add_parser("list")
+
+    args = parser.parse_args()
+    mgr = ModelLifecycleManager()
+
+    if args.cmd == "register":
+        mgr.register(ModelConfig(args.name, args.path, args.backend, args.port, 
+                                 gpu_ids=[int(x) for x in args.gpus.split(",")]))
+    elif args.cmd == "start":
+        mgr.start(args.name)
+    elif args.cmd == "stop":
+        mgr.stop(args.name)
+    elif args.cmd in ("status", "list"):
+        mgr.status(getattr(args, 'name', None))


⚠️ Potential issue | 🟠 Major

Missing handler for unregister command.

The unregister subcommand is defined in the parser (line 157) but not handled in the if/elif chain, causing it to silently do nothing.

elif args.cmd == "stop": mgr.stop(args.name) + elif args.cmd == "unregister": + mgr.unregister(args.name) elif args.cmd in ("status", "list"): mgr.status(getattr(args, 'name', None))

You'll also need to implement the unregister method in ModelLifecycleManager:

def unregister(self, name: str) -> bool: service_path = CORTEX_SERVICE_DIR / f"cortex-{name}.service" self._systemctl("stop", f"cortex-{name}.service") self._systemctl("disable", f"cortex-{name}.service") service_path.unlink(missing_ok=True) self.db.delete_model(name) self._systemctl("daemon-reload") print(f"✅ Unregistered model '{name}'") return True

🤖 Prompt for AI Agents

In cortex/kernel_features/model_lifecycle.py around lines 157 to 175, the CLI defines an "unregister" subcommand but the if/elif chain never handles args.cmd == "unregister", so the command is a no-op; add an elif branch to call mgr.unregister(args.name) alongside the existing start/stop/status/list handlers, and implement ModelLifecycleManager.unregister(name: str) to stop and disable the systemd service, remove the service file, delete the model from the DB, reload systemd daemon, print a success message and return True (ensure service file removal uses a safe unlink and that systemctl stop/disable/daemon-reload calls use the existing _systemctl helper).

mikejmorgan-ai added enhancement New feature or request kernel-features Kernel-level AI enhancements labels Dec 2, 2025

github-code-quality bot found potential problems Dec 2, 2025

View reviewed changes

mikejmorgan-ai merged commit b77a004 into main Dec 2, 2025
7 of 12 checks passed

mikejmorgan-ai deleted the feature/kernel-features branch December 2, 2025 10:58

coderabbitai bot reviewed Dec 2, 2025

View reviewed changes

Sahilbhatane mentioned this pull request Dec 7, 2025

feat: Add LLM Device Abstraction Layer #240

Closed

coderabbitai bot mentioned this pull request Dec 23, 2025

[#273] Add Python 3.14 free-threading compatibility 2.0 #346

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Kernel-Level AI Features (Tier 1) #224

Add Kernel-Level AI Features (Tier 1) #224

Uh oh!

mikejmorgan-ai commented Dec 2, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 2, 2025 •

edited

Loading

Uh oh!

sonarqubecloud bot commented Dec 2, 2025

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Dec 2, 2025

Uh oh!

coderabbitai bot Dec 2, 2025

Uh oh!

coderabbitai bot Dec 2, 2025

Uh oh!

coderabbitai bot Dec 2, 2025

Uh oh!

coderabbitai bot Dec 2, 2025

Uh oh!

coderabbitai bot Dec 2, 2025

Uh oh!

coderabbitai bot Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Add Kernel-Level AI Features (Tier 1) #224

Add Kernel-Level AI Features (Tier 1) #224

Uh oh!

Conversation

mikejmorgan-ai commented Dec 2, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why This Matters

Usage

Testing

Next Steps

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Poem

Pre-merge checks and finishing touches

Uh oh!

sonarqubecloud bot commented Dec 2, 2025

Quality Gate passed

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mikejmorgan-ai commented Dec 2, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 2, 2025 •

edited

Loading