-
-
Notifications
You must be signed in to change notification settings - Fork 52
feat: add system resource monitoring with TUI and install integration #664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: add system resource monitoring with TUI and install integration #664
Conversation
Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com>
CLA Verification FailedThe following contributors have not signed the Contributor License Agreement:
How to Sign
This check runs automatically. Maintainers can update |
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughAdds a real-time monitoring subsystem (sampler, UI, analyzer, exporter, storage), integrates optional install-time monitoring into the coordinator and CLI, promotes psutil to a core dependency, and includes comprehensive unit tests for the monitoring features. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI as CortexCLI
participant Sampler as ResourceSampler
participant UI as MonitorUI
participant Exporter as export_samples
User->>CLI: cortex monitor --duration 60
CLI->>Sampler: create & start()
activate Sampler
Sampler->>Sampler: _sample_loop (background)
CLI->>UI: create MonitorUI(sampler)
activate UI
UI->>Sampler: get_latest_sample() (periodic)
Sampler-->>UI: ResourceSample
UI->>UI: render live bars & metrics
Note over Sampler,UI: Duration elapsed or user interrupts
UI->>Sampler: stop()
deactivate Sampler
Sampler-->>UI: PeakUsage
UI->>UI: show_summary(peak_usage)
UI-->>CLI: PeakUsage (exit code 0)
opt --export provided
CLI->>Exporter: export_samples(samples, path)
Exporter->>Exporter: format JSON/CSV
end
CLI-->>User: exit code
sequenceDiagram
participant User
participant CLI as CortexCLI
participant Coord as InstallationCoordinator
participant Sampler as ResourceSampler
participant Storage as MonitorStorage
User->>CLI: cortex install cuda --monitor
CLI->>Coord: execute(enable_monitoring=True)
activate Coord
Coord->>Sampler: create & start()
activate Sampler
Sampler->>Sampler: _sample_loop (background)
Coord->>Coord: run installation steps
Sampler->>Sampler: collect samples
Note over Sampler: Installation completes
Coord->>Sampler: stop()
deactivate Sampler
Sampler-->>Coord: PeakUsage
Coord->>Storage: finalize_session(peak_usage, metadata)
Coord->>Coord: populate InstallationResult with peak metrics
Coord-->>CLI: InstallationResult (with peak metrics)
deactivate Coord
CLI->>CLI: print peak usage summary
CLI-->>User: exit code + peak metrics
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @Vaibhav701161, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a robust system resource monitoring capability to Cortex, allowing users to gain deep insights into their system's performance. It provides both a real-time interactive TUI for immediate observation and an integrated monitoring option during software installations to capture peak resource demands. The collected data is stored for historical review, can be exported for external analysis, and is intelligently analyzed to offer actionable recommendations, significantly enhancing the diagnostic and optimization toolkit for Cortex users. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This is a substantial and well-executed pull request that introduces a comprehensive system resource monitoring feature. The code is well-structured, with a clear separation of concerns into modules for sampling, UI, storage, exporting, and analysis. I appreciate the thoughtful design choices, such as the graceful degradation when optional dependencies like rich are unavailable, the thread-safe data collection in the sampler, and the fallback to a user-writable directory for the database. The addition of extensive unit tests is also a great strength. My review includes a few suggestions to address a missing type hint, a couple of opportunities for refactoring to reduce code duplication, and cleaning up some leftover development artifacts and a copy-paste error in the new tests. Overall, this is a high-quality contribution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Vaibhav701161 Thanks for contributing to Cortex Linux.
Please open a separate PR for CLA verification, following the pattern used in #401.
Also, kindly address all bot comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cortex/cli.py (1)
1463-1470: Fixmonitorplumbing to avoid TypeError/NameError
cli.install(..., monitor=...)is called (Line 5611), butinstall()doesn’t acceptmonitor, andmonitoris referenced later (Line 1694). This will raiseTypeError(unexpected kwarg) orNameErrorat runtime. Add amonitorparameter and wire it through.🛠️ Proposed fix
- def install( - self, - software: str, - execute: bool = False, - dry_run: bool = False, - parallel: bool = False, - json_output: bool = False, - ): + def install( + self, + software: str, + execute: bool = False, + dry_run: bool = False, + parallel: bool = False, + json_output: bool = False, + monitor: bool = False, + ) -> int: @@ coordinator = InstallationCoordinator( commands=commands, descriptions=[f"Step {i + 1}" for i in range(len(commands))], timeout=300, stop_on_error=True, progress_callback=progress_callback, enable_monitoring=monitor, )Also applies to: 1694-1706, 5611-5616
🤖 Fix all issues with AI agents
In `@cortex/coordinator.py`:
- Around line 245-255: The code logs "Resource monitoring started" regardless of
whether ResourceSampler actually began; modify the try block that creates
self._sampler (ResourceSampler, self._sampler.start()) to check the sampler's
running state (e.g., call self._sampler.is_running or equivalent) after start(),
only log "Resource monitoring started" when it reports running, and if it is not
running set self._sampler = None to avoid misleading logs and zero peaks; ensure
ImportError handling remains unchanged.
- Around line 67-77: The Coordinator class sets monitoring attributes without
type hints and the from_plan factory doesn't accept/forward enable_monitoring;
add explicit type annotations for self._sampler and self._peak_usage (e.g.,
import Optional and Any from typing and declare self._sampler: Optional[Any] =
None and self._peak_usage: Optional[Any] = None) and update the from_plan method
signature to include enable_monitoring: bool = False, then pass that parameter
through to the Coordinator constructor when instantiating (ensure the call to
Coordinator(...) includes enable_monitoring=enable_monitoring).
In `@cortex/monitor/exporter.py`:
- Around line 104-157: The _export_csv function accepts a metadata dict but
never writes it to the CSV; update _export_csv to iterate over the metadata
argument (when not None) and write each key/value as a comment line (e.g., "#
key: value") to the opened file before writing the header and samples, similar
to how the JSON exporter preserves metadata; locate the metadata parameter on
_export_csv and the with open(...) block to insert the metadata comment writes
so existing fields and writer.writeheader() remain unchanged.
In `@cortex/monitor/monitor_ui.py`:
- Around line 11-17: The success message in export_samples() currently prints
the original export_path even though the code appends ".json" when no suffix is
provided; change export_samples() to resolve the final path up front (e.g.,
compute final_export_path from the given export_path by adding ".json" when
Path(export_path).suffix is empty), use final_export_path for the file
write/serialization and for the success print/log, and apply the same fix to the
similar export block around the other occurrence (the block referenced at lines
~331-339) so both use the resolved final path in I/O and messaging.
- Around line 221-257: The _create_bar method currently always uses RAM
thresholds and allows percent >100 to overflow the bar; update it to (1)
determine the correct threshold set based on the metric label (e.g., map label
or introduce/inspect a metric key like "cpu"/"ram"/"disk" to pick
self.thresholds.cpu_warning/critical, self.thresholds.disk_warning/critical,
etc.) instead of always using self.thresholds.ram_warning/critical, and (2)
clamp the incoming percent value to the 0–100 range before computing filled =
int((percent / 100) * BAR_WIDTH) so the bar never overruns; keep the rest of the
UI logic (append label, bar, percent, cores/suffix) unchanged but use the
clamped percent variable and the chosen warning/critical thresholds to set
color.
- Around line 74-86: The run method currently only checks RICH_AVAILABLE before
starting the Live UI; add a TTY check and fall back when output is not a
terminal by returning self._run_fallback(duration). Concretely, import sys and
modify run(self, duration: int | None = None) to first check if not
RICH_AVAILABLE or not sys.stdout.isatty() and in that case call and return
self._run_fallback(duration); keep the rest of the Live-based logic unchanged so
Live is only used when Rich is available and stdout is a TTY.
In `@cortex/monitor/sampler.py`:
- Around line 131-134: The constructor currently treats max_samples using
truthiness so max_samples=0 becomes "unlimited"; change the assignment for
self.max_samples to explicitly handle None, zero, and negative values: if
max_samples is None set DEFAULT_MAX_SAMPLES, if max_samples is an int and < 0
raise ValueError (guard against negatives), otherwise set self.max_samples =
int(max_samples) allowing 0 as a valid value; update the assignment that
currently references DEFAULT_MAX_SAMPLES/self.max_samples to use this new
explicit logic (refer to self.max_samples and DEFAULT_MAX_SAMPLES in
sampler.py).
In `@cortex/monitor/storage.py`:
- Around line 46-61: The writability check in _get_db_path is incorrect: using
db_path.parent.stat() only checks existence/readability, not write permissions,
so replace that logic to verify actual write access to DEFAULT_DB_PATH's parent
(or fallback) before returning it; for example, use os.access(db_path.parent,
os.W_OK) or attempt creating a temporary file in db_path.parent and clean it up,
and if writable return str(db_path) else create USER_DB_PATH.parent and return
str(USER_DB_PATH); update references in _get_db_path and ensure downstream
callers like _ensure_tables continue to rely on the verified path.
In `@pyproject.toml`:
- Around line 58-60: Update the psutil dependency declaration to require
psutil>=6.1.0 so it supports Python 3.13; locate the psutil entry (currently
"psutil>=5.9.0") in the dependency list and change the version specifier to
"psutil>=6.1.0" ensuring no other constraints conflict with the project's
declared Python version or classifiers.
In `@tests/test_monitor.py`:
- Around line 658-719: The nested test functions (test_analyze_high_cpu_and_ram,
test_analyze_mixed_usage_cpu_only, test_analyze_borderline_thresholds,
test_analyze_single_sample) are indented inside test_analyzer_performance_score
so pytest won't collect them; move each of these functions out to top-level
(unindent them to the same indentation as test_analyzer_performance_score), keep
their bodies and imports (or remove duplicate imports if preferred), and ensure
they accept the sample_data fixture as declared so they run independently.
🧹 Nitpick comments (5)
tests/test_monitor.py (2)
24-115: Add type hints to fixtures/helpers to align with repo standardsThe new fixtures and helper functions are untyped. If the repo standard is “type hints required in Python code,” consider annotating fixture return types (e.g.,
Iterator[str],list[ResourceSample],MagicMock). As per coding guidelines, please add type hints.💡 Example adjustments
-from unittest.mock import MagicMock, patch +from collections.abc import Iterator +from unittest.mock import MagicMock, patch @@ -@pytest.fixture -def mock_psutil(): +@pytest.fixture +def mock_psutil() -> Iterator[MagicMock]: @@ -@pytest.fixture -def sample_data(): +@pytest.fixture +def sample_data() -> list["ResourceSample"]: @@ -@pytest.fixture -def temp_db(): +@pytest.fixture +def temp_db() -> Iterator[str]: @@ -@pytest.fixture -def temp_export_dir(): +@pytest.fixture +def temp_export_dir() -> Iterator[str]:
875-1019: Remove the duplicated storage test block
TestStorageand the related storage tests are duplicated in the same module. The second definition overwrites the first and makes maintenance noisy. Recommend keeping a single copy.🧹 Suggested cleanup
-# ============================================================================= -# Storage Tests -# ============================================================================= - - -class TestStorage: - """Tests for the storage module.""" - ... - -def test_storage_list_sessions_limit(temp_db): - ... - -def test_storage_get_nonexistent_session(temp_db): - ... - -def test_storage_save_samples_invalid_session(temp_db, sample_data): - ...cortex/monitor/storage.py (3)
11-17: Movetimedeltaimport to the top of the file.
timedeltais imported insidecleanup_old_sessions(line 423) but should be imported at module level alongsidedatetimefor consistency and PEP 8 compliance.Suggested fix
import json import logging import sqlite3 import uuid -from datetime import datetime +from datetime import datetime, timedelta from pathlib import Path from typing import AnyAnd remove line 423:
- from datetime import timedeltaAlso applies to: 423-423
108-112: Consider adding an index onstart_timefor query performance.
list_sessionsorders bystart_time DESCandcleanup_old_sessionsfilters onstart_time < ?. Without an index, these queries perform full table scans as sessions accumulate.Add index on start_time
# Create index for faster queries cursor.execute(""" CREATE INDEX IF NOT EXISTS idx_metrics_session ON resource_metrics(session_id) """) + cursor.execute(""" + CREATE INDEX IF NOT EXISTS idx_sessions_start_time + ON monitor_sessions(start_time) + """) + conn.commit()
179-212: Useexecutemany()for batch insert performance.Inserting samples one-by-one in a loop is inefficient for large batches. For long monitoring sessions, this could accumulate thousands of samples.
executemany()provides significantly better performance.Refactor to use executemany
try: with sqlite3.connect(self.db_path) as conn: cursor = conn.cursor() - for sample in samples: - cursor.execute( - """ - INSERT INTO resource_metrics - (session_id, timestamp, cpu_percent, cpu_count, - ram_used_gb, ram_total_gb, ram_percent, - disk_used_gb, disk_total_gb, disk_percent, - disk_read_rate, disk_write_rate, - net_recv_rate, net_sent_rate) - VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) - """, - ( - session_id, - sample.timestamp, - sample.cpu_percent, - sample.cpu_count, - sample.ram_used_gb, - sample.ram_total_gb, - sample.ram_percent, - sample.disk_used_gb, - sample.disk_total_gb, - sample.disk_percent, - sample.disk_read_rate, - sample.disk_write_rate, - sample.net_recv_rate, - sample.net_sent_rate, - ), - ) + cursor.executemany( + """ + INSERT INTO resource_metrics + (session_id, timestamp, cpu_percent, cpu_count, + ram_used_gb, ram_total_gb, ram_percent, + disk_used_gb, disk_total_gb, disk_percent, + disk_read_rate, disk_write_rate, + net_recv_rate, net_sent_rate) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) + """, + [ + ( + session_id, + s.timestamp, + s.cpu_percent, + s.cpu_count, + s.ram_used_gb, + s.ram_total_gb, + s.ram_percent, + s.disk_used_gb, + s.disk_total_gb, + s.disk_percent, + s.disk_read_rate, + s.disk_write_rate, + s.net_recv_rate, + s.net_sent_rate, + ) + for s in samples + ], + ) conn.commit()
Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cortex/cli.py (1)
1689-1707:--monitoris ignored for parallel installs.Monitoring is only wired into the sequential
InstallationCoordinatorpath. If a user runs--parallel --monitor, monitoring silently does nothing. Please either support monitoring in the parallel path or block this combination explicitly.🛠️ Suggested guard (short-term fix)
print(f"\n{t('install.executing')}") if parallel: + if monitor: + self._print_error("Monitoring is not supported with --parallel yet.") + return 1 import asyncio
♻️ Duplicate comments (2)
cortex/monitor/sampler.py (1)
113-138: Reject negative max_samples instead of clamping.
Line 138 usesmax(0, max_samples), so-1silently disables storage; raising aValueErrormakes misconfigurations explicit and aligns with the docstring.🛠️ Proposed fix
- else: - self.max_samples = max(0, max_samples) + else: + if max_samples < 0: + raise ValueError("max_samples must be >= 0 or None") + self.max_samples = max_samplescortex/monitor/monitor_ui.py (1)
352-364: Use the resolved export path for the write.
Line 362 still passesexport_pathtoexport_sampleseven whenactual_pathdiffers, so the exported filename can diverge from the success message. Pass the resolved path to keep I/O and messaging aligned.🛠️ Proposed fix
- export_samples(samples, export_path, peak) + export_samples(samples, actual_path, peak)
🧹 Nitpick comments (2)
tests/test_monitor.py (2)
332-358: Reduce reliance on real sleeps to avoid flaky tests.These tests depend on
time.sleep, which can be unreliable under load. Consider polling forget_sample_count()with a bounded timeout or mocking the sampling loop for determinism.
883-897: Replace the placeholder assertion with a real check.
assert Truedoesn’t validate anything. Consider asserting the subcommand is present in parser help or checking thatcli.monitoris callable from routing.
Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com>
Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com>
Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com>
|



Related Issue
Closes #51
Summary
This PR introduces system resource monitoring to Cortex, providing visibility into CPU, RAM, Disk, and Network usage both standalone and during installations.
What’s included
Standalone monitoring
cortex monitorlaunches a real-time TUI dashboardInstall-time monitoring
cortex install <package> --monitorPersistent storage
Advisory analysis
Graceful fallbacks
psutilorrichis unavailableMemory-safe design
Implementation Overview
New
cortex/monitor/module:sampler.py– thread-safe resource sampling viapsutilmonitor_ui.py– real-time TUI usingrich.Livestorage.py– SQLite-backed persistenceexporter.py– JSON / CSV exportanalyzer.py– advisory recommendationsCLI integration:
monitorcommand--monitorflag wired throughinstall→InstallationCoordinatorBackward-compatible extension of
InstallationResultto include peak metricsImplementationExport functionalityMonitoring InstallationStress Checkcortex.stress.mp4
Testing
Comprehensive unit tests for:
Manual verification:
Coverage target met (>80% for
cortex/monitor/)cortex.test.coverage.mp4
Non-Goals (Intentional)
AI Disclosure
Used AI assistance (Google Antigravity) for:
All code, design decisions, and final implementations were reviewed, validated, and iterated manually.
Checklist
feat(monitor): add system resource monitoringpytest tests/)Summary by CodeRabbit
New Features
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.