-
-
Notifications
You must be signed in to change notification settings - Fork 52
feat: System health score and recommendations #604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @mikejmorgan-ai, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant new feature to the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a valuable system health check feature. The implementation is well-structured, using dataclasses and the rich library effectively for a clean user experience. The accompanying tests are comprehensive and demonstrate good use of mocking. My feedback primarily focuses on enhancing the robustness and maintainability of the health checks, particularly in parsing command outputs and handling platform-specific commands. Addressing these points will make the feature more reliable and easier to extend in the future.
| if code == 0: | ||
| lines = output.strip().split("\n") | ||
| if len(lines) >= 2: | ||
| parts = lines[1].split() | ||
| for part in parts: | ||
| if part.endswith("%"): | ||
| try: | ||
| usage_percent = int(part.rstrip("%")) | ||
| except ValueError: | ||
| pass | ||
| break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parsing the output of df -h by splitting on whitespace is fragile and can easily break on systems with different locales or output formatting (e.g., when long filesystem names cause line wrapping).
A more robust method is to use the --output=pcent flag, which provides a predictable, single-column output. This simplifies parsing and makes the check more reliable across different environments.
if code == 0:
lines = output.strip().split("\n")
if len(lines) >= 2:
try:
# Get the second line, strip whitespace and the '%' suffix
usage_percent = int(lines[1].strip().rstrip("%"))
except (ValueError, IndexError):
pass| code, output, _ = self._run_command(["free", "-m"]) | ||
|
|
||
| usage_percent = 50 # Default | ||
| if code == 0: | ||
| lines = output.strip().split("\n") | ||
| for line in lines: | ||
| if line.startswith("Mem:"): | ||
| parts = line.split() | ||
| if len(parts) >= 3: | ||
| try: | ||
| total = int(parts[1]) | ||
| used = int(parts[2]) | ||
| if total > 0: | ||
| usage_percent = int((used / total) * 100) | ||
| except ValueError: | ||
| pass | ||
| break | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parsing the output of free -m by splitting on whitespace is fragile and can break with different versions or configurations of free. For greater robustness, consider checking for the availability of free --json (on supported versions) and using it to parse the memory information from a structured format. If JSON output is not available, the current parsing logic could be improved to be less dependent on exact spacing.
| code, output, _ = self._run_command( | ||
| ["apt", "list", "--upgradable"], | ||
| timeout=60, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The update check is hardcoded to use apt, which limits this feature to Debian-based systems. Since Cortex is positioned as a general Linux tool, this check should be more platform-agnostic. Consider detecting the system's package manager (e.g., yum, dnf, pacman) and executing the appropriate command. If that's out of scope, this limitation should be clearly documented.
| health_parser.add_argument( | ||
| "-v", "--verbose", | ||
| action="store_true", | ||
| help="Enable verbose output", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The health sub-command defines its own --verbose / -v flag, but a global one already exists for the main cortex command. This is redundant and could be confusing. The existing logic correctly uses the global verbose flag, so this local definition can be removed to avoid duplication and potential conflicts.
| """Check disk space usage.""" | ||
| code, output, _ = self._run_command(["df", "-h", "/"]) | ||
|
|
||
| usage_percent = 50 # Default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defaulting usage_percent to 50 when the df command fails is misleading, as it results in a "FAIR" score of 50, masking the check's failure. It would be more transparent to default to a value that yields a score of 0 and provide a detail message indicating the failure. This ensures the user is aware that the check could not be completed.
| usage_percent = 50 # Default | |
| usage_percent = 100 # Default to 100% on error for a score of 0 |
| content = ssh_config.read_text() | ||
| if "PermitRootLogin yes" in content: | ||
| issues.append("Root SSH login enabled") | ||
| score -= 15 | ||
| if "PasswordAuthentication yes" in content: | ||
| issues.append("Password SSH enabled") | ||
| score -= 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using simple string containment to check for settings like PermitRootLogin yes in /etc/ssh/sshd_config is not robust. This check can be easily fooled by commented-out lines, extra whitespace, or different casing. Using regular expressions to match the setting at the start of a non-commented line would be much more reliable.
Note: You will need to add import re at the top of the file for the suggestion to work.
content = ssh_config.read_text()
if "PermitRootLogin yes" in content: # A more robust regex check is recommended here
issues.append("Root SSH login enabled")
score -= 15
if "PasswordAuthentication yes" in content: # A more robust regex check is recommended here
issues.append("Password SSH enabled")
score -= 10| try: | ||
| factor = check_func() | ||
| report.factors.append(factor) | ||
| except Exception as e: | ||
| if self.verbose: | ||
| console.print(f"[yellow]Warning: {name} check failed: {e}[/yellow]") | ||
| progress.advance(task) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a health check function fails with an exception, it is silently skipped unless in verbose mode. This can hide critical issues and give a misleadingly high health score. It would be better to always create a HealthFactor with a score of 0 and an error message in the details field. This ensures that failures are always visible to the user in the final report.
try:
factor = check_func()
report.factors.append(factor)
except Exception as e:
if self.verbose:
console.print(f"[yellow]Warning: {name} check failed: {e}[/yellow]")
# Create a failed factor to make the error visible in the report
# Note: This requires knowing the category for the check.
# Consider refactoring the 'checks' list to include the category.
report.factors.append(HealthFactor(
name=name,
category=HealthCategory.PERFORMANCE, # This is a placeholder
score=0,
details=f"Check failed to run: {e}"
))| """Run system health check. | ||
| Args: | ||
| action: Action to perform (check, history, fix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring for the action parameter mentions fix as a possible value, but this action is not implemented in the function logic or defined in the argument parser's choices. The docstring should be updated to accurately reflect the available actions (check, history, factors, quick).
action: Action to perform (check, history, factors, quick)| elif action == "factors": | ||
| console.print("[bold cyan]Health Factors:[/bold cyan]") | ||
| factors = [ | ||
| ("Disk Space", "Monitors disk usage percentage", "1.0"), | ||
| ("Memory", "Monitors RAM usage", "0.8"), | ||
| ("System Updates", "Checks for available package updates", "1.2"), | ||
| ("Security", "Checks firewall, SSH config, auto-updates", "1.5"), | ||
| ("System Services", "Monitors failed systemd services", "1.0"), | ||
| ("Performance", "Checks load average and swap usage", "1.0"), | ||
| ] | ||
| for name, desc, weight in factors: | ||
| console.print(f" [cyan]{name}[/cyan] (weight: {weight})") | ||
| console.print(f" {desc}") | ||
| return 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The factors action uses a hardcoded list of factor names, descriptions, and weights. This creates a maintenance issue, as any changes to the health checks (e.g., adding a new one, modifying a weight) will require this list to be updated manually. This data should be sourced from a single source of truth, such as a central configuration or by adding metadata to the check functions themselves.
| import os | ||
| import tempfile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| """Display health report.""" | ||
| # Overall score | ||
| score = report.overall_score | ||
| status = report.status.value |
| """ | ||
|
|
||
| import json | ||
| import os |
| import json | ||
| import os | ||
| import subprocess | ||
| import time |
| from datetime import datetime | ||
| from enum import Enum | ||
| from pathlib import Path | ||
| from typing import Callable, Optional |
| """ | ||
|
|
||
| import json | ||
| import os |
|
|
||
| import json | ||
| import os | ||
| import tempfile |
| import json | ||
| import os | ||
| import tempfile | ||
| from datetime import datetime |
| import os | ||
| import tempfile | ||
| from datetime import datetime | ||
| from pathlib import Path |
| if part.endswith("%"): | ||
| try: | ||
| usage_percent = int(part.rstrip("%")) | ||
| except ValueError: |
| used = int(parts[2]) | ||
| if total > 0: | ||
| usage_percent = int((used / total) * 100) | ||
| except ValueError: |
- Overall health score 0-100 with weighted factors - Health factors: disk, memory, updates, security, services, performance - Actionable recommendations with point values - History tracking with trend analysis - Quick check mode for scripting - 36 tests with >80% coverage Usage: cortex health # Full health report cortex health quick # Quick score check cortex health history # View score history cortex health factors # List all health factors Health Factors: - Disk Space (weight: 1.0) - Memory Usage (weight: 0.8) - System Updates (weight: 1.2) - Security (weight: 1.5) - System Services (weight: 1.0) - Performance (weight: 1.0) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
5d62315 to
4a90d9f
Compare
|




Summary
Implements #128 - System Health Score and Recommendations.
Features
Commands
Example Output
Testing
Fixes #128