-
-
Notifications
You must be signed in to change notification settings - Fork 52
[feature] Instant smart suggestions - offline package stack recommendations #335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAdds an offline suggestion system and interactive suggestion-driven install flow: new SuggestionDatabase, IntentMatcher (FuzzyMatcher), SuggestionTUI with prompt_toolkit fallback, and CLI integration (CortexCLI.suggest/demo). Simplifies install to sequential execution, removes stack/doctor commands, and adds install previews and per-step progress/error handling. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI as CortexCLI
participant DB as SuggestionDatabase
participant Matcher as IntentMatcher
participant TUI as SuggestionTUI
participant Shell as HostShell
User->>CLI: suggest(query)
CLI->>DB: load/open database
CLI->>Matcher: init with DB
CLI->>TUI: init with DB
TUI->>Matcher: match(query)
Matcher->>DB: detect intent/language, fetch candidate stacks
Matcher-->>TUI: scored results
TUI->>User: render interactive results
User->>TUI: select stack / request preview
TUI->>Matcher: get_install_preview(stack_id)
Matcher->>DB: fetch apt/pip/npm package lists
Matcher-->>TUI: install preview (commands)
TUI->>User: confirm install
User->>CLI: confirm
CLI->>Shell: execute commands sequentially (progress + per-step status)
Shell-->>User: installation results/status
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: defaults Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (1)**/*.py📄 CodeRabbit inference engine (AGENTS.md)
Files:
🧬 Code graph analysis (1)test/test_cli.py (2)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (2)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Nitpick comments (7)
cortex/tui.py (2)
291-299: Input parsing can raise an unhandled exception for empty input.If the user presses Enter without typing anything,
int("")raisesValueError, which is caught. However, the UX could be improved by treating empty input as cancellation.🔎 Minor improvement
try: choice = console.input("\n[bold]Enter choice (1-5, 0 to cancel):[/bold] ") + if not choice.strip(): + return None idx = int(choice) - 1
358-365: Avoid re-instantiating database and matcher in the main block.Creating
FuzzyMatcher(PackageDatabase())here loads the database a second time. Theresultdict already containsdatawith the full stack information from the prior search.🔎 Proposed fix
if result: console.print(f"\n[bold green]✅ Selected:[/bold green] {result['name']}") - preview = FuzzyMatcher(PackageDatabase()).get_install_preview(result["type"], result["id"]) + # Use the data already present in result + db = PackageDatabase() + matcher = FuzzyMatcher(db) + preview = matcher.get_install_preview(result["type"], result["id"])Or better, expose the preview data directly from
run_suggestionsor cache the database instance.cortex/cli.py (1)
716-721: Consider extracting known commands to a constant.The hardcoded list of commands could drift out of sync with the actual subparsers. Consider defining a module-level constant for maintainability.
🔎 Example
KNOWN_COMMANDS = frozenset({ 'demo', 'wizard', 'status', 'install', 'history', 'rollback', 'check-pref', 'edit-pref', 'notify', 'cache', '-h', '--help', '-V', '--version' })cortex/matcher.py (2)
145-175: Simple fuzzy matching may miss some typos.The character overlap approach works for cases like "mongodo" → "mongodb" but won't catch transposition errors well (e.g., "dokcer" → "docker" where character sets are identical). Consider using edit distance for more robust typo correction if needed.
207-216: Consolidate exact ID match checks.Two separate loops check for exact stack ID matches (per-token and query-as-id). These could be combined for clarity.
🔎 Suggested consolidation
# Strategy 1: Exact stack ID match query_as_id = query.lower().replace(" ", "-") + potential_ids = set(tokens) | {query_as_id} - for token in tokens: - stack = self.db.get_stack(token) - if stack: - candidates[token] = self.SCORE_EXACT_ID - - stack = self.db.get_stack(query_as_id) - if stack: - candidates[query_as_id] = self.SCORE_EXACT_ID + for potential_id in potential_ids: + if self.db.get_stack(potential_id): + candidates[potential_id] = self.SCORE_EXACT_IDcortex/database.py (2)
49-52: Specify encoding when opening JSON file.Opening files without explicit encoding uses platform default, which may cause issues on some systems. JSON files are typically UTF-8.
🔎 Proposed fix
def _load(self) -> None: """Load the database from JSON file.""" - with open(self.db_path, 'r') as f: + with open(self.db_path, 'r', encoding='utf-8') as f: self._data = json.load(f)
120-132: Consider using dict lookup for get_intent and get_language.These methods iterate through lists when dict-based lookup would be O(1). If these are called frequently during matching, this could impact performance.
🔎 Proposed optimization
Add indexes in
_build_indexes():self._intents_by_id: Dict[str, Dict] = {i["id"]: i for i in self._intents} self._languages_by_id: Dict[str, Dict] = {l["id"]: l for l in self._languages}Then update the methods:
def get_intent(self, intent_id: str) -> Optional[Dict]: """Get an intent by its ID.""" - for intent in self._intents: - if intent["id"] == intent_id: - return intent - return None + return self._intents_by_id.get(intent_id) def get_language(self, lang_id: str) -> Optional[Dict]: """Get a language by its ID.""" - for lang in self._languages: - if lang["id"] == lang_id: - return lang - return None + return self._languages_by_id.get(lang_id)
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
cortex/cli.pycortex/database.pycortex/matcher.pycortex/tui.pydata/suggestions.json
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
cortex/matcher.pycortex/tui.pycortex/database.pycortex/cli.py
🧠 Learnings (1)
📚 Learning: 2025-12-11T12:03:24.071Z
Learnt from: CR
Repo: cortexlinux/cortex PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T12:03:24.071Z
Learning: Applies to **/*install*.py : Implement audit logging to ~/.cortex/history.db for all package operations
Applied to files:
cortex/cli.py
🧬 Code graph analysis (2)
cortex/matcher.py (1)
cortex/database.py (8)
stacks(107-108)detect_language_from_keywords(193-198)detect_intent_from_keywords(144-191)intents(99-100)get_stack(134-136)get_stacks_by_intent_and_language(210-215)get_stacks_by_language(206-208)get_stacks_by_intent(202-204)
cortex/tui.py (2)
cortex/hardware_detection.py (1)
detect_hardware(646-648)cortex/matcher.py (2)
match(177-323)get_install_preview(325-357)
🪛 GitHub Actions: CI
cortex/cli.py
[error] 33-33: Ruff I001: Import block is un-sorted or un-formatted.
🪛 GitHub Check: Lint
cortex/cli.py
[failure] 33-34: Ruff (I001)
cortex/cli.py:33:5: I001 Import block is un-sorted or un-formatted
[failure] 151-151: Ruff (W293)
cortex/cli.py:151:1: W293 Blank line contains whitespace
[failure] 146-146: Ruff (W293)
cortex/cli.py:146:1: W293 Blank line contains whitespace
[failure] 142-142: Ruff (W293)
cortex/cli.py:142:1: W293 Blank line contains whitespace
[failure] 138-138: Ruff (W293)
cortex/cli.py:138:1: W293 Blank line contains whitespace
[failure] 136-137: Ruff (I001)
cortex/cli.py:136:21: I001 Import block is un-sorted or un-formatted
[failure] 130-130: Ruff (W293)
cortex/cli.py:130:1: W293 Blank line contains whitespace
[failure] 125-125: Ruff (W293)
cortex/cli.py:125:1: W293 Blank line contains whitespace
[failure] 121-121: Ruff (W293)
cortex/cli.py:121:1: W293 Blank line contains whitespace
[failure] 117-117: Ruff (W293)
cortex/cli.py:117:1: W293 Blank line contains whitespace
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Agent
🔇 Additional comments (10)
cortex/tui.py (2)
19-28: LGTM - Class initialization is well-structured.The
SuggestionTUIclass is properly initialized with database and matcher dependencies. Type hints are present and the docstring describes the purpose.
103-112: LGTM - Hardware info display is clean.Uses
detect_quick()appropriately and formats output nicely.cortex/cli.py (2)
111-116: Type hint missing on return value of suggest method.Per coding guidelines, type hints are required. The method returns
intbut the signature only shows-> intimplicitly through fallback calls.🔎 Add explicit return type
- def suggest(self, query: str) -> int: + def suggest(self, query: str) -> int:The signature already has
-> intimplied but ensure consistency. Actually, looking again, the docstring mentions it returns int via the return statements. This is fine.Actually, reviewing again - the method signature at line 111 doesn't show a return type annotation in the provided code. The method should have
-> intadded.🔎 Add return type annotation
- def suggest(self, query: str) -> int: + def suggest(self, query: str) -> int:Wait, I see the code shows
def suggest(self, query: str) -> int:- that's already correct. Let me re-check the annotated code... Line 111 showsdef suggest(self, query: str) -> int:. The return type is present.
659-665: Demo method has placeholder implementation.The
demo()method comment says "(Keep existing demo logic)" but the implementation only prints a message and returns. Ensure the actual demo logic is preserved or this is intentional.cortex/matcher.py (2)
1-21: LGTM - Well-documented module header.The docstring clearly explains the matching strategy and provides useful examples.
360-361: LGTM - Backward compatibility alias is helpful.Provides
FuzzyMatcheralias for code expecting the old name.cortex/database.py (4)
33-47: LGTM - Database discovery with fallback paths.The search order prioritizes local development paths before system-wide locations, and includes backward compatibility for
packages.json.
144-191: LGTM - Intent detection with careful prioritization.The multi-pass approach correctly handles ambiguous tokens and prioritizes more specific matches.
217-223: LGTM - GPU filtering logic is correct.Properly filters based on GPU availability and vendor matching.
260-261: LGTM - Backward compatibility alias maintains API stability.
PackageDatabase = SuggestionDatabaseallows existing code to continue working.
cortex/cli.py
Outdated
|
|
||
| try: | ||
| # Show suggestions interactively | ||
| selected = run_suggestions(query, interactive=True) | ||
|
|
||
| if not selected: | ||
| # User cancelled | ||
| return 0 | ||
|
|
||
| # User selected something, now install it | ||
| console.print() | ||
| cx_print(f"Selected: {selected['name']}", "success") | ||
| console.print() | ||
|
|
||
| # Ask if they want to install | ||
| try: | ||
| choice = console.input("[bold cyan]Install this? (y/n):[/bold cyan] ").strip().lower() | ||
| if choice in ('y', 'yes'): | ||
| # Use stack data directly - no LLM needed! | ||
| from cortex.matcher import IntentMatcher | ||
| from cortex.database import SuggestionDatabase | ||
|
|
||
| db = SuggestionDatabase() | ||
| matcher = IntentMatcher(db) | ||
| preview = matcher.get_install_preview(selected['type'], selected['id']) | ||
|
|
||
| if preview.get("error"): | ||
| self._print_error(preview["error"]) | ||
| return 1 | ||
|
|
||
| commands = preview.get("commands", []) | ||
| if not commands: | ||
| cx_print("No installation commands for this stack", "warning") | ||
| return 0 | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove trailing whitespace from blank lines.
Static analysis reports W293: Blank line contains whitespace on multiple lines in this method (117, 121, 125, 130, 138, 142, 146, 151). Run a formatter or remove trailing spaces to fix the CI failure.
🧰 Tools
🪛 GitHub Check: Lint
[failure] 151-151: Ruff (W293)
cortex/cli.py:151:1: W293 Blank line contains whitespace
[failure] 146-146: Ruff (W293)
cortex/cli.py:146:1: W293 Blank line contains whitespace
[failure] 142-142: Ruff (W293)
cortex/cli.py:142:1: W293 Blank line contains whitespace
[failure] 138-138: Ruff (W293)
cortex/cli.py:138:1: W293 Blank line contains whitespace
[failure] 136-137: Ruff (I001)
cortex/cli.py:136:21: I001 Import block is un-sorted or un-formatted
[failure] 130-130: Ruff (W293)
cortex/cli.py:130:1: W293 Blank line contains whitespace
[failure] 125-125: Ruff (W293)
cortex/cli.py:125:1: W293 Blank line contains whitespace
[failure] 121-121: Ruff (W293)
cortex/cli.py:121:1: W293 Blank line contains whitespace
[failure] 117-117: Ruff (W293)
cortex/cli.py:117:1: W293 Blank line contains whitespace
🤖 Prompt for AI Agents
In cortex/cli.py around lines 117 to 151 there are blank lines that contain
trailing whitespace (W293) causing CI failures; remove any trailing spaces on
the blank lines (and optionally run your formatter/auto-strip trailing
whitespace across the file) so the blank lines are truly empty, then save and
re-run tests.
cortex/cli.py
Outdated
| from cortex.matcher import IntentMatcher | ||
| from cortex.database import SuggestionDatabase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix import sorting within function.
Static analysis flags these imports as unsorted (I001).
🔎 Proposed fix
- from cortex.matcher import IntentMatcher
from cortex.database import SuggestionDatabase
+ from cortex.matcher import IntentMatcher📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from cortex.matcher import IntentMatcher | |
| from cortex.database import SuggestionDatabase | |
| from cortex.database import SuggestionDatabase | |
| from cortex.matcher import IntentMatcher |
🧰 Tools
🪛 GitHub Check: Lint
[failure] 136-137: Ruff (I001)
cortex/cli.py:136:21: I001 Import block is un-sorted or un-formatted
🤖 Prompt for AI Agents
In cortex/cli.py around lines 136-137, the two local imports are unsorted
(I001); reorder them so module names are alphabetically sorted within the
function (import cortex.database.SuggestionDatabase before
cortex.matcher.IntentMatcher) to satisfy the import-sorting rule.
cortex/cli.py
Outdated
| # Execute commands with quiet output and spinner | ||
| import subprocess | ||
| from rich.progress import Progress, SpinnerColumn, TextColumn | ||
|
|
||
| with Progress( | ||
| SpinnerColumn(), | ||
| TextColumn("[progress.description]{task.description}"), | ||
| console=console, | ||
| transient=True | ||
| ) as progress: | ||
| for i, cmd in enumerate(commands, 1): | ||
| # Make commands quieter | ||
| quiet_cmd = cmd | ||
| if "apt install" in cmd or "apt update" in cmd: | ||
| quiet_cmd = cmd.replace("apt ", "apt -qq ") | ||
| if "pip install" in cmd: | ||
| quiet_cmd = cmd.replace("pip install", "pip install -q") | ||
|
|
||
| task = progress.add_task(f"Step {i}/{len(commands)}: Installing...", total=None) | ||
|
|
||
| # sudo commands need interactive terminal for password prompt | ||
| if cmd.strip().startswith("sudo"): | ||
| progress.stop() # Pause spinner for sudo | ||
| returncode = subprocess.call(quiet_cmd, shell=True) | ||
| result_returncode = returncode | ||
| result_stderr = "" | ||
| progress.start() # Resume spinner | ||
| else: | ||
| # pip/other commands can run quietly | ||
| result = subprocess.run( | ||
| quiet_cmd, | ||
| shell=True, | ||
| capture_output=True, | ||
| text=True | ||
| ) | ||
| result_returncode = result.returncode | ||
| result_stderr = result.stderr | ||
|
|
||
| progress.remove_task(task) | ||
|
|
||
| if result_returncode != 0: | ||
| self._print_error(f"Step {i} failed") | ||
| if result_stderr: | ||
| # Show only the last few error lines | ||
| error_lines = result_stderr.strip().split('\n')[-5:] | ||
| for line in error_lines: | ||
| console.print(f" [dim]{line}[/dim]") | ||
| return 1 | ||
| else: | ||
| cx_print(f"Step {i}/{len(commands)} complete", "success") | ||
|
|
||
| console.print() | ||
| cx_print("Installation complete!", "success") | ||
| return 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's find the cortex/cli.py file and read lines 159-212
find . -name "cli.py" -path "*/cortex/*" | head -5Repository: cortexlinux/cortex
Length of output: 77
🏁 Script executed:
# Read the specific lines to see the method context
sed -n '140,220p' ./cortex/cli.py | cat -nRepository: cortexlinux/cortex
Length of output: 4549
🏁 Script executed:
# Read a wider range to see the method definition
sed -n '100,160p' ./cortex/cli.py | cat -nRepository: cortexlinux/cortex
Length of output: 2892
🏁 Script executed:
# Now search for InstallationHistory usage and the install() method
grep -n "InstallationHistory" ./cortex/cli.pyRepository: cortexlinux/cortex
Length of output: 294
🏁 Script executed:
# Let's see the context around those InstallationHistory usages
grep -B 5 "InstallationHistory()" ./cortex/cli.py | head -30Repository: cortexlinux/cortex
Length of output: 732
🏁 Script executed:
# Let's find the install() method definition and see how it records history
sed -n '200,400p' ./cortex/cli.py | cat -nRepository: cortexlinux/cortex
Length of output: 9490
🏁 Script executed:
# Let's see the full install() method to understand the pattern
sed -n '302,400p' ./cortex/cli.py | cat -nRepository: cortexlinux/cortex
Length of output: 4493
🏁 Script executed:
# Get the suggest() method signature line number to confirm the range
grep -n "def suggest" ./cortex/cli.pyRepository: cortexlinux/cortex
Length of output: 107
🏁 Script executed:
# Let's look at the complete suggest() method to see if it records to history
sed -n '112,300p' ./cortex/cli.py | grep -A 150 "def suggest" | grep -i "history\|InstallationHistory"Repository: cortexlinux/cortex
Length of output: 124
🏁 Script executed:
# Get the full suggest() method - find where it ends
grep -n "def suggest\|def install\|def notify" ./cortex/cli.py | head -5Repository: cortexlinux/cortex
Length of output: 227
🏁 Script executed:
# Read the complete suggest() method (lines 112-230)
sed -n '112,230p' ./cortex/cli.pyRepository: cortexlinux/cortex
Length of output: 5485
Implement audit logging for suggestion-based installations.
The suggest() method executes package operations without recording them to ~/.cortex/history.db, while the install() method uses InstallationHistory to maintain an audit trail. This creates an inconsistency with the audit logging requirement for all package operations. Add InstallationHistory recording to the suggest() method to ensure all installations are tracked and rollback-capable.
cortex/matcher.py
Outdated
| post_install = stack.get("post_install", []) | ||
|
|
||
| commands = [] | ||
| if apt_packages: | ||
| commands.append(f"sudo apt update && sudo apt install -y {' '.join(apt_packages)}") | ||
| if pip_packages: | ||
| commands.append(f"pip install {' '.join(pip_packages)}") | ||
| if npm_packages: | ||
| commands.append(f"npm install -g {' '.join(npm_packages)}") | ||
|
|
||
| return { | ||
| "name": stack.get("name"), | ||
| "description": stack.get("description"), | ||
| "apt_packages": apt_packages, | ||
| "pip_packages": pip_packages, | ||
| "npm_packages": npm_packages, | ||
| "post_install": post_install, | ||
| "commands": commands, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
post_install commands are loaded but not included in the commands list.
The post_install field is extracted from the stack but not appended to commands. If these are meant to be executed, they should be added; otherwise, clarify the intent.
🔎 If post_install should be included
if npm_packages:
commands.append(f"npm install -g {' '.join(npm_packages)}")
+
+ # Add post-install commands
+ commands.extend(post_install)
return {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| post_install = stack.get("post_install", []) | |
| commands = [] | |
| if apt_packages: | |
| commands.append(f"sudo apt update && sudo apt install -y {' '.join(apt_packages)}") | |
| if pip_packages: | |
| commands.append(f"pip install {' '.join(pip_packages)}") | |
| if npm_packages: | |
| commands.append(f"npm install -g {' '.join(npm_packages)}") | |
| return { | |
| "name": stack.get("name"), | |
| "description": stack.get("description"), | |
| "apt_packages": apt_packages, | |
| "pip_packages": pip_packages, | |
| "npm_packages": npm_packages, | |
| "post_install": post_install, | |
| "commands": commands, | |
| post_install = stack.get("post_install", []) | |
| commands = [] | |
| if apt_packages: | |
| commands.append(f"sudo apt update && sudo apt install -y {' '.join(apt_packages)}") | |
| if pip_packages: | |
| commands.append(f"pip install {' '.join(pip_packages)}") | |
| if npm_packages: | |
| commands.append(f"npm install -g {' '.join(npm_packages)}") | |
| # Add post-install commands | |
| commands.extend(post_install) | |
| return { | |
| "name": stack.get("name"), | |
| "description": stack.get("description"), | |
| "apt_packages": apt_packages, | |
| "pip_packages": pip_packages, | |
| "npm_packages": npm_packages, | |
| "post_install": post_install, | |
| "commands": commands, |
🤖 Prompt for AI Agents
In cortex/matcher.py around lines 334 to 351, the post_install list is read from
the stack but never added to the commands list; if post_install entries should
be executed, extend the commands list with those entries (e.g., validate
post_install is a list, filter out non-string/empty items, then
commands.extend(post_install) or append each string) so the returned "commands"
includes apt/pip/npm commands plus post_install steps in the intended order; if
post_install is optional, leave the existing default but still normalize it
before extending.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds an instant, offline smart suggestion system that enables users to query packages using natural language (e.g., cortex docker or cortex python ml) without requiring LLM API calls. The system uses intent-based matching with a curated database of 500+ package stacks, fuzzy typo correction, and an interactive TUI for selection.
Key Changes
- Adds offline package suggestion engine with intent + language detection and fuzzy matching
- Introduces interactive TUI with scrolling viewport and keyboard navigation
- Integrates suggestions as the default behavior for non-command queries (bypassing argparse)
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 20 comments.
| File | Description |
|---|---|
| cortex/tui.py | New interactive TUI with selection interface, viewport scrolling, and fallback modes |
| cortex/matcher.py | Intent-based matching engine with 6-strategy scoring system and fuzzy matching |
| cortex/database.py | Database loader with keyword indexing for fast intent/language detection |
| cortex/cli.py | CLI integration with early query detection, suggestion flow, and direct installation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
cortex/tui.py
Outdated
| if not self.results: | ||
| console.print(f"[dim]No matches found for '{query}'[/dim]") | ||
| return None | ||
|
|
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function displays up to 5 items but prompts for "choice (1-5, 0 to cancel)". If fewer than 5 results exist, entering a number for a non-existent item (e.g., entering "5" when only 3 results exist) will fail silently. Consider dynamically generating the prompt text to reflect the actual number of available results.
cortex/tui.py
Outdated
| event.app.exit() | ||
|
|
||
| @kb.add("tab") | ||
| def _(event): |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Tab key handler exits the app, shows details, then tries to restart with event.app.run(). This creates a nested event loop which can cause unexpected behavior or failures. Consider restructuring to use a loop that re-runs the application instead of nesting calls to app.run().
| event.app.exit() | |
| @kb.add("tab") | |
| def _(event): | |
| # Show full details without exiting and restarting the app, | |
| # to avoid nesting event loops. | |
| self.show_full_details() | |
| event.app.invalidate() |
cortex/cli.py
Outdated
| f"Step {i}/{len(commands)}: Installing...", total=None | ||
| ) | ||
|
|
||
| # sudo commands need interactive terminal for password prompt | ||
| if cmd.strip().startswith("sudo"): | ||
| progress.stop() # Pause spinner for sudo | ||
| returncode = subprocess.call(quiet_cmd, shell=True) | ||
| result_returncode = returncode | ||
| result_stderr = "" | ||
| progress.start() # Resume spinner | ||
| else: | ||
| # pip/other commands can run quietly | ||
| result = subprocess.run( | ||
| quiet_cmd, shell=True, capture_output=True, text=True |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shell injection vulnerability: Commands are executed with shell=True using user-controlled input from the database. If the database contains malicious commands (e.g., through injection or tampering), they will be executed with the user's privileges (potentially with sudo). Consider using subprocess with a list of arguments instead of shell=True, or at minimum add validation/sanitization of commands before execution.
cortex/database.py
Outdated
| # Only check multi-word keywords | ||
| if " " in kw and kw.lower() in query: | ||
| return intent["id"] | ||
|
|
||
| # Second pass: check multi-word aliases | ||
| for alias, intent_ids in self._aliases.items(): | ||
| if " " in alias and alias.lower() in query: |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential bug: When checking multi-word keywords/aliases in lines 161 and 166, the check kw.lower() in query performs a substring search. This could produce false matches, e.g., "machinelearning" would match "machine learning" even though it's one word. Consider using word boundary checks or splitting the query properly to avoid partial word matches.
cortex/database.py
Outdated
|
|
||
| def _load(self) -> None: | ||
| """Load the database from JSON file.""" | ||
| with open(self.db_path) as f: |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing error handling: If the JSON file is malformed or contains invalid data, the json.load() will raise a JSONDecodeError which is not caught. This could lead to unhelpful error messages for users. Consider wrapping the JSON loading in a try-except block with a more informative error message about the corrupted database file.
| with open(self.db_path) as f: | |
| try: | |
| self._data = json.load(f) | |
| except json.JSONDecodeError as e: | |
| # Provide a clearer error message indicating which database file is corrupted. | |
| raise json.JSONDecodeError( | |
| f"Failed to parse suggestion database file '{self.db_path}': {e.msg}", | |
| e.doc, | |
| e.pos, | |
| ) from e |
cortex/database.py
Outdated
| for intent in self._intents: | ||
| for kw in intent.get("keywords", []): | ||
| self._intent_keywords[kw.lower()] = intent["id"] | ||
|
|
||
| # Language keyword index | ||
| self._language_keywords: dict[str, str] = {} # keyword -> language_id | ||
| for lang in self._languages: | ||
| for kw in lang.get("keywords", []): |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential data integrity issue: If multiple intents or languages share the same keyword, later entries will silently overwrite earlier ones in the keyword index (lines 66, 72). This could lead to incorrect intent/language detection. Consider either validating keyword uniqueness during loading, using a list of intent/language IDs per keyword, or documenting this behavior as expected.
| for intent in self._intents: | |
| for kw in intent.get("keywords", []): | |
| self._intent_keywords[kw.lower()] = intent["id"] | |
| # Language keyword index | |
| self._language_keywords: dict[str, str] = {} # keyword -> language_id | |
| for lang in self._languages: | |
| for kw in lang.get("keywords", []): | |
| intent_id = intent.get("id") | |
| for kw in intent.get("keywords", []): | |
| key = kw.lower() | |
| existing = self._intent_keywords.get(key) | |
| if existing is not None and existing != intent_id: | |
| raise ValueError( | |
| f"Duplicate intent keyword '{key}' for intents " | |
| f"'{existing}' and '{intent_id}' in {self.db_path}" | |
| ) | |
| self._intent_keywords[key] = intent_id | |
| # Language keyword index | |
| self._language_keywords: Dict[str, str] = {} # keyword -> language_id | |
| for lang in self._languages: | |
| lang_id = lang.get("id") | |
| for kw in lang.get("keywords", []): | |
| key = kw.lower() | |
| existing = self._language_keywords.get(key) | |
| if existing is not None and existing != lang_id: | |
| raise ValueError( | |
| f"Duplicate language keyword '{key}' for languages " | |
| f"'{existing}' and '{lang_id}' in {self.db_path}" | |
| ) | |
| self._language_keywords[key] = lang_id |
cortex/cli.py
Outdated
| # Import suggestion system components | ||
| try: | ||
| from cortex.database import PackageDatabase | ||
| from cortex.tui import run_suggestions |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import of 'PackageDatabase' is not used.
| from cortex.tui import run_suggestions |
cortex/tui.py
Outdated
| from typing import Optional | ||
|
|
||
| from rich.console import Console | ||
| from rich.panel import Panel |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import of 'Text' is not used.
| from rich.panel import Panel |
cortex/tui.py
Outdated
| from rich.table import Table | ||
| from rich.text import Text | ||
|
|
||
| from .database import PackageDatabase |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import of 'detect_hardware' is not used.
| from .database import PackageDatabase | |
| from .hardware_detection import detect_quick |
cortex/tui.py
Outdated
| for i, item in enumerate(self.results[:5]): | ||
| prefix = "→" if i == 0 else " " | ||
| name = item["id"].ljust(16) # Pad for alignment | ||
| desc = item["description"] |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| desc = item["description"] | |
| except (ValueError, KeyboardInterrupt): | |
| # Intentionally treat invalid input or user cancellation as "no selection" | |
| # and fall through to the final `return None` below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
♻️ Duplicate comments (3)
cortex/matcher.py (1)
349-381:post_installcommands are loaded but not included in the commands list.The
post_installfield is extracted from the stack (line 358) and returned in the preview dict, but it's never appended tocommands. If these are intended to be executed as part of the installation, they should be added to the commands list.🔎 If post_install should be included
if npm_packages: commands.append(f"npm install -g {' '.join(npm_packages)}") + + # Add post-install commands + if post_install: + commands.extend(post_install) return {cortex/tui.py (1)
176-181: Tab key handler logic is broken.After calling
event.app.exit(), callingevent.app.run()on the same Application instance won't restart the interactive loop properly. The application has already exited and its state is invalidated.Consider using a sentinel value and handling details display in an outer loop:
🔎 Suggested approach
@kb.add('tab') def _(event): - event.app.exit() - self.show_full_details() - # Restart the app - event.app.run() + self.selected_item = {"__show_details__": True} + event.app.exit()Then wrap the
app.run()call in a loop that checks for the sentinel and re-runs after showing details.cortex/cli.py (1)
112-228: Audit logging missing for suggestion-based installations.The
suggest()method executes package operations via subprocess but doesn't record them to~/.cortex/history.dbusingInstallationHistory, unlike theinstall()method. This creates an inconsistency and violates the audit logging requirement for all package operations.Based on learnings, implement audit logging similar to the
install()method.
🧹 Nitpick comments (9)
cortex/matcher.py (2)
20-20: Remove unused importOptional.The
Optionaltype is imported but never used; the code consistently uses thestr | Noneunion syntax instead.🔎 Proposed fix
-from typing import Optional
165-195: Use Damerau-Levenshtein distance or RapidFuzz for transposition-aware typo detection.The current character set overlap approach won't effectively catch transposition typos (e.g., "dokcer" vs "docker"). Standard Levenshtein distance does not handle transposed characters well; Damerau-Levenshtein is better suited. RapidFuzz provides Damerau-Levenshtein among its distance metrics and is a robust choice for production use.
cortex/database.py (3)
8-8: Remove unused importOptional.The
Optionaltype is imported but the code usesstr | Noneanddict | Noneunion syntax throughout.🔎 Proposed fix
-from typing import Any, Optional +from typing import Any
121-133:get_intentandget_languageuse linear search despite having indexed data.These methods iterate through lists when indexed lookups could be used. Consider building
_intents_by_idand_languages_by_idindexes in_build_indexesfor O(1) lookups, similar to how_stacks_by_idis implemented.🔎 Suggested improvement
def _build_indexes(self) -> None: """Build lookup indexes for fast access.""" + # Intent and language indexes by ID + self._intents_by_id: dict[str, dict] = {i["id"]: i for i in self._intents} + self._languages_by_id: dict[str, dict] = {l["id"]: l for l in self._languages} + # Intent keyword index ... def get_intent(self, intent_id: str) -> dict | None: """Get an intent by its ID.""" - for intent in self._intents: - if intent["id"] == intent_id: - return intent - return None + return self._intents_by_id.get(intent_id) def get_language(self, lang_id: str) -> dict | None: """Get a language by its ID.""" - for lang in self._languages: - if lang["id"] == lang_id: - return lang - return None + return self._languages_by_id.get(lang_id)
211-216:get_stacks_by_intent_and_languagecreates intermediate sets unnecessarily.The method creates two sets and computes their intersection. This could be optimized by iterating once and checking membership.
🔎 Minor optimization
def get_stacks_by_intent_and_language(self, intent_id: str, lang_id: str) -> list[dict]: """Get stacks matching both intent and language.""" - intent_stacks = {s["id"] for s in self.get_stacks_by_intent(intent_id)} - lang_stacks = {s["id"] for s in self.get_stacks_by_language(lang_id)} - matching_ids = intent_stacks & lang_stacks - return [self._stacks_by_id[sid] for sid in matching_ids] + lang_stack_ids = {s["id"] for s in self.get_stacks_by_language(lang_id)} + return [s for s in self.get_stacks_by_intent(intent_id) if s["id"] in lang_stack_ids]cortex/tui.py (3)
7-7: Remove unused importOptional.
Optionalis imported but the code usesdict | Noneunion syntax.🔎 Proposed fix
-from typing import Optional +from typing import TYPE_CHECKINGOr simply remove the import if not needed elsewhere.
15-15: Remove unused importdetect_hardware.Only
detect_quickis used in the file;detect_hardwareis imported but never called.🔎 Proposed fix
-from .hardware_detection import detect_hardware, detect_quick +from .hardware_detection import detect_quick
366-379: Redundant database instantiation in CLI entry point.When the user selects a result, a new
PackageDatabase()is instantiated (line 374) even though one was already created insiderun_suggestions. This duplicates work and could cause inconsistencies if database state differs.🔎 Proposed fix - pass matcher through result
The result dict could include the preview directly, or the TUI could expose the matcher instance.
cortex/cli.py (1)
718-752: Early argument interception bypasses argparse validation.The main function intercepts arguments before argparse runs, which bypasses standard argument validation and help text. This could cause confusion when users mix flags incorrectly (e.g.,
cortex docker --invalid-flag).Additionally, the hardcoded command list (lines 720-735) needs to be kept in sync with subparser definitions, creating a maintenance burden.
🔎 Consider using argparse's parse_known_args instead
# Parse known args first, then check if remaining looks like a query args, remaining = parser.parse_known_args() if not args.command and remaining: query = " ".join(remaining) return cli.suggest(query)
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
cortex/cli.pycortex/database.pycortex/matcher.pycortex/tui.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
cortex/matcher.pycortex/cli.pycortex/tui.pycortex/database.py
🧠 Learnings (1)
📚 Learning: 2025-12-11T12:03:24.071Z
Learnt from: CR
Repo: cortexlinux/cortex PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T12:03:24.071Z
Learning: Applies to **/*install*.py : Implement audit logging to ~/.cortex/history.db for all package operations
Applied to files:
cortex/cli.py
🧬 Code graph analysis (2)
cortex/matcher.py (1)
cortex/database.py (9)
SuggestionDatabase(11-258)stacks(108-109)detect_language_from_keywords(194-199)detect_intent_from_keywords(145-192)intents(100-101)get_stack(135-137)get_stacks_by_intent_and_language(211-216)get_stacks_by_language(207-209)get_stacks_by_intent(203-205)
cortex/tui.py (2)
cortex/hardware_detection.py (1)
detect_hardware(646-648)cortex/matcher.py (2)
match(197-347)get_install_preview(349-381)
🔇 Additional comments (8)
cortex/matcher.py (2)
77-99: Docstring missing for private method_build_keyword_index.The docstring "Build reverse index from keywords to stacks." is present, which is good. The implementation correctly indexes both explicit keywords and name words.
197-347: Core matching logic is well-structured with clear scoring priorities.The
matchmethod implements a comprehensive multi-strategy matching approach with well-documented scoring constants. The hardware filtering for GPU requirements and the bonus system for complexity/GPU availability are good UX touches.cortex/database.py (1)
145-192: Intent detection logic is comprehensive with clear priority documentation.The six-pass detection approach with documented priorities handles multi-word keywords, aliases, and ambiguity with language tokens well. The docstring clearly explains the detection order.
cortex/tui.py (1)
118-243: Interactive mode implementation with scrolling viewport is well designed.The scrolling viewport with scroll indicators, keyboard navigation, and auto-selection for high-confidence single results provides a good UX. The graceful fallback to simple mode when prompt_toolkit is unavailable is a nice touch.
cortex/cli.py (4)
33-34: Import sorting issue was flagged in previous review.The imports in the try block should be alphabetically sorted to satisfy linting rules.
139-140: Import sorting issue was flagged in previous review.The local imports should be sorted alphabetically.
112-117: Type hint for return type present and docstring provided.The
suggestmethod has proper return type annotation (-> int) and a clear docstring.
662-668: Add type hints and complete the demo method implementation.The
demomethod is missing type hints (the return type should be-> int) which violates the coding guidelines. Additionally, the implementation only contains a placeholder comment without actual demo logic beyond printing messages and returning 0.
cortex/cli.py
Outdated
| if cmd.strip().startswith("sudo"): | ||
| progress.stop() # Pause spinner for sudo | ||
| returncode = subprocess.call(quiet_cmd, shell=True) | ||
| result_returncode = returncode | ||
| result_stderr = "" | ||
| progress.start() # Resume spinner | ||
| else: | ||
| # pip/other commands can run quietly | ||
| result = subprocess.run( | ||
| quiet_cmd, shell=True, capture_output=True, text=True | ||
| ) | ||
| result_returncode = result.returncode | ||
| result_stderr = result.stderr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
fd -t f "cli.py" | head -5Repository: cortexlinux/cortex
Length of output: 110
🏁 Script executed:
wc -l cortex/cli.pyRepository: cortexlinux/cortex
Length of output: 79
🏁 Script executed:
sed -n '160,220p' cortex/cli.pyRepository: cortexlinux/cortex
Length of output: 3108
🏁 Script executed:
sed -n '100,160p' cortex/cli.pyRepository: cortexlinux/cortex
Length of output: 2465
🏁 Script executed:
fd -t f "database.py\|suggestions.json\|\.json" cortex/ | grep -E "(database|suggestion)" | head -20Repository: cortexlinux/cortex
Length of output: 44
🏁 Script executed:
rg "SuggestionDatabase|commands.*=" cortex/cli.py -A 2 -B 2Repository: cortexlinux/cortex
Length of output: 1330
🏁 Script executed:
fd -t f "database.py" cortex/Repository: cortexlinux/cortex
Length of output: 80
🏁 Script executed:
wc -l cortex/database.py && head -80 cortex/database.pyRepository: cortexlinux/cortex
Length of output: 3168
🏁 Script executed:
fd -t f "matcher.py" cortex/Repository: cortexlinux/cortex
Length of output: 79
🏁 Script executed:
grep -n "get_install_preview\|def get_install_preview" cortex/matcher.py -A 20Repository: cortexlinux/cortex
Length of output: 999
🏁 Script executed:
sed -n '360,380p' cortex/matcher.pyRepository: cortexlinux/cortex
Length of output: 1013
🏁 Script executed:
find . -name "suggestions.json" -o -name "packages.json" | head -5Repository: cortexlinux/cortex
Length of output: 85
🏁 Script executed:
ls -la data/suggestions.json && head -50 data/suggestions.jsonRepository: cortexlinux/cortex
Length of output: 1103
🏁 Script executed:
git log --oneline data/suggestions.json | head -5Repository: cortexlinux/cortex
Length of output: 107
🏁 Script executed:
grep -n "apt_packages\|pip_packages\|npm_packages" cortex/database.py -B 2 -A 2 | head -40Repository: cortexlinux/cortex
Length of output: 1066
🏁 Script executed:
cat > /tmp/check_injection.py << 'EOF'
import json
# Check if package names could contain shell metacharacters
with open('data/suggestions.json') as f:
data = json.load(f)
# Look for packages with potentially dangerous characters
dangerous_chars = [';', '$', '`', '|', '&', '(', ')', '<', '>', '\n']
issues = []
for stack in data.get('stacks', []):
for pkg_type in ['apt_packages', 'pip_packages', 'npm_packages']:
for pkg in stack.get(pkg_type, []):
for char in dangerous_chars:
if char in pkg:
issues.append(f"Stack {stack['id']}: {pkg_type} contains '{char}': {pkg}")
if issues:
print("Found potentially dangerous package names:")
for issue in issues[:5]:
print(f" {issue}")
else:
print("No obviously malicious package names found in current database")
print("However, if database is user-editable, attacker could inject commands")
EOF
python /tmp/check_injection.pyRepository: cortexlinux/cortex
Length of output: 148
Command injection risk depends on suggestions.json file permissions.
The subprocess.call and subprocess.run calls use shell=True with package names from suggestions.json. While the database is bundled and read-only in the repository, the code searches for user-writable locations (~/.cortex/suggestions.json). If this file is modified by a local attacker, arbitrary commands could be injected via package names like vim && malicious-command. This is a local privilege concern rather than a remote vulnerability. Consider restricting file permissions for suggestions.json in user directories or validating/escaping package names before shell execution.
🤖 Prompt for AI Agents
In cortex/cli.py around lines 186 to 198, the code executes package install
commands with shell=True using untrusted package names from suggestions.json,
creating a local command-injection risk; fix by validating and normalizing
package names on load (allow only expected pattern [a-z0-9._+-]+ and
reject/escape anything else), enforce restrictive permissions on user
suggestions.json (e.g., owner-only read/write) and fail-safe if permissions are
too open, and change subprocess calls to avoid shell=True: construct argument
lists and call subprocess.run([...], shell=False) (for sudo use ['sudo',
package] or ['sudo', 'pip', 'install', package] as appropriate) so arbitrary
shell operators cannot be interpreted.
cortex/cli.py
Outdated
| def install(self, software: str, execute: bool = False, dry_run: bool = False): | ||
| """Install software using LLM to interpret the request. No suggestions - direct execution.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Type hint missing on return type for install method.
The install method lacks a return type annotation. Based on the implementation, it returns int.
🔎 Proposed fix
- def install(self, software: str, execute: bool = False, dry_run: bool = False):
+ def install(self, software: str, execute: bool = False, dry_run: bool = False) -> int:As per coding guidelines, type hints are required in Python code.
🤖 Prompt for AI Agents
In cortex/cli.py around lines 301 to 302, the install method signature is
missing a return type annotation; update the method signature to include the
correct return type (-> int) so it reads install(self, software: str, execute:
bool = False, dry_run: bool = False) -> int:, and keep the existing
implementation and docstring unchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
test_parallel_llm.py (1)
1-1: CI Pipeline Failure: Black formatting required.The Black formatting check has failed. Run
black .from the repository root to format this file and the 3 other files that need reformatting.#!/bin/bash # Fix formatting issues black .
♻️ Duplicate comments (10)
cortex/tui.py (3)
181-186: Tab key handler creates nested event loops.Calling
event.app.exit()followed byevent.app.run()creates a nested event loop, which can cause unexpected behavior or failures. Consider restructuring to use a loop that re-runs the application instead of nesting calls, or use a modal/overlay for details without exiting.
304-304: Misleading UI hints in non-interactive mode.The hint
[↑↓ to select, Enter to install, Tab for details]is misleading becauserun_simpleonly accepts numeric input, not arrow keys or Tab.🔎 Proposed fix
console.print(f"{prefix} {name} {desc}") - console.print("[dim] [↑↓ to select, Enter to install, Tab for details][/dim]") + console.print("[dim] Enter a number to select[/dim]") console.print()
328-328: Update references from "packages.json" to "suggestions.json".The docstring and error message reference "packages.json" but the database loader searches for "suggestions.json" (see cortex/database.py line 54). This inconsistency could confuse users.
🔎 Proposed fix
Args: query: Initial search query - db_path: Optional path to packages.json + db_path: Optional path to suggestions.json interactive: Run in interactive mode Returns: Selected package/stack dict or None if cancelled """ try: db = PackageDatabase(db_path) except FileNotFoundError as e: console.print(f"[red]Error:[/red] {e}") - console.print("[dim]Make sure packages.json exists in data/ directory[/dim]") + console.print("[dim]Make sure suggestions.json exists in data/ directory[/dim]") return NoneAlso applies to: 336-338
cortex/database.py (6)
58-58: Specify explicit encoding when opening the JSON file.Opening files without specifying encoding may cause issues on systems with non-UTF-8 default encodings.
🔎 Proposed fix
def _load(self) -> None: """Load the database from JSON file.""" - with open(self.db_path) as f: + with open(self.db_path, encoding="utf-8") as f: self._data = json.load(f)
58-59: Add error handling for malformed JSON files.If the JSON file is malformed,
json.load()will raise aJSONDecodeErrorwith a potentially unclear error message. Consider wrapping in a try-except block to provide a more informative message.🔎 Proposed fix
def _load(self) -> None: """Load the database from JSON file.""" with open(self.db_path, encoding="utf-8") as f: - self._data = json.load(f) + try: + self._data = json.load(f) + except json.JSONDecodeError as e: + raise ValueError( + f"Failed to parse suggestion database file '{self.db_path}': {e.msg}" + ) from e
70-79: Duplicate keywords will silently overwrite earlier entries.If multiple intents or languages share the same keyword, later entries will silently overwrite earlier ones in the keyword index. This could lead to incorrect intent/language detection. Consider either validating keyword uniqueness during loading, using a list of intent/language IDs per keyword, or documenting this as expected behavior.
82-82: Duplicate stack IDs will silently overwrite earlier entries.The dictionary comprehension will silently overwrite earlier stacks if duplicate IDs exist. Consider validating that all stack IDs are unique during loading, or at minimum log a warning when duplicates are detected.
132-144: Consider adding O(1) lookups for intents and languages.The
get_intentandget_languagemethods perform linear searches through lists. Since_build_indexesalready creates dictionaries for other lookups, consider adding_intents_by_idand_languages_by_iddictionaries for O(1) lookups to improve performance.
168-179: Substring search for multi-word keywords could produce false matches.The checks
kw.lower() in queryandalias.lower() in queryperform substring searches, which could produce false positives. For example, "machinelearning" would match "machine learning" even though it's one continuous word. Consider using word boundary checks or proper tokenization to avoid partial word matches.cortex/matcher.py (1)
363-380:post_installcommands are loaded but not included in the commands list.The
post_installfield is extracted from the stack but not appended tocommands. If these are meant to be executed, they should be added; otherwise, clarify the intent with a comment.🔎 If post_install should be included
if npm_packages: commands.append(f"npm install -g {' '.join(npm_packages)}") + + # Add post-install commands + commands.extend(post_install) return {
🧹 Nitpick comments (8)
cortex/matcher.py (2)
276-286: Consider optimizing partial keyword matching for large databases.The nested loop iterates through all keywords for each token, which is O(tokens × keywords). For large databases, consider optimizing with n-gram indexing, prefix trees, or adding early termination if enough high-scoring candidates are found.
288-313: Consider running fuzzy matching only as a fallback.Fuzzy matching currently runs for all stacks even when other strategies have already found candidates. For large databases, consider only running fuzzy matching when no other strategies produced results, or adding early termination once enough high-scoring candidates are found.
cortex/tui.py (6)
12-12: Remove unused import.The
Textimport fromrich.textis not used in this file.🔎 Proposed fix
from rich.console import Console from rich.panel import Panel from rich.table import Table -from rich.text import Text
15-15: Remove unused import.Only
detect_quickis used (line 114), butdetect_hardwareis also imported.🔎 Proposed fix
-from .hardware_detection import detect_hardware, detect_quick +from .hardware_detection import detect_quick
189-193: Clarify the max_name_width calculation logic.The calculation
results[:viewport_size + 5]examines up to 10 items (5 + 5), but the rationale for looking beyond the viewport isn't clear. Consider adding a comment explaining why you need to look at more than just the visible items, or simplify if the extra items aren't necessary.
299-311: Prompt range doesn't reflect actual number of results.The prompt says "choice (1-5, 0 to cancel)" but if fewer than 5 results exist, entering a number for a non-existent item will fail silently. Consider dynamically generating the prompt to reflect the actual number of results.
🔎 Proposed fix
console.print() console.print(" 0. Cancel") - try: - choice = console.input("\n[bold]Enter choice (1-5, 0 to cancel):[/bold] ") + num_displayed = min(len(self.results), 5) + try: + choice = console.input(f"\n[bold]Enter choice (1-{num_displayed}, 0 to cancel):[/bold] ") idx = int(choice) - 1
314-315: Add explanatory comment for the exception handler.The
exceptclause silently catches exceptions without explanation. Consider adding a comment to clarify that invalid input or user cancellation is intentionally treated as "no selection."🔎 Proposed fix
except (ValueError, KeyboardInterrupt): + # Treat invalid input or user cancellation as "no selection" pass
379-379: Avoid recreating PackageDatabase for the preview.Line 379 creates a new
PackageDatabase()instance just to get the install preview. Consider refactoring to either include the database/matcher reference in the result dict or cache the database instance, to avoid redundant file loading and parsing.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
cortex/database.pycortex/matcher.pycortex/tui.pyexamples/parallel_llm_demo.pytest_parallel_llm.py
✅ Files skipped from review due to trivial changes (1)
- examples/parallel_llm_demo.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
cortex/tui.pytest_parallel_llm.pycortex/matcher.pycortex/database.py
🧬 Code graph analysis (3)
cortex/tui.py (3)
cortex/hardware_detection.py (1)
detect_hardware(646-648)tests/test_graceful_degradation.py (1)
matcher(112-113)cortex/matcher.py (2)
match(202-352)get_install_preview(354-386)
test_parallel_llm.py (1)
cortex/llm_router.py (1)
acomplete(441-505)
cortex/matcher.py (1)
cortex/database.py (9)
SuggestionDatabase(11-269)stacks(116-118)detect_language_from_keywords(205-210)detect_intent_from_keywords(156-203)intents(106-108)get_stack(146-148)get_stacks_by_intent_and_language(222-227)get_stacks_by_language(218-220)get_stacks_by_intent(214-216)
🪛 GitHub Actions: CI
test_parallel_llm.py
[error] 1-1: Black formatting check failed. 4 files would be reformatted by black; run 'black .' to format.
🔇 Additional comments (2)
test_parallel_llm.py (2)
51-51: LGTM - unnecessary f-strings removed.The cosmetic changes removing f-strings where no interpolation is needed improve code clarity.
Also applies to: 98-98, 145-145, 247-247
236-237: Good fix - properly propagates task_type parameter.The change ensures
task_typeis correctly passed toacompleteby filtering it from the dict unpacking and passing it explicitly, which prevents potential duplicate keyword argument issues while ensuring proper LLM routing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (6)
cortex/database.py (3)
137-149: Prefer O(1) dictionary lookups over O(n) linear search.The
get_intentandget_languagemethods perform linear searches through lists. Since_build_indexesalready creates dictionaries for other lookups, consider addingintent_by_idandlanguage_by_iddictionaries for O(1) performance.🔎 Proposed refactor
Add to
_build_indexes:def _build_indexes(self) -> None: """Build lookup indexes for fast access.""" + # Intent and language ID indexes + self._intents_by_id: dict[str, dict] = {i["id"]: i for i in self._intents} + self._languages_by_id: dict[str, dict] = {l["id"]: l for l in self._languages} + # Intent keyword index self._intent_keywords: dict[str, str] = {} # keyword -> intent_idThen update the methods:
def get_intent(self, intent_id: str) -> dict | None: """Get an intent by its ID.""" - for intent in self._intents: - if intent["id"] == intent_id: - return intent - return None + return self._intents_by_id.get(intent_id) def get_language(self, lang_id: str) -> dict | None: """Get a language by its ID.""" - for lang in self._languages: - if lang["id"] == lang_id: - return lang - return None + return self._languages_by_id.get(lang_id)Based on past review comments, this performance concern was previously identified but not addressed.
75-84: Duplicate keywords silently overwrite in indexes.If multiple intents or languages share the same keyword in the JSON data, later entries will silently overwrite earlier ones in the keyword indexes. This could lead to incorrect intent/language detection. Consider validating keyword uniqueness during loading or documenting this as expected behavior.
Based on past review comments, this issue was previously identified but not addressed.
87-87: Duplicate stack IDs will be silently overwritten.If the JSON contains stacks with duplicate IDs, the dictionary comprehension will silently overwrite earlier entries. Consider validating that all stack IDs are unique during loading or logging a warning when duplicates are detected.
Based on past review comments, this issue was previously identified but not addressed.
cortex/tui.py (3)
187-193: Clarify max_name_width lookahead logic.The calculation
self.results[:viewport_size + 5]looks at up to 10 items, but the rationale for adding 5 to viewport_size isn't documented. Consider adding a comment explaining why you look beyond the visible viewport.🔎 Suggested comment
# Calculate max name width for alignment (from visible results) + # Look at viewport_size + 5 items to stabilize column width as user scrolls max_name_width = ( max(len(item["id"]) for item in self.results[: viewport_size + 5]) if self.results else 16 )Based on past review comments, this clarity concern was previously identified but not addressed.
313-314: Add explanatory comment to empty except clause.The except clause silently catches
ValueErrorandKeyboardInterruptwithout explanation. Consider adding a comment to clarify the intent.🔎 Proposed improvement
except (ValueError, KeyboardInterrupt): + # Treat invalid input or user cancellation as "no selection" passBased on past review comments, this clarity concern was previously identified but not addressed.
180-185: Nested event loop created by Tab handler.The Tab key handler exits the app, shows details, then tries to restart with
event.app.run(). This creates a nested event loop which can cause unexpected behavior or failures. Consider restructuring to use a sentinel value and an outer loop, or show details without exiting the app.Based on past review comments, this nested event loop issue was previously identified but not addressed.
🧹 Nitpick comments (1)
cortex/tui.py (1)
369-383: Consider reusing database instances in CLI entry point.Line 378 creates a new
PackageDatabase()andFuzzyMatcherto get the preview, but the database instance was already created earlier. Consider refactoring to reuse instances or pass them through for better efficiency.🔎 Proposed refactor
if __name__ == "__main__": query = sys.argv[1] if len(sys.argv) > 1 else "" interactive = "--simple" not in sys.argv + db = PackageDatabase() + matcher = FuzzyMatcher(db) - result = run_suggestions(query, interactive=interactive) + # Refactor run_suggestions to accept db instance or create tui directly + tui = SuggestionTUI(db) + result = tui.run_interactive(query) if interactive else tui.run_simple(query) if result: console.print(f"\n[bold green]✅ Selected:[/bold green] {result['name']}") - preview = FuzzyMatcher(PackageDatabase()).get_install_preview(result["type"], result["id"]) + preview = matcher.get_install_preview(result["type"], result["id"])
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
cortex/database.pycortex/tui.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
cortex/tui.pycortex/database.py
🧬 Code graph analysis (1)
cortex/tui.py (1)
cortex/matcher.py (2)
match(202-352)get_install_preview(354-386)
🔇 Additional comments (2)
cortex/database.py (2)
56-71: LGTM: Proper error handling and encoding.The UTF-8 encoding and JSON error handling with informative messages address previous review concerns effectively.
161-217: Well-structured multi-pass intent detection with proper word boundaries.The multi-word keyword matching correctly uses word boundary checks by splitting queries into words and checking sequential matches. This addresses the previous concern about false substring matches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
test_parallel_llm.py (1)
231-233: Acceptable formatting, though line 232 is slightly long.The call formatting is readable. Line 232 exceeds typical line length limits (~111 characters), but for test code this is acceptable, especially since further splitting might reduce clarity.
examples/parallel_llm_demo.py (1)
25-25: Consider adding type hints to demonstrate best practices.The async functions lack return type annotations. Per the coding guidelines, type hints are required in Python code. Since this is an example file, adding type hints would demonstrate best practices to users.
🔎 Suggested type hints for async functions
-async def demo_multi_package_queries(): +async def demo_multi_package_queries() -> None: """Demonstrate parallel package queries."""-async def demo_parallel_error_diagnosis(): +async def demo_parallel_error_diagnosis() -> None: """Demonstrate parallel error diagnosis."""-async def demo_hardware_config_checks(): +async def demo_hardware_config_checks() -> None: """Demonstrate parallel hardware config checks."""-async def demo_batch_completion(): +async def demo_batch_completion() -> None: """Demonstrate generic batch completion."""-async def demo_sequential_vs_parallel(): +async def demo_sequential_vs_parallel() -> None: """Compare sequential vs parallel performance."""-async def main(): +async def main() -> None: """Run all demos."""As per coding guidelines, type hints are required in Python code.
Also applies to: 55-55, 93-93, 127-127, 184-184, 218-218
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
cortex/llm_router.pyexamples/parallel_llm_demo.pytest_parallel_llm.pytests/test_cli.pytests/test_llm_router.py
✅ Files skipped from review due to trivial changes (2)
- tests/test_llm_router.py
- cortex/llm_router.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
examples/parallel_llm_demo.pytests/test_cli.pytest_parallel_llm.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Maintain >80% test coverage for pull requests
Files:
tests/test_cli.py
🧬 Code graph analysis (3)
examples/parallel_llm_demo.py (2)
cortex/llm_router.py (1)
query_multiple_packages(725-772)cortex/database.py (1)
packages(126-128)
tests/test_cli.py (1)
test/test_cli.py (2)
test_main_install_with_execute(281-285)test_main_install_with_dry_run(289-293)
test_parallel_llm.py (1)
cortex/llm_router.py (4)
query_multiple_packages(725-772)diagnose_errors_parallel(775-821)check_hardware_configs_parallel(824-872)acomplete(441-505)
🔇 Additional comments (4)
test_parallel_llm.py (2)
51-51: LGTM - Good practice removing unnecessary f-strings.Removing f-strings that don't contain any variable interpolation is a best practice, as it avoids unnecessary overhead. The changes are consistent throughout the file.
Also applies to: 98-98, 146-146, 243-243
173-173: LGTM - Single-line formatting improves readability.The conversion to single-line calls for these helper functions is clean and maintains good readability. Line lengths are acceptable and align with the PR's streamlined approach.
Also applies to: 184-184, 195-195
tests/test_cli.py (1)
187-187: LGTM - Test assertions correctly updated for API change.The removal of the
parallelparameter from test assertions correctly reflects the API signature change described in the PR summary. All three test cases are consistently updated to match the newCortexCLI.install()signature.Also applies to: 195-195, 203-203
examples/parallel_llm_demo.py (1)
15-15: LGTM! Clean formatting improvements.The blank line at line 15 properly separates standard library imports from local imports (PEP 8 compliant), and the single-line call at line 39 improves readability while staying well within line length limits.
Also applies to: 39-39
| print( | ||
| f" Time saved: {elapsed_seq - elapsed_par:.2f}s ({((elapsed_seq - elapsed_par)/elapsed_seq*100):.1f}%)" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add defensive guard for division by zero.
The percentage calculation divides by elapsed_seq without checking for zero, unlike line 211 which guards against elapsed_par == 0. While unlikely in practice due to the asyncio.sleep() calls, adding a guard would be more defensive.
Additionally, the complex nested expression could be extracted to a variable for improved readability.
🔎 Proposed fix with guard and improved readability
speedup = elapsed_seq / elapsed_par if elapsed_par > 0 else 1.0
print(f"\n⚡ Speedup: {speedup:.2f}x")
+ if elapsed_seq > 0:
+ time_saved = elapsed_seq - elapsed_par
+ percent_saved = (time_saved / elapsed_seq) * 100
+ print(f" Time saved: {time_saved:.2f}s ({percent_saved:.1f}%)")
+ else:
+ print(" Time saved: 0.00s (0.0%)")
- print(
- f" Time saved: {elapsed_seq - elapsed_par:.2f}s ({((elapsed_seq - elapsed_par)/elapsed_seq*100):.1f}%)"
- )Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In examples/parallel_llm_demo.py around lines 213 to 215, the percentage
calculation divides by elapsed_seq without guarding against zero and uses a
nested expression that hurts readability; update the print to first compute a
percentage variable with a guard (e.g., if elapsed_seq == 0 then set percentage
to 0.0 or "N/A"), then format that variable in the f-string so you avoid
division-by-zero and improve readability.
|



Summary
Adds instant, offline smart suggestions when users type queries like
cortex dockerorcortex "python ml". No LLM calls needed - works entirely offline with a curated database of 500+ package stacks.Features
Demo
$ cortex docker
→ docker-dev Docker optimized for local development
docker-python Docker environment for Python development
docker-nodejs Docker environment for Node.js applications
↓ 7 more below
[↑↓ to select, Enter to install, Tab for details, q to quit]
Files Added/Changed
Testing
Summary by CodeRabbit
New Features
Removed
Changed
Tests / Style
✏️ Tip: You can customize this high-level summary in your review settings.