Skip to content

fix: Proper refresh of Groq models#12158

Merged
erichare merged 21 commits into
mainfrom
fix-groq-component
Mar 13, 2026
Merged

fix: Proper refresh of Groq models#12158
erichare merged 21 commits into
mainfrom
fix-groq-component

Conversation

@erichare
Copy link
Copy Markdown
Collaborator

@erichare erichare commented Mar 11, 2026

This pull request enhances the Groq model discovery process by improving how models are filtered and tested for their capabilities. The main updates involve more accurate exclusion of non-LLM models, explicit testing for chat completion support, and clearer error handling when checking tool calling features.

Model filtering and capability detection improvements:

  • Expanded the SKIP_PATTERNS list in GroqModelDiscovery to exclude additional non-LLM models such as speech-related models (orpheus, playai).
  • Updated the model testing logic in get_models to first check whether each model supports chat completions before testing for tool calling, ensuring that models incapable of chat are filtered out early.
  • Added a new _test_chat_completion method to explicitly verify if a model supports basic chat completions, improving detection of non-chat models.

Error handling enhancements:

  • Refined error handling in _test_tool_calling to log specific issues, distinguish between missing Groq package and tool support errors, and conservatively handle other API errors.

Test updates:

  • Modified the unit test for mixed tool calling support to reflect the new call order and logic, ensuring both chat and tool capabilities are properly tested for each model.

Summary by CodeRabbit

  • Improvements
    • Enhanced model discovery to validate chat capability before testing tool calling functionality
    • Improved error handling and logging during model capability validation
    • Expanded model filtering to exclude additional models from LLM selection

@erichare erichare linked an issue Mar 11, 2026 that may be closed by this pull request
@github-actions github-actions Bot added the bug Something isn't working label Mar 11, 2026
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Mar 11, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 11, 2026

Frontend Unit Test Coverage Report

Coverage Summary

Lines Statements Branches Functions
Coverage: 24%
24.14% (8609/35660) 16.89% (4743/28067) 16.87% (1261/7473)

Unit Test Results

Tests Skipped Failures Errors Time
2776 0 💤 0 ❌ 0 🔥 45.848s ⏱️

@langflow-ai langflow-ai deleted a comment from coderabbitai Bot Mar 11, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 11, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 23bf6ab8-1878-41ec-a040-ad816b02784d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

The Groq model discovery process was enhanced to validate chat completion capability before testing tool-calling support, introducing a new validation method and refining error handling logic across both the implementation and corresponding test suite.

Changes

Cohort / File(s) Summary
Model Discovery Logic
src/lfx/src/lfx/base/models/groq_model_discovery.py
Added new _test_chat_completion() method to validate chat capability. Updated get_models() flow to test chat completion first; models without chat support are marked as non-LLM and skip further testing. Expanded SKIP_PATTERNS with additional models. Refined _test_tool_calling() error handling to consistently return False on errors. Updated logging and docstrings to reflect new testing sequence.
Test Updates
src/backend/tests/unit/groq/test_groq_model_discovery.py
Rewrote mock tool-calling flow in test_mixed_tool_calling_support to model a sequence of three successful calls (chat llama, tool llama, chat gemma) followed by a single failure on the fourth call (tool gemma).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 2 warnings)

Check name Status Explanation Resolution
Test Coverage For New Implementations ❌ Error PR adds new _test_chat_completion method but only modifies one existing test without adding dedicated unit tests for the new functionality. Add test cases for _test_chat_completion method, refined error handling in _test_tool_calling, and verification of expanded SKIP_PATTERNS functionality.
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Quality And Coverage ⚠️ Warning New _test_chat_completion method lacks dedicated unit tests for its error handling logic (ImportError, API errors, fallback behavior). Add unit tests for _test_chat_completion covering: successful completion, ImportError, unsupported chat completions errors, and transient failures. Verify non-chat models are filtered from LLM list.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: Proper refresh of Groq models' accurately reflects the main purpose of the PR: fixing and improving the Groq model discovery and refresh logic, including model filtering, capability detection, and error handling.
Test File Naming And Structure ✅ Passed The test file follows correct pytest naming patterns and demonstrates proper structure with clear test function names that describe the test purpose.
Excessive Mock Usage Warning ✅ Passed Test uses 2 @patch decorators appropriately for isolating external Groq API dependencies. Mock configurations reasonably simulate API behavior to test model filtering logic. Assertions verify actual outcomes, not just mock interactions.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix-groq-component
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 11, 2026

Codecov Report

❌ Patch coverage is 10.41667% with 43 lines in your changes missing coverage. Please review.
✅ Project coverage is 38.38%. Comparing base (4511419) to head (56fb02c).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
...rc/lfx/src/lfx/base/models/groq_model_discovery.py 10.41% 43 Missing ⚠️

❌ Your patch status has failed because the patch coverage (10.41%) is below the target coverage (40.00%). You can increase the patch coverage or adjust the target coverage.
❌ Your project status has failed because the head coverage (44.27%) is below the target coverage (60.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main   #12158      +/-   ##
==========================================
+ Coverage   38.36%   38.38%   +0.01%     
==========================================
  Files        1630     1630              
  Lines       80250    80289      +39     
  Branches    12114    12120       +6     
==========================================
+ Hits        30791    30818      +27     
- Misses      47723    47735      +12     
  Partials     1736     1736              
Flag Coverage Δ
backend 57.24% <ø> (+0.10%) ⬆️
frontend 21.56% <ø> (ø)
lfx 44.27% <10.41%> (-0.06%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...rc/lfx/src/lfx/base/models/groq_model_discovery.py 15.27% <10.41%> (-2.82%) ⬇️

... and 10 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Mar 11, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/lfx/src/lfx/base/models/groq_model_discovery.py (1)

157-173: Consider whether "terms_required" should be treated differently from "does not support chat".

Models that return terms_required or model_terms_required errors do support chat completions—they just require the user to accept terms first. Currently, these models get added to non_llm_models with not_supported=True, which is semantically incorrect.

If this is intentional (treating unusable-for-this-user as unsupported), a brief comment clarifying the rationale would help. Otherwise, consider handling terms-required models differently—perhaps keeping them in models_metadata but with a separate flag like requires_terms_acceptance=True.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lfx/src/lfx/base/models/groq_model_discovery.py` around lines 157 - 173,
In the except block in groq_model_discovery (the Exception as e handler), stop
treating "terms_required" and "model_terms_required" as equivalent to "does not
support chat": remove those phrases from the unsupported list, and instead
detect them explicitly and return True (chat supported) while recording that the
model requires user terms acceptance—e.g., set a requires_terms_acceptance=True
flag on the model metadata (models_metadata) or return a tuple/structured result
so callers (and non_llm_models processing) can distinguish "supported but
requires terms" from truly unsupported models; if the original behavior was
intentional, add a short clarifying comment in that handler referencing
non_llm_models and the rationale.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/lfx/src/lfx/base/models/groq_model_discovery.py`:
- Around line 157-173: In the except block in groq_model_discovery (the
Exception as e handler), stop treating "terms_required" and
"model_terms_required" as equivalent to "does not support chat": remove those
phrases from the unsupported list, and instead detect them explicitly and return
True (chat supported) while recording that the model requires user terms
acceptance—e.g., set a requires_terms_acceptance=True flag on the model metadata
(models_metadata) or return a tuple/structured result so callers (and
non_llm_models processing) can distinguish "supported but requires terms" from
truly unsupported models; if the original behavior was intentional, add a short
clarifying comment in that handler referencing non_llm_models and the rationale.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a5d8fbe6-3c90-4d15-970e-890701732eea

📥 Commits

Reviewing files that changed from the base of the PR and between 911bc9d and a141f37.

📒 Files selected for processing (2)
  • src/backend/tests/unit/groq/test_groq_model_discovery.py
  • src/lfx/src/lfx/base/models/groq_model_discovery.py

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves Groq model discovery by filtering out non-LLM models more accurately and validating model capabilities in a safer order (chat support before tool-calling support), with corresponding unit test updates.

Changes:

  • Expanded SKIP_PATTERNS to exclude additional non-LLM/speech model families.
  • Added an explicit chat-completions capability probe and updated discovery flow to run it before tool-calling checks.
  • Refined error handling/logging in tool-calling capability detection and updated the mixed-support unit test to match the new call order.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/lfx/src/lfx/base/models/groq_model_discovery.py Updates filtering and capability detection; adds _test_chat_completion; adjusts error handling.
src/backend/tests/unit/groq/test_groq_model_discovery.py Updates the mixed tool-calling support test to reflect the new capability-check sequence.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/lfx/src/lfx/base/models/groq_model_discovery.py Outdated
Comment thread src/lfx/src/lfx/base/models/groq_model_discovery.py Outdated
Comment thread src/lfx/src/lfx/base/models/groq_model_discovery.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/lfx/src/lfx/base/models/groq_model_discovery.py
Comment thread src/lfx/src/lfx/base/models/groq_model_discovery.py Outdated
Comment thread src/lfx/src/lfx/base/models/groq_model_discovery.py Outdated
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Mar 12, 2026
erichare and others added 2 commits March 12, 2026 10:18
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@github-actions github-actions Bot removed the bug Something isn't working label Mar 12, 2026
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Mar 12, 2026
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Mar 12, 2026
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Mar 12, 2026
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Mar 12, 2026
@erichare erichare requested a review from HimavarshaVS March 12, 2026 21:01
@Cristhianzl
Copy link
Copy Markdown
Member

Summary

This PR improves the Groq model discovery process by adding:

  • Chat completion validation before testing tool calling
  • Better error handling with tri-state return (True/False/None)
  • New skip patterns for non-LLM models (orpheus, playai)
  • Fallback when get_models returns an empty list (ids = ids or GROQ_MODELS)
  • Comprehensive unit tests

Results by Category

Category Status Notes
🔴 PII in Logs ✅ OK No personal data in logs
🔴 Security ✅ OK No hardcoded secrets, API key via parameter
🔴 DRY ⚠️ VIOLATION Access error detection logic duplicated
🔴 File Limits VIOLATION Test file has 718 lines (limit: ~500)
🟠 Single Responsibility ✅ OK Cohesive class with single responsibility
🟠 Error Handling ⚠️ PARTIAL Inconsistent ImportError handling
🟠 Code Quality ⚠️ PARTIAL except Exception broad catch
🟡 Observability ✅ OK Structured logging at key points
🟢 Tests ⚠️ PARTIAL Good coverage but file exceeds limit

🔴 CRITICAL

1. DRY - Duplicated error detection logic

File: src/lfx/src/lfx/base/models/groq_model_discovery.py

The access/entitlement error detection block is identically duplicated in _test_chat_completion (lines 173-181) and _test_tool_calling (lines 238-246):

# Appears TWICE - identical
if any(
    phrase in error_msg
    for phrase in [
        "terms acceptance",
        "terms_required",
        "model_terms_required",
        "not available",
    ]
):

Recommendation: Extract to a class constant or helper method:

ACCESS_ERROR_PHRASES = ["terms acceptance", "terms_required", "model_terms_required", "not available"]

def _is_access_error(self, error_msg: str) -> bool:
    return any(phrase in error_msg for phrase in self.ACCESS_ERROR_PHRASES)

2. File Limit - Test file with 718 lines

File: src/backend/tests/unit/groq/test_groq_model_discovery.py718 lines

The hard limit is ~500 lines (up to ~530 OK; 600+ is a red flag). 718 lines is a clear violation.

Recommendation: Split the test file into smaller modules:

  • test_groq_discovery_success.pyTestGroqModelDiscoverySuccess
  • test_groq_discovery_errors.pyTestGroqModelDiscoveryErrors
  • test_groq_chat_completion.pyTestChatCompletionDetection
  • test_groq_discovery_edge_cases.pyTestGroqModelDiscoveryEdgeCases
  • test_groq_convenience_function.pyTestGetGroqModelsConvenienceFunction

🟠 IMPORTANT

3. Inconsistent ImportError handling

File: groq_model_discovery.py

  • _test_chat_completion (line 162): re-raises ImportError (with raise)
  • _test_tool_calling (line 229): returns False on ImportError

This inconsistency can cause unexpected behavior. If the groq package is not installed:

  • Chat test will propagate the exception to get_models
  • Tool test will never be reached

get_models catches ImportError in its general except block (line 124), so it works, but the intent is confusing.

Recommendation: Unify the behavior. Both should either re-raise or return the same value. Since ImportError means no test can be performed, re-raising in both makes more sense.


4. except Exception broad catch (BLE001)

File: groq_model_discovery.py, lines 166 and 232

Both _test_chat_completion and _test_tool_calling use except Exception with # noqa: BLE001. While this works and has justification (the Groq API can throw any type of exception), it violates the principle of strong error typing.

Recommendation: If it's not possible to enumerate specific Groq API exceptions, at least document why the broad catch is necessary with an inline comment explaining the trade-off.


5. logger.exception with redundant f-string

File: groq_model_discovery.py, line 125

logger.exception(f"Error discovering models: {e}")

logger.exception() already includes the full exception traceback. The {e} in the f-string is redundant. Same pattern in groq.py lines 105 and 118.

Recommendation:

logger.exception("Error discovering models")

6. _test_tool_calling returns False instead of None for ImportError

File: groq_model_discovery.py, line 231

except ImportError:
    logger.warning("groq package not installed, cannot test tool calling")
    return False

False means "the model definitely does NOT support tools", but ImportError means we couldn't test. Following the tri-state logic established in this PR, it should return None (indeterminate).


🟡 RECOMMENDED

7. base_url not passed in original _test_tool_calling (FIXED in this PR)

The PR correctly added base_url=self.base_url when creating the Groq client in _test_tool_calling (it was previously missing). Good fix.

8. Defensive fallback in groq.py

File: src/lfx/src/lfx/components/groq/groq.py, line 120

ids = ids or GROQ_MODELS

Good defensive addition for the case where get_models returns an empty list.


🟢 TESTS

Strengths

  • ✅ Good coverage of both happy path AND adversarial tests
  • ✅ Tests error scenarios: rate limit, terms required, import error, corrupted cache
  • ✅ Tests edge cases: expired cache, empty list, nonexistent directory
  • ✅ Tests the convenience function get_groq_models()
  • ✅ Well-organized fixtures in conftest.py
  • ✅ Arrange-Act-Assert structure

Areas of Concern

  • Test file exceeds 718 lines (limit ~500)
  • ⚠️ Coverage was not run and shown — must run and present results
  • ⚠️ Missing adversarial tests for:
    • Model with empty ID ("")
    • Model with special characters in ID
    • _get_provider_name with empty string
    • _save_cache with metadata containing non-serializable types
    • get_models when _fetch_available_models returns duplicate models

Pre-Submission Checklist

CRITICAL (Blockers)
[x] No PII in any logs, prints, or webhook messages
[x] No secrets/credentials in code
[ ] No duplicate types, classes, or logic (DRY) — VIOLATION: duplicated access error logic
[x] No file exceeds ~500 lines — SOURCE OK (343 lines)
[ ] TEST FILE exceeds ~500 lines — VIOLATION: 718 lines
[x] No file has more than 5 functions with DIFFERENT responsibilities
[x] No file has more than 10 functions even with same responsibility
[x] No file has more than 1 main class
[x] No mixed responsibility prefixes in same file

IMPORTANT (Must fix)
[x] Each file/function has single responsibility
[ ] Proper error handling — PARTIAL: inconsistent ImportError between methods
[x] Strong typing (no any/object types)
[x] Inputs validated at boundaries
[x] Types in dedicated types file (N/A)
[x] Constants in dedicated constants file

RECOMMENDED (Should fix)
[x] Appropriate logging at key points
[ ] No unnecessary comments — logger.exception with redundant f-string
[x] No over-engineering

TESTS
[x] Unit tests for core logic
[x] Tests cover success, error, and edge cases
[x] Tests have BOTH happy path AND adversarial tests
[ ] Adversarial tests included — PARTIAL: missing extreme edge cases
[ ] Coverage ran and shown — NOT EXECUTED
[x] All created tests pass — assuming yes (CI)
[ ] Not prolonging legacy bad patterns — inconsistent ImportError

Verdict

Item Score
Security & PII ✅ 10/10
DRY ⚠️ 6/10
File Structure ⚠️ 6/10
Architecture ✅ 9/10
Error Handling ⚠️ 7/10
Tests ⚠️ 7/10
TOTAL ~75/100

@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Mar 13, 2026
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Mar 13, 2026
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Mar 13, 2026
Copy link
Copy Markdown
Member

@Cristhianzl Cristhianzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@github-actions github-actions Bot added the lgtm This PR has been approved by a maintainer label Mar 13, 2026
@erichare erichare added this pull request to the merge queue Mar 13, 2026
Merged via the queue into main with commit aea0796 Mar 13, 2026
98 of 100 checks passed
@erichare erichare deleted the fix-groq-component branch March 13, 2026 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working lgtm This PR has been approved by a maintainer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Groq Component : error to fetch models LLM

3 participants