Dev #29

JayaSurya-27 · 2025-05-28T21:12:53Z

Pull Request

Description

Describe the changes in this pull request:

What feature/bug does this PR address?
Provide any relevant links or screenshots.

Checklist

Code compiles correctly.
Created/updated tests.
Linting and formatting applied.
Documentation updated.

Related Issues

Closes #<issue_number>

Summary by CodeRabbit

New Features
- Added support for custom evaluation templates and model validation in evaluation tagging.
- Introduced a new function to fetch custom evaluation templates from a remote service.
Improvements
- Enhanced validation and flexibility for evaluation configurations and mappings.
- Updated evaluation name handling to use standardized identifiers.
Bug Fixes
- Corrected documentation references for protection methods.
Dependency Updates
- Updated multiple framework packages and dependencies to ensure compatibility.
- Added a new dependency on the "futureagi" package.
- Adjusted Python version requirement to exclude Python 3.9.

Feature/custom eval

…on handling. Updated eval_name type and adjusted related logic for consistency in mapping validation.

…n EvalConfig and EvalMappingConfig to streamline evaluation types.

Bugfix/custom evals prototype

Bugfix/model

Feature/error message

updating version

coderabbitai · 2025-05-28T21:13:00Z

Walkthrough

This update introduces support for custom evaluation templates and model validation in the fault injection instrumentation system. It adds a ModelChoices enum, refactors evaluation name handling, extends the EvalTag dataclass with a model field, and implements stricter config and mapping validation. Multiple framework instrumentation modules are updated to reference the new Protect class. Several package versions and dependencies are incremented.

Changes

File(s)	Change Summary
python/fi_instrumentation/fi_types.py	Adds `ModelChoices` enum, refactors `EvalName`, extends `EvalTag` with model field and validation, supports custom eval templates, removes deprecated evals.
python/fi_instrumentation/settings.py	Adds `get_custom_eval_template` for fetching custom eval templates from a remote service.
python/fi_instrumentation/otel.py	Ensures `eval_name` is always a string in `EvalTag` before processing; minor import and newline changes.
python/fi_instrumentation/instrumentation/_protect_wrapper.py	Updates docstring to reference `Protect.protect` instead of `ProtectClient.protect`.
python/frameworks/*/pyproject.toml	Bumps package and `fi-instrumentation-otel` dependency versions across all frameworks.
python/pyproject.toml	Increments version, excludes Python 3.9, adds `futureagi` dependency.
python/frameworks//traceai_/init.py	Updates instrumentation to reference `Protect.protect` instead of `ProtectClient.protect`.
python/frameworks/langchain/examples/tool_calling_agent.py	Updates to use `Protect` and `Evaluator` instead of `ProtectClient` and `EvalClient`.
python/frameworks/anthropic/examples/guardrail_example.py	Updates to use `Protect` and `Evaluator` instead of `ProtectClient` and `EvalClient`.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant EvalTag
    participant Settings
    participant RemoteService

    User->>EvalTag: Instantiate with eval_name, model, config, mapping
    EvalTag->>EvalTag: __post_init__ (validation)
    alt Custom eval_name
        EvalTag->>Settings: get_custom_eval_template(eval_name)
        Settings->>RemoteService: POST /custom-eval-template
        RemoteService-->>Settings: Template JSON
        Settings-->>EvalTag: Template
        EvalTag->>EvalTag: Validate config/mapping against template
    else Standard eval_name
        EvalTag->>EvalTag: Validate eval_name and model against enums
        EvalTag->>EvalTag: Validate config/mapping
    end
    EvalTag-->>User: Instance ready or error raised

Possibly related PRs

future-agi/traceAI#25: Implements the same enhancements in fi_types.py, including ModelChoices, custom eval support, and validation refactoring.
future-agi/traceAI#27: Contains identical changes to fi_types.py for custom eval templates, model validation, and enum updates.
future-agi/traceAI#26: Directly overlaps with this PR, modifying fi_types.py to add model support and custom eval validation.

Poem

In the warren of code, new models hop in,
Custom evals now join, let the testing begin!
With enums and checks, our logic’s robust,
Protect hops forward, in frameworks we trust.
Version bumps leap, dependencies align,
This rabbit’s quite proud—our code’s looking fine! 🐇✨

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 1

🔭 Outside diff range comments (3)

python/frameworks/langchain/traceai_langchain/__init__.py (1)
62-67: 🛠️ Refactor suggestion

Consider adding protect method restoration in uninstrumentation.

The _uninstrument method doesn't restore the original protect method like it does for other wrapped methods. This could leave the protection method wrapped after uninstrumentation.

Add protection method restoration:
 def _uninstrument(self, **kwargs: Any) -> None:
     langchain_core.callbacks.BaseCallbackManager.__init__ = self._original_callback_manager_init  # type: ignore
     self._original_callback_manager_init = None  # type: ignore
+    if hasattr(self, '_original_protect') and self._original_protect is not None:
+        from fi.evals import Protect
+        Protect.protect = self._original_protect
+        self._original_protect = None
     self._tracer = None
     self.fi_tracer = None
python/frameworks/instructor/traceai_instructor/__init__.py (1)
56-66: 🛠️ Refactor suggestion

Missing protect method restoration in uninstrumentation.

Similar to the langchain instrumentor, the _uninstrument method doesn't restore the original protect method. This could cause issues if the instrumentation is removed and then re-applied.

Add the missing uninstrumentation for the protect method:
             patch_module.handle_response_model = self._original_handle_response_model  # type: ignore[attr-defined]
             self._original_handle_response_model = None
 
+        if hasattr(self, '_original_protect') and self._original_protect is not None:
+            from fi.evals import Protect
+            Protect.protect = self._original_protect
+            self._original_protect = None
python/frameworks/groq/traceai_groq/__init__.py (1)
65-73: 🛠️ Refactor suggestion

Missing protect method uninstrumentation.

The _uninstrument method handles the groq-specific methods but doesn't restore the original protect method like it should.

Add the missing protect method restoration:
             groq_module.AsyncCompletions.create = (
                 self._original_async_completions_create
             )
+        
+        if hasattr(self, '_original_protect') and self._original_protect is not None:
+            from fi.evals import Protect
+            Protect.protect = self._original_protect
+            self._original_protect = None

🧹 Nitpick comments (3)

python/fi_instrumentation/fi_types.py (3)
900-903: Remove unnecessary f-string prefix.

The f-string at line 902 has no placeholders and should be a regular string.
-                f"eval_name is required"
+                "eval_name is required"
🧰 Tools

🪛 Ruff (0.11.9)

902-902: f-string without any placeholders

Remove extraneous f prefix

(F541)

984-984: Remove unnecessary f-string prefix.

The f-string has no placeholders and should be a regular string.
-                f"eval_name must be an Present."
+                "eval_name must be Present."
Also, fix the grammatical error: "must be Present" → "must be present".

🧰 Tools

🪛 Ruff (0.11.9)

984-984: f-string without any placeholders

Remove extraneous f prefix

(F541)

987-992: Simplify nested if statements.

Combine the nested if statements for better readability.
-        if not is_custom_eval:
-            if not isinstance(self.eval_name, EvalName):
-                raise ValueError(
-                    f"eval_name must be an EvalName enum, got {type(self.eval_name)}"
-                )
+        if not is_custom_eval and not isinstance(self.eval_name, EvalName):
+            raise ValueError(
+                f"eval_name must be an EvalName enum, got {type(self.eval_name)}"
+            )
🧰 Tools

🪛 Ruff (0.11.9)

987-988: Use a single if statement instead of nested if statements

Combine if statements using and

(SIM102)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fe6a211 and 936c2cc.

📒 Files selected for processing (34)

python/fi_instrumentation/fi_types.py (7 hunks)
python/fi_instrumentation/instrumentation/_protect_wrapper.py (1 hunks)
python/fi_instrumentation/otel.py (3 hunks)
python/fi_instrumentation/settings.py (3 hunks)
python/frameworks/anthropic/examples/guardrail_example.py (2 hunks)
python/frameworks/anthropic/pyproject.toml (2 hunks)
python/frameworks/anthropic/traceai_anthropic/__init__.py (3 hunks)
python/frameworks/autogen/pyproject.toml (2 hunks)
python/frameworks/bedrock/pyproject.toml (2 hunks)
python/frameworks/bedrock/traceai_bedrock/__init__.py (2 hunks)
python/frameworks/crewai/pyproject.toml (2 hunks)
python/frameworks/dspy/pyproject.toml (2 hunks)
python/frameworks/groq/pyproject.toml (2 hunks)
python/frameworks/groq/traceai_groq/__init__.py (2 hunks)
python/frameworks/guardrails/pyproject.toml (2 hunks)
python/frameworks/haystack/pyproject.toml (2 hunks)
python/frameworks/haystack/traceai_haystack/__init__.py (2 hunks)
python/frameworks/instructor/pyproject.toml (2 hunks)
python/frameworks/instructor/traceai_instructor/__init__.py (2 hunks)
python/frameworks/langchain/examples/tool_calling_agent.py (2 hunks)
python/frameworks/langchain/pyproject.toml (2 hunks)
python/frameworks/langchain/traceai_langchain/__init__.py (2 hunks)
python/frameworks/litellm/pyproject.toml (2 hunks)
python/frameworks/litellm/traceai_litellm/__init__.py (2 hunks)
python/frameworks/llama_index/pyproject.toml (2 hunks)
python/frameworks/llama_index/traceai_llamaindex/__init__.py (2 hunks)
python/frameworks/mistralai/pyproject.toml (2 hunks)
python/frameworks/openai-agents/pyproject.toml (2 hunks)
python/frameworks/openai/pyproject.toml (2 hunks)
python/frameworks/openai/traceai_openai/__init__.py (2 hunks)
python/frameworks/smolagents/pyproject.toml (2 hunks)
python/frameworks/vertexai/pyproject.toml (2 hunks)
python/frameworks/vertexai/traceai_vertexai/__init__.py (2 hunks)
python/pyproject.toml (3 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (2)

python/fi_instrumentation/otel.py (1)

python/fi_instrumentation/fi_types.py (1)

EvalName (434-489)

python/frameworks/anthropic/traceai_anthropic/__init__.py (1)

python/fi_instrumentation/instrumentation/_protect_wrapper.py (1)

GuardrailProtectWrapper (71-164)

🪛 Ruff (0.11.9)

python/fi_instrumentation/settings.py

136-136: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

python/fi_instrumentation/fi_types.py

902-902: f-string without any placeholders

Remove extraneous f prefix

(F541)

921-923: Use a single if statement instead of nested if statements

(SIM102)

984-984: f-string without any placeholders

Remove extraneous f prefix

(F541)

987-988: Use a single if statement instead of nested if statements

Combine if statements using and

(SIM102)

🔇 Additional comments (67)

python/fi_instrumentation/settings.py (1)

6-6: LGTM! Import additions support the new functionality.

The added imports Any for type hints and requests for HTTP functionality are appropriately used in the new get_custom_eval_template function.

Also applies to: 9-9

python/frameworks/openai-agents/pyproject.toml (1)

3-3: LGTM! Version updates are consistent and well-coordinated.

The package version increment and dependency update to fi-instrumentation-otel 0.1.6 align with the broader ecosystem updates mentioned in the PR summary. This ensures compatibility with the enhanced evaluation template and model validation features.

Also applies to: 15-15

python/frameworks/langchain/pyproject.toml (1)

3-3: LGTM! Coordinated version updates maintain ecosystem consistency.

The version bump and dependency update are properly aligned with the other framework packages in this PR, ensuring compatibility with the new Protect class and enhanced evaluation features.

Also applies to: 14-14

python/frameworks/anthropic/pyproject.toml (1)

3-3: LGTM! Final framework package successfully updated.

This completes the coordinated version updates across all framework packages, ensuring consistent dependency versions and compatibility with the updated instrumentation library that supports custom evaluation templates and the new Protect class.

Also applies to: 15-15
python/frameworks/dspy/pyproject.toml (2)

3-3: Approve version bump for traceAI-DSPy.
The package version has been updated from 0.1.5 to 0.1.6, matching the coordinated release plan.

15-15:
Details

✅ Verification successful

Verify updated fi-instrumentation-otel dependency.
The fi-instrumentation-otel dependency was bumped to 0.1.6. Please confirm that this version is published on PyPI and resolves correctly in CI.
You can run:

🏁 Script executed:
#!/bin/bash
# Check if fi-instrumentation-otel 0.1.6 exists on PyPI
curl -s https://pypi.org/pypi/fi-instrumentation-otel/json | jq '.releases | has("0.1.6")'
Length of output: 97
fi-instrumentation-otel 0.1.6 is published on PyPI
Verified via curl https://pypi.org/pypi/fi-instrumentation-otel/json | jq '.releases | has("0.1.6")' returning true. The dependency bump should resolve correctly in CI.
python/frameworks/guardrails/pyproject.toml (2)

3-3: Approve version bump for traceAI-guardrails.
The package version has been updated from 0.1.2 to 0.1.3, consistent with the rest of the framework releases.

16-16:
Details

✅ Verification successful

Verify updated fi-instrumentation-otel dependency.
The fi-instrumentation-otel dependency was updated to 0.1.6. Ensure that this version is available and passes dependency resolution.

🏁 Script executed:
#!/bin/bash
# Check if fi-instrumentation-otel 0.1.6 exists on PyPI
curl -s https://pypi.org/pypi/fi-instrumentation-otel/json | jq '.releases | has("0.1.6")'
Length of output: 97
fi-instrumentation-otel 0.1.6 is available
Confirmed that version 0.1.6 of fi-instrumentation-otel is published on PyPI and will resolve correctly.
python/frameworks/smolagents/pyproject.toml (2)

3-3: Approve version bump for traceAI-smolagents.
The package version has been updated from 0.1.2 to 0.1.3, aligning with the multi-framework release.

15-15:
Details

✅ Verification successful

Verify updated fi-instrumentation-otel dependency.
The fi-instrumentation-otel version was bumped to 0.1.6. Please confirm its publication and installability.

🏁 Script executed:
#!/bin/bash
# Check if fi-instrumentation-otel 0.1.6 exists on PyPI
curl -s https://pypi.org/pypi/fi-instrumentation-otel/json | jq '.releases | has("0.1.6")'
Length of output: 97
fi-instrumentation-otel 0.1.6 verified on PyPI
The fi-instrumentation-otel package version 0.1.6 is published and installable:
curl -s https://pypi.org/pypi/fi-instrumentation-otel/json | jq '.releases | has("0.1.6")'
# => true
No further changes needed.

python/frameworks/smolagents/pyproject.toml (line 15): fi-instrumentation-otel = "0.1.6"
python/frameworks/litellm/pyproject.toml (2)

3-3: Approve version bump for traceAI-litellm.
The package version has been updated from 0.1.5 to 0.1.6, consistent with the overall release.

15-15:
Details

✅ Verification successful

Verify updated fi-instrumentation-otel dependency.
The fi-instrumentation-otel dependency was updated to 0.1.6. Ensure this version is released and resolves without issues.

🏁 Script executed:
#!/bin/bash
# Check if fi-instrumentation-otel 0.1.6 exists on PyPI
curl -s https://pypi.org/pypi/fi-instrumentation-otel/json | jq '.releases | has("0.1.6")'
Length of output: 97
fi-instrumentation-otel 0.1.6 release verified
The specified version exists on PyPI (.releases contains “0.1.6”), so the dependency should resolve without issues.
python/frameworks/groq/pyproject.toml (2)

3-3: Approve version bump for traceAI-groq.
The package version has been updated from 0.1.5 to 0.1.6, in line with other framework updates.

15-15:
Details

❓ Verification inconclusive

Verify updated fi-instrumentation-otel dependency.
The fi-instrumentation-otel dependency was bumped to 0.1.6. Please verify that this version is published and integrates successfully.

🏁 Script executed:
#!/bin/bash
# Check if fi-instrumentation-otel 0.1.6 exists on PyPI
curl -s https://pypi.org/pypi/fi-instrumentation-otel/json | jq '.releases | has("0.1.6")'
Length of output: 97

🏁 Script executed:
#!/bin/bash
set -euo pipefail

# Install the specific version of fi-instrumentation-otel
pip install fi-instrumentation-otel==0.1.6

# Verify that it can be imported and its version matches
python - << 'EOF'
import fi_instrumentation_otel as fio
version = getattr(fio, "__version__", None) or fio.__dict__.get("VERSION", None)
if version != "0.1.6":
    echo "Unexpected version: $version" >&2
    exit 1
print("Import successful, version:", version)
EOF
Length of output: 12275
Confirm fi-instrumentation-otel 0.1.6 integration
The bump to 0.1.6 is published on PyPI, but our integration check script failed due to a syntax error. Please manually verify that the package installs, imports cleanly, and reports the correct version:

• File: python/frameworks/groq/pyproject.toml (line 15) now reads:
fi-instrumentation-otel = "0.1.6"

Suggested verification steps:
pip install fi-instrumentation-otel==0.1.6
python - <<'EOF'
import fi_instrumentation_otel as fio
print("Imported version:", fio.__version__)
EOF
Ensure the import succeeds and __version__ is “0.1.6”.
python/frameworks/vertexai/pyproject.toml (2)

3-3: Version bump is correct and consistent.
Bumped the traceAI-vertexai package version to 0.1.6 to align with the planned release. Ensure that the corresponding Git tag and release notes (e.g., in CHANGELOG.md) are updated to reflect this new version.

14-14:
Details

❓ Verification inconclusive

Verify availability of the new dependency version.
Updated the fi-instrumentation-otel dependency to 0.1.6. Please confirm that this version has been published and is compatible with the updated instrumentation code.

🏁 Script executed:
#!/bin/bash
# Check PyPI for the latest version of fi-instrumentation-otel
curl -s https://pypi.org/pypi/fi-instrumentation-otel/json | jq '.info.version'
Length of output: 87
fi-instrumentation-otel 0.1.6 is published on PyPI
I’ve confirmed via PyPI that version 0.1.6 exists. Please ensure this release is compatible with your updated instrumentation code by running your integration tests and CI pipelines.
python/frameworks/bedrock/pyproject.toml (2)

3-3: Version bump is correct and consistent.
Upgraded traceAI-bedrock package version to 0.1.6, matching the other framework packages. Remember to update the release tag and changelog accordingly.

15-15:
Details

❓ Verification inconclusive

Verify availability of the new dependency version.
Updated fi-instrumentation-otel to 0.1.6. Double-check that this version is released and fully compatible with your Bedrock instrumentation updates.

🏁 Script executed:
#!/bin/bash
# Check that fi-instrumentation-otel@0.1.6 is available on PyPI
curl -s https://pypi.org/pypi/fi-instrumentation-otel/json | jq '.info.version'
Length of output: 87
fi-instrumentation-otel@0.1.6 confirmed on PyPI
Version 0.1.6 of fi-instrumentation-otel is published (PyPI reports 0.1.6). Please proceed to validate that this release works correctly with your Bedrock instrumentation updates.
python/frameworks/mistralai/pyproject.toml (2)

3-3: Version bump is correct and consistent.
Incremented the traceAI-mistralai package version to 0.1.6 in line with the coordinated release across frameworks. Ensure this is reflected in your release workflow.

13-13:
Details

✅ Verification successful

Verify availability of the new dependency version.
Bumped fi-instrumentation-otel dependency to 0.1.6. Please confirm that this version is published and works with your updated Mistral AI instrumentation.

🏁 Script executed:
#!/bin/bash
# Verify fi-instrumentation-otel version on PyPI
curl -s https://pypi.org/pypi/fi-instrumentation-otel/json | jq '.info.version'
Length of output: 87
Dependency Version Confirmed on PyPI
fi-instrumentation-otel version 0.1.6 is published and available on PyPI. Proceed with this update.
python/frameworks/openai/pyproject.toml (2)

3-3: Version bump is correct and consistent.
Updated traceAI-openai package version to 0.1.6 to match the cross-framework release. Don’t forget to tag and note this in your changelog.

13-13:
Details

❓ Verification inconclusive

Verify availability of the new dependency version.
Changed fi-instrumentation-otel to 0.1.6. Ensure that this version is live on PyPI and compatible with your OpenAI instrumentation changes.

🏁 Script executed:
#!/bin/bash
# Confirm fi-instrumentation-otel@0.1.6 is published
curl -s https://pypi.org/pypi/fi-instrumentation-otel/json | jq '.info.version'
Length of output: 87
Dependency Version Confirmed on PyPI
The package fi-instrumentation-otel version 0.1.6 is published and available. Please verify that this release’s changes are compatible with your OpenAI instrumentation updates and adjust any related tests or configurations as needed.
python/frameworks/instructor/pyproject.toml (2)

3-3: Version bump is correct and consistent.
Bumped the traceAI-instructor package version to 0.1.6. Ensure that your release process (tags, CHANGELOG.md) is updated accordingly.

14-14:
Details

✅ Verification successful

Verify availability of the new dependency version.
Updated fi-instrumentation-otel to 0.1.6. Please verify that this version exists on PyPI and aligns with the new instructor instrumentation.

🏁 Script executed:
#!/bin/bash
# Check fi-instrumentation-otel version on PyPI
curl -s https://pypi.org/pypi/fi-instrumentation-otel/json | jq '.info.version'
Length of output: 87
Dependency version confirmed on PyPI
The fi-instrumentation-otel package at version 0.1.6 is available on PyPI and aligns with the instructor instrumentation. No further action required.
python/frameworks/autogen/pyproject.toml (2)

3-3: Approve version bump for traceAI-autogen.
The package version is correctly incremented from 0.1.2 to 0.1.3 following semver conventions.

15-15: Approve fi-instrumentation-otel dependency update.
Updating fi-instrumentation-otel to 0.1.6 aligns this package with the latest core instrumentation enhancements.

python/frameworks/crewai/pyproject.toml (2)

3-3: Approve version bump for traceAI-crewai.
The version has been safely bumped from 0.1.5 to 0.1.6, matching the coordinated release across frameworks.

16-16: Approve fi-instrumentation-otel dependency update.
Aligning fi-instrumentation-otel to 0.1.6 ensures all framework instrumentations use the latest validation logic.

python/pyproject.toml (3)

3-3: Approve core package version bump.
Increasing fi-instrumentation-otel version to 0.1.6 is correct per the coordinated release.

17-17: Request verification of Python version specifier change.
The Python requirement was changed to ">3.9,<3.13", thereby excluding 3.9 entirely. Please confirm this exclusion is intentional and compatible with downstream frameworks.

28-28: Approve addition of futureagi dependency.
Adding the new futureagi package (^0.6.0) aligns with the expanded feature set. Ensure it is published and available on PyPI.

python/frameworks/haystack/pyproject.toml (2)

3-3: Approve version bump for traceAI-haystack.
Version updated to 0.1.6, consistent with the other framework packages.

15-15: Approve fi-instrumentation-otel dependency update.
Bumping to 0.1.6 keeps this instrumentation module in sync with core enhancements.

python/frameworks/llama_index/pyproject.toml (2)

3-3: Approve version bump for traceAI-llamaindex.
The version has been incremented to 0.1.6 in line with other framework releases.

14-14: Approve fi-instrumentation-otel dependency update.
Updating the dependency to 0.1.6 ensures access to the latest evaluation and model validation features.

python/frameworks/langchain/traceai_langchain/__init__.py (2)

40-40: Import update looks good.

The import change from ProtectClient to Protect aligns with the broader refactor mentioned in the AI summary.

54-54: Method references updated consistently.

The changes correctly update the method references from ProtectClient.protect to Protect.protect for both the original method storage and the wrapping call.

Also applies to: 57-57

python/frameworks/instructor/traceai_instructor/__init__.py (2)

11-11: Import change is consistent with the refactor.

The import update from ProtectClient to Protect follows the same pattern as other framework instrumentors.

50-50: Method references updated correctly.

Both the original method storage and wrapper function call are properly updated to use Protect.protect.

Also applies to: 53-53

python/frameworks/anthropic/traceai_anthropic/__init__.py (3)

45-45: Import update is consistent.

The change from ProtectClient to Protect aligns with the refactor across all framework instrumentors.

92-92: Method wrapping updated correctly.

The protect method references are properly updated to use the new Protect class.

Also applies to: 95-95

101-101: Proper uninstrumentation implemented.

Good practice - the uninstrumentation method correctly imports the Protect class and restores the original protect method, ensuring clean teardown of the instrumentation.

Also applies to: 114-114

python/frameworks/groq/traceai_groq/__init__.py (2)

13-13: Import change follows the refactor pattern.

The update from ProtectClient to Protect is consistent with other framework instrumentors.

59-59: Protection method wrapping updated correctly.

The method references are properly updated to use Protect.protect instead of ProtectClient.protect.

Also applies to: 62-62

python/fi_instrumentation/instrumentation/_protect_wrapper.py (1)

72-72: LGTM! Documentation update aligns with API refactoring.

The docstring has been correctly updated to reflect the change from ProtectClient.protect to Protect.protect, ensuring documentation stays in sync with the codebase refactoring.
python/frameworks/haystack/traceai_haystack/__init__.py (2)

18-18: LGTM! Import updated for ProtectClient → Protect refactoring.

The import has been correctly updated to use the new Protect class instead of ProtectClient.

79-84:
Details

✅ Verification successful

LGTM! Method references updated consistently.

The instrumentation logic has been properly updated to reference Protect.protect instead of ProtectClient.protect. The refactoring maintains the same functionality while using the new class structure.

Let me verify that all ProtectClient references have been updated across the codebase to prevent runtime errors:

🏁 Script executed:
#!/bin/bash
# Description: Search for any remaining ProtectClient references that might have been missed in the refactoring
# Expected: No remaining ProtectClient imports or references

echo "Searching for remaining ProtectClient references..."
rg -i "protectclient" --type py -A 2 -B 2

echo "Searching for old import patterns..."
rg "from fi\.evals import.*ProtectClient" --type py

echo "Searching for old method references..."
rg "ProtectClient\.protect" --type py
Length of output: 418
LGTM! No remaining ProtectClient references detected.

Ripgrep searches for

case-insensitive “protectclient”

old import patterns (from fi.evals import ProtectClient)

method calls (ProtectClient.protect)

all returned no matches, confirming the refactoring is complete.
python/frameworks/litellm/traceai_litellm/__init__.py (2)

44-44: LGTM! Import refactoring implemented consistently.

The import statement has been correctly updated from ProtectClient to Protect, maintaining consistency with the codebase refactoring.

272-277: LGTM! Instrumentation references updated properly.

The instrumentation setup correctly references the new Protect.protect method instead of ProtectClient.protect. The refactoring maintains all existing functionality while using the updated class structure.

python/frameworks/bedrock/traceai_bedrock/__init__.py (2)

51-51: LGTM! Import statement updated for refactoring.

The import has been correctly changed from ProtectClient to Protect, consistent with the framework-wide refactoring effort.

424-429: LGTM! Instrumentation wrapper updated correctly.

The protect method instrumentation has been properly updated to target Protect.protect instead of ProtectClient.protect. The changes maintain existing functionality while using the refactored class structure.

python/fi_instrumentation/otel.py (3)

18-18: LGTM!

The EvalName import is necessary for the enum conversion logic added later in the function.

71-73: Well-implemented enum conversion logic!

This preprocessing step properly handles both EvalName enum instances and string values, ensuring consistent string representation before calling prepare_eval_tags. The type checking prevents unintended conversions.

654-654: Good formatting improvement!

Adding the trailing newline follows Python file formatting conventions.

python/frameworks/anthropic/examples/guardrail_example.py (2)

7-7: LGTM!

The import update aligns with the class name refactoring from ProtectClient/EvalClient to Protect/Evaluator.

52-53: Consistent class name updates!

The instantiation changes from EvalClient to Evaluator and ProtectClient to Protect are consistent with the import updates and maintain the same API usage.

python/frameworks/llama_index/traceai_llamaindex/__init__.py (3)

11-11: LGTM!

The import update from ProtectClient to Protect aligns with the systematic refactoring across framework instrumentors.

89-89: Consistent method reference update!

The change from ProtectClient.protect to Protect.protect maintains the same instrumentation pattern while using the updated class name.

92-92: Correct wrapper function name update!

The change from ProtectClient.protect to Protect.protect correctly updates the wrapped function name while maintaining the same instrumentation behavior.

python/frameworks/openai/traceai_openai/__init__.py (3)

13-13: LGTM!

The import change from ProtectClient to Protect is consistent with the systematic refactoring across all framework instrumentors.

58-58: Consistent reference update!

The change from ProtectClient.protect to Protect.protect follows the established pattern and maintains the same instrumentation functionality.

61-61: Correct wrapper update!

The change from ProtectClient.protect to Protect.protect correctly updates the wrapped function name while maintaining identical instrumentation behavior.

python/frameworks/langchain/examples/tool_calling_agent.py (1)

24-24: LGTM! Clean refactor to the new API structure.

The migration from ProtectClient/EvalClient to Protect/Evaluator is implemented correctly, with proper dependency injection where Protect now receives the Evaluator instance.

Also applies to: 288-289

python/frameworks/vertexai/traceai_vertexai/__init__.py (1)

12-12: LGTM! Instrumentation correctly updated for the new Protect API.

The wrapper registration is properly updated to reference Protect.protect instead of ProtectClient.protect, maintaining consistency with the API refactor.

Also applies to: 49-54
python/fi_instrumentation/fi_types.py (3)

418-424: LGTM! Well-defined model choices enum.

The ModelChoices enum provides a clean way to specify available models for evaluations.

910-941: Well-structured validation logic for custom eval templates.

The implementation correctly handles both system and custom evaluation templates, with appropriate validation for model choices and configuration fields. The distinction between custom and system evals is clearly maintained throughout the validation process.

🧰 Tools

🪛 Ruff (0.11.9)

921-923: Use a single if statement instead of nested if statements

(SIM102)

434-490:
Details

❓ Verification inconclusive

Verify the impact of the breaking change from UPPER_CASE to snake_case enum values.

This change converts all EvalName enum values from UPPER_CASE to snake_case, which is a breaking change that could affect existing code referencing these values.

Run the following script to check for any remaining references to the old UPPER_CASE enum values:

🏁 Script executed:
#!/bin/bash
# Description: Search for references to old UPPER_CASE EvalName enum values

# Search for patterns like EvalName.CONVERSATION_COHERENCE (old style)
rg -A 2 'EvalName\.[A-Z_]+' --glob '*.py'

# Also search for string references to the old values
rg '"CONVERSATION_COHERENCE"|"CONTENT_MODERATION"|"CONTEXT_ADHERENCE"|"PROMPT_PERPLEXITY"' --glob '*.py'
Length of output: 44106
Enum .value migration: internal impact verified

A repo-wide search found no literal references to the old UPPER_CASE strings (e.g. "CONVERSATION_COHERENCE"), and all example code uses EvalName.FOO members (attribute names unchanged).

The breaking change affects only the enum’s .value strings (now snake_case instead of UPPER_CASE).

Examples and internal mappings remain valid; no code changes are required in this PR.

Please review any external clients, configuration files, scripts, or documentation that read or compare against the enum’s .value and update them to snake_case.

Consider adding a changelog entry (and version bump) to highlight this migration.

coderabbitai · 2025-05-28T21:19:06Z

python/fi_instrumentation/settings.py

+def get_custom_eval_template(
+   eval_name: str, base_url: Optional[str] = None
+) -> Dict[str, Any]:
+    """
+    Check if a custom eval template exists for a given eval name.
+    """
+    if not eval_name:
+        raise ValueError("Eval name is required")
+
+    if base_url is None:
+        base_url = get_env_collector_endpoint()
+
+
+    url = f"{base_url}/tracer/custom-eval-config/get_custom_eval_by_name/"
+
+    try:
+        headers = {
+            "Content-Type" : "application/json",
+            **(get_env_fi_auth_header() or {}),
+        }
+
+        response = requests.post(
+            url,
+            headers=headers,
+            json={"eval_template_name": eval_name},
+        )
+
+        response.raise_for_status()
+        return response.json().get("result", {})
+    except Exception as e:
+        raise ValueError(f"Failed to check custom eval template: {e}")


🛠️ Refactor suggestion

⚠️ Potential issue

Improve exception handling and add response validation.

The new function has several areas for improvement:

Exception chaining: The static analysis correctly identifies that exception handling should preserve the original exception context.

Response validation: The function assumes the response contains a "result" key without validation.

Broad exception handling: Catching all exceptions and converting to ValueError may hide important debugging information.

Apply these improvements:

def get_custom_eval_template( - eval_name: str, base_url: Optional[str] = None + eval_name: str, base_url: Optional[str] = None ) -> Dict[str, Any]: """ - Check if a custom eval template exists for a given eval name. + Fetch a custom evaluation template by name from the remote service. + + Args: + eval_name: The name of the evaluation template to fetch + base_url: Optional base URL, defaults to environment collector endpoint + + Returns: + Dictionary containing the custom evaluation template data + + Raises: + ValueError: If eval_name is empty or request fails + requests.RequestException: If HTTP request fails """ if not eval_name: raise ValueError("Eval name is required") if base_url is None: base_url = get_env_collector_endpoint() url = f"{base_url}/tracer/custom-eval-config/get_custom_eval_by_name/" try: headers = { - "Content-Type" : "application/json", + "Content-Type": "application/json", **(get_env_fi_auth_header() or {}), } response = requests.post( url, headers=headers, json={"eval_template_name": eval_name}, ) response.raise_for_status() - return response.json().get("result", {}) + response_data = response.json() + if "result" not in response_data: + raise ValueError("Invalid response format: missing 'result' key") + return response_data["result"] - except Exception as e: - raise ValueError(f"Failed to check custom eval template: {e}") + except requests.RequestException as e: + raise ValueError(f"Failed to fetch custom eval template: {e}") from e + except (KeyError, ValueError) as e: + if "Invalid response format" in str(e): + raise + raise ValueError(f"Failed to parse custom eval template response: {e}") from e

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def get_custom_eval_template(

eval_name: str, base_url: Optional[str] = None

) -> Dict[str, Any]:

"""

Check if a custom eval template exists for a given eval name.

"""

if not eval_name:

raise ValueError("Eval name is required")

if base_url is None:

base_url = get_env_collector_endpoint()

url = f"{base_url}/tracer/custom-eval-config/get_custom_eval_by_name/"

try:

headers = {

"Content-Type" : "application/json",

**(get_env_fi_auth_header() or {}),

}

response = requests.post(

url,

headers=headers,

json={"eval_template_name": eval_name},

)

response.raise_for_status()

return response.json().get("result", {})

except Exception as e:

raise ValueError(f"Failed to check custom eval template: {e}")

def get_custom_eval_template(

eval_name: str, base_url: Optional[str] = None

) -> Dict[str, Any]:

"""

Fetch a custom evaluation template by name from the remote service.

Args:

eval_name: The name of the evaluation template to fetch

base_url: Optional base URL, defaults to environment collector endpoint

Returns:

Dictionary containing the custom evaluation template data

Raises:

ValueError: If eval_name is empty or request fails

requests.RequestException: If HTTP request fails

"""

if not eval_name:

raise ValueError("Eval name is required")

if base_url is None:

base_url = get_env_collector_endpoint()

url = f"{base_url}/tracer/custom-eval-config/get_custom_eval_by_name/"

try:

headers = {

"Content-Type": "application/json",

**(get_env_fi_auth_header() or {}),

}

response = requests.post(

url,

headers=headers,

json={"eval_template_name": eval_name},

)

response.raise_for_status()

response_data = response.json()

if "result" not in response_data:

raise ValueError("Invalid response format: missing 'result' key")

return response_data["result"]

except requests.RequestException as e:

raise ValueError(f"Failed to fetch custom eval template: {e}") from e

except (KeyError, ValueError) as e:

if "Invalid response format" in str(e):

raise

raise ValueError(f"Failed to parse custom eval template response: {e}") from e

🧰 Tools

🪛 Ruff (0.11.9)

136-136: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents

In python/fi_instrumentation/settings.py around lines 106 to 136, improve exception handling by using exception chaining with 'raise ... from e' to preserve the original context. Add validation to check if the response JSON contains the "result" key before accessing it, and raise a clear error if missing. Replace the broad except clause with more specific exceptions like requests.RequestException and JSONDecodeError to avoid hiding other errors.

sarthakFuture and others added 17 commits May 26, 2025 19:16

WIP on main: cc557c2 Merge pull request #16 from future-agi/dev

3716ceb

adding specific validations for custom evals

4e9334c

making custom evals configurable

1d37d92

pr comments

771c505

adding model as compulsory field

b7eaeec

adding model valid choices enum

b057735

Merge pull request #24 from future-agi/feature/ts-custom-eval

b2963e0

Feature/custom eval

Refactor EvalTag to support EvalName type and improve custom evaluati…

6a62b64

…on handling. Updated eval_name type and adjusted related logic for consistency in mapping validation.

Remove DETERMINISTIC_EVALS from EvalName and related configurations i…

d09d4cb

…n EvalConfig and EvalMappingConfig to streamline evaluation types.

Merge pull request #25 from future-agi/bugfix/custom-evals-prototype

7e067b4

Bugfix/custom evals prototype

making model optional

d8585ae

Merge branch 'dev' into bugfix/model

12748d8

Merge pull request #26 from future-agi/bugfix/model

16137ca

Bugfix/model

adding new validations

3fd211f

Merge pull request #27 from future-agi/feature/error-message

1e22118

Feature/error message

updating version

ba9cd0b

Merge pull request #28 from future-agi/version_bump

936c2cc

updating version

NVJKKartik approved these changes May 28, 2025

View reviewed changes

NVJKKartik merged commit 250cb84 into main May 28, 2025
1 check was pending

coderabbitai bot reviewed May 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dev #29

Dev #29

Uh oh!

JayaSurya-27 commented May 28, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented May 28, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot May 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Dev #29

Dev #29

Uh oh!

Conversation

JayaSurya-27 commented May 28, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request

Description

Checklist

Related Issues

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot May 28, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

JayaSurya-27 commented May 28, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented May 28, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)