Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,21 @@ poetry run python3 -m pytest tests -v

Latest verification (2025-10-24): 144 passed, 10 skipped, 32 warnings. Warnings originate from third-party dependencies (`langchain` pydantic shim deprecations and `datetime.utcnow` usage). Track upstream fixes or pin patched releases as needed.

### Running CI Checks Locally

Before opening a pull request, you can run the same checks locally that are executed in CI.

#### Requirements
- Python **3.10 or higher**
- [Poetry](https://python-poetry.org/) installed

#### Setup
Install dependencies (including dev tools):

```bash
poetry install --with dev
```
Comment on lines +206 to +219
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Section is incomplete - missing CI check commands.

This section addresses issue #3 but is incomplete. According to the acceptance criteria, you need to document how to run:

  • poetry check - verify Poetry config
  • poetry run ruff check . - run linting
  • poetry run ruff format --check . - verify formatting
  • poetry run pytest - run tests

Currently, only the setup (poetry install --with dev) is shown. Additionally, the AI summary mentions this content "appears twice in the diff" - please verify there's no duplication in the README.

Apply this diff to complete the section:

 #### Setup
 Install dependencies (including dev tools):
 
 ```bash
 poetry install --with dev

+#### Verify Your Changes
+
+Run these commands before opening a pull request:
+
+bash +# Verify pyproject.toml and lock file consistency +poetry check + +# Lint code with ruff +poetry run ruff check . + +# Verify code formatting +poetry run ruff format --check . + +# Run all tests +poetry run pytest +
+
+If formatting issues are detected, you can auto-fix them:
+
+bash +poetry run ruff format . +


<details>
<summary>🤖 Prompt for AI Agents</summary>

In README.md around lines 206 to 219, the "Running CI Checks Locally" section
only shows setup but is missing the CI verification commands; add a "Verify Your
Changes" subsection immediately after the existing "poetry install --with dev"
block that lists the required commands (poetry check, poetry run ruff check .,
poetry run ruff format --check ., poetry run pytest) and mention how to auto-fix
formatting with poetry run ruff format ., and also scan the README for any
duplicate copies of this section and remove the duplicate so the instructions
appear only once.


</details>

<!-- fingerprinting:phantom:poseidon:puma -->

<!-- This is an auto-generated comment by CodeRabbit -->


## Contributing

We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for:
Expand Down
30 changes: 27 additions & 3 deletions src/agentunit/adapters/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,14 +32,38 @@ class BaseAdapter(abc.ABC):

@abc.abstractmethod
def prepare(self) -> None:
"""Perform any lazy setup (loading graphs, flows, etc.)."""
"""
Perform any lazy setup required before execution.

This may include loading graphs, flows, or other resources.

Returns:
None
"""

@abc.abstractmethod
def execute(self, case: DatasetCase, trace: TraceLog) -> AdapterOutcome:
"""Run the agent flow on a single dataset case."""
"""
Run the agent flow on a single dataset case.

Args:
case (DatasetCase): The dataset case to be processed.
trace (TraceLog): Trace log used to record execution details.

Returns:
AdapterOutcome: The outcome produced by executing the adapter.
"""

def cleanup(self) -> None: # pragma: no cover - default no-op
"""Hook for cleaning up resources such as temporary files or servers."""
"""
Clean up resources after execution.

This hook can be used to release resources such as temporary files
or running servers.

Returns:
None
"""

def supports_replay(self) -> bool:
return True
4 changes: 3 additions & 1 deletion src/agentunit/core/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
"""Core components for AgentUnit."""
"""
Core components for AgentUnit.
"""

from agentunit.datasets.base import DatasetCase, DatasetSource
from agentunit.reporting.results import ScenarioResult
Expand Down
16 changes: 12 additions & 4 deletions src/agentunit/core/exceptions.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,23 @@
"""Custom exceptions for AgentUnit."""
"""
Custom exceptions for AgentUnit.
"""

from __future__ import annotations


class AgentUnitError(Exception):
"""Base class for AgentUnit exceptions."""
"""
Base class for AgentUnit exceptions.
"""


class AdapterNotAvailableError(AgentUnitError):
"""Raised when an adapter cannot be initialized due to missing dependencies."""
"""
Raised when an adapter cannot be initialized due to missing dependencies.
"""


class ScenarioExecutionError(AgentUnitError):
"""Raised when a scenario fails during execution."""
"""
Raised when a scenario fails during execution.
"""
8 changes: 6 additions & 2 deletions src/agentunit/core/replay.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
"""Replay utilities leveraging stored traces."""
"""
Replay utilities leveraging stored traces.
"""

from __future__ import annotations

Expand All @@ -8,7 +10,9 @@


def load_traces(traces_dir: str | Path) -> list[TraceLog]:
"""Load stored traces from disk for deterministic replay or analysis."""
"""
Load stored traces from disk for deterministic replay or analysis.
"""

path = Path(traces_dir)
logs: list[TraceLog] = []
Expand Down
4 changes: 3 additions & 1 deletion src/agentunit/core/runner.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
"""Scenario runner orchestration."""
"""
Scenario runner orchestration.
"""

from __future__ import annotations

Expand Down
16 changes: 12 additions & 4 deletions src/agentunit/core/scenario.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
"""Scenario definition API exposed to end users."""
"""
Scenario definition API exposed to end users.
"""

from __future__ import annotations

Expand All @@ -19,7 +21,9 @@

@dataclass(slots=True)
class Scenario:
"""Defines a reproducible agent evaluation scenario."""
"""
Defines a reproducible agent evaluation scenario.
"""

name: str
adapter: BaseAdapter
Expand Down Expand Up @@ -75,7 +79,9 @@ def from_crewai(
name: str | None = None,
**options: object,
) -> Scenario:
"""Create scenario from CrewAI crew."""
"""
Create scenario from CrewAI crew.
"""
from agentunit.adapters.crewai import CrewAIAdapter

adapter = CrewAIAdapter.from_crew(crew, **options)
Expand All @@ -91,7 +97,9 @@ def from_autogen(
name: str | None = None,
**options: object,
) -> Scenario:
"""Create scenario from AutoGen orchestrator."""
"""
Create scenario from AutoGen orchestrator.
"""
from agentunit.adapters.autogen import AutoGenAdapter

adapter = AutoGenAdapter(orchestrator=orchestrator, **options)
Expand Down
12 changes: 9 additions & 3 deletions src/agentunit/core/trace.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
"""Tracing utilities shared between adapters and the runner."""
"""
Tracing utilities shared between adapters and the runner.
"""

from __future__ import annotations

Expand All @@ -11,7 +13,9 @@

@dataclass(slots=True)
class TraceEvent:
"""Represents a single prompt, tool call, or response in an agent run."""
"""
Represents a single prompt, tool call, or response in an agent run.
"""

type: str
payload: dict[str, Any]
Expand All @@ -20,7 +24,9 @@ class TraceEvent:

@dataclass(slots=True)
class TraceLog:
"""A collection of chronological events for a scenario iteration."""
"""
A collection of chronological events for a scenario iteration.
"""

events: list[TraceEvent] = field(default_factory=list)

Expand Down