Skip to content

feat: implement pytest plugin for AgentUnit scenario discovery (resolves #22)#38

Closed
sshekhar563 wants to merge 10 commits into
aviralgarg05:mainfrom
sshekhar563:feature/pytest-plugin-scenario-discovery
Closed

feat: implement pytest plugin for AgentUnit scenario discovery (resolves #22)#38
sshekhar563 wants to merge 10 commits into
aviralgarg05:mainfrom
sshekhar563:feature/pytest-plugin-scenario-discovery

Conversation

@sshekhar563
Copy link
Copy Markdown
Contributor

@sshekhar563 sshekhar563 commented Dec 15, 2025

🎯 Overview #22

This PR implements a comprehensive pytest plugin that enables automatic discovery and execution of AgentUnit scenarios as pytest tests, resolving issue #22.

✨ Features Added

🔍 Core Plugin Functionality

  • Automatic scenario discovery from tests/eval/ directory
  • Python file support with Scenario objects and scenario_* functions
  • Config file support for YAML/JSON (with nocode module integration)
  • Pytest markers (@pytest.mark.agentunit, @pytest.mark.scenario)
  • Native test execution using AgentUnit's run_suite function
  • Robust error handling for failed scenario loading

🛠️ CLI Tool

  • agentunit-init-eval command for quick setup
  • Generates example scenario files with correct API usage
  • Supports custom directory names and example creation

📁 Files Added

  • src/agentunit/pytest/plugin.py - Main plugin implementation
  • src/agentunit/pytest/cli.py - CLI setup command
  • src/agentunit/pytest/__init__.py - Package initialization
  • tests/test_pytest_plugin.py - Comprehensive test suite (6 tests)
  • tests/eval/example_scenarios.py - Working example scenarios
  • docs/pytest-plugin.md - Complete documentation

⚙️ Configuration

  • Added pytest entry point in pyproject.toml
  • Plugin auto-registers when AgentUnit is installed

🔧 API Corrections

  • ✅ Updated all examples to use adapter parameter instead of deprecated agent
  • ✅ Created SimpleAdapter class for function-based agents
  • ✅ Fixed CLI-generated examples to use proper adapter pattern
  • ✅ Updated documentation with correct API usage

🧪 Testing & Quality

  • 6/6 tests passing with comprehensive coverage
  • ✅ Tests for discovery, execution, success/failure scenarios, and error handling
  • ✅ Code passes ruff formatting and linting
  • ✅ No type checking diagnostics
  • ✅ Proper mock objects for pytest integration testing

📖 Usage Example

1. Initialize evaluation directory:

agentunit-init-eval -d tests/eval -e


<!-- This is an auto-generated comment: release notes by coderabbit.ai -->

## Summary by CodeRabbit

* **New Features**
  * Added a pytest plugin for seamless AgentUnit scenario discovery and execution within the pytest test runner.
  * Introduced a CLI command (`agentunit-init-eval`) to initialize evaluation directories with example scenarios and documentation.
  * Added LangGraph integration tests validating agent scenarios with real framework implementations.

* **Documentation**
  * Added comprehensive pytest plugin guide with installation, usage, and configuration instructions.
  * Added integration test documentation with setup and execution guidance.
  * Updated README with test verification summary and integration test overview.

* **Dependencies**
  * Extended langchain support to <0.4.0.
  * Added optional LangGraph dependency (^0.2.0) for integration testing.

<sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

- Create tests/integration/ directory with comprehensive LangGraph tests
- Add simple_langgraph_agent.py with mock LangGraph agent for testing
- Implement test_langgraph_integration.py with full evaluation cycle tests
- Configure pytest markers for integration and langgraph tests
- Add graceful dependency handling - tests skip when LangGraph not installed
- Include comprehensive documentation and CI examples
- Update main README with integration test instructions

Fixes aviralgarg05#24
- Add langgraph as optional dependency in pyproject.toml
- Update installation instructions to use poetry extras
- Ensure LangGraph is properly managed within Poetry environment
- Address coderabbit review feedback about pip vs poetry installation
- Update actions/setup-python from v4 to v5
- Update codecov/codecov-action from v3 to v4
- Address coderabbit security and compatibility recommendations
- Correct indentation for steps sections in both jobs
- Ensure proper YAML syntax for GitHub Actions workflow
- Address coderabbit YAML formatting feedback
- Update langchain version constraint to be compatible with langgraph
- Resolve langchain-core version conflicts between langchain and langgraph
- Update poetry.lock with resolved dependencies
- Fix CI test failures caused by outdated lock file
- Change absolute imports to relative imports in test files
- Add missing tests/__init__.py for proper package structure
- Fix all linting errors (whitespace, formatting, imports)
- Improve code quality in simple_langgraph_agent.py
- Resolve ModuleNotFoundError issues in CI

This addresses the test import failures and ensures proper
Python package structure for the tests directory.
- Add proper __init__ method that calls super() with required name and loader
- Change iter_cases to _generate_cases method that returns a list
- This fixes the TypeError about missing positional arguments

Resolves the 8 failing LangGraph integration tests.
- Replace invalid 'response_length' metric with 'faithfulness' metric
- Fix missing newline at end of tests/__init__.py file
- This should resolve the last failing test and linting issues

All LangGraph integration tests should now pass successfully.
- Run ruff format on all integration test files
- Fix formatting for conftest.py, simple_langgraph_agent.py,
  test_integration_basic.py, and test_langgraph_integration.py
- This resolves the 'Run ruff formatter check' CI failure

All linting and formatting checks should now pass.
Resolves aviralgarg05#22

This commit implements a comprehensive pytest plugin that enables automatic
discovery and execution of AgentUnit scenarios as pytest tests.

## Features Added:

### Core Plugin Functionality:
- Automatic scenario discovery from tests/eval/ directory
- Support for Python files with Scenario objects and scenario_* functions
- Support for YAML/JSON config files (with nocode module integration)
- Pytest markers (@pytest.mark.agentunit, @pytest.mark.scenario)
- Proper test execution using AgentUnit's run_suite function
- Comprehensive error handling for failed scenario loading

### CLI Tool:
- agentunit-init-eval command for directory setup
- Generates example scenario files with correct API usage
- Supports custom directory names and example creation

### Files Added:
- src/agentunit/pytest/plugin.py - Main plugin implementation
- src/agentunit/pytest/cli.py - CLI command for setup
- src/agentunit/pytest/__init__.py - Package initialization
- tests/test_pytest_plugin.py - Comprehensive test suite (6 tests)
- tests/eval/example_scenarios.py - Example scenarios
- docs/pytest-plugin.md - Complete documentation

### Configuration:
- Added pytest entry point in pyproject.toml
- Plugin auto-registers when AgentUnit is installed

## API Corrections:
- Updated all examples to use 'adapter' parameter instead of deprecated 'agent'
- Created SimpleAdapter class for function-based agents
- Fixed CLI-generated examples to use proper adapter pattern

## Testing:
- All 6 plugin tests pass
- Comprehensive test coverage for discovery, execution, and error handling
- Code passes ruff formatting and linting
- No type checking diagnostics

## Usage:
1. Install AgentUnit (plugin auto-registers)
2. Run: agentunit-init-eval -d tests/eval -e
3. Create scenario files in tests/eval/
4. Run: pytest tests/eval/
5. Filter with: pytest -m agentunit

The plugin integrates seamlessly with pytest's discovery mechanism and
provides a natural way to run AgentUnit evaluations as part of test suites.
@continue
Copy link
Copy Markdown

continue Bot commented Dec 15, 2025

All Green - Keep your PRs mergeable

Learn more

All Green is an AI agent that automatically:

✅ Addresses code review comments

✅ Fixes failing CI checks

✅ Resolves merge conflicts

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 15, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

This PR implements a comprehensive pytest plugin for AgentUnit that enables pytest-based scenario discovery and execution. It adds pytest hooks for collecting AgentUnit scenarios from Python files and YAML/JSON configs in a tests/eval/ directory, CLI tooling for scenario initialization, LangGraph integration tests, and updated documentation.

Changes

Cohort / File(s) Summary
Pytest Plugin Core
src/agentunit/pytest/__init__.py, src/agentunit/pytest/plugin.py, src/agentunit/pytest/cli.py
Introduces AgentUnit pytest plugin with pytest_configure and pytest_collect_file hooks for discovering scenarios in tests/eval/; AgentUnitFile and AgentUnitItem classes for collection and execution; CLI init_eval command to scaffold evaluation directories with optional example scenarios.
Configuration & Dependency Updates
pyproject.toml
Extended langchain upper bound to <0.4.0; added langgraph ^0.2.0 as optional dependency; registered agentunit-init-eval CLI entry point; declared pytest11 plugin; added pytest configuration with markers (agentunit, scenario, integration, langgraph), testpaths, and addopts.
Documentation
README.md, docs/pytest-plugin.md
Updated README development workflow text; added integration tests guidance section; added comprehensive pytest plugin documentation covering installation, scenario discovery, Python/config-based scenarios, markers, fixtures, configuration, and CI integration.
Example & Test Scenarios
tests/eval/__init__.py, tests/eval/example_scenarios.py, tests/eval/failing_scenario.py, tests/test_pytest_plugin.py
Added eval scenarios package with example adapters and datasets; created example and failing scenarios for plugin demonstration; added plugin test suite exercising scenario discovery, item execution, success/failure paths, and load error handling.
Integration Tests
tests/integration/__init__.py, tests/integration/README.md, tests/integration/IMPLEMENTATION_SUMMARY.md, tests/integration/conftest.py, tests/integration/ci-example.yml, tests/integration/simple_langgraph_agent.py, tests/integration/test_integration_basic.py, tests/integration/test_langgraph_integration.py
Added comprehensive LangGraph integration test suite covering scenario creation from callables/files, full evaluation cycles, metrics collection, error handling, retries, and clone operations; included CI workflow example, pytest markers configuration, and basic integration tests.
Package Initialization
tests/__init__.py
Added tests package marker.

Sequence Diagram

sequenceDiagram
    actor User
    participant Pytest
    participant PluginHooks
    participant AgentUnitFile
    participant ScenarioLoader
    participant AgentUnitCore
    participant FileSystem

    User->>Pytest: pytest tests/eval/
    Pytest->>PluginHooks: pytest_configure()
    PluginHooks->>Pytest: Register markers
    
    Pytest->>PluginHooks: pytest_collect_file(path)
    PluginHooks->>FileSystem: Check if tests/eval/ dir
    FileSystem-->>PluginHooks: Path valid
    PluginHooks->>AgentUnitFile: Create from_parent()
    
    Pytest->>AgentUnitFile: collect()
    AgentUnitFile->>ScenarioLoader: _discover_scenarios()
    ScenarioLoader->>FileSystem: Read .py/.yaml/.json files
    FileSystem-->>ScenarioLoader: File content
    alt Python File
        ScenarioLoader->>ScenarioLoader: _discover_python_scenarios()
        ScenarioLoader->>ScenarioLoader: _import_module()
    else Config File
        ScenarioLoader->>ScenarioLoader: _discover_config_scenarios()
    end
    ScenarioLoader-->>AgentUnitFile: list[Scenario]
    
    loop For Each Scenario
        AgentUnitFile->>AgentUnitFile: Create AgentUnitItem
    end
    AgentUnitFile-->>Pytest: Generator[AgentUnitItem]
    
    Pytest->>AgentUnitFile: runtest()
    AgentUnitFile->>AgentUnitCore: run_suite(scenario)
    AgentUnitCore-->>AgentUnitFile: RunResult
    alt Success
        AgentUnitFile-->>Pytest: Test passes
    else Failure
        AgentUnitFile->>Pytest: Raise AssertionError with details
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • src/agentunit/pytest/plugin.py: Dense implementation with multiple discovery strategies (Python modules, YAML/JSON configs), pytest hook compliance, and error handling paths requiring careful review of scenario loading and execution logic.
  • tests/integration/test_langgraph_integration.py: Comprehensive integration test suite with numerous test scenarios covering clone, metrics, retries, and error handling requiring verification of expected behaviors.
  • pyproject.toml: Multiple configuration sections (dependencies, entry points, pytest markers, testpaths) need validation for consistency and compatibility.
  • src/agentunit/pytest/cli.py: CLI implementation with file I/O and example generation logic.

Possibly related issues

  • Create pytest plugin for scenario discovery #22: Directly addresses "Create pytest plugin for scenario discovery" by implementing pytest_collect_file, pytest_configure, AgentUnitFile/AgentUnitItem classes, and tests/eval directory discovery.

Possibly related PRs

  • Add LangGraph integration tests (#24) #36: Both PRs implement the same LangGraph integration test suite, CI workflow examples, pytest configuration, and README updates, indicating overlapping or related work.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a55eacf and 4724602.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (19)
  • README.md (1 hunks)
  • docs/pytest-plugin.md (1 hunks)
  • pyproject.toml (4 hunks)
  • src/agentunit/pytest/__init__.py (1 hunks)
  • src/agentunit/pytest/cli.py (1 hunks)
  • src/agentunit/pytest/plugin.py (1 hunks)
  • tests/__init__.py (1 hunks)
  • tests/eval/__init__.py (1 hunks)
  • tests/eval/example_scenarios.py (1 hunks)
  • tests/eval/failing_scenario.py (1 hunks)
  • tests/integration/IMPLEMENTATION_SUMMARY.md (1 hunks)
  • tests/integration/README.md (1 hunks)
  • tests/integration/__init__.py (1 hunks)
  • tests/integration/ci-example.yml (1 hunks)
  • tests/integration/conftest.py (1 hunks)
  • tests/integration/simple_langgraph_agent.py (1 hunks)
  • tests/integration/test_integration_basic.py (1 hunks)
  • tests/integration/test_langgraph_integration.py (1 hunks)
  • tests/test_pytest_plugin.py (1 hunks)

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant