-
Notifications
You must be signed in to change notification settings - Fork 17
Add LangGraph integration tests (#24) #36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
cd8f7f1
c88d027
7a71405
1a9e09b
db0f198
bea72ef
b5c522f
0c17003
f383761
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| # Tests package |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,131 @@ | ||
| # LangGraph Integration Tests - Implementation Summary | ||
|
|
||
| This document summarizes the implementation of LangGraph integration tests for AgentUnit (Issue #24). | ||
|
|
||
| ## ✅ Completed Tasks | ||
|
|
||
| ### 1. Created Integration Test Structure | ||
| - ✅ Created `tests/integration/` directory | ||
| - ✅ Added `__init__.py` and `conftest.py` for proper test configuration | ||
| - ✅ Configured pytest markers for integration and LangGraph tests | ||
|
|
||
| ### 2. Simple LangGraph Agent Implementation | ||
| - ✅ Created `simple_langgraph_agent.py` with a working LangGraph agent | ||
| - ✅ Implemented fallback behavior when LangGraph is not installed | ||
| - ✅ Agent handles multiple query types (quantum, python, weather, general) | ||
| - ✅ Compatible with AgentUnit's payload format | ||
|
|
||
| ### 3. Comprehensive Integration Tests | ||
| - ✅ Created `test_langgraph_integration.py` with full test suite | ||
| - ✅ Tests scenario creation from callable agents and Python files | ||
| - ✅ Tests full evaluation cycle with multiple test cases | ||
| - ✅ Tests metrics integration (when available) | ||
| - ✅ Tests error handling and retry functionality | ||
| - ✅ Tests multiple scenarios running together | ||
|
|
||
| ### 4. Pytest Configuration | ||
| - ✅ Added pytest markers to `pyproject.toml` | ||
| - ✅ Configured automatic test marking for integration tests | ||
| - ✅ Tests are properly skipped when LangGraph is not installed | ||
|
|
||
| ### 5. Documentation | ||
| - ✅ Created comprehensive `README.md` for integration tests | ||
| - ✅ Documented prerequisites and running instructions | ||
| - ✅ Added CI configuration example | ||
| - ✅ Updated main project README with integration test information | ||
|
|
||
| ## ✅ Acceptance Criteria Met | ||
|
|
||
| ### Integration tests pass with LangGraph installed | ||
| - Tests are designed to pass when LangGraph is available | ||
| - Comprehensive test coverage of AgentUnit + LangGraph integration | ||
|
|
||
| ### Tests are skipped gracefully without LangGraph | ||
| - Uses `pytest.importorskip()` to skip tests when LangGraph is not available | ||
| - Provides clear skip messages | ||
| - Fallback mock responses work without LangGraph | ||
|
|
||
| ### CI optionally runs integration tests | ||
| - Provided example CI configuration in `ci-example.yml` | ||
| - Shows how to run integration tests conditionally | ||
| - Demonstrates selective test execution with pytest markers | ||
|
|
||
| ## 📁 Files Created | ||
|
|
||
| ``` | ||
| tests/integration/ | ||
| ├── __init__.py # Package initialization | ||
| ├── conftest.py # Test configuration and markers | ||
| ├── simple_langgraph_agent.py # Simple LangGraph agent for testing | ||
| ├── test_langgraph_integration.py # Main integration tests | ||
| ├── test_integration_basic.py # Basic structure tests | ||
| ├── README.md # Documentation | ||
| ├── ci-example.yml # CI configuration example | ||
| └── IMPLEMENTATION_SUMMARY.md # This file | ||
| ``` | ||
|
|
||
| ## 🧪 Test Coverage | ||
|
|
||
| The integration tests cover: | ||
|
|
||
| 1. **Scenario Creation** | ||
| - From callable functions | ||
| - From Python files | ||
| - With custom configurations | ||
|
|
||
| 2. **Full Evaluation Cycle** | ||
| - Multiple test cases | ||
| - Success and failure scenarios | ||
| - Metrics calculation | ||
| - Trace logging | ||
|
|
||
| 3. **Error Handling** | ||
| - Agent failures | ||
| - Retry logic | ||
| - Graceful degradation | ||
|
|
||
| 4. **Framework Integration** | ||
| - LangGraph adapter registration | ||
| - Multiple scenario execution | ||
| - Scenario cloning and modification | ||
|
|
||
| ## 🚀 Usage Examples | ||
|
|
||
| ### Run all integration tests: | ||
| ```bash | ||
| pytest tests/integration/ | ||
| ``` | ||
|
|
||
| ### Run only LangGraph tests: | ||
| ```bash | ||
| pytest tests/integration/ -m langgraph | ||
| ``` | ||
|
|
||
| ### Skip integration tests: | ||
| ```bash | ||
| pytest -m "not integration" | ||
| ``` | ||
|
|
||
| ### Install LangGraph for testing: | ||
| ```bash | ||
| # Install optional integration test dependencies | ||
| poetry install --extras integration-tests | ||
| ``` | ||
|
|
||
| ## 🔧 Technical Implementation Details | ||
|
|
||
| - **Graceful Dependency Handling**: Uses `pytest.importorskip()` and try/except imports | ||
| - **Mock Fallbacks**: Provides mock responses when dependencies are unavailable | ||
| - **Pytest Markers**: Proper test categorization and selective execution | ||
| - **AgentUnit Integration**: Full compatibility with AgentUnit's Scenario and Runner APIs | ||
| - **CI Ready**: Designed for optional execution in continuous integration | ||
|
|
||
| ## 🎯 Next Steps | ||
|
|
||
| The integration test framework is now ready for: | ||
| 1. Adding more framework integrations (CrewAI, AutoGen, etc.) | ||
| 2. Expanding test coverage with more complex scenarios | ||
| 3. Integration with CI/CD pipelines | ||
| 4. Performance and load testing scenarios | ||
|
|
||
| This implementation fully addresses Issue #24 and provides a solid foundation for future integration testing needs. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,97 @@ | ||
| # Integration Tests | ||
|
|
||
| This directory contains integration tests that verify AgentUnit works with real framework implementations. | ||
|
|
||
| ## LangGraph Integration Tests | ||
|
|
||
| The LangGraph integration tests verify that AgentUnit can properly evaluate LangGraph agents through a complete evaluation cycle. | ||
|
|
||
| ### Prerequisites | ||
|
|
||
| To run LangGraph integration tests, you need to install LangGraph: | ||
|
|
||
| ```bash | ||
| # Install optional integration test dependencies | ||
| poetry install --extras integration-tests | ||
| ``` | ||
|
|
||
| Or install LangGraph manually: | ||
|
|
||
| ```bash | ||
| poetry add langgraph --group dev | ||
| ``` | ||
|
|
||
| ### Running Integration Tests | ||
|
|
||
| #### Run all integration tests: | ||
| ```bash | ||
| pytest tests/integration/ | ||
| ``` | ||
|
|
||
| #### Run only LangGraph tests: | ||
| ```bash | ||
| pytest tests/integration/ -m langgraph | ||
| ``` | ||
|
|
||
| #### Skip integration tests (run only unit tests): | ||
| ```bash | ||
| pytest -m "not integration" | ||
| ``` | ||
|
|
||
| #### Run with verbose output: | ||
| ```bash | ||
| pytest tests/integration/ -v | ||
| ``` | ||
|
|
||
| ### Test Structure | ||
|
|
||
| - `simple_langgraph_agent.py` - Contains a simple LangGraph agent implementation for testing | ||
| - `test_langgraph_integration.py` - Integration tests for LangGraph adapter | ||
| - `conftest.py` - Test configuration and markers | ||
|
|
||
| ### What the Tests Cover | ||
|
|
||
| 1. **Scenario Creation**: Tests creating scenarios from callable agents and Python files | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add language specification to fenced code block. The bash code block is missing a language identifier, which breaks syntax highlighting and violates Markdown best practices. Apply this diff to add the language tag: 1. **Scenario Creation**: Tests creating scenarios from callable agents and Python files
2. **Full Evaluation Cycle**: Tests running complete evaluation cycles with multiple test casesActually, looking more carefully, line 54 shows the start of the list. Let me check line 55 context again. The issue is on the code block around line 13-16 (bash block). Let me correct the line range: - ```bash
+ ```Wait, I need to look at the actual lines. In the annotated code starting around line 13, I see: This is within a bash code fence that starts at line 13. The static analysis hint says line 55 is missing language spec, but looking at the code, line 55 doesn't show a code fence. Let me recount... Actually, the file shows lines 13-16 have bash code, then 20-22 have another bash code block. Then further down around line 28-44 are bash code blocks. The static analysis error "55-55: Fenced code blocks should have a language specified" points to line 55 which shows: This doesn't match. Let me re-examine. The static analysis hint says line 55 has a fenced code block issue. Looking back at the annotated code, around line 55 is where? Let me count more carefully... The annotation shows lines 1-97. Line 55 in that context is in the "What the Tests Cover" section which is NOT a code block. The static analysis tool might be flagging something else. Let me look for any unclosed or incorrectly specified code fences... Actually, I see the issue. Look at the Example Usage section (lines 79-97). It has a python code fence starting at line 81: ```python
import pytest
...In tests/integration/README.md around lines 54-54, there are fenced code blocks |
||
| 2. **Full Evaluation Cycle**: Tests running complete evaluation cycles with multiple test cases | ||
| 3. **Metrics Integration**: Tests that metrics can be calculated (when available) | ||
| 4. **Error Handling**: Tests graceful handling of agent failures | ||
| 5. **Retry Logic**: Tests scenario retry functionality | ||
| 6. **Multiple Scenarios**: Tests running multiple scenarios together | ||
|
|
||
| ### CI Integration | ||
|
|
||
| The integration tests are designed to be optionally run in CI: | ||
|
|
||
| - Tests are automatically skipped if LangGraph is not installed | ||
| - Use pytest markers to selectively run or skip integration tests | ||
| - All tests are marked with `@pytest.mark.integration` and `@pytest.mark.langgraph` | ||
|
|
||
|
Comment on lines
+65
to
+68
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Doc claim about markers is likely too strong. 🤖 Prompt for AI Agents |
||
| ### Adding New Integration Tests | ||
|
|
||
| When adding integration tests for other frameworks: | ||
|
|
||
| 1. Create a simple agent implementation in the framework | ||
| 2. Create test cases that cover the full evaluation cycle | ||
| 3. Use appropriate pytest markers (e.g., `@pytest.mark.crewai`) | ||
| 4. Ensure tests are skipped gracefully when dependencies are not available | ||
| 5. Document the prerequisites and running instructions | ||
|
|
||
| ### Example Usage | ||
|
|
||
| ```python | ||
| import pytest | ||
| from agentunit import Scenario, run_suite | ||
| from tests.integration.simple_langgraph_agent import invoke_agent | ||
|
|
||
| @pytest.mark.langgraph | ||
| @pytest.mark.integration | ||
| def test_my_langgraph_scenario(): | ||
| scenario = Scenario.load_langgraph( | ||
| path=invoke_agent, | ||
| dataset=my_dataset, | ||
| name="my-test" | ||
| ) | ||
|
|
||
| result = run_suite([scenario]) | ||
| assert len(result.scenarios) == 1 | ||
| ``` | ||
|
Comment on lines
+79
to
+97
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ensure all fenced code blocks have explicit language specifications. Per markdownlint (MD040), all fenced code blocks should declare their language. Review the code blocks in the Example Usage section and earlier to ensure they are properly specified (e.g., 🤖 Prompt for AI Agents |
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| """Integration tests for AgentUnit with real frameworks.""" |
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,59 @@ | ||||||
| # Example CI configuration for running integration tests | ||||||
| # This shows how to optionally run integration tests in CI | ||||||
|
|
||||||
| name: Tests | ||||||
|
|
||||||
| on: [push, pull_request] | ||||||
|
|
||||||
| jobs: | ||||||
| unit-tests: | ||||||
| runs-on: ubuntu-latest | ||||||
| strategy: | ||||||
| matrix: | ||||||
| python-version: ["3.10", "3.11", "3.12"] | ||||||
|
|
||||||
| steps: | ||||||
| - uses: actions/checkout@v4 | ||||||
| - name: Set up Python ${{ matrix.python-version }} | ||||||
| uses: actions/setup-python@v5 | ||||||
| with: | ||||||
| python-version: ${{ matrix.python-version }} | ||||||
|
|
||||||
| - name: Install dependencies | ||||||
| run: | | ||||||
| python -m pip install --upgrade pip | ||||||
| pip install poetry | ||||||
| poetry install | ||||||
|
|
||||||
| - name: Run unit tests (excluding integration) | ||||||
| run: | | ||||||
| poetry run pytest -m "not integration" --cov=agentunit --cov-report=xml | ||||||
|
|
||||||
| - name: Upload coverage to Codecov | ||||||
| uses: codecov/codecov-action@v4 | ||||||
|
|
||||||
| integration-tests: | ||||||
| runs-on: ubuntu-latest | ||||||
| # Only run integration tests on main branch or when explicitly requested | ||||||
| if: github.ref == 'refs/heads/main' || contains(github.event.pull_request.labels.*.name, 'run-integration-tests') | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Guard On - if: github.ref == 'refs/heads/main' || contains(github.event.pull_request.labels.*.name, 'run-integration-tests')
+ if: github.ref == 'refs/heads/main' || (github.event_name == 'pull_request' && contains(github.event.pull_request.labels.*.name, 'run-integration-tests'))📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||
|
|
||||||
| steps: | ||||||
| - uses: actions/checkout@v4 | ||||||
| - name: Set up Python | ||||||
| uses: actions/setup-python@v5 | ||||||
| with: | ||||||
| python-version: "3.11" | ||||||
|
|
||||||
| - name: Install dependencies including integration test deps | ||||||
| run: | | ||||||
| python -m pip install --upgrade pip | ||||||
| pip install poetry | ||||||
| poetry install --extras integration-tests | ||||||
|
|
||||||
| - name: Run integration tests | ||||||
| run: | | ||||||
| poetry run pytest tests/integration/ -v | ||||||
|
|
||||||
| - name: Run LangGraph specific tests | ||||||
| run: | | ||||||
| poetry run pytest tests/integration/ -m langgraph -v | ||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,23 @@ | ||
| """Configuration for integration tests.""" | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import pytest | ||
|
|
||
|
|
||
| def pytest_configure(config): | ||
| """Configure pytest markers for integration tests.""" | ||
| config.addinivalue_line( | ||
| "markers", | ||
| "integration: marks tests as integration tests (deselect with '-m \"not integration\"')", | ||
| ) | ||
| config.addinivalue_line( | ||
| "markers", "langgraph: marks tests as requiring LangGraph (skipped if not installed)" | ||
| ) | ||
|
|
||
|
|
||
| def pytest_collection_modifyitems(config, items): | ||
| """Automatically mark integration tests.""" | ||
| for item in items: | ||
| if "integration" in str(item.fspath): | ||
| item.add_marker(pytest.mark.integration) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
pytest --strict-config flag behavior and plugin compatibility💡 Result:
Short answer
What it actually does
Plugin compatibility notes (practical implications)
Versions / changelog pointers
If you want, I can:
Sources
[1] pytest docs — Good Integration Practices / strict mode.
[2] pytest source/docs — Config parsing and _warn_or_fail_if_strict implementation (strict_config behavior) and changelog notes.
[3] pytest changelog (notes about strict/strict_config).
Ensure
--strict-configcompatibility before merging.The
--strict-configflag will error on unknown INI keys from incompatible plugins or legacy configs. Before enabling, audit third-party plugins to confirm they properly register custom INI options, and add a CI check to catch config issues early. Document the stricter requirements for contributors if your project uses external plugins.🤖 Prompt for AI Agents