aviralgarg05 · aviralgarg05 · Dec 13, 2025 · Dec 12, 2025 · Dec 12, 2025 · Dec 12, 2025
diff --git a/README.md b/README.md
@@ -193,14 +193,23 @@ Use the table above as the canonical navigation surface; every document cross-li
 ## Development workflow
 
 1. Install dependencies (Poetry or pip).
-2. Run the unit and integration suite:
+2. Run the test suite:
 
 ```bash
+# Run all tests (unit + integration)
 poetry run python3 -m pytest tests -v
+
+# Run only unit tests (skip integration tests)
+poetry run python3 -m pytest -m "not integration" -v
+
+# Run only integration tests (requires framework dependencies)
+poetry run python3 -m pytest tests/integration/ -v
 ```
 
 3. Execute targeted suites during active development, then run the full matrix before opening a pull request.
 
+**Integration Tests**: The `tests/integration/` directory contains tests that verify AgentUnit works with real framework implementations (LangGraph, etc.). These tests are automatically skipped if the required dependencies are not installed. See [tests/integration/README.md](tests/integration/README.md) for details.
+
 Latest verification (2025-10-24): 144 passed, 10 skipped, 32 warnings. Warnings originate from third-party dependencies (`langchain` pydantic shim deprecations and `datetime.utcnow` usage). Track upstream fixes or pin patched releases as needed.
 
 ## Contributing

diff --git a/poetry.lock b/poetry.lock
diff --git a/pyproject.toml b/pyproject.toml
@@ -24,7 +24,7 @@ classifiers = [
 python = "^3.10"
 pyyaml = "^6.0"
 crewai = { version = "^0.201.1", python = "<3.14" }
-langchain = ">=0.0.353,<0.2.0"  # security: stay on patched 0.0.x line compatible with ecosystem
+langchain = ">=0.0.353,<0.4.0"  # security: stay on patched line compatible with ecosystem
 opentelemetry-api = "^1.25.0"
 opentelemetry-sdk = "^1.25.0"
 opentelemetry-exporter-otlp = "^1.25.0"
@@ -34,9 +34,11 @@ httpx = "^0.27.0"
 numpy = "^1.24.0"
 scipy = "^1.11.0"
 ragas = { version = ">=0.1.9", optional = true }
+langgraph = { version = "^0.2.0", optional = true }
 
 [tool.poetry.extras]
 ragas = ["ragas"]
+integration-tests = ["langgraph"]
 
 [tool.poetry.group.dev.dependencies]
 pytest = "^8.2.0"
@@ -54,3 +56,18 @@ agentunit = "agentunit.cli:entrypoint"
 requires = ["poetry-core>=1.8.2"]
 build-backend = "poetry.core.masonry.api"
 
+[tool.pytest.ini_options]
+markers = [
+    "integration: marks tests as integration tests (deselect with '-m \"not integration\"')",
+    "langgraph: marks tests as requiring LangGraph (skipped if not installed)",
+]
+testpaths = ["tests"]
+python_files = ["test_*.py"]
+python_classes = ["Test*"]
+python_functions = ["test_*"]
+addopts = [
+    "--strict-markers",
+    "--strict-config",
+    "-ra",
+]
+
diff --git a/tests/__init__.py b/tests/__init__.py
@@ -0,0 +1 @@
+# Tests package
diff --git a/tests/integration/IMPLEMENTATION_SUMMARY.md b/tests/integration/IMPLEMENTATION_SUMMARY.md
@@ -0,0 +1,131 @@
+# LangGraph Integration Tests - Implementation Summary
+
+This document summarizes the implementation of LangGraph integration tests for AgentUnit (Issue #24).
+
+## ✅ Completed Tasks
+
+### 1. Created Integration Test Structure
+- ✅ Created `tests/integration/` directory
+- ✅ Added `__init__.py` and `conftest.py` for proper test configuration
+- ✅ Configured pytest markers for integration and LangGraph tests
+
+### 2. Simple LangGraph Agent Implementation
+- ✅ Created `simple_langgraph_agent.py` with a working LangGraph agent
+- ✅ Implemented fallback behavior when LangGraph is not installed
+- ✅ Agent handles multiple query types (quantum, python, weather, general)
+- ✅ Compatible with AgentUnit's payload format
+
+### 3. Comprehensive Integration Tests
+- ✅ Created `test_langgraph_integration.py` with full test suite
+- ✅ Tests scenario creation from callable agents and Python files
+- ✅ Tests full evaluation cycle with multiple test cases
+- ✅ Tests metrics integration (when available)
+- ✅ Tests error handling and retry functionality
+- ✅ Tests multiple scenarios running together
+
+### 4. Pytest Configuration
+- ✅ Added pytest markers to `pyproject.toml`
+- ✅ Configured automatic test marking for integration tests
+- ✅ Tests are properly skipped when LangGraph is not installed
+
+### 5. Documentation
+- ✅ Created comprehensive `README.md` for integration tests
+- ✅ Documented prerequisites and running instructions
+- ✅ Added CI configuration example
+- ✅ Updated main project README with integration test information
+
+## ✅ Acceptance Criteria Met
+
+### Integration tests pass with LangGraph installed
+- Tests are designed to pass when LangGraph is available
+- Comprehensive test coverage of AgentUnit + LangGraph integration
+
+### Tests are skipped gracefully without LangGraph
+- Uses `pytest.importorskip()` to skip tests when LangGraph is not available
+- Provides clear skip messages
+- Fallback mock responses work without LangGraph
+
+### CI optionally runs integration tests
+- Provided example CI configuration in `ci-example.yml`
+- Shows how to run integration tests conditionally
+- Demonstrates selective test execution with pytest markers
+
+## 📁 Files Created
+
+```
+tests/integration/
+├── __init__.py                     # Package initialization
+├── conftest.py                     # Test configuration and markers
+├── simple_langgraph_agent.py       # Simple LangGraph agent for testing
+├── test_langgraph_integration.py   # Main integration tests
+├── test_integration_basic.py       # Basic structure tests
+├── README.md                       # Documentation
+├── ci-example.yml                  # CI configuration example
+└── IMPLEMENTATION_SUMMARY.md       # This file
+```
+
+## 🧪 Test Coverage
+
+The integration tests cover:
+
+1. **Scenario Creation**
+   - From callable functions
+   - From Python files
+   - With custom configurations
+
+2. **Full Evaluation Cycle**
+   - Multiple test cases
+   - Success and failure scenarios
+   - Metrics calculation
+   - Trace logging
+
+3. **Error Handling**
+   - Agent failures
+   - Retry logic
+   - Graceful degradation
+
+4. **Framework Integration**
+   - LangGraph adapter registration
+   - Multiple scenario execution
+   - Scenario cloning and modification
+
+## 🚀 Usage Examples
+
+### Run all integration tests:
+```bash
+pytest tests/integration/
+```
+
+### Run only LangGraph tests:
+```bash
+pytest tests/integration/ -m langgraph
+```
+
+### Skip integration tests:
+```bash
+pytest -m "not integration"
+```
+
+### Install LangGraph for testing:
+```bash
+# Install optional integration test dependencies
+poetry install --extras integration-tests
+```
+
+## 🔧 Technical Implementation Details
+
+- **Graceful Dependency Handling**: Uses `pytest.importorskip()` and try/except imports
+- **Mock Fallbacks**: Provides mock responses when dependencies are unavailable
+- **Pytest Markers**: Proper test categorization and selective execution
+- **AgentUnit Integration**: Full compatibility with AgentUnit's Scenario and Runner APIs
+- **CI Ready**: Designed for optional execution in continuous integration
+
+## 🎯 Next Steps
+
+The integration test framework is now ready for:
+1. Adding more framework integrations (CrewAI, AutoGen, etc.)
+2. Expanding test coverage with more complex scenarios
+3. Integration with CI/CD pipelines
+4. Performance and load testing scenarios
+
+This implementation fully addresses Issue #24 and provides a solid foundation for future integration testing needs.
diff --git a/tests/integration/README.md b/tests/integration/README.md
@@ -0,0 +1,97 @@
+# Integration Tests
+
+This directory contains integration tests that verify AgentUnit works with real framework implementations.
+
+## LangGraph Integration Tests
+
+The LangGraph integration tests verify that AgentUnit can properly evaluate LangGraph agents through a complete evaluation cycle.
+
+### Prerequisites
+
+To run LangGraph integration tests, you need to install LangGraph:
+
+```bash
+# Install optional integration test dependencies
+poetry install --extras integration-tests
+```
+
+Or install LangGraph manually:
+
+```bash
+poetry add langgraph --group dev
+```
+
+### Running Integration Tests
+
+#### Run all integration tests:
+```bash
+pytest tests/integration/
+```
+
+#### Run only LangGraph tests:
+```bash
+pytest tests/integration/ -m langgraph
+```
+
+#### Skip integration tests (run only unit tests):
+```bash
+pytest -m "not integration"
+```
+
+#### Run with verbose output:
+```bash
+pytest tests/integration/ -v
+```
+
+### Test Structure
+
+- `simple_langgraph_agent.py` - Contains a simple LangGraph agent implementation for testing
+- `test_langgraph_integration.py` - Integration tests for LangGraph adapter
+- `conftest.py` - Test configuration and markers
+
+### What the Tests Cover
+
+1. **Scenario Creation**: Tests creating scenarios from callable agents and Python files
+2. **Full Evaluation Cycle**: Tests running complete evaluation cycles with multiple test cases
+3. **Metrics Integration**: Tests that metrics can be calculated (when available)
+4. **Error Handling**: Tests graceful handling of agent failures
+5. **Retry Logic**: Tests scenario retry functionality
+6. **Multiple Scenarios**: Tests running multiple scenarios together
+
+### CI Integration
+
+The integration tests are designed to be optionally run in CI:
+
+- Tests are automatically skipped if LangGraph is not installed
+- Use pytest markers to selectively run or skip integration tests
+- All tests are marked with `@pytest.mark.integration` and `@pytest.mark.langgraph`
+
+### Adding New Integration Tests
+
+When adding integration tests for other frameworks:
+
+1. Create a simple agent implementation in the framework
+2. Create test cases that cover the full evaluation cycle
+3. Use appropriate pytest markers (e.g., `@pytest.mark.crewai`)
+4. Ensure tests are skipped gracefully when dependencies are not available
+5. Document the prerequisites and running instructions
+
+### Example Usage
+
+```python
+import pytest
+from agentunit import Scenario, run_suite
+from tests.integration.simple_langgraph_agent import invoke_agent
+
+@pytest.mark.langgraph
+@pytest.mark.integration
+def test_my_langgraph_scenario():
+    scenario = Scenario.load_langgraph(
+        path=invoke_agent,
+        dataset=my_dataset,
+        name="my-test"
+    )
+
+    result = run_suite([scenario])
+    assert len(result.scenarios) == 1
+```
diff --git a/tests/integration/__init__.py b/tests/integration/__init__.py
@@ -0,0 +1 @@
+"""Integration tests for AgentUnit with real frameworks."""
diff --git a/tests/integration/ci-example.yml b/tests/integration/ci-example.yml
@@ -0,0 +1,59 @@
+# Example CI configuration for running integration tests
+# This shows how to optionally run integration tests in CI
+
+name: Tests
+
+on: [push, pull_request]
+
+jobs:
+  unit-tests:
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        python-version: ["3.10", "3.11", "3.12"]
+
+    steps:
+      - uses: actions/checkout@v4
+      - name: Set up Python ${{ matrix.python-version }}
+        uses: actions/setup-python@v5
+        with:
+          python-version: ${{ matrix.python-version }}
+
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install poetry
+          poetry install
+
+      - name: Run unit tests (excluding integration)
+        run: |
+          poetry run pytest -m "not integration" --cov=agentunit --cov-report=xml
+
+      - name: Upload coverage to Codecov
+        uses: codecov/codecov-action@v4
+
+  integration-tests:
+    runs-on: ubuntu-latest
+    # Only run integration tests on main branch or when explicitly requested
+    if: github.ref == 'refs/heads/main' || contains(github.event.pull_request.labels.*.name, 'run-integration-tests')
-    if: github.ref == 'refs/heads/main' || contains(github.event.pull_request.labels.*.name, 'run-integration-tests')
+    if: github.ref == 'refs/heads/main' || (github.event_name == 'pull_request' && contains(github.event.pull_request.labels.*.name, 'run-integration-tests'))
-    if: github.ref == 'refs/heads/main' || contains(github.event.pull_request.labels.*.name, 'run-integration-tests')
+    if: github.ref == 'refs/heads/main' || (github.event_name == 'pull_request' && contains(github.event.pull_request.labels.*.name, 'run-integration-tests'))
+
+    steps:
+      - uses: actions/checkout@v4
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.11"
+
+      - name: Install dependencies including integration test deps
+        run: |
+          python -m pip install --upgrade pip
+          pip install poetry
+          poetry install --extras integration-tests
+
+      - name: Run integration tests
+        run: |
+          poetry run pytest tests/integration/ -v
+
+      - name: Run LangGraph specific tests
+        run: |
+          poetry run pytest tests/integration/ -m langgraph -v
diff --git a/tests/integration/conftest.py b/tests/integration/conftest.py
@@ -0,0 +1,23 @@
+"""Configuration for integration tests."""
+
+from __future__ import annotations
+
+import pytest
+
+
+def pytest_configure(config):
+    """Configure pytest markers for integration tests."""
+    config.addinivalue_line(
+        "markers",
+        "integration: marks tests as integration tests (deselect with '-m \"not integration\"')",
+    )
+    config.addinivalue_line(
+        "markers", "langgraph: marks tests as requiring LangGraph (skipped if not installed)"
+    )
+
+
+def pytest_collection_modifyitems(config, items):
+    """Automatically mark integration tests."""
+    for item in items:
+        if "integration" in str(item.fspath):
+            item.add_marker(pytest.mark.integration)
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		"""Integration tests for AgentUnit with real frameworks."""