Add LangGraph integration tests (#24)#36
Conversation
- Create tests/integration/ directory with comprehensive LangGraph tests - Add simple_langgraph_agent.py with mock LangGraph agent for testing - Implement test_langgraph_integration.py with full evaluation cycle tests - Configure pytest markers for integration and langgraph tests - Add graceful dependency handling - tests skip when LangGraph not installed - Include comprehensive documentation and CI examples - Update main README with integration test instructions Fixes aviralgarg05#24
Learn moreAll Green is an AI agent that automatically: ✅ Addresses code review comments ✅ Fixes failing CI checks ✅ Resolves merge conflicts |
WalkthroughAdds a LangGraph-focused integration test suite: pytest markers and hooks, a simple LangGraph agent with runtime fallback, comprehensive integration tests and docs (README, IMPLEMENTATION_SUMMARY), pyproject test/config updates, and a CI workflow example. No production APIs or exported-public declarations changed. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related issues
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (10)
tests/integration/conftest.py (2)
8-15: Avoid marker definition drift (already inpyproject.toml).
With--strict-markersand global marker definitions inpyproject.toml, keeping the same strings here is redundant and can drift over time. Consider removing these lines (or ensure they’re generated from a single source).
18-22: Auto-marking via substring match can mislabel tests.
if "integration" in str(item.fspath)may accidentally mark non-integration tests whose path happens to contain that substring. Consider matching a path segment instead.def pytest_collection_modifyitems(config, items): """Automatically mark integration tests.""" for item in items: - if "integration" in str(item.fspath): + # Match directory name, not substring + from pathlib import Path + if "integration" in Path(str(item.fspath)).parts: item.add_marker(pytest.mark.integration)tests/integration/IMPLEMENTATION_SUMMARY.md (1)
92-112: Keep install/run commands consistent with the repo’s Poetry workflow.
The examples are helpful; just ensure “pip install langgraph” vs “poetry add …” guidance matches how CI/devshell actually runs pytest (Poetry venv vs system pip).README.md (1)
213-213: “Latest verification” will go stale quickly.
Consider linking to CI artifacts / a badge instead, or moving this to a dated changelog entry to avoid implying it’s current.tests/integration/README.md (1)
80-96: Prefer relative import in example to avoidtestspackage ambiguity.
from tests.integration.simple_langgraph_agent ...can collide if a differenttestspackage is onPYTHONPATH. Since this doc lives inside the integration package, a relative import example is more robust.-from tests.integration.simple_langgraph_agent import invoke_agent +from .simple_langgraph_agent import invoke_agenttests/integration/test_integration_basic.py (2)
23-47: Use relative import to avoidtestspackage ambiguity.
This reduces risk of importing the wrongtestsmodule in some environments.def test_simple_langgraph_agent_fallback(): """Test that the simple agent works without LangGraph installed.""" - from tests.integration.simple_langgraph_agent import invoke_agent, LANGGRAPH_AVAILABLE + from .simple_langgraph_agent import invoke_agent, LANGGRAPH_AVAILABLE
49-55:test_pytest_markers_configuredis a no-op; assert markers viapytestconfig.
You can make this test actually validate marker registration under--strict-markers.@pytest.mark.integration -def test_pytest_markers_configured(): +def test_pytest_markers_configured(pytestconfig): """Test that pytest markers are properly configured.""" - import pytest - - # This test itself should have the integration marker - # We can't easily test this programmatically, but the test runner will validate it + markers = "\n".join(pytestconfig.getini("markers")) + assert "integration:" in markers + assert "langgraph:" in markerstests/integration/test_langgraph_integration.py (2)
89-102: Consider using a path relative to the test file.The hardcoded path
"tests/integration/simple_langgraph_agent.py"assumes tests are run from the repository root. This is typically fine for CI but could fail if tests are run from a different directory.Consider using
Path(__file__).parent / "simple_langgraph_agent.py"for a more robust path resolution:def test_langgraph_scenario_from_python_file(self): """Test creating a scenario from a Python file.""" - # Create a temporary Python file with the agent - agent_file = Path("tests/integration/simple_langgraph_agent.py") + # Use path relative to this test file + agent_file = Path(__file__).parent / "simple_langgraph_agent.py"
171-189: Retry test doesn't verify actual retry behavior.The test sets
scenario.retries = 2but uses an agent that never fails, so retry logic is never exercised. This only verifies that setting retries doesn't break the happy path.Consider adding a test with an agent that fails on first attempts to actually verify retry behavior:
def test_scenario_with_retries_actually_retrying(self): """Test that retries actually work when agent fails intermittently.""" call_count = 0 def flaky_agent(payload): nonlocal call_count call_count += 1 if call_count < 2: raise ValueError("Temporary failure") return {"result": f"Success on attempt {call_count}"} scenario = Scenario.load_langgraph( path=flaky_agent, dataset=SimpleDataset(), name="flaky-retry-test" ) scenario.retries = 3 result = run_suite([scenario]) # Verify that retries allowed eventual successtests/integration/simple_langgraph_agent.py (1)
74-92: Consider simplifying the graph structure.The
should_continuefunction always returnsEND, making the conditional edge unnecessary. For a single-step agent, you could use a direct edge instead of conditional edges.If intentional for demonstration purposes, this is fine. Otherwise, consider simplifying:
- def should_continue(state: AgentState) -> str: - """Determine if the agent should continue processing.""" - return END - # Create the graph workflow = StateGraph(AgentState) # Add nodes workflow.add_node("process", process_query) # Set entry point workflow.set_entry_point("process") - # Add edges - workflow.add_conditional_edges( - "process", - should_continue, - {END: END} - ) + # Add edge to end + workflow.add_edge("process", END)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
README.md(1 hunks)pyproject.toml(1 hunks)tests/integration/IMPLEMENTATION_SUMMARY.md(1 hunks)tests/integration/README.md(1 hunks)tests/integration/__init__.py(1 hunks)tests/integration/ci-example.yml(1 hunks)tests/integration/conftest.py(1 hunks)tests/integration/simple_langgraph_agent.py(1 hunks)tests/integration/test_integration_basic.py(1 hunks)tests/integration/test_langgraph_integration.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
tests/integration/simple_langgraph_agent.py (1)
tests/test_framework_adapters.py (1)
agent(90-92)
tests/integration/test_integration_basic.py (1)
tests/integration/simple_langgraph_agent.py (1)
invoke_agent(98-132)
🔇 Additional comments (15)
tests/integration/__init__.py (1)
1-1: Good package initializer docstring.
Clear intent; no functional risk.README.md (1)
196-207: Test commands/readability look good.
Nice, explicit split between unit vs integration runs.tests/integration/test_langgraph_integration.py (9)
1-13: LGTM on module setup and skip mechanism.The use of
pytest.importorskip()at module level is the correct approach for conditionally skipping all tests when LangGraph is not installed. This provides clear feedback to developers about why tests are skipped.
16-46: LGTM on SimpleDataset implementation.The dataset correctly implements
iter_cases()as a generator yielding well-structured test cases with varied query types for comprehensive coverage.
54-87: LGTM on callable scenario test.The test properly verifies scenario creation, adapter preparation, and execution. The assertion order (success → output not None → content check) is correct.
104-129: LGTM on full evaluation cycle test.Comprehensive test that validates the entire evaluation pipeline, including result structure and individual run outcomes. Good coverage of the happy path.
131-148: LGTM on metrics test.The test appropriately validates the structure of metrics results without being overly strict about the metric values, allowing for graceful handling when specific metrics are unavailable.
150-169: LGTM on error handling test.Good test coverage for the failure path. Verifying that errors are captured and recorded without crashing the test suite is essential for robust integration testing.
191-212: LGTM on clone test with minor note.Good verification of clone immutability. The assertion
original.retries == 1assumes the default value; consider documenting this expectation or retrieving the default dynamically.
230-251: LGTM on multiple scenarios test.Clean test for batch scenario execution. The list comprehension for scenario creation and iteration over results is well-structured.
215-227: LGTM on adapter registry test.Good verification of adapter registration and alias resolution. Testing both the primary name and alias ensures the registry is properly configured. The
langgraph_graphalias is explicitly registered insrc/agentunit/adapters/langgraph.pyand intentionally tested here.tests/integration/simple_langgraph_agent.py (4)
7-31: LGTM on conditional imports with fallback.The try/except pattern with
LANGGRAPH_AVAILABLEflag is a clean approach for graceful degradation. The fallback stubs enable type checking and import without runtime usage.
33-39: LGTM on AgentState definition.Clean TypedDict definition that captures the agent state structure appropriately.
98-132: LGTM on invoke_agent implementation.The function correctly handles both the LangGraph-available and fallback paths. The response structure is consistent between real and mock implementations, which is important for test reliability.
One minor observation: the agent is created fresh on each
invoke_agentcall (line 107). This is acceptable for integration tests but worth noting if performance becomes a concern.
135-139: LGTM on module-level graph instance.Clean conditional initialization that provides a pre-built graph for direct usage while handling the unavailable case gracefully.
| [tool.pytest.ini_options] | ||
| markers = [ | ||
| "integration: marks tests as integration tests (deselect with '-m \"not integration\"')", | ||
| "langgraph: marks tests as requiring LangGraph (skipped if not installed)", | ||
| ] | ||
| testpaths = ["tests"] | ||
| python_files = ["test_*.py"] | ||
| python_classes = ["Test*"] | ||
| python_functions = ["test_*"] | ||
| addopts = [ | ||
| "--strict-markers", | ||
| "--strict-config", | ||
| "-ra", | ||
| ] | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
pytest --strict-config flag behavior and plugin compatibility
💡 Result:
Short answer
- --strict-config (or the ini option strict_config) turns unknown/unsupported pytest INI keys from warnings into errors (UsageError) when enabled. [1][2]
- You can enable it via CLI (--strict-config), in pytest.ini/pyproject.toml (strict_config = true), or enable the umbrella strict = true which enables strict_config plus other strict options. [1][2]
What it actually does
- During config parsing pytest computes unknown ini keys and, if strict_config is true, raises a UsageError instead of issuing a PytestConfigWarning. If strict is set, strict_config is considered true. [2][3]
Plugin compatibility notes (practical implications)
- Plugins that relied on unregistered/typoed INI keys, or that add custom ini keys but didn’t register them with parser.addini, will start failing under strict_config. Such plugins should register their ini options via pytest's parser.addini (or document the keys) to remain compatible. [2][3]
- If you enable strict (the global flag) you may pick up future strictness checks too — enable it only with a pinned pytest version or be prepared to opt out of individual strict options. [1]
Versions / changelog pointers
- pytest added the umbrella strict option and strict_config handling in recent releases; the reintroduced/changed behavior of --strict/strict mode is documented in the changelog and deprecations pages. Check your pytest version’s docs/changelog if you rely on older behavior. [1][2]
If you want, I can:
- show the exact error message you’ll get with --strict-config, or
- scan your project/plugins (paste pytest.ini and plugin names) and list likely incompatibilities.
Sources
[1] pytest docs — Good Integration Practices / strict mode.
[2] pytest source/docs — Config parsing and _warn_or_fail_if_strict implementation (strict_config behavior) and changelog notes.
[3] pytest changelog (notes about strict/strict_config).
Ensure --strict-config compatibility before merging.
The --strict-config flag will error on unknown INI keys from incompatible plugins or legacy configs. Before enabling, audit third-party plugins to confirm they properly register custom INI options, and add a CI check to catch config issues early. Document the stricter requirements for contributors if your project uses external plugins.
🤖 Prompt for AI Agents
In pyproject.toml around lines 57 to 71, adding pytest's --strict-config may
cause test runs to fail if third-party or legacy plugins define unknown INI
keys; audit installed pytest plugins and project config for any non-registered
INI options, update or remove offending plugins/config keys, or register custom
options via pytest_addoption in conftest.py as appropriate, add a CI job that
runs pytest with --strict-config to catch regressions, and update
CONTRIBUTING/README to document the stricter INI key requirements for
contributors.
| - name: Install dependencies including integration test deps | ||
| run: | | ||
| python -m pip install --upgrade pip | ||
| pip install poetry | ||
| poetry install | ||
| # Install optional integration test dependencies | ||
| pip install langgraph | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n tests/integration/ci-example.yml | head -70Repository: aviralgarg05/agentunit
Length of output: 2233
🌐 Web query:
current recommended version actions/setup-python 2024 2025 latest
💡 Result:
Recommended: use actions/setup-python@v6 (current latest release v6.1.0). [1][2][3]
Sources:
- action repo README (examples use @v6). [1]
- releases (v6.1.0). [2]
- GitHub Marketplace docs / usage examples. [3]
🌐 Web query:
codecov/codecov-action latest version recommended 2024 2025
💡 Result:
Recommendation: use codecov/codecov-action@v5 (the v5 line). The repository README states "@v5 is recommended" and v5.x is the current major with recent patch releases (e.g., v5.5.1). [1][2]
Notes:
- v5 uses the Codecov CLI/uploader and is the supported path for tokenless/global-upload features; v3 and earlier do not include CLI features. [1][3]
Sources:
[1] codecov/codecov-action README (recommended @v5).
[2] codecov/codecov-action releases (v5.x current; example v5.5.1).
[3] Codecov docs on the Codecov Uploader/CLI.
CI example installs langgraph into the wrong environment (will be invisible to poetry run) and uses outdated action versions.
pip install langgraph runs against the base Python environment and won't be visible to poetry run pytest, which executes within Poetry's managed venv. Install via poetry run pip install langgraph or add langgraph to Poetry dependencies/extras. Additionally, upgrade action versions: actions/setup-python@v6 (current latest) and codecov/codecov-action@v5 (v3 is deprecated).
- name: Install dependencies including integration test deps
run: |
python -m pip install --upgrade pip
pip install poetry
poetry install
# Install optional integration test dependencies
- pip install langgraph
+ poetry run pip install langgraphAlso update action versions:
- Line 18, 43:
actions/setup-python@v4→actions/setup-python@v6 - Line 33:
codecov/codecov-action@v3→codecov/codecov-action@v5
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Install dependencies including integration test deps | |
| run: | | |
| python -m pip install --upgrade pip | |
| pip install poetry | |
| poetry install | |
| # Install optional integration test dependencies | |
| pip install langgraph | |
| - name: Install dependencies including integration test deps | |
| run: | | |
| python -m pip install --upgrade pip | |
| pip install poetry | |
| poetry install | |
| # Install optional integration test dependencies | |
| poetry run pip install langgraph | |
| - Tests are automatically skipped if LangGraph is not installed | ||
| - Use pytest markers to selectively run or skip integration tests | ||
| - All tests are marked with `@pytest.mark.integration` and `@pytest.mark.langgraph` | ||
|
|
There was a problem hiding this comment.
Doc claim about markers is likely too strong.
Basic integration tests (e.g., fallback behavior) shouldn’t necessarily be @pytest.mark.langgraph. Consider rephrasing to: “All tests are marked integration; LangGraph-dependent tests are marked langgraph.”
🤖 Prompt for AI Agents
In tests/integration/README.md around lines 64 to 67, the README overstates that
all integration tests are marked with both `integration` and `langgraph`; change
the wording to clarify that all integration tests use the `integration` marker
and only tests that depend on LangGraph use the `langgraph` marker. Update the
three bullet points to state that tests are skipped if LangGraph is not
installed, use pytest markers to selectively run or skip integration tests, and
that tests are marked `integration` while LangGraph-dependent tests are
additionally marked `langgraph`.
- Add langgraph as optional dependency in pyproject.toml - Update installation instructions to use poetry extras - Ensure LangGraph is properly managed within Poetry environment - Address coderabbit review feedback about pip vs poetry installation
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
tests/integration/README.md (1)
65-67: Clarify pytest marker claim—not all integration tests require thelanggraphmarker.Line 67 overstates that "All tests are marked with
@pytest.mark.integrationand@pytest.mark.langgraph". Basic integration tests (e.g., fallback behavior, graceful skipping) should only use@pytest.mark.integration; the@pytest.mark.langgraphmarker should only apply to tests that directly depend on LangGraph functionality. Update the three bullet points under "CI Integration" to accurately reflect this distinction.Apply this diff to correct the claim:
- Tests are automatically skipped if LangGraph is not installed - Use pytest markers to selectively run or skip integration tests - All tests are marked with `@pytest.mark.integration` and `@pytest.mark.langgraph` + Tests are automatically skipped if LangGraph is not installed + Use pytest markers to selectively run or skip integration tests + All tests are marked `integration`; LangGraph-dependent tests are additionally marked `langgraph`
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
pyproject.toml(2 hunks)tests/integration/IMPLEMENTATION_SUMMARY.md(1 hunks)tests/integration/README.md(1 hunks)tests/integration/ci-example.yml(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- tests/integration/IMPLEMENTATION_SUMMARY.md
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
tests/integration/README.md
55-55: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - name: Set up Python ${{ matrix.python-version }} | ||
| uses: actions/setup-python@v4 |
There was a problem hiding this comment.
Update GitHub Actions to current recommended versions.
The workflow uses outdated action versions. Update to the current stable releases for security, feature parity, and future compatibility.
Apply these diffs to modernize the action versions:
- name: Set up Python ${{ matrix.python-version }}
- uses: actions/setup-python@v4
+ uses: actions/setup-python@v6
with:
python-version: ${{ matrix.python-version }} - name: Upload coverage to Codecov
- uses: codecov/codecov-action@v3
+ uses: codecov/codecov-action@v5 - name: Set up Python
- uses: actions/setup-python@v4
+ uses: actions/setup-python@v6
with:
python-version: "3.11"Also applies to: 43-43, 33-33
🤖 Prompt for AI Agents
In tests/integration/ci-example.yml around lines 18, 33 and 43, the workflow is
using outdated action versions; update each "uses:" entry to the current
recommended stable release tags (for example upgrade any actions/checkout,
actions/setup-python, actions/cache, actions/setup-node, etc., to their latest
major stable versions) by replacing the old version suffixes with the current
stable tags, verify compatibility in the action release notes, run the workflow
locally or via a test branch to confirm no breaking changes, and pin to a
specific major tag (not a floating tag) for future stability.
|
|
||
| ### What the Tests Cover | ||
|
|
||
| 1. **Scenario Creation**: Tests creating scenarios from callable agents and Python files |
There was a problem hiding this comment.
Add language specification to fenced code block.
The bash code block is missing a language identifier, which breaks syntax highlighting and violates Markdown best practices.
Apply this diff to add the language tag:
1. **Scenario Creation**: Tests creating scenarios from callable agents and Python files
2. **Full Evaluation Cycle**: Tests running complete evaluation cycles with multiple test casesActually, looking more carefully, line 54 shows the start of the list. Let me check line 55 context again. The issue is on the code block around line 13-16 (bash block). Let me correct the line range:
- ```bash
+ ```Wait, I need to look at the actual lines. In the annotated code starting around line 13, I see:
13~
# Install optional integration test dependencies
14~
poetry install --extras integration-tests
This is within a bash code fence that starts at line 13. The static analysis hint says line 55 is missing language spec, but looking at the code, line 55 doesn't show a code fence. Let me recount...
Actually, the file shows lines 13-16 have bash code, then 20-22 have another bash code block. Then further down around line 28-44 are bash code blocks. The static analysis error "55-55: Fenced code blocks should have a language specified" points to line 55 which shows:
54~
1. **Scenario Creation**: Tests creating scenarios from callable agents and Python files
55~
2. **Full Evaluation Cycle**: Tests running complete evaluation cycles with multiple test cases
This doesn't match. Let me re-examine. The static analysis hint says line 55 has a fenced code block issue. Looking back at the annotated code, around line 55 is where? Let me count more carefully...
The annotation shows lines 1-97. Line 55 in that context is in the "What the Tests Cover" section which is NOT a code block. The static analysis tool might be flagging something else. Let me look for any unclosed or incorrectly specified code fences...
Actually, I see the issue. Look at the Example Usage section (lines 79-97). It has a python code fence starting at line 81:
```python
import pytest
...
This is properly marked. So the static analysis hint might be referring to one of the earlier bash blocks. Let me check if any are missing the bash language tag in the original file... The code shown has ```bash markers properly, so this might be a false positive from the static analysis tool, OR the static analysis tool is pointing to the Example Usage section's python block not being properly closed.
For safety, I'll include a review comment about ensuring all code fences have language specifications, using a more general approach.
<details>
<summary>🤖 Prompt for AI Agents</summary>
In tests/integration/README.md around lines 54-54, there are fenced code blocks
missing a language specifier; update the offending fences (e.g., the bash block
starting near lines ~13-16 and any other unlabelled fences) by adding the
appropriate language identifier (bash or python) to the opening fence,
ensure each fence is properly closed, and scan the file to confirm every fenced
code block includes a language tag.
</details>
<!-- fingerprinting:phantom:triton:mongoose -->
<!-- This is an auto-generated comment by CodeRabbit -->
| ### Example Usage | ||
|
|
||
| ```python | ||
| import pytest | ||
| from agentunit import Scenario, run_suite | ||
| from tests.integration.simple_langgraph_agent import invoke_agent | ||
|
|
||
| @pytest.mark.langgraph | ||
| @pytest.mark.integration | ||
| def test_my_langgraph_scenario(): | ||
| scenario = Scenario.load_langgraph( | ||
| path=invoke_agent, | ||
| dataset=my_dataset, | ||
| name="my-test" | ||
| ) | ||
|
|
||
| result = run_suite([scenario]) | ||
| assert len(result.scenarios) == 1 | ||
| ``` No newline at end of file |
There was a problem hiding this comment.
Ensure all fenced code blocks have explicit language specifications.
Per markdownlint (MD040), all fenced code blocks should declare their language. Review the code blocks in the Example Usage section and earlier to ensure they are properly specified (e.g., python, bash, ```yaml).
🤖 Prompt for AI Agents
In tests/integration/README.md around lines 79 to 97 the fenced code block(s)
under "Example Usage" are missing explicit language identifiers; update each
triple-backtick fence to include the appropriate language (e.g., ```python for
the Python snippet) and scan earlier sections for any other fenced blocks
without language tags, adding the correct language specifiers (bash, yaml, etc.)
to satisfy markdownlint MD040.
- Update actions/setup-python from v4 to v5 - Update codecov/codecov-action from v3 to v4 - Address coderabbit security and compatibility recommendations
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
tests/integration/ci-example.yml (1)
18-18: Update GitHub Actions to current recommended majors (setup-python / codecov).This repeats prior review feedback:
actions/setup-python@v5andcodecov/codecov-action@v4may be behind current recommended majors; please confirm latest and bump accordingly (typicallysetup-python@v6,codecov-action@v5).- uses: actions/setup-python@v5 + uses: actions/setup-python@v6 @@ - uses: codecov/codecov-action@v4 + uses: codecov/codecov-action@v5 @@ - uses: actions/setup-python@v5 + uses: actions/setup-python@v6What is the current recommended major version for actions/setup-python? What is the current recommended major version for codecov/codecov-action?Also applies to: 33-33, 43-43
| integration-tests: | ||
| runs-on: ubuntu-latest | ||
| # Only run integration tests on main branch or when explicitly requested | ||
| if: github.ref == 'refs/heads/main' || contains(github.event.pull_request.labels.*.name, 'run-integration-tests') |
There was a problem hiding this comment.
Guard pull_request context in the job if: to avoid push-event evaluation failures.
On push, github.event.pull_request isn’t present; rewrite to avoid touching it unless event_name == 'pull_request'.
- if: github.ref == 'refs/heads/main' || contains(github.event.pull_request.labels.*.name, 'run-integration-tests')
+ if: github.ref == 'refs/heads/main' || (github.event_name == 'pull_request' && contains(github.event.pull_request.labels.*.name, 'run-integration-tests'))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if: github.ref == 'refs/heads/main' || contains(github.event.pull_request.labels.*.name, 'run-integration-tests') | |
| if: github.ref == 'refs/heads/main' || (github.event_name == 'pull_request' && contains(github.event.pull_request.labels.*.name, 'run-integration-tests')) |
🤖 Prompt for AI Agents
In tests/integration/ci-example.yml around line 38, the job if: currently
accesses github.event.pull_request.labels unguarded which will fail on push
events; change the condition to first check the event name before touching
pull_request (e.g. keep the main-branch check, and combine with a guarded
pull_request check using github.event_name == 'pull_request' &&
contains(github.event.pull_request.labels.*.name, 'run-integration-tests')).
Ensure the final boolean expression evaluates safely on push events by only
referencing github.event.pull_request when github.event_name is 'pull_request'.
- Correct indentation for steps sections in both jobs - Ensure proper YAML syntax for GitHub Actions workflow - Address coderabbit YAML formatting feedback
- Update langchain version constraint to be compatible with langgraph - Resolve langchain-core version conflicts between langchain and langgraph - Update poetry.lock with resolved dependencies - Fix CI test failures caused by outdated lock file
|
@aviralgarg05 I have fix the issue |
- Change absolute imports to relative imports in test files - Add missing tests/__init__.py for proper package structure - Fix all linting errors (whitespace, formatting, imports) - Improve code quality in simple_langgraph_agent.py - Resolve ModuleNotFoundError issues in CI This addresses the test import failures and ensures proper Python package structure for the tests directory.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (5)
tests/integration/test_integration_basic.py (2)
42-47: Simplify redundant assertion logic.The assertion on line 47 is somewhat redundant with the check on lines 42-44. When
LANGGRAPH_AVAILABLEis False, line 44 already asserts"Mock response"is in the result. The comment on line 46 suggests checking if the query appears in the response, but the assertion specifically checks for "quantum" rather than validating the query content generically.Consider simplifying to:
- # Should contain the query in the response - assert "quantum" in result["result"].lower() or "Mock response" in result["result"] + # Response should be relevant to the query + assert any(term in result["result"].lower() for term in ["quantum", "mock"])
50-55: Placeholder test provides no validation.This test has no assertions and doesn't validate that pytest markers are configured. Consider either implementing actual validation or removing it.
You could verify markers programmatically:
@pytest.mark.integration def test_pytest_markers_configured(): """Test that pytest markers are properly configured.""" - - # This test itself should have the integration marker - # We can't easily test this programmatically, but the test runner will validate it + import pytest + + # Verify markers are registered in pytest config + config = pytest.Config.fromdictargs({}, []) + markers = {marker.name for marker in config.getini("markers")} + + assert "integration" in markers + assert "langgraph" in markerstests/integration/test_langgraph_integration.py (3)
13-14: Unused variable frompytest.importorskip.The
langgraphvariable is assigned but never used. Theimportorskipcall will skip tests regardless of assignment.-# Skip all tests in this module if LangGraph is not available -langgraph = pytest.importorskip("langgraph", reason="LangGraph not installed") +# Skip all tests in this module if LangGraph is not available +pytest.importorskip("langgraph", reason="LangGraph not installed")
90-104: Use relative path instead of hardcoded path.Line 93 uses a hardcoded path that assumes tests run from the repository root. This can fail if tests are executed from a different directory.
Apply this diff to use a relative path:
def test_langgraph_scenario_from_python_file(self): """Test creating a scenario from a Python file.""" - # Create a temporary Python file with the agent - agent_file = Path("tests/integration/simple_langgraph_agent.py") + agent_file = Path(__file__).parent / "simple_langgraph_agent.py"
172-191: Test doesn't verify retry behavior.This test sets
retries = 2but usesinvoke_agentwhich always succeeds. The retry logic is never exercised or validated. Consider adding a test with an agent that fails initially but succeeds on retry.Add a test that actually triggers retries:
def test_scenario_with_retries(self): """Test scenario retry functionality.""" attempt_count = {"value": 0} def flaky_agent(payload): """Agent that fails first time, succeeds on retry.""" attempt_count["value"] += 1 if attempt_count["value"] == 1: raise ValueError("First attempt fails") return {"result": f"Success on attempt {attempt_count['value']}"} scenario = Scenario.load_langgraph( path=flaky_agent, dataset=SimpleDataset(), name="retry-test" ) scenario.retries = 2 result = run_suite([scenario]) scenario_result = result.scenarios[0] # Should succeed after retry for run in scenario_result.runs: assert run.success is True
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
tests/__init__.py(1 hunks)tests/integration/__init__.py(1 hunks)tests/integration/conftest.py(1 hunks)tests/integration/simple_langgraph_agent.py(1 hunks)tests/integration/test_integration_basic.py(1 hunks)tests/integration/test_langgraph_integration.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- tests/init.py
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/integration/conftest.py
- tests/integration/init.py
🧰 Additional context used
🧬 Code graph analysis (3)
tests/integration/test_langgraph_integration.py (5)
src/agentunit/core/scenario.py (3)
Scenario(21-270)load_langgraph(41-53)clone(257-270)tests/integration/simple_langgraph_agent.py (1)
invoke_agent(99-133)src/agentunit/core/trace.py (1)
TraceLog(22-72)src/agentunit/adapters/langgraph.py (1)
LangGraphAdapter(34-142)src/agentunit/adapters/registry.py (1)
resolve_adapter(39-45)
tests/integration/simple_langgraph_agent.py (2)
src/agentunit/cli/__init__.py (1)
get(80-89)tests/test_framework_adapters.py (1)
agent(90-92)
tests/integration/test_integration_basic.py (1)
tests/integration/simple_langgraph_agent.py (1)
invoke_agent(99-133)
🔇 Additional comments (11)
tests/integration/test_langgraph_integration.py (6)
17-48: LGTM!The
SimpleDatasetclass provides good test coverage with three diverse test cases covering different query types and difficulty levels.
105-131: LGTM!Comprehensive test of the full evaluation cycle with proper validation of results, including success status, duration, and error handling.
151-171: LGTM!The error handling test properly validates that failures are recorded with error messages and don't crash the test suite.
233-253: LGTM!Well-structured test for running multiple independent scenarios in parallel, with proper validation that each scenario executes correctly.
216-229: The adapter alias is properly registered.The
"langgraph_graph"alias is correctly defined insrc/agentunit/adapters/langgraph.pyat line 145 viaregister_adapter(LangGraphAdapter, aliases=("langgraph_graph",)). The test assertion will pass as expected.
132-150: Use a valid metric name instead of "response_length".The metric
"response_length"is not defined in the system. The available metrics are:faithfulness,tool_success,answer_correctness,hallucination_rate, andretrieval_quality. Using an undefined metric will raise aKeyErrorinresolve_metrics()before the test can reach the assertion. Replace the metric name with one of the valid options or mock the metrics registry for this test.Likely an incorrect or invalid review comment.
tests/integration/simple_langgraph_agent.py (5)
1-32: LGTM!Excellent use of graceful dependency handling with
LANGGRAPH_AVAILABLEflag and fallback types. This allows tests to run even when LangGraph is not installed, with clear distinction between mock and real behavior.
34-41: LGTM!The
AgentStateTypedDict properly defines the expected state structure for the agent.
43-97: LGTM!The agent implementation is appropriately simple for testing purposes. Deterministic responses based on query keywords make test assertions predictable. The graph structure is minimal but correctly follows LangGraph patterns.
99-133: LGTM!The
invoke_agentfunction correctly handles both mock and real execution paths. The response extraction logic safely handles edge cases with proper None checks.
136-140: LGTM!Module-level graph initialization is appropriately conditional and provides a convenient import target for tests.
| from agentunit.core.trace import TraceLog | ||
| trace = TraceLog() | ||
| outcome = scenario.adapter.execute(test_case, trace) | ||
|
|
||
| assert outcome.success is True | ||
| assert outcome.output is not None | ||
| assert "quantum" in outcome.output.lower() |
There was a problem hiding this comment.
Add type guard before string operation.
Line 88 calls .lower() on outcome.output without verifying it's a string. According to the AdapterOutcome structure, output can be None.
Apply this diff to add a type guard:
outcome = scenario.adapter.execute(test_case, trace)
assert outcome.success is True
assert outcome.output is not None
+ assert isinstance(outcome.output, str)
assert "quantum" in outcome.output.lower()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from agentunit.core.trace import TraceLog | |
| trace = TraceLog() | |
| outcome = scenario.adapter.execute(test_case, trace) | |
| assert outcome.success is True | |
| assert outcome.output is not None | |
| assert "quantum" in outcome.output.lower() | |
| from agentunit.core.trace import TraceLog | |
| trace = TraceLog() | |
| outcome = scenario.adapter.execute(test_case, trace) | |
| assert outcome.success is True | |
| assert outcome.output is not None | |
| assert isinstance(outcome.output, str) | |
| assert "quantum" in outcome.output.lower() |
🤖 Prompt for AI Agents
In tests/integration/test_langgraph_integration.py around lines 82 to 88,
outcome.output may be None but the test calls outcome.output.lower(); add a type
guard before calling .lower() by asserting outcome.output is a string (e.g.,
assert isinstance(outcome.output, str)) or checking isinstance(outcome.output,
str) and then perform the lower() check only inside that branch so the test
won’t raise when output is None.
|
@aviralgarg05 Could you please again run the tests/workflows for this PR? Thank you. |
- Add proper __init__ method that calls super() with required name and loader - Change iter_cases to _generate_cases method that returns a list - This fixes the TypeError about missing positional arguments Resolves the 8 failing LangGraph integration tests.
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
tests/integration/test_langgraph_integration.py (1)
86-92: Guardoutcome.outputtype before calling.lower()to avoid AttributeError.This matches the earlier review note:
AdapterOutcome.outputmay be non-string (orNone), so.lower()can raise.outcome = scenario.adapter.execute(test_case, trace) assert outcome.success is True assert outcome.output is not None + assert isinstance(outcome.output, str) assert "quantum" in outcome.output.lower()
| # Create a temporary Python file with the agent | ||
| agent_file = Path("tests/integration/simple_langgraph_agent.py") | ||
|
|
||
| scenario = Scenario.load_langgraph( | ||
| path=agent_file, | ||
| dataset=SimpleDataset(), | ||
| name="test-langgraph-file", | ||
| callable="invoke_agent" # Specify the callable name | ||
| ) |
There was a problem hiding this comment.
Fix cwd-dependent test path: build the agent file path relative to this test file.
Path("tests/integration/simple_langgraph_agent.py") will fail if pytest is invoked from a different working directory (common in CI).
- agent_file = Path("tests/integration/simple_langgraph_agent.py")
+ agent_file = Path(__file__).resolve().parent / "simple_langgraph_agent.py"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Create a temporary Python file with the agent | |
| agent_file = Path("tests/integration/simple_langgraph_agent.py") | |
| scenario = Scenario.load_langgraph( | |
| path=agent_file, | |
| dataset=SimpleDataset(), | |
| name="test-langgraph-file", | |
| callable="invoke_agent" # Specify the callable name | |
| ) | |
| # Create a temporary Python file with the agent | |
| agent_file = Path(__file__).resolve().parent / "simple_langgraph_agent.py" | |
| scenario = Scenario.load_langgraph( | |
| path=agent_file, | |
| dataset=SimpleDataset(), | |
| name="test-langgraph-file", | |
| callable="invoke_agent" # Specify the callable name | |
| ) |
🤖 Prompt for AI Agents
In tests/integration/test_langgraph_integration.py around lines 96 to 104, the
test constructs agent_file using a cwd-dependent relative path which breaks when
pytest runs from another working directory; instead build the path relative to
the test file by using the test file's directory (e.g.
Path(__file__).resolve().parent / "simple_langgraph_agent.py") so the test
locates the agent file regardless of current working directory.
| for run in scenario_result.runs: | ||
| assert run.success is True | ||
| assert run.duration_ms > 0 | ||
| assert run.error is None |
There was a problem hiding this comment.
Avoid flaky timing assertion: duration_ms can legitimately be 0ms.
for run in scenario_result.runs:
assert run.success is True
- assert run.duration_ms > 0
+ assert run.duration_ms >= 0
assert run.error is None📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for run in scenario_result.runs: | |
| assert run.success is True | |
| assert run.duration_ms > 0 | |
| assert run.error is None | |
| for run in scenario_result.runs: | |
| assert run.success is True | |
| assert run.duration_ms >= 0 | |
| assert run.error is None |
🤖 Prompt for AI Agents
In tests/integration/test_langgraph_integration.py around lines 131 to 134, the
assertion "assert run.duration_ms > 0" is flaky because duration can
legitimately be 0ms; change this to check for non-negativity instead (e.g.,
"assert run.duration_ms >= 0") or remove the timing assertion entirely if timing
isn't critical, ensuring the test only asserts that duration is not negative
rather than strictly positive.
| cloned = original.clone( | ||
| name="cloned", | ||
| retries=3, | ||
| timeout=120.0 | ||
| ) | ||
|
|
||
| assert cloned.name == "cloned" | ||
| assert cloned.retries == 3 | ||
| assert cloned.timeout == 120.0 | ||
| assert original.name == "original" # Original unchanged | ||
| assert original.retries == 1 # Default value | ||
|
|
There was a problem hiding this comment.
Don’t hardcode default Scenario.retries in test; assert “unchanged” instead.
assert original.retries == 1 bakes in an implementation default and will break on harmless default changes.
cloned = original.clone(
name="cloned",
retries=3,
timeout=120.0
)
assert cloned.name == "cloned"
assert cloned.retries == 3
assert cloned.timeout == 120.0
assert original.name == "original" # Original unchanged
- assert original.retries == 1 # Default value
+ assert original.retries != cloned.retries📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| cloned = original.clone( | |
| name="cloned", | |
| retries=3, | |
| timeout=120.0 | |
| ) | |
| assert cloned.name == "cloned" | |
| assert cloned.retries == 3 | |
| assert cloned.timeout == 120.0 | |
| assert original.name == "original" # Original unchanged | |
| assert original.retries == 1 # Default value | |
| cloned = original.clone( | |
| name="cloned", | |
| retries=3, | |
| timeout=120.0 | |
| ) | |
| assert cloned.name == "cloned" | |
| assert cloned.retries == 3 | |
| assert cloned.timeout == 120.0 | |
| assert original.name == "original" # Original unchanged | |
| assert original.retries != cloned.retries |
🤖 Prompt for AI Agents
In tests/integration/test_langgraph_integration.py around lines 207 to 218, the
test currently asserts a hardcoded default original.retries == 1 which pins
behavior to an implementation detail; instead capture the original retries value
before cloning (e.g. original_retries = original.retries) and after cloning
assert original.retries == original_retries (or assert that the original value
is unchanged) rather than asserting a specific numeric default.
| assert len(result.scenarios) == 3 | ||
| for i, scenario_result in enumerate(result.scenarios): | ||
| assert scenario_result.name == f"scenario-{i}" | ||
| assert len(scenario_result.runs) == 3 # Each has 3 test cases |
There was a problem hiding this comment.
Don’t rely on run_suite() preserving scenario order; assert by name set.
If run_suite() parallelizes or sorts internally, ordering may change and this becomes flaky.
result = run_suite(scenarios)
assert len(result.scenarios) == 3
- for i, scenario_result in enumerate(result.scenarios):
- assert scenario_result.name == f"scenario-{i}"
- assert len(scenario_result.runs) == 3 # Each has 3 test cases
+ names = {sr.name for sr in result.scenarios}
+ assert names == {f"scenario-{i}" for i in range(3)}
+ for scenario_result in result.scenarios:
+ assert len(scenario_result.runs) == 3 # Each has 3 test cases📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| assert len(result.scenarios) == 3 | |
| for i, scenario_result in enumerate(result.scenarios): | |
| assert scenario_result.name == f"scenario-{i}" | |
| assert len(scenario_result.runs) == 3 # Each has 3 test cases | |
| assert len(result.scenarios) == 3 | |
| names = {sr.name for sr in result.scenarios} | |
| assert names == {f"scenario-{i}" for i in range(3)} | |
| for scenario_result in result.scenarios: | |
| assert len(scenario_result.runs) == 3 # Each has 3 test cases |
🤖 Prompt for AI Agents
In tests/integration/test_langgraph_integration.py around lines 254 to 257, the
test currently assumes run_suite() preserves scenario order which can be flaky;
change the assertions to validate scenario names as a set and then look up
scenarios by name to assert their runs. Specifically, assert that the set of
scenario_result.name values equals {"scenario-0","scenario-1","scenario-2"},
then build a mapping from name to scenario_result and for each expected name
assert len(mapping[name].runs) == 3.
- Replace invalid 'response_length' metric with 'faithfulness' metric - Fix missing newline at end of tests/__init__.py file - This should resolve the last failing test and linting issues All LangGraph integration tests should now pass successfully.
|
@aviralgarg05 If possible, could you please re-run the tests/workflows for this PR? Thank you for your time and support. |
- Run ruff format on all integration test files - Fix formatting for conftest.py, simple_langgraph_agent.py, test_integration_basic.py, and test_langgraph_integration.py - This resolves the 'Run ruff formatter check' CI failure All linting and formatting checks should now pass.
|
@aviralgarg05 I have the issue of lint & format, could you please re-run the tests/workflows for this PR? Thank you for your time and support. |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tests/integration/simple_langgraph_agent.py (1)
75-89: Consider simplifying the edge pattern.Since
should_continuealways returnsEND, the conditional edge pattern at line 89 could be simplified to a direct edge:workflow.add_edge("process", END). The current implementation works correctly but is more complex than necessary for this deterministic flow.Apply this diff to simplify:
- def should_continue(state: AgentState) -> str: - """Determine if the agent should continue processing.""" - return END - # Create the graph workflow = StateGraph(AgentState) # Add nodes workflow.add_node("process", process_query) # Set entry point workflow.set_entry_point("process") # Add edges - workflow.add_conditional_edges("process", should_continue, {END: END}) + workflow.add_edge("process", END)tests/integration/test_integration_basic.py (1)
50-55: Consider removing or enhancing this placeholder test.This test contains no assertions and always passes. While pytest markers can be hard to test programmatically, consider either:
- Adding runtime assertions (e.g., checking
pytest.markregistry or test node metadata via a pytest hook in conftest.py)- Documenting in the docstring why this is intentionally minimal
- Removing the test if it serves no verification purpose
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
tests/integration/conftest.py(1 hunks)tests/integration/simple_langgraph_agent.py(1 hunks)tests/integration/test_integration_basic.py(1 hunks)tests/integration/test_langgraph_integration.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/integration/conftest.py
- tests/integration/test_langgraph_integration.py
🧰 Additional context used
🧬 Code graph analysis (2)
tests/integration/simple_langgraph_agent.py (1)
tests/test_framework_adapters.py (1)
agent(90-92)
tests/integration/test_integration_basic.py (1)
tests/integration/simple_langgraph_agent.py (1)
invoke_agent(95-129)
🔇 Additional comments (5)
tests/integration/simple_langgraph_agent.py (4)
8-34: Good fallback pattern for optional dependencies.The conditional import with fallback stubs is well-implemented and allows the test module to be imported regardless of LangGraph availability. The simplified
add_messagesfallback is acceptable for test scenarios.
36-44: LGTM!The
AgentStateTypedDict provides clear structure for the agent's state and aligns well with typical LangGraph patterns.
95-129: Well-structured invocation logic with proper fallback.The function correctly handles both LangGraph-available and unavailable cases, with matching response structures. The safe extraction of the final message (lines 118-121) properly handles edge cases.
132-136: Module-level graph initialization is acceptable for test code.Creating the graph at module import time is a reasonable approach for a test agent, as
create_simple_agent()is lightweight and deterministic.tests/integration/test_integration_basic.py (1)
23-48: LGTM! Comprehensive fallback testing.The test effectively validates both the mock response (when LangGraph is unavailable) and the real response structure, with appropriate assertions for each case.
|
All required checks are passing. |
Welcome to Codecov 🎉Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests. ℹ️ You can also turn on project coverage checks and project coverage reporting on Pull Request comment Thanks for integrating Codecov - We've got you covered ☂️ |
* Add LangGraph integration tests - Create tests/integration/ directory with comprehensive LangGraph tests - Add simple_langgraph_agent.py with mock LangGraph agent for testing - Implement test_langgraph_integration.py with full evaluation cycle tests - Configure pytest markers for integration and langgraph tests - Add graceful dependency handling - tests skip when LangGraph not installed - Include comprehensive documentation and CI examples - Update main README with integration test instructions Fixes aviralgarg05#24 * Fix LangGraph installation instructions for Poetry environment - Add langgraph as optional dependency in pyproject.toml - Update installation instructions to use poetry extras - Ensure LangGraph is properly managed within Poetry environment - Address coderabbit review feedback about pip vs poetry installation * Update GitHub Actions to current recommended versions - Update actions/setup-python from v4 to v5 - Update codecov/codecov-action from v3 to v4 - Address coderabbit security and compatibility recommendations * Fix YAML indentation in CI example file - Correct indentation for steps sections in both jobs - Ensure proper YAML syntax for GitHub Actions workflow - Address coderabbit YAML formatting feedback * Fix dependency conflicts and update poetry.lock - Update langchain version constraint to be compatible with langgraph - Resolve langchain-core version conflicts between langchain and langgraph - Update poetry.lock with resolved dependencies - Fix CI test failures caused by outdated lock file * Fix import issues and linting errors in integration tests - Change absolute imports to relative imports in test files - Add missing tests/__init__.py for proper package structure - Fix all linting errors (whitespace, formatting, imports) - Improve code quality in simple_langgraph_agent.py - Resolve ModuleNotFoundError issues in CI This addresses the test import failures and ensures proper Python package structure for the tests directory. * Fix SimpleDataset constructor to properly inherit from DatasetSource - Add proper __init__ method that calls super() with required name and loader - Change iter_cases to _generate_cases method that returns a list - This fixes the TypeError about missing positional arguments Resolves the 8 failing LangGraph integration tests. * Fix final test issue and linting error - Replace invalid 'response_length' metric with 'faithfulness' metric - Fix missing newline at end of tests/__init__.py file - This should resolve the last failing test and linting issues All LangGraph integration tests should now pass successfully. * Fix code formatting issues in integration tests - Run ruff format on all integration test files - Fix formatting for conftest.py, simple_langgraph_agent.py, test_integration_basic.py, and test_langgraph_integration.py - This resolves the 'Run ruff formatter check' CI failure All linting and formatting checks should now pass.

🎯 Overview #24
This PR implements comprehensive integration tests for LangGraph with AgentUnit, addressing issue #24.
✅ Tasks Completed
tests/integration/🧪 What's Added
New Files
tests/integration/- New integration test directorysimple_langgraph_agent.py- Mock LangGraph agent for testingtest_langgraph_integration.py- Comprehensive integration teststest_integration_basic.py- Basic structure validation testsconftest.py- Pytest configuration and markersREADME.md- Complete documentationci-example.yml- CI configuration exampleUpdated Files
pyproject.toml- Added pytest markers configurationREADME.md- Updated testing section with integration test info🔧 Key Features
Graceful Dependency Handling
pytest.importorskip()for proper dependency checkingComprehensive Test Coverage
CI Ready
integration,langgraph)🚀 Usage