Fix ScenarioRun Duration Tracking in SwarmAdapter#63
Conversation
Signed-off-by: Jagriti-student <jagriti7989@gmail.com>
Signed-off-by: Jagriti-student <jagriti7989@gmail.com>
Signed-off-by: Jagriti-student <jagriti7989@gmail.com>
Signed-off-by: Jagriti-student <jagriti7989@gmail.com>
Learn moreAll Green is an AI agent that automatically: ✅ Addresses code review comments ✅ Fixes failing CI checks ✅ Resolves merge conflicts |
WalkthroughThe changes implement duration tracking in the SwarmAdapter by computing Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 3❌ Failed checks (3 warnings)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
💤 Files with no reviewable changes (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
pyproject.toml (1)
23-37: Confirm numpy is still available or remove direct imports.Direct
numpyimports were found in the codebase:
tests/test_comparison.py:5src/agentunit/production/__init___old.py:13(appears to be a deprecated backup file)While
scipywill bring innumpyas a transitive dependency, the test file has a direct import. Either addnumpyback as an explicit dependency or remove the directnumpyimports and rely solely on scipy's numpy.src/agentunit/adapters/swarm_adapter.py (1)
1-1: Pipeline failure: Ruff format check failed.The CI pipeline indicates that this file needs reformatting. Please run
poetry run ruff format src/agentunit/adapters/swarm_adapter.pyto fix the formatting issues before merging.
🤖 Fix all issues with AI agents
In @src/agentunit/reporting/results.py:
- Around line 203-204: When rows is empty the function returns target without
creating the file; change the early-return branch so it creates the file (either
an empty file or headers-only) before returning—for example, invoke
target.write_text("", encoding="utf-8") (or write header text) in the branch
that currently checks `if not rows:` and then `return target`; locate this in
the results handling code that uses the `rows` variable and `target` to ensure
the file is always created.
🧹 Nitpick comments (3)
src/agentunit/reporting/results.py (2)
82-92:_flatten_metricsonly handles single-level nesting.The helper only flattens one level of nested dictionaries. If
valueis a dict containing another dict, the inner dict will be serialized as a string in the CSV (viaDictWriter's default behavior), which may not be the intended outcome.Consider using recursion if deeper nesting is expected, or document this limitation.
♻️ Optional: Recursive flattening
-def _flatten_metrics(metrics: dict[str, Any], prefix: str = "metric") -> dict[str, Any]: +def _flatten_metrics(metrics: dict[str, Any], prefix: str = "metric") -> dict[str, Any]: + """Flatten nested metric dictionaries recursively.""" flat: dict[str, Any] = {} for key, value in metrics.items(): + full_key = f"{prefix}_{key}" if isinstance(value, dict): - for inner_key, inner_value in value.items(): - flat[f"{prefix}_{key}_{inner_key}"] = inner_value + flat.update(_flatten_metrics(value, full_key)) else: - flat[f"{prefix}_{key}"] = value + flat[full_key] = value return flat
206-211: Consider a more intuitive column ordering.Alphabetical sorting places columns like
case_idbeforescenario_name. For better readability, consider placing key identifying columns first.♻️ Optional: Fixed column order with dynamic metrics
- fieldnames = sorted({key for row in rows for key in row}) + # Fixed columns first, then sorted metric columns + fixed_cols = ["scenario_name", "case_id", "success", "duration_ms", "error"] + metric_cols = sorted({k for row in rows for k in row if k not in fixed_cols}) + fieldnames = fixed_cols + metric_colssrc/agentunit/adapters/swarm_adapter.py (1)
353-369: Consider consolidating local imports.
ScenarioRunandScenarioResultare imported separately at lines 353 and 369. These could be combined into a single import statement for cleaner code.♻️ Consolidate imports
# Create trace log from agentunit.core.trace import TraceLog + from agentunit.reporting.results import ScenarioResult, ScenarioRun trace = TraceLog() trace.record( "session_complete", session_id=session_id, metrics=metrics, ) # Create scenario run - from agentunit.reporting.results import ScenarioRun - scenario_run = ScenarioRun( ... ) # Create result - from agentunit.reporting.results import ScenarioResult - result = ScenarioResult( ... )
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (4)
pyproject.tomlsrc/agentunit/adapters/swarm_adapter.pysrc/agentunit/reporting/results.pytests/test_reporting.py
💤 Files with no reviewable changes (1)
- tests/test_reporting.py
🧰 Additional context used
🧬 Code graph analysis (1)
src/agentunit/adapters/swarm_adapter.py (2)
src/agentunit/core/trace.py (2)
TraceLog(26-78)record(33-34)src/agentunit/reporting/results.py (1)
ScenarioRun(29-36)
🪛 GitHub Actions: CI
src/agentunit/adapters/swarm_adapter.py
[error] 1-1: Ruff format check failed. 1 file would be reformatted: src/agentunit/adapters/swarm_adapter.py. Command: poetry run ruff format --check src/ tests/.
🔇 Additional comments (7)
pyproject.toml (3)
27-27: LGTM on langchain constraint.The version constraint
>=0.0.353,<0.4.0is reasonable and unchanged from before.
78-82: LGTM on pytest configuration.The
-raflag configuration is appropriate for showing test summary information. This appears to be a formatting-only change with no functional impact.
35-41: Extras configuration and dependency versions are correct.The
integration-testsextra correctly references the optionallanggraphdependency, and jsonschema version4.25.1is available on PyPI. The configuration aligns with the PR objectives.src/agentunit/reporting/results.py (1)
100-120: LGTM!The addition of
started_atandfinished_atfields toSuiteResultwith ISO formatting into_dict()andto_json()is well-implemented. These timing metadata fields complement the duration tracking changes in the adapter.src/agentunit/adapters/swarm_adapter.py (3)
17-25: LGTM!Good pattern for handling optional dependencies. The import guard with
HAS_SWARMflag combined with the check in__init__(line 51-56) provides clear error messaging when the dependency is missing.
331-337: LGTM - Correctly implements duration tracking.This change properly addresses the PR objective by calculating
duration_msfrom session timestamps. The conversion from seconds to milliseconds (duration_seconds * 1000) is correct, and the fallback to0.0provides defensive handling for edge cases.
355-366: LGTM!The
ScenarioRuncreation correctly uses the calculatedduration_msvalue (line 364), which is the core fix for this PR. The conditional logic forscenario_nameis appropriately defensive.
Signed-off-by: Jagriti-student <jagriti7989@gmail.com>
|
I have solve the issue. |

Currently, ScenarioRun duration is hardcoded to 0.0 in SwarmAdapter, which does not reflect the actual runtime of the scenario.
This PR updates SwarmAdapter.end_session to:
Use the already-calculated duration_seconds metric.
Convert it to milliseconds.
Populate the duration_ms field in ScenarioRun.
This ensures that scenario durations are accurately recorded, improving metrics tracking and reporting consistency.
Closes #54
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Refactor
✏️ Tip: You can customize this high-level summary in your review settings.