Add CSV export support for SuiteResultAdd csv reports by Jagriti-student · Pull Request #62 · aviralgarg05/agentunit

Jagriti-student · 2026-01-12T04:45:24Z

Summary

This PR adds a to_csv() method to SuiteResult in src/agentunit/reporting/results.py.
It allows exporting all test results to a CSV file, flattening nested metrics into separate columns
(e.g., metric_ExactMatch, metric_Latency) for easy analysis in spreadsheets.

Changes

Added _flatten_metrics() helper (already existing but improved if needed)
Added SuiteResult.to_csv(path: str | Path) method
CSV writes one row per scenario run

Testing

Verified manually using:

from datetime import datetime
from agentunit.reporting.results import SuiteResult, ScenarioResult, ScenarioRun

run = ScenarioRun(
    scenario_name="DemoScenario",
    case_id="case_1",
    success=True,
    metrics={"ExactMatch": 1.0, "Latency": 120.5},
    duration_ms=150.0,
    trace=None,
)

scenario = ScenarioResult(name="DemoScenario")
scenario.add_run(run)

suite = SuiteResult(
    scenarios=[scenario],
    started_at=datetime.now(),
    finished_at=datetime.now(),
)

suite.to_csv("results/test.csv")

Closes #58  

<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

* **New Features**
  * CSV export for test results, including scenario, case ID, success, duration, error, and flattened metrics.

* **Improvements**
  * Reorganized JUnit output and improved time reporting accuracy.
  * Tighter Markdown and JSON formatting for exported reports.

* **Maintenance**
  * Removed numpy dependency and added an "integration-tests" extras entry; minor dependency reordering and formatting cleanup.

<sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

…markdown

Signed-off-by: Jagriti-student <jagriti7989@gmail.com>

continue · 2026-01-12T04:45:32Z

Learn more

All Green is an AI agent that automatically:

✅ Addresses code review comments

✅ Fixes failing CI checks

✅ Resolves merge conflicts

Unsubscribe from All Green comments

coderabbitai · 2026-01-12T04:45:34Z

Warning

Rate limit exceeded

@Jagriti-student has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 14 minutes and 16 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between fc43108 and 637bb2d.

📒 Files selected for processing (1)

src/agentunit/reporting/results.py

Walkthrough

Adds CSV export to SuiteResult with metric flattening, refactors JUnit output to a single testsuite and adjusts time reporting, enhances Markdown rendering helpers, and updates pyproject.toml (removes numpy, reorders dependencies, adds integration-tests extra).

Changes

Cohort / File(s)	Summary
Configuration `pyproject.toml`	Removed `numpy` dependency, removed inline comment from `langchain` line, reordered `jsonschema` within dependencies, and added `integration-tests = ["langgraph"]` under `[tool.poetry.extras]`.
Reporting / Exports `src/agentunit/reporting/results.py`	Added `SuiteResult.to_csv(path: str
Tests / Minor edits `tests/test_reporting.py`	Minor import/comment and whitespace cleanup in test file.

Sequence Diagram(s)

(omitted — changes are internal export additions and formatting; no multi-component sequential flow requiring visualization)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Add LangGraph integration tests (#24) #36: Overlaps with pyproject.toml edits adding an integration-tests extra that includes langgraph.
feat: add HTML report exporter for SuiteResult #48: Related changes to SuiteResult export code and HTML/report rendering.
Feature/html report exporter #50: Related expansions to export functionality in src/agentunit/reporting/results.py.

Suggested reviewers

aviralgarg05

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name	Status	Explanation	Resolution
Out of Scope Changes check	⚠️ Warning	Changes to pyproject.toml (numpy removal, jsonschema reorder, langchain comment removal, integration-tests extras) and test file formatting are out of scope relative to issue #58's CSV export objective.	Remove the pyproject.toml changes (unrelated dependency management) and test file formatting changes to keep the PR focused on CSV export functionality per issue #58.
Docstring Coverage	⚠️ Warning	Docstring coverage is 22.22% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	❓ Inconclusive	The PR description includes a summary, changes, and testing verification but lacks the detailed template structure with Type of Change, Testing checklist, Code Quality checks, and other required sections.	Provide a more complete description using the template: include Type of Change, Testing checklist, Code Quality verification, and Documentation updates to align with repository standards.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly addresses the main change (CSV export) but contains awkward phrasing with duplication ('Add CSV export support for SuiteResultAdd csv reports').
Linked Issues check	✅ Passed	The PR successfully implements all requirements from issue #58: adds to_csv() method, flattens nested metrics, and writes one row per run with proper CSV structure.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov-commenter · 2026-01-12T04:47:10Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 11.76471% with 30 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/agentunit/reporting/results.py	11.76%	30 Missing ⚠️

📢 Thoughts on this report? Let us know!

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

src/agentunit/reporting/results.py (2)
83-95: One-level flattening is intentional but should be documented.

The helper only flattens one level of nesting. If metrics contains deeper structures like {"outer": {"inner": {"deep": 1}}}, the inner dict becomes a cell value (e.g., "{'deep': 1}"), which may not be spreadsheet-friendly.

If deeper nesting is a realistic scenario, consider making this recursive or adding a docstring clarifying the depth limit.

217-219: Column order is alphabetical — consider prioritizing core columns.

Sorting fieldnames alphabetically places case_id before scenario_name and interleaves metric_* columns unpredictably. For spreadsheet usability, leading with fixed columns (scenario_name, case_id, success, duration_ms, error) followed by sorted metric columns is more intuitive.
Proposed fix
-        fieldnames = sorted(
-            {key for row in rows for key in row.keys()}
-        )
+        base_fields = ["scenario_name", "case_id", "success", "duration_ms", "error"]
+        metric_fields = sorted(
+            {key for row in rows for key in row.keys()} - set(base_fields)
+        )
+        fieldnames = base_fields + metric_fields

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 40bc42d and 580323d.

⛔ Files ignored due to path filters (1)

poetry.lock is excluded by !**/*.lock

📒 Files selected for processing (3)

pyproject.toml
src/agentunit/reporting/results.py
tests/test_reporting.py

💤 Files with no reviewable changes (1)

tests/test_reporting.py

🧰 Additional context used

🧬 Code graph analysis (1)

src/agentunit/reporting/results.py (3)

src/agentunit/reporting/html.py (1)

render_html_report (11-102)

src/agentunit/core/runner.py (1)

run (45-54)

src/agentunit/datasets/base.py (1)

name (38-39)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Test (Python 3.12)
GitHub Check: Test (Python 3.10)

🔇 Additional comments (4)

pyproject.toml (1)

39-41: LGTM — extras group aligns with test markers.

The new integration-tests extras entry correctly groups langgraph for optional testing scenarios, matching the pytest marker defined at line 70.

src/agentunit/reporting/results.py (3)

133-177: LGTM — JUnit structure is now a single testsuite element.

The refactored XML uses a single <testsuite> with nested <testcase> elements, which is a more standard JUnit format. Time reporting and failure message handling are implemented correctly.

189-226: CSV export implementation looks solid overall.

The method correctly:

Creates parent directories

Flattens metrics for spreadsheet compatibility

Uses DictWriter with proper encoding and newline handling

Minor improvements noted in separate comments regarding column ordering and empty-file behavior.

268-272: No changes needed—the type annotation is correct.

ScenarioRun.metrics is properly typed as dict[str, float | None], which is consistent with the .2f format used in the markdown rendering. The _flatten_metrics function handles nested dicts, but it's used only in CSV export (line 210), not in the markdown rendering at lines 268-272. The markdown code correctly assumes float or None values.

Likely an incorrect or invalid review comment.

coderabbitai · 2026-01-12T04:48:41Z

+        if not rows:
+            return target


⚠️ Potential issue | 🟡 Minor

Inconsistent behavior: empty suite doesn't create file.

When there are no runs, this returns early without writing any file. Other export methods (to_json, to_markdown, to_junit) always create the output file even when empty. This inconsistency could surprise callers who expect the file to exist after a successful call.

Proposed fix: create empty CSV with headers or at minimum an empty file

if not rows: + target.touch() return target

Or, to maintain header consistency for empty results, define a fixed set of base fieldnames.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if not rows:

return target

if not rows:

target.touch()

return target

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @src/agentunit/reporting/results.py:
- Line 206: The comprehension building fieldnames uses row.keys (a method
object) instead of calling it; update the expression to call row.keys() so it
reads: fieldnames = sorted({key for row in rows for key in row.keys()}) to avoid
TypeError and correctly collect keys from each row in rows.

🧹 Nitpick comments (1)

src/agentunit/reporting/results.py (1)
203-204: Consider documenting the empty-suite behavior.

When there are no runs, the method returns without creating a file. This is reasonable since fieldnames are derived from row data, but callers might expect a file to exist. Consider documenting this in the docstring.
📝 Suggested docstring update
     def to_csv(self, path: str | Path) -> Path:
         """
         Export suite results to CSV.
         One row per scenario run.
+
+        Note: If there are no scenario runs, no file is created.
         """

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 580323d and fc43108.

📒 Files selected for processing (1)

src/agentunit/reporting/results.py

🧰 Additional context used

🧬 Code graph analysis (1)

src/agentunit/reporting/results.py (1)

src/agentunit/core/runner.py (1)

run (45-54)

🔇 Additional comments (5)

src/agentunit/reporting/results.py (5)

5-5: LGTM!

The new imports (csv module and Any type) are appropriate for the CSV export functionality being added.

Also applies to: 10-10

82-92: Implementation handles single-level nesting only.

The helper correctly flattens one level of nested dictionaries. If metrics contain deeper nesting (e.g., {"a": {"b": {"c": 1}}}), inner dicts would be serialized as strings rather than flattened. This is likely sufficient for current use cases where ScenarioRun.metrics is typed as dict[str, float | None].

116-119: LGTM!

Explicit encoding="utf-8" is good practice for cross-platform consistency.

131-166: LGTM!

The simplified JUnit output with a single testsuite root element is a valid format widely supported by CI tools.

241-261: LGTM!

The explicit type hints and formatting improvements enhance readability without changing behavior.

aviralgarg05

LGTM!

Jagriti-student added 20 commits December 7, 2025 14:43

Add basic evaluation example script

b052329

Fix typos and improve clarity in docstrings across core modules

dd2f4fe

Add Google-style docstrings to BaseAdapter methods

80b0706

Format base adapter using ruff

8e7b8c1

docs: add instructions for running CI checks locally

0669ffd

Remove example file unrelated to CI documentation

fe9d27b

Add py.typed marker for type checker support

7e21593

Add test for markdown emoji encoding

e53ab49

Fix test_reporting: correct class usage, fields, and Windows-safe to_…

ac0efa1

…markdown

All tests passing: fixed dependencies and formatting

dd4d11a

Merge branch 'main' into test-markdown-emoji

35b5efc

Signed-off-by: Jagriti-student <jagriti7989@gmail.com>

Merge branch 'main' into test-markdown-emoji

8efa96d

Signed-off-by: Jagriti-student <jagriti7989@gmail.com>

Merge branch 'main' into test-markdown-emoji

ec02604

Signed-off-by: Jagriti-student <jagriti7989@gmail.com>

Merge branch 'main' into test-markdown-emoji

5dd7810

Signed-off-by: Jagriti-student <jagriti7989@gmail.com>

Update dependencies / poetry config

b6b5d9f

Fix emoji markdown test and align ScenarioRun signature

27d79e0

Fix reporting tests and update dependencies

12a7eb4

Fix missing required dependencies (jsonschema, scipy)

d3dd11c

Update all files

b4034a1

Add CSV export support for SuiteResult

580323d

coderabbitai Bot reviewed Jan 12, 2026

View reviewed changes

Jagriti-student added 2 commits January 12, 2026 10:21

Fix SIM118 linter issue in SuiteResult.to_csv

f899ead

Fix Ruff formatting issues in SuiteResult.to_csv

fc43108

coderabbitai Bot reviewed Jan 12, 2026

View reviewed changes

Comment thread src/agentunit/reporting/results.py Outdated

Fix CSV export: iterate over dict keys correctly and pass Ruff lint

637bb2d

aviralgarg05 approved these changes Jan 12, 2026

View reviewed changes

aviralgarg05 merged commit 8f9b43d into aviralgarg05:main Jan 12, 2026
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CSV export support for SuiteResultAdd csv reports#62

Add CSV export support for SuiteResultAdd csv reports#62
aviralgarg05 merged 23 commits into
aviralgarg05:mainfrom
Jagriti-student:add-csv-reports

Jagriti-student commented Jan 12, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

continue Bot commented Jan 12, 2026

Uh oh!

coderabbitai Bot commented Jan 12, 2026 •

edited

Loading

Rate limit exceeded

Uh oh!

codecov-commenter commented Jan 12, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jan 12, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

aviralgarg05 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Jagriti-student commented Jan 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Testing

Uh oh!

continue Bot commented Jan 12, 2026

Uh oh!

coderabbitai Bot commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

codecov-commenter commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aviralgarg05 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Jagriti-student commented Jan 12, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jan 12, 2026 •

edited

Loading

codecov-commenter commented Jan 12, 2026 •

edited

Loading