feat(metrics): loop metrics tracking and ralph-stats analytics (#21) by frankbria · Pull Request #238 · frankbria/ralph-claude-code

frankbria · 2026-04-02T02:12:37Z

Summary

Implements #21: [P3] Phase 3.7: Implement metrics and analytics

Add track_metrics() to ralph_loop.sh — appends a JSON Lines entry to .ralph/logs/metrics.jsonl after each loop with timestamp, loop number, duration (seconds), success flag, and cumulative API call count
Capture get_epoch_seconds() before/after execute_claude_code() in the main loop to compute duration
Add print_metrics_summary() called on graceful exit — logs aggregate stats (total_loops, successful, avg_duration, total_calls)
Create ralph-stats.sh standalone script — reads metrics.jsonl and outputs a JSON summary via jq
Update install.sh to install ralph-stats wrapper command and copy ralph-stats.sh to ~/.ralph/

Acceptance Criteria

Metrics logged in JSON Lines format
ralph-stats command works
Summary displayed on exit
All tests pass (633 total, 0 failures)
Documented in CLAUDE.md

Test Plan

4 new BATS unit tests in tests/unit/test_metrics_tracking.bats (TDD — written before implementation)
Updated integration install tests to include ralph-stats mock, executable check, and uninstall assertion
Full test suite: 633 tests, all passing

Implementation Notes

track_metrics() placed after increment_call_counter() (~line 529) following codebase conventions
Uses get_iso_timestamp() and get_epoch_seconds() from existing lib/date_utils.sh
ralph-stats.sh uses the same wrapper pattern as all other Ralph commands in install.sh
total_calls in summary = max(calls) from metrics file (cumulative counter at last loop)

Closes #21

Summary by CodeRabbit

New Features
- Added ralph-stats command providing metrics summaries of loop execution analytics, including total loops, success count, average duration, and total calls made
- Automatic metrics tracking for each loop execution
Tests
- Added comprehensive test suite validating metrics tracking and aggregation functionality

…mmand (#21) - Add track_metrics() to ralph_loop.sh: appends JSON Lines to .ralph/logs/metrics.jsonl with timestamp, loop number, duration, success, and API call count - Integrate timing in main loop: captures epoch seconds before/after execute_claude_code - Add print_metrics_summary() called on graceful exit to log aggregate stats - Create ralph-stats.sh: standalone analytics command using jq to compute total_loops, successful, avg_duration, and total_calls from metrics.jsonl - Update install.sh to install ralph-stats wrapper and copy ralph-stats.sh - Add 4 BATS unit tests covering all acceptance criteria - Update integration install tests with ralph-stats mock and assertions

coderabbitai · 2026-04-02T02:12:50Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7ec9e134-4b1f-4a88-aa76-a2fe445c27c6

📥 Commits

Reviewing files that changed from the base of the PR and between 7b927f8 and d812a60.

📒 Files selected for processing (1)

ralph_loop.sh

Walkthrough

This PR implements metrics tracking and analytics for Ralph loop execution. It introduces track_metrics() to log per-loop execution data to metrics.jsonl, adds ralph-stats command to aggregate and display analytics, installs the new command via the installer, and includes comprehensive test coverage.

Changes

Cohort / File(s)	Summary
Metrics Core Logic `ralph_loop.sh`, `ralph-stats.sh`	Added `track_metrics()` to record per-loop metrics (loop number, duration, success, calls) to JSONL file; added `print_metrics_summary()` to aggregate metrics via `jq` and log summary on exit. New `ralph-stats.sh` script reads metrics file and outputs aggregated JSON with `total_loops`, `successful`, `avg_duration`, `total_calls`.
Installation and Documentation `install.sh`, `CLAUDE.md`	Updated installer to create `ralph-stats` wrapper in `~/.local/bin/` and copy `ralph-stats.sh` to `$RALPH_HOME/`. Updated documentation to list new command and test suite.
Integration Testing `tests/integration/test_installation.bats`	Extended installation tests to verify `ralph-stats` wrapper creation, executable permissions on both wrapper and `ralph-stats.sh`, and proper cleanup during uninstallation.
Unit Testing `tests/unit/test_metrics_tracking.bats`	Added comprehensive test suite (4 tests) validating metrics tracking: JSON Lines format, append semantics, `ralph-stats` aggregation output, and `print_metrics_summary` behavior with real metrics data.

Sequence Diagram

sequenceDiagram
    actor User
    participant RalphLoop as Ralph Loop<br/>(ralph_loop.sh)
    participant MetricsFile as metrics.jsonl
    participant jq as jq
    participant StatsCmd as ralph-stats<br/>Command

    User->>RalphLoop: Start Ralph execution
    loop Each loop iteration
        RalphLoop->>RalphLoop: Execute Claude code
        RalphLoop->>RalphLoop: Measure duration & success
        RalphLoop->>RalphLoop: track_metrics(...)
        RalphLoop->>MetricsFile: Append JSONL line
    end
    
    RalphLoop->>RalphLoop: On exit, print_metrics_summary()
    RalphLoop->>MetricsFile: Read all records
    RalphLoop->>jq: jq -s (aggregate)
    jq-->>RalphLoop: Aggregated JSON
    RalphLoop->>User: Output summary

    User->>StatsCmd: Run ralph-stats
    StatsCmd->>MetricsFile: Read metrics.jsonl
    StatsCmd->>jq: jq -s (aggregate)
    jq-->>StatsCmd: Aggregated JSON
    StatsCmd-->>User: Display analytics

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

fix: progress detection improvements (#141, #144) #158: Modifies ralph_loop.sh's loop execution path for execute_claude_code behavior, directly intersects with metrics tracking additions in the same execution flow.

Poem

🐰 Numbers hop through loops so fast,
Each metric caught, forever cast,
In JSON Lines they dance and stay,
Ralph-stats counts them every day!
Loop by loop, the data grows,
Analytics blooms where metrics flows. 📊

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and concisely describes the main change: adding loop metrics tracking and a ralph-stats analytics command to the Ralph codebase.
Linked Issues check	✅ Passed	The PR fully addresses all coding requirements from issue `#21`: track_metrics() and print_metrics_summary() functions added to ralph_loop.sh with proper JSON Lines logging, ralph-stats.sh command created for metrics analysis, install.sh updated to install the command, and comprehensive test coverage added via test_metrics_tracking.bats.
Out of Scope Changes check	✅ Passed	All code changes are directly aligned with issue `#21` requirements: metrics tracking functions, ralph-stats implementation, installation updates, and related tests. CLAUDE.md documentation updates are supporting documentation within scope.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/metrics-tracking

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-02T02:14:53Z

Review: feat(metrics): loop metrics tracking and ralph-stats analytics

This is a clean, focused implementation. The code follows existing patterns well and the TDD approach with 4 new tests and updated integration tests is solid. A few observations:

Minor Issues

No. 1 — Semantic issue: total_calls uses max instead of last

In both ralph-stats.sh and print_metrics_summary():

total_calls: (map(.calls) | max // 0)

The PR description notes this is intentional — "total_calls = max(calls) from metrics file (cumulative counter at last loop)". Since calls is a monotonically increasing cumulative counter within a run, max and last should agree. However, max is semantically fragile: if the counter were ever reset mid-run (e.g., on hourly rollover), max would give a misleading result. Using last is safer:

total_calls: (.[length-1].calls // 0)

No. 2 — local declaration masks subshell errors

In ralph_loop.sh:

local loop_duration=$(( $(get_epoch_seconds) - loop_start_epoch ))

Combining local and command substitution on one line means if get_epoch_seconds fails, the local declaration still exits 0, hiding the error. This pattern appears elsewhere in the codebase so it is not a blocker, but worth noting for consistency with stricter parts of the code that split the declaration:

local loop_duration
loop_duration=$(( $(get_epoch_seconds) - loop_start_epoch ))

Test Coverage Gap

The 4 tests cover the happy path well. Consider adding edge cases in a follow-on:

Empty metrics file (0 bytes) — jq -s on an empty file returns [], so max on an empty array returns null, which the // 0 handles. But a test would confirm this.
Malformed JSON line in metrics.jsonl — jq -s will error out; the current code has no recovery path.

Positive Notes

track_metrics correctly uses %s for the success field, writing a bare JSON boolean (true/false), not a quoted string. This is correct.
The mkdir -p "$LOG_DIR" inside track_metrics is a good defensive guard.
Install/uninstall symmetry in install.sh is complete — ralph-stats is added to both the install and the uninstall rm -f line.
Error outputs in ralph-stats.sh correctly go to stderr.
The placement of print_metrics_summary on the graceful-exit path (inside the break block) means it only fires once, which is correct.

Overall this is a well-structured, minimal implementation that fits the codebase conventions. The semantic max vs last point is the only item worth revisiting before merge.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (3)

ralph_loop.sh (1)

2071-2071: Address SC2155: Declare and assign separately.

The static analysis correctly flags that declaring and assigning in one statement masks the return value of cat. While the fallback || echo "0" mitigates risk here, separating declaration and assignment follows best practices.

🔧 Separate declaration from assignment

-        local calls_made=$(cat "$CALL_COUNT_FILE" 2>/dev/null || echo "0")
+        local calls_made
+        calls_made=$(cat "$CALL_COUNT_FILE" 2>/dev/null || echo "0")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` at line 2071, The line combining declaration and command
substitution should be split to satisfy SC2155: first declare the local variable
(local calls_made) and then assign it on the next line using the command
substitution that reads CALL_COUNT_FILE with the fallback; update the code
around the existing variable and CALL_COUNT_FILE usage (variable name calls_made
and CALL_COUNT_FILE) so the assignment is done separately and still preserves
the "0" fallback and redirects stderr to /dev/null.

tests/unit/test_metrics_tracking.bats (1)

94-107: print_metrics_summary test could be more specific.

The test checks for presence of "total_loops" and "2" in output, but these are substring matches that could pass for unrelated reasons. Consider checking specific JSON field values.

💡 More specific assertions

     run print_metrics_summary
 
     assert_success
-    [[ "$output" == *"total_loops"* ]]
-    [[ "$output" == *"2"* ]]
+    # Output contains JSON summary - extract and validate
+    [[ "$output" == *'"total_loops": 2'* ]] || [[ "$output" == *'"total_loops":2'* ]]

Or parse the JSON from output for precise validation.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_metrics_tracking.bats` around lines 94 - 107, The test uses
loose substring checks for print_metrics_summary output; instead parse the
command output as JSON and assert exact field values—after running
print_metrics_summary (function print_metrics_summary) load "$output" and use a
JSON parser (e.g. jq) to assert fields like .total_loops == 2 and other expected
fields (e.g. .total_calls, .avg_duration) rather than checking for the
substrings "total_loops" and "2"; update the assertions to read from the
generated "$LOG_DIR/metrics.jsonl" fixture or the captured "$output" and perform
exact equality checks on the JSON fields.

tests/integration/test_installation.bats (1)

557-606: Consider adding ralph-stats to the end-to-end workflow test.

The "complete installation workflow end-to-end" test verifies most commands but doesn't explicitly check for ralph-stats. While other tests cover it, adding assertions here would provide more comprehensive E2E coverage.

📝 Optional: Add ralph-stats verification

     # Verify all commands installed
     assert_file_exists "$TEST_INSTALL_DIR/ralph"
     assert_file_exists "$TEST_INSTALL_DIR/ralph-monitor"
     assert_file_exists "$TEST_INSTALL_DIR/ralph-setup"
     assert_file_exists "$TEST_INSTALL_DIR/ralph-import"
     assert_file_exists "$TEST_INSTALL_DIR/ralph-migrate"
+    assert_file_exists "$TEST_INSTALL_DIR/ralph-stats"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/integration/test_installation.bats` around lines 557 - 606, Add checks
for the ralph-stats binary to the "complete installation workflow end-to-end"
test: in the "Verify all commands installed" block add an assertion to check the
file exists (referencing the TEST_INSTALL_DIR/ralph-stats), and in the "Verify
all permissions correct" block add an executable permission test for
TEST_INSTALL_DIR/ralph-stats; update the test named "complete installation
workflow end-to-end" so these two assertions run alongside the other command
existence and permission checks.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/unit/test_metrics_tracking.bats`:
- Around line 60-74: The per-line JSON validation loop in the "track_metrics:
appends one entry per loop iteration" test silently ignores jq failures; update
the while-read loop that iterates over "$LOG_DIR/metrics.jsonl" (used after
calling track_metrics) so each line is validated with jq in a way that
propagates failures (e.g., use jq -e . or check jq's exit status) and cause the
test to fail if any line is invalid JSON; ensure the loop exits non-zero on the
first jq error so the test correctly fails when track_metrics produces invalid
JSON.
- Around line 42-58: The jq validation in the test "track_metrics: writes valid
JSON Lines to metrics.jsonl" silently ignores parse errors because its output is
redirected to /dev/null and stderr is merged (the jq exit code can be lost);
update the test in tests/unit/test_metrics_tracking.bats so the jq invocation
for JSON parsing (used after reading "$LOG_DIR/metrics.jsonl") does not redirect
stderr/stdout to /dev/null, allowing jq to return a non-zero exit and cause the
BATS test to fail on invalid JSON—locate the jq call in that test and remove the
redirection so failures are propagated.

---

Nitpick comments:
In `@ralph_loop.sh`:
- Line 2071: The line combining declaration and command substitution should be
split to satisfy SC2155: first declare the local variable (local calls_made) and
then assign it on the next line using the command substitution that reads
CALL_COUNT_FILE with the fallback; update the code around the existing variable
and CALL_COUNT_FILE usage (variable name calls_made and CALL_COUNT_FILE) so the
assignment is done separately and still preserves the "0" fallback and redirects
stderr to /dev/null.

In `@tests/integration/test_installation.bats`:
- Around line 557-606: Add checks for the ralph-stats binary to the "complete
installation workflow end-to-end" test: in the "Verify all commands installed"
block add an assertion to check the file exists (referencing the
TEST_INSTALL_DIR/ralph-stats), and in the "Verify all permissions correct" block
add an executable permission test for TEST_INSTALL_DIR/ralph-stats; update the
test named "complete installation workflow end-to-end" so these two assertions
run alongside the other command existence and permission checks.

In `@tests/unit/test_metrics_tracking.bats`:
- Around line 94-107: The test uses loose substring checks for
print_metrics_summary output; instead parse the command output as JSON and
assert exact field values—after running print_metrics_summary (function
print_metrics_summary) load "$output" and use a JSON parser (e.g. jq) to assert
fields like .total_loops == 2 and other expected fields (e.g. .total_calls,
.avg_duration) rather than checking for the substrings "total_loops" and "2";
update the assertions to read from the generated "$LOG_DIR/metrics.jsonl"
fixture or the captured "$output" and perform exact equality checks on the JSON
fields.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6d998b11-3098-497c-af69-beaf6506314d

📥 Commits

Reviewing files that changed from the base of the PR and between 9f5cc34 and 5432344.

📒 Files selected for processing (6)

CLAUDE.md
install.sh
ralph-stats.sh
ralph_loop.sh
tests/integration/test_installation.bats
tests/unit/test_metrics_tracking.bats

- Remove 2>&1 on jq validation so failures surface in tests - Separate local declaration from assignment for SC2155 in ralph_loop.sh

github-actions · 2026-04-02T02:25:58Z

Review: fix(metrics): address CodeRabbit review feedback

The implementation is clean and follows existing conventions well. The TDD approach with 4 unit tests and updated integration tests is solid. Good to see the CodeRabbit feedback being worked through iteratively. A few items still remaining.

Still needs attention

No. 1 - Weak assertion in print_metrics_summary test (CodeRabbit raised this)

In tests/unit/test_metrics_tracking.bats lines 103-104, the substring match on "2" is too loose - it could match a timestamp like "2025-01-01". Since print_metrics_summary outputs structured JSON, extract and assert the specific field value using jq or a tighter match pattern (e.g., checking for "total_loops":2 specifically).

No. 2 - SC2155: combined local declaration and command substitution (CodeRabbit raised this)

In the metrics block added to ralph_loop.sh, the line local calls_made=$(cat "$CALL_COUNT_FILE" 2>/dev/null || echo 0) masks the command exit code. Split into a separate declaration and assignment, consistent with how other SC2155 instances have been handled in this codebase.

New concern

No. 3 - No rotation for metrics.jsonl

ralph.log has rotation via rotate_logs() in lib/log_utils.sh (10 MB threshold, 4 archives). metrics.jsonl will grow unbounded. After thousands of loops on a long-running project, jq -s in print_metrics_summary and ralph-stats.sh slurps the entire file into memory on each invocation. Consider either adding rotation for metrics.jsonl in rotate_logs(), or capping summarization to the last N entries (e.g., tail -1000 before piping to jq).

Acknowledged from prior review

No. 4 - total_calls uses max (raised previously)

The PR description documents this: "total_calls in summary = max(calls) from metrics file (cumulative counter at last loop)." Since the call counter is monotonically increasing within a run, max is equivalent to last. Acceptable as-is, but a short inline comment in ralph-stats.sh would clarify intent for future readers.

Overall this is a well-structured feature. Resolving items 1 and 2 (both already flagged by CodeRabbit) would close out the feedback loop before merge.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ralph_loop.sh`:
- Around line 2074-2087: The code reads CALL_COUNT_FILE (hourly-reset) when
calling track_metrics, causing total_calls to drop; change to use a persistent
cumulative counter (e.g. TOTAL_CALLS_FILE) that is not rotated hourly: after
execute_claude_code and determining exec_result, increment the persistent
counter when a call occurs (or on success/failure as desired), persist it to
TOTAL_CALLS_FILE, then read that persistent value and pass it into track_metrics
instead of reading CALL_COUNT_FILE; update references around
execute_claude_code, exec_result, and track_metrics to use and maintain
TOTAL_CALLS_FILE.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 895abb1c-513e-40e1-979a-eec287c13692

📥 Commits

Reviewing files that changed from the base of the PR and between 5432344 and d015838.

📒 Files selected for processing (2)

ralph_loop.sh
tests/unit/test_metrics_tracking.bats

✅ Files skipped from review due to trivial changes (1)

tests/unit/test_metrics_tracking.bats

coderabbitai · 2026-04-02T02:27:40Z

+        # Capture loop start time for duration tracking (Issue #21)
+        local loop_start_epoch
+        loop_start_epoch=$(get_epoch_seconds)
+
        # Execute Claude Code
        execute_claude_code "$loop_count"
        local exec_result=$?
-
+
+        # Record metrics for this loop (Issue #21)
+        local loop_duration
+        loop_duration=$(( $(get_epoch_seconds) - loop_start_epoch ))
+        local loop_success="false"
+        [ $exec_result -eq 0 ] && loop_success="true"
+        track_metrics "$loop_count" "$loop_duration" "$loop_success" "$(cat "$CALL_COUNT_FILE" 2>/dev/null || echo 0)"


⚠️ Potential issue | 🟠 Major

calls metric is not cumulative across hourly resets.

At Line 2087, the value passed to track_metrics is read from .call_count, which is reset every hour. That breaks the cumulative-call expectation and can underreport total_calls analytics.

💡 Proposed fix

@@ - while true; do + local total_calls_lifetime=0 + while true; do @@ - local calls_made=$(cat "$CALL_COUNT_FILE" 2>/dev/null || echo "0") + local calls_made + calls_made=$(cat "$CALL_COUNT_FILE" 2>/dev/null || echo "0") update_status "$loop_count" "$calls_made" "executing" "running" @@ - execute_claude_code "$loop_count" + local calls_before_exec="$calls_made" + execute_claude_code "$loop_count" local exec_result=$? @@ - track_metrics "$loop_count" "$loop_duration" "$loop_success" "$(cat "$CALL_COUNT_FILE" 2>/dev/null || echo 0)" + local calls_after_exec + calls_after_exec=$(cat "$CALL_COUNT_FILE" 2>/dev/null || echo "0") + local calls_delta=0 + if [[ "$calls_after_exec" -ge "$calls_before_exec" ]]; then + calls_delta=$((calls_after_exec - calls_before_exec)) + else + # hourly reset occurred + calls_delta=$calls_after_exec + fi + total_calls_lifetime=$((total_calls_lifetime + calls_delta)) + track_metrics "$loop_count" "$loop_duration" "$loop_success" "$total_calls_lifetime"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ralph_loop.sh` around lines 2074 - 2087, The code reads CALL_COUNT_FILE (hourly-reset) when calling track_metrics, causing total_calls to drop; change to use a persistent cumulative counter (e.g. TOTAL_CALLS_FILE) that is not rotated hourly: after execute_claude_code and determining exec_result, increment the persistent counter when a call occurs (or on success/failure as desired), persist it to TOTAL_CALLS_FILE, then read that persistent value and pass it into track_metrics instead of reading CALL_COUNT_FILE; update references around execute_claude_code, exec_result, and track_metrics to use and maintain TOTAL_CALLS_FILE.

… hourly resets - Track calls_before_exec and compute calls_this_loop as delta instead of reading cumulative CALL_COUNT_FILE (which resets hourly) - Update ralph-stats to sum .calls instead of max for true lifetime total - Update test assertion: total_calls = sum(2+5+7) = 14

github-actions · 2026-04-02T02:33:40Z

Review: fix(metrics): per-loop call delta for total_calls

Good progress on this PR. The per-loop call delta approach (using calls_after_exec - calls_before_exec) correctly solves the hourly reset problem. A few items remain.

Bug: `print_metrics_summary` and `ralph-stats.sh` are now inconsistent

This is the main issue introduced in the latest commit. ralph-stats.sh was updated to use add:

total_calls: (map(.calls) | add // 0)

But print_metrics_summary in ralph_loop.sh still uses max:

total_calls: (map(.calls) | max // 0)

Since calls is now a per-loop delta (not a cumulative counter), max returns the highest single-loop call count rather than the total. For the 3-entry fixture [2, 5, 7], ralph-stats correctly returns 14, but print_metrics_summary would return 7. The two summaries should agree — print_metrics_summary needs the same add fix.

Still open from prior reviews

No. 1 — Weak assertion (raised in both prior reviews)

In test_metrics_tracking.bats lines 103-104:

[[ "$output" == *"total_loops"* ]]
[[ "$output" == *"2"* ]]

The "2" match is still loose — it would match a timestamp like "2025-01-01". The tests for ralph-stats in the same file use jq -r for precise field assertions; print_metrics_summary should follow the same pattern:

[[ "$(echo "$output" | jq -r '.total_loops')" == "2" ]]

No. 2 — No rotation for metrics.jsonl (raised in prior review)

ralph.log has rotate_logs() gating at 10 MB. metrics.jsonl grows unbounded. Long-running projects will accumulate thousands of entries; jq -s slurps the entire file into memory on every print_metrics_summary call (inside the main loop exit path) and every ralph-stats invocation. A follow-on issue to add rotation (or cap jq -s to the last N entries) would be appropriate if this isn't addressed here.

Items addressed — thank you

SC2155: calls_made is now split into declaration + assignment in the main loop.
ralph-stats.sh correctly uses add for per-loop delta semantics.
Test assertions for the ralph-stats test case use jq -r field extraction — precise and correct.
Install/uninstall symmetry for ralph-stats is complete.

The print_metrics_summary inconsistency (still using max) is the blocker before merge.

coderabbitai

♻️ Duplicate comments (1)

ralph_loop.sh (1)

553-558: ⚠️ Potential issue | 🟠 Major

total_calls aggregation is incorrect (uses max instead of sum).

Line 557 currently returns only the largest per-loop value, which underreports aggregate calls when metrics store per-loop deltas.

💡 Suggested fix

 summary=$(jq -s '{
     total_loops: length,
     successful: (map(select(.success==true)) | length),
     avg_duration: (if length > 0 then (map(.duration) | add) / length else 0 end),
-    total_calls: (map(.calls) | max // 0)
+    total_calls: (map(.calls // 0) | add // 0)
 }' "$metrics_file" 2>/dev/null)

#!/bin/bash
set -euo pipefail

tmp_file="$(mktemp)"
cat > "$tmp_file" <<'EOF'
{"calls":2}
{"calls":5}
{"calls":7}
EOF

echo "Current logic (max):"
jq -s '{total_calls:(map(.calls) | max // 0)}' "$tmp_file"

echo "Expected logic (sum):"
jq -s '{total_calls:(map(.calls // 0) | add // 0)}' "$tmp_file"

rm -f "$tmp_file"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` around lines 553 - 558, The aggregation for total_calls in the
jq -s block is using max which undercounts; change the expression for
total_calls to sum per-loop call values and handle missing values and empty
arrays (e.g. replace (map(.calls) | max // 0) with (map(.calls // 0) | add //
0)) so the summary variable correctly totals calls from metrics_file.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@ralph_loop.sh`:
- Around line 553-558: The aggregation for total_calls in the jq -s block is
using max which undercounts; change the expression for total_calls to sum
per-loop call values and handle missing values and empty arrays (e.g. replace
(map(.calls) | max // 0) with (map(.calls // 0) | add // 0)) so the summary
variable correctly totals calls from metrics_file.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 27b06437-c812-4d47-bc6f-c91dd5233614

📥 Commits

Reviewing files that changed from the base of the PR and between d015838 and 7b927f8.

📒 Files selected for processing (3)

ralph-stats.sh
ralph_loop.sh
tests/unit/test_metrics_tracking.bats

✅ Files skipped from review due to trivial changes (1)

ralph-stats.sh

🚧 Files skipped from review as they are similar to previous changes (1)

tests/unit/test_metrics_tracking.bats

…_summary

github-actions · 2026-04-02T02:40:50Z

PR Review: feat(metrics): loop metrics tracking and ralph-stats analytics

This is a clean, well-scoped implementation. The architecture follows existing patterns, the tests cover the main behaviors, and the installation wiring is consistent with how other commands are set up. A few things worth addressing:

Bug: Off-by-one in `calls_this_loop` delta (ralph_loop.sh)

The arithmetic condition for the per-loop call delta uses strict greater-than:

local calls_this_loop=$(( calls_after_exec > calls_before_exec ? calls_after_exec - calls_before_exec : calls_after_exec ))

When calls_after_exec == calls_before_exec (i.e., no API calls were made — timed out, skipped, etc.), the condition is false and the fallback returns the full counter value rather than 0. The fix is straightforward:

local calls_this_loop=$(( calls_after_exec >= calls_before_exec ? calls_after_exec - calls_before_exec : calls_after_exec ))

PR description inconsistency (minor)

The Implementation Notes section still says:

total_calls in summary = max(calls) from metrics file (cumulative counter at last loop)

But the most recent commit switched this to add (sum of per-loop deltas), which is correct given that calls_this_loop is now a delta rather than a cumulative counter. The description should be updated to reflect the actual behavior. The tests confirm the add behavior is right (2+5+7=14), so this is a docs-only fix.

Code duplication: jq query appears twice

The identical jq summary query is in both print_metrics_summary (ralph_loop.sh) and ralph-stats.sh. When this logic needs to change in the future (e.g., adding a new metric field), it will need to be updated in two places. This is not a blocker given the current scope, just worth noting for future maintenance.

Weak assertion in `print_metrics_summary` test

The last test only checks for the string "total_loops" and that "2" appears somewhere in the output:

[[ "$output" == *"total_loops"* ]]
[[ "$output" == *"2"* ]]

The other three tests validate specific field values via jq. Aligning this test to do the same (checking successful, avg_duration, total_calls) would make it consistent and catch regressions more precisely.

Summary

The overall implementation is solid — the track_metrics() function, ralph-stats.sh, install wiring, and integration tests all look correct. The off-by-one in the delta calculation is the only functional concern; the rest are polish items.

coderabbitai Bot reviewed Apr 2, 2026

View reviewed changes

Comment thread tests/unit/test_metrics_tracking.bats

Comment thread tests/unit/test_metrics_tracking.bats

fix(metrics): address CodeRabbit review feedback (#238)

d015838

- Remove 2>&1 on jq validation so failures surface in tests - Separate local declaration from assignment for SC2155 in ralph_loop.sh

coderabbitai Bot reviewed Apr 2, 2026

View reviewed changes

fix(metrics): use add instead of max for total_calls in print_metrics…

d812a60

…_summary

frankbria merged commit dc89f16 into main Apr 2, 2026
5 checks passed

frankbria deleted the feature/metrics-tracking branch April 2, 2026 02:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(metrics): loop metrics tracking and ralph-stats analytics (#21)#238

feat(metrics): loop metrics tracking and ralph-stats analytics (#21)#238
frankbria merged 4 commits into
mainfrom
feature/metrics-tracking

frankbria commented Apr 2, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 2, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Apr 2, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Apr 2, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 2, 2026

Uh oh!

github-actions Bot commented Apr 2, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

github-actions Bot commented Apr 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

frankbria commented Apr 2, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Acceptance Criteria

Test Plan

Implementation Notes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Apr 2, 2026

Review: feat(metrics): loop metrics tracking and ralph-stats analytics

Minor Issues

Test Coverage Gap

Positive Notes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Apr 2, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 2, 2026

Review: fix(metrics): per-loop call delta for total_calls

Bug: print_metrics_summary and ralph-stats.sh are now inconsistent

Still open from prior reviews

Items addressed — thank you

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 2, 2026

PR Review: feat(metrics): loop metrics tracking and ralph-stats analytics

Bug: Off-by-one in calls_this_loop delta (ralph_loop.sh)

PR description inconsistency (minor)

Code duplication: jq query appears twice

Weak assertion in print_metrics_summary test

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

frankbria commented Apr 2, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 2, 2026 •

edited

Loading

Bug: `print_metrics_summary` and `ralph-stats.sh` are now inconsistent

Bug: Off-by-one in `calls_this_loop` delta (ralph_loop.sh)

Weak assertion in `print_metrics_summary` test