Skip to content

feat(metrics): loop metrics tracking and ralph-stats analytics (#21)#238

Merged
frankbria merged 4 commits into
mainfrom
feature/metrics-tracking
Apr 2, 2026
Merged

feat(metrics): loop metrics tracking and ralph-stats analytics (#21)#238
frankbria merged 4 commits into
mainfrom
feature/metrics-tracking

Conversation

@frankbria
Copy link
Copy Markdown
Owner

@frankbria frankbria commented Apr 2, 2026

Summary

Implements #21: [P3] Phase 3.7: Implement metrics and analytics

  • Add track_metrics() to ralph_loop.sh — appends a JSON Lines entry to .ralph/logs/metrics.jsonl after each loop with timestamp, loop number, duration (seconds), success flag, and cumulative API call count
  • Capture get_epoch_seconds() before/after execute_claude_code() in the main loop to compute duration
  • Add print_metrics_summary() called on graceful exit — logs aggregate stats (total_loops, successful, avg_duration, total_calls)
  • Create ralph-stats.sh standalone script — reads metrics.jsonl and outputs a JSON summary via jq
  • Update install.sh to install ralph-stats wrapper command and copy ralph-stats.sh to ~/.ralph/

Acceptance Criteria

  • Metrics logged in JSON Lines format
  • ralph-stats command works
  • Summary displayed on exit
  • All tests pass (633 total, 0 failures)
  • Documented in CLAUDE.md

Test Plan

  • 4 new BATS unit tests in tests/unit/test_metrics_tracking.bats (TDD — written before implementation)
  • Updated integration install tests to include ralph-stats mock, executable check, and uninstall assertion
  • Full test suite: 633 tests, all passing

Implementation Notes

  • track_metrics() placed after increment_call_counter() (~line 529) following codebase conventions
  • Uses get_iso_timestamp() and get_epoch_seconds() from existing lib/date_utils.sh
  • ralph-stats.sh uses the same wrapper pattern as all other Ralph commands in install.sh
  • total_calls in summary = max(calls) from metrics file (cumulative counter at last loop)

Closes #21

Summary by CodeRabbit

  • New Features

    • Added ralph-stats command providing metrics summaries of loop execution analytics, including total loops, success count, average duration, and total calls made
    • Automatic metrics tracking for each loop execution
  • Tests

    • Added comprehensive test suite validating metrics tracking and aggregation functionality

…mmand (#21)

- Add track_metrics() to ralph_loop.sh: appends JSON Lines to .ralph/logs/metrics.jsonl
  with timestamp, loop number, duration, success, and API call count
- Integrate timing in main loop: captures epoch seconds before/after execute_claude_code
- Add print_metrics_summary() called on graceful exit to log aggregate stats
- Create ralph-stats.sh: standalone analytics command using jq to compute
  total_loops, successful, avg_duration, and total_calls from metrics.jsonl
- Update install.sh to install ralph-stats wrapper and copy ralph-stats.sh
- Add 4 BATS unit tests covering all acceptance criteria
- Update integration install tests with ralph-stats mock and assertions
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 2, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7ec9e134-4b1f-4a88-aa76-a2fe445c27c6

📥 Commits

Reviewing files that changed from the base of the PR and between 7b927f8 and d812a60.

📒 Files selected for processing (1)
  • ralph_loop.sh

Walkthrough

This PR implements metrics tracking and analytics for Ralph loop execution. It introduces track_metrics() to log per-loop execution data to metrics.jsonl, adds ralph-stats command to aggregate and display analytics, installs the new command via the installer, and includes comprehensive test coverage.

Changes

Cohort / File(s) Summary
Metrics Core Logic
ralph_loop.sh, ralph-stats.sh
Added track_metrics() to record per-loop metrics (loop number, duration, success, calls) to JSONL file; added print_metrics_summary() to aggregate metrics via jq and log summary on exit. New ralph-stats.sh script reads metrics file and outputs aggregated JSON with total_loops, successful, avg_duration, total_calls.
Installation and Documentation
install.sh, CLAUDE.md
Updated installer to create ralph-stats wrapper in ~/.local/bin/ and copy ralph-stats.sh to $RALPH_HOME/. Updated documentation to list new command and test suite.
Integration Testing
tests/integration/test_installation.bats
Extended installation tests to verify ralph-stats wrapper creation, executable permissions on both wrapper and ralph-stats.sh, and proper cleanup during uninstallation.
Unit Testing
tests/unit/test_metrics_tracking.bats
Added comprehensive test suite (4 tests) validating metrics tracking: JSON Lines format, append semantics, ralph-stats aggregation output, and print_metrics_summary behavior with real metrics data.

Sequence Diagram

sequenceDiagram
    actor User
    participant RalphLoop as Ralph Loop<br/>(ralph_loop.sh)
    participant MetricsFile as metrics.jsonl
    participant jq as jq
    participant StatsCmd as ralph-stats<br/>Command

    User->>RalphLoop: Start Ralph execution
    loop Each loop iteration
        RalphLoop->>RalphLoop: Execute Claude code
        RalphLoop->>RalphLoop: Measure duration & success
        RalphLoop->>RalphLoop: track_metrics(...)
        RalphLoop->>MetricsFile: Append JSONL line
    end
    
    RalphLoop->>RalphLoop: On exit, print_metrics_summary()
    RalphLoop->>MetricsFile: Read all records
    RalphLoop->>jq: jq -s (aggregate)
    jq-->>RalphLoop: Aggregated JSON
    RalphLoop->>User: Output summary

    User->>StatsCmd: Run ralph-stats
    StatsCmd->>MetricsFile: Read metrics.jsonl
    StatsCmd->>jq: jq -s (aggregate)
    jq-->>StatsCmd: Aggregated JSON
    StatsCmd-->>User: Display analytics
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 Numbers hop through loops so fast,
Each metric caught, forever cast,
In JSON Lines they dance and stay,
Ralph-stats counts them every day!
Loop by loop, the data grows,
Analytics blooms where metrics flows. 📊

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main change: adding loop metrics tracking and a ralph-stats analytics command to the Ralph codebase.
Linked Issues check ✅ Passed The PR fully addresses all coding requirements from issue #21: track_metrics() and print_metrics_summary() functions added to ralph_loop.sh with proper JSON Lines logging, ralph-stats.sh command created for metrics analysis, install.sh updated to install the command, and comprehensive test coverage added via test_metrics_tracking.bats.
Out of Scope Changes check ✅ Passed All code changes are directly aligned with issue #21 requirements: metrics tracking functions, ralph-stats implementation, installation updates, and related tests. CLAUDE.md documentation updates are supporting documentation within scope.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/metrics-tracking

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 2, 2026

Review: feat(metrics): loop metrics tracking and ralph-stats analytics

This is a clean, focused implementation. The code follows existing patterns well and the TDD approach with 4 new tests and updated integration tests is solid. A few observations:

Minor Issues

No. 1 — Semantic issue: total_calls uses max instead of last

In both ralph-stats.sh and print_metrics_summary():

total_calls: (map(.calls) | max // 0)

The PR description notes this is intentional — "total_calls = max(calls) from metrics file (cumulative counter at last loop)". Since calls is a monotonically increasing cumulative counter within a run, max and last should agree. However, max is semantically fragile: if the counter were ever reset mid-run (e.g., on hourly rollover), max would give a misleading result. Using last is safer:

total_calls: (.[length-1].calls // 0)

No. 2 — local declaration masks subshell errors

In ralph_loop.sh:

local loop_duration=$(( $(get_epoch_seconds) - loop_start_epoch ))

Combining local and command substitution on one line means if get_epoch_seconds fails, the local declaration still exits 0, hiding the error. This pattern appears elsewhere in the codebase so it is not a blocker, but worth noting for consistency with stricter parts of the code that split the declaration:

local loop_duration
loop_duration=$(( $(get_epoch_seconds) - loop_start_epoch ))

Test Coverage Gap

The 4 tests cover the happy path well. Consider adding edge cases in a follow-on:

  • Empty metrics file (0 bytes) — jq -s on an empty file returns [], so max on an empty array returns null, which the // 0 handles. But a test would confirm this.
  • Malformed JSON line in metrics.jsonljq -s will error out; the current code has no recovery path.

Positive Notes

  • track_metrics correctly uses %s for the success field, writing a bare JSON boolean (true/false), not a quoted string. This is correct.
  • The mkdir -p "$LOG_DIR" inside track_metrics is a good defensive guard.
  • Install/uninstall symmetry in install.sh is complete — ralph-stats is added to both the install and the uninstall rm -f line.
  • Error outputs in ralph-stats.sh correctly go to stderr.
  • The placement of print_metrics_summary on the graceful-exit path (inside the break block) means it only fires once, which is correct.

Overall this is a well-structured, minimal implementation that fits the codebase conventions. The semantic max vs last point is the only item worth revisiting before merge.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
ralph_loop.sh (1)

2071-2071: Address SC2155: Declare and assign separately.

The static analysis correctly flags that declaring and assigning in one statement masks the return value of cat. While the fallback || echo "0" mitigates risk here, separating declaration and assignment follows best practices.

🔧 Separate declaration from assignment
-        local calls_made=$(cat "$CALL_COUNT_FILE" 2>/dev/null || echo "0")
+        local calls_made
+        calls_made=$(cat "$CALL_COUNT_FILE" 2>/dev/null || echo "0")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` at line 2071, The line combining declaration and command
substitution should be split to satisfy SC2155: first declare the local variable
(local calls_made) and then assign it on the next line using the command
substitution that reads CALL_COUNT_FILE with the fallback; update the code
around the existing variable and CALL_COUNT_FILE usage (variable name calls_made
and CALL_COUNT_FILE) so the assignment is done separately and still preserves
the "0" fallback and redirects stderr to /dev/null.
tests/unit/test_metrics_tracking.bats (1)

94-107: print_metrics_summary test could be more specific.

The test checks for presence of "total_loops" and "2" in output, but these are substring matches that could pass for unrelated reasons. Consider checking specific JSON field values.

💡 More specific assertions
     run print_metrics_summary
 
     assert_success
-    [[ "$output" == *"total_loops"* ]]
-    [[ "$output" == *"2"* ]]
+    # Output contains JSON summary - extract and validate
+    [[ "$output" == *'"total_loops": 2'* ]] || [[ "$output" == *'"total_loops":2'* ]]

Or parse the JSON from output for precise validation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_metrics_tracking.bats` around lines 94 - 107, The test uses
loose substring checks for print_metrics_summary output; instead parse the
command output as JSON and assert exact field values—after running
print_metrics_summary (function print_metrics_summary) load "$output" and use a
JSON parser (e.g. jq) to assert fields like .total_loops == 2 and other expected
fields (e.g. .total_calls, .avg_duration) rather than checking for the
substrings "total_loops" and "2"; update the assertions to read from the
generated "$LOG_DIR/metrics.jsonl" fixture or the captured "$output" and perform
exact equality checks on the JSON fields.
tests/integration/test_installation.bats (1)

557-606: Consider adding ralph-stats to the end-to-end workflow test.

The "complete installation workflow end-to-end" test verifies most commands but doesn't explicitly check for ralph-stats. While other tests cover it, adding assertions here would provide more comprehensive E2E coverage.

📝 Optional: Add ralph-stats verification
     # Verify all commands installed
     assert_file_exists "$TEST_INSTALL_DIR/ralph"
     assert_file_exists "$TEST_INSTALL_DIR/ralph-monitor"
     assert_file_exists "$TEST_INSTALL_DIR/ralph-setup"
     assert_file_exists "$TEST_INSTALL_DIR/ralph-import"
     assert_file_exists "$TEST_INSTALL_DIR/ralph-migrate"
+    assert_file_exists "$TEST_INSTALL_DIR/ralph-stats"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration/test_installation.bats` around lines 557 - 606, Add checks
for the ralph-stats binary to the "complete installation workflow end-to-end"
test: in the "Verify all commands installed" block add an assertion to check the
file exists (referencing the TEST_INSTALL_DIR/ralph-stats), and in the "Verify
all permissions correct" block add an executable permission test for
TEST_INSTALL_DIR/ralph-stats; update the test named "complete installation
workflow end-to-end" so these two assertions run alongside the other command
existence and permission checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/unit/test_metrics_tracking.bats`:
- Around line 60-74: The per-line JSON validation loop in the "track_metrics:
appends one entry per loop iteration" test silently ignores jq failures; update
the while-read loop that iterates over "$LOG_DIR/metrics.jsonl" (used after
calling track_metrics) so each line is validated with jq in a way that
propagates failures (e.g., use jq -e . or check jq's exit status) and cause the
test to fail if any line is invalid JSON; ensure the loop exits non-zero on the
first jq error so the test correctly fails when track_metrics produces invalid
JSON.
- Around line 42-58: The jq validation in the test "track_metrics: writes valid
JSON Lines to metrics.jsonl" silently ignores parse errors because its output is
redirected to /dev/null and stderr is merged (the jq exit code can be lost);
update the test in tests/unit/test_metrics_tracking.bats so the jq invocation
for JSON parsing (used after reading "$LOG_DIR/metrics.jsonl") does not redirect
stderr/stdout to /dev/null, allowing jq to return a non-zero exit and cause the
BATS test to fail on invalid JSON—locate the jq call in that test and remove the
redirection so failures are propagated.

---

Nitpick comments:
In `@ralph_loop.sh`:
- Line 2071: The line combining declaration and command substitution should be
split to satisfy SC2155: first declare the local variable (local calls_made) and
then assign it on the next line using the command substitution that reads
CALL_COUNT_FILE with the fallback; update the code around the existing variable
and CALL_COUNT_FILE usage (variable name calls_made and CALL_COUNT_FILE) so the
assignment is done separately and still preserves the "0" fallback and redirects
stderr to /dev/null.

In `@tests/integration/test_installation.bats`:
- Around line 557-606: Add checks for the ralph-stats binary to the "complete
installation workflow end-to-end" test: in the "Verify all commands installed"
block add an assertion to check the file exists (referencing the
TEST_INSTALL_DIR/ralph-stats), and in the "Verify all permissions correct" block
add an executable permission test for TEST_INSTALL_DIR/ralph-stats; update the
test named "complete installation workflow end-to-end" so these two assertions
run alongside the other command existence and permission checks.

In `@tests/unit/test_metrics_tracking.bats`:
- Around line 94-107: The test uses loose substring checks for
print_metrics_summary output; instead parse the command output as JSON and
assert exact field values—after running print_metrics_summary (function
print_metrics_summary) load "$output" and use a JSON parser (e.g. jq) to assert
fields like .total_loops == 2 and other expected fields (e.g. .total_calls,
.avg_duration) rather than checking for the substrings "total_loops" and "2";
update the assertions to read from the generated "$LOG_DIR/metrics.jsonl"
fixture or the captured "$output" and perform exact equality checks on the JSON
fields.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6d998b11-3098-497c-af69-beaf6506314d

📥 Commits

Reviewing files that changed from the base of the PR and between 9f5cc34 and 5432344.

📒 Files selected for processing (6)
  • CLAUDE.md
  • install.sh
  • ralph-stats.sh
  • ralph_loop.sh
  • tests/integration/test_installation.bats
  • tests/unit/test_metrics_tracking.bats

Comment thread tests/unit/test_metrics_tracking.bats
Comment thread tests/unit/test_metrics_tracking.bats
- Remove 2>&1 on jq validation so failures surface in tests
- Separate local declaration from assignment for SC2155 in ralph_loop.sh
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 2, 2026

Review: fix(metrics): address CodeRabbit review feedback

The implementation is clean and follows existing conventions well. The TDD approach with 4 unit tests and updated integration tests is solid. Good to see the CodeRabbit feedback being worked through iteratively. A few items still remaining.

Still needs attention

No. 1 - Weak assertion in print_metrics_summary test (CodeRabbit raised this)

In tests/unit/test_metrics_tracking.bats lines 103-104, the substring match on "2" is too loose - it could match a timestamp like "2025-01-01". Since print_metrics_summary outputs structured JSON, extract and assert the specific field value using jq or a tighter match pattern (e.g., checking for "total_loops":2 specifically).

No. 2 - SC2155: combined local declaration and command substitution (CodeRabbit raised this)

In the metrics block added to ralph_loop.sh, the line local calls_made=$(cat "$CALL_COUNT_FILE" 2>/dev/null || echo 0) masks the command exit code. Split into a separate declaration and assignment, consistent with how other SC2155 instances have been handled in this codebase.

New concern

No. 3 - No rotation for metrics.jsonl

ralph.log has rotation via rotate_logs() in lib/log_utils.sh (10 MB threshold, 4 archives). metrics.jsonl will grow unbounded. After thousands of loops on a long-running project, jq -s in print_metrics_summary and ralph-stats.sh slurps the entire file into memory on each invocation. Consider either adding rotation for metrics.jsonl in rotate_logs(), or capping summarization to the last N entries (e.g., tail -1000 before piping to jq).

Acknowledged from prior review

No. 4 - total_calls uses max (raised previously)

The PR description documents this: "total_calls in summary = max(calls) from metrics file (cumulative counter at last loop)." Since the call counter is monotonically increasing within a run, max is equivalent to last. Acceptable as-is, but a short inline comment in ralph-stats.sh would clarify intent for future readers.

Overall this is a well-structured feature. Resolving items 1 and 2 (both already flagged by CodeRabbit) would close out the feedback loop before merge.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ralph_loop.sh`:
- Around line 2074-2087: The code reads CALL_COUNT_FILE (hourly-reset) when
calling track_metrics, causing total_calls to drop; change to use a persistent
cumulative counter (e.g. TOTAL_CALLS_FILE) that is not rotated hourly: after
execute_claude_code and determining exec_result, increment the persistent
counter when a call occurs (or on success/failure as desired), persist it to
TOTAL_CALLS_FILE, then read that persistent value and pass it into track_metrics
instead of reading CALL_COUNT_FILE; update references around
execute_claude_code, exec_result, and track_metrics to use and maintain
TOTAL_CALLS_FILE.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 895abb1c-513e-40e1-979a-eec287c13692

📥 Commits

Reviewing files that changed from the base of the PR and between 5432344 and d015838.

📒 Files selected for processing (2)
  • ralph_loop.sh
  • tests/unit/test_metrics_tracking.bats
✅ Files skipped from review due to trivial changes (1)
  • tests/unit/test_metrics_tracking.bats

Comment thread ralph_loop.sh Outdated
Comment on lines +2074 to +2087
# Capture loop start time for duration tracking (Issue #21)
local loop_start_epoch
loop_start_epoch=$(get_epoch_seconds)

# Execute Claude Code
execute_claude_code "$loop_count"
local exec_result=$?


# Record metrics for this loop (Issue #21)
local loop_duration
loop_duration=$(( $(get_epoch_seconds) - loop_start_epoch ))
local loop_success="false"
[ $exec_result -eq 0 ] && loop_success="true"
track_metrics "$loop_count" "$loop_duration" "$loop_success" "$(cat "$CALL_COUNT_FILE" 2>/dev/null || echo 0)"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

calls metric is not cumulative across hourly resets.

At Line 2087, the value passed to track_metrics is read from .call_count, which is reset every hour. That breaks the cumulative-call expectation and can underreport total_calls analytics.

💡 Proposed fix
@@
-    while true; do
+    local total_calls_lifetime=0
+    while true; do
@@
-        local calls_made=$(cat "$CALL_COUNT_FILE" 2>/dev/null || echo "0")
+        local calls_made
+        calls_made=$(cat "$CALL_COUNT_FILE" 2>/dev/null || echo "0")
         update_status "$loop_count" "$calls_made" "executing" "running"
@@
-        execute_claude_code "$loop_count"
+        local calls_before_exec="$calls_made"
+        execute_claude_code "$loop_count"
         local exec_result=$?
@@
-        track_metrics "$loop_count" "$loop_duration" "$loop_success" "$(cat "$CALL_COUNT_FILE" 2>/dev/null || echo 0)"
+        local calls_after_exec
+        calls_after_exec=$(cat "$CALL_COUNT_FILE" 2>/dev/null || echo "0")
+        local calls_delta=0
+        if [[ "$calls_after_exec" -ge "$calls_before_exec" ]]; then
+            calls_delta=$((calls_after_exec - calls_before_exec))
+        else
+            # hourly reset occurred
+            calls_delta=$calls_after_exec
+        fi
+        total_calls_lifetime=$((total_calls_lifetime + calls_delta))
+        track_metrics "$loop_count" "$loop_duration" "$loop_success" "$total_calls_lifetime"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` around lines 2074 - 2087, The code reads CALL_COUNT_FILE
(hourly-reset) when calling track_metrics, causing total_calls to drop; change
to use a persistent cumulative counter (e.g. TOTAL_CALLS_FILE) that is not
rotated hourly: after execute_claude_code and determining exec_result, increment
the persistent counter when a call occurs (or on success/failure as desired),
persist it to TOTAL_CALLS_FILE, then read that persistent value and pass it into
track_metrics instead of reading CALL_COUNT_FILE; update references around
execute_claude_code, exec_result, and track_metrics to use and maintain
TOTAL_CALLS_FILE.

… hourly resets

- Track calls_before_exec and compute calls_this_loop as delta instead of
  reading cumulative CALL_COUNT_FILE (which resets hourly)
- Update ralph-stats to sum .calls instead of max for true lifetime total
- Update test assertion: total_calls = sum(2+5+7) = 14
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 2, 2026

Review: fix(metrics): per-loop call delta for total_calls

Good progress on this PR. The per-loop call delta approach (using calls_after_exec - calls_before_exec) correctly solves the hourly reset problem. A few items remain.


Bug: print_metrics_summary and ralph-stats.sh are now inconsistent

This is the main issue introduced in the latest commit. ralph-stats.sh was updated to use add:

total_calls: (map(.calls) | add // 0)

But print_metrics_summary in ralph_loop.sh still uses max:

total_calls: (map(.calls) | max // 0)

Since calls is now a per-loop delta (not a cumulative counter), max returns the highest single-loop call count rather than the total. For the 3-entry fixture [2, 5, 7], ralph-stats correctly returns 14, but print_metrics_summary would return 7. The two summaries should agree — print_metrics_summary needs the same add fix.


Still open from prior reviews

No. 1 — Weak assertion (raised in both prior reviews)

In test_metrics_tracking.bats lines 103-104:

[[ "$output" == *"total_loops"* ]]
[[ "$output" == *"2"* ]]

The "2" match is still loose — it would match a timestamp like "2025-01-01". The tests for ralph-stats in the same file use jq -r for precise field assertions; print_metrics_summary should follow the same pattern:

[[ "$(echo "$output" | jq -r '.total_loops')" == "2" ]]

No. 2 — No rotation for metrics.jsonl (raised in prior review)

ralph.log has rotate_logs() gating at 10 MB. metrics.jsonl grows unbounded. Long-running projects will accumulate thousands of entries; jq -s slurps the entire file into memory on every print_metrics_summary call (inside the main loop exit path) and every ralph-stats invocation. A follow-on issue to add rotation (or cap jq -s to the last N entries) would be appropriate if this isn't addressed here.


Items addressed — thank you

  • SC2155: calls_made is now split into declaration + assignment in the main loop.
  • ralph-stats.sh correctly uses add for per-loop delta semantics.
  • Test assertions for the ralph-stats test case use jq -r field extraction — precise and correct.
  • Install/uninstall symmetry for ralph-stats is complete.

The print_metrics_summary inconsistency (still using max) is the blocker before merge.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
ralph_loop.sh (1)

553-558: ⚠️ Potential issue | 🟠 Major

total_calls aggregation is incorrect (uses max instead of sum).

Line 557 currently returns only the largest per-loop value, which underreports aggregate calls when metrics store per-loop deltas.

💡 Suggested fix
 summary=$(jq -s '{
     total_loops: length,
     successful: (map(select(.success==true)) | length),
     avg_duration: (if length > 0 then (map(.duration) | add) / length else 0 end),
-    total_calls: (map(.calls) | max // 0)
+    total_calls: (map(.calls // 0) | add // 0)
 }' "$metrics_file" 2>/dev/null)
#!/bin/bash
set -euo pipefail

tmp_file="$(mktemp)"
cat > "$tmp_file" <<'EOF'
{"calls":2}
{"calls":5}
{"calls":7}
EOF

echo "Current logic (max):"
jq -s '{total_calls:(map(.calls) | max // 0)}' "$tmp_file"

echo "Expected logic (sum):"
jq -s '{total_calls:(map(.calls // 0) | add // 0)}' "$tmp_file"

rm -f "$tmp_file"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ralph_loop.sh` around lines 553 - 558, The aggregation for total_calls in the
jq -s block is using max which undercounts; change the expression for
total_calls to sum per-loop call values and handle missing values and empty
arrays (e.g. replace (map(.calls) | max // 0) with (map(.calls // 0) | add //
0)) so the summary variable correctly totals calls from metrics_file.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@ralph_loop.sh`:
- Around line 553-558: The aggregation for total_calls in the jq -s block is
using max which undercounts; change the expression for total_calls to sum
per-loop call values and handle missing values and empty arrays (e.g. replace
(map(.calls) | max // 0) with (map(.calls // 0) | add // 0)) so the summary
variable correctly totals calls from metrics_file.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 27b06437-c812-4d47-bc6f-c91dd5233614

📥 Commits

Reviewing files that changed from the base of the PR and between d015838 and 7b927f8.

📒 Files selected for processing (3)
  • ralph-stats.sh
  • ralph_loop.sh
  • tests/unit/test_metrics_tracking.bats
✅ Files skipped from review due to trivial changes (1)
  • ralph-stats.sh
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/unit/test_metrics_tracking.bats

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 2, 2026

PR Review: feat(metrics): loop metrics tracking and ralph-stats analytics

This is a clean, well-scoped implementation. The architecture follows existing patterns, the tests cover the main behaviors, and the installation wiring is consistent with how other commands are set up. A few things worth addressing:


Bug: Off-by-one in calls_this_loop delta (ralph_loop.sh)

The arithmetic condition for the per-loop call delta uses strict greater-than:

local calls_this_loop=$(( calls_after_exec > calls_before_exec ? calls_after_exec - calls_before_exec : calls_after_exec ))

When calls_after_exec == calls_before_exec (i.e., no API calls were made — timed out, skipped, etc.), the condition is false and the fallback returns the full counter value rather than 0. The fix is straightforward:

local calls_this_loop=$(( calls_after_exec >= calls_before_exec ? calls_after_exec - calls_before_exec : calls_after_exec ))

PR description inconsistency (minor)

The Implementation Notes section still says:

total_calls in summary = max(calls) from metrics file (cumulative counter at last loop)

But the most recent commit switched this to add (sum of per-loop deltas), which is correct given that calls_this_loop is now a delta rather than a cumulative counter. The description should be updated to reflect the actual behavior. The tests confirm the add behavior is right (2+5+7=14), so this is a docs-only fix.


Code duplication: jq query appears twice

The identical jq summary query is in both print_metrics_summary (ralph_loop.sh) and ralph-stats.sh. When this logic needs to change in the future (e.g., adding a new metric field), it will need to be updated in two places. This is not a blocker given the current scope, just worth noting for future maintenance.


Weak assertion in print_metrics_summary test

The last test only checks for the string "total_loops" and that "2" appears somewhere in the output:

[[ "$output" == *"total_loops"* ]]
[[ "$output" == *"2"* ]]

The other three tests validate specific field values via jq. Aligning this test to do the same (checking successful, avg_duration, total_calls) would make it consistent and catch regressions more precisely.


Summary

The overall implementation is solid — the track_metrics() function, ralph-stats.sh, install wiring, and integration tests all look correct. The off-by-one in the delta calculation is the only functional concern; the rest are polish items.

@frankbria frankbria merged commit dc89f16 into main Apr 2, 2026
5 checks passed
@frankbria frankbria deleted the feature/metrics-tracking branch April 2, 2026 02:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[P3] Phase 3.7: Implement metrics and analytics

1 participant