test(tmux): add 14 integration tests for tmux session management#245
Conversation
Tests check_tmux_available(), get_tmux_base_index(), and setup_tmux_session() using inline function definitions and a file-based tmux call tracker. Covers: session creation, 3-pane layout (split-window -h and -v), live log tail in pane 1, monitor in pane 2, loop command in pane 0 (no --monitor flag), window title, pane focus, and custom flag forwarding (--calls, --prompt). Total tests: 660 (was 646, +14)
WalkthroughAdds a new Bats integration test Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/integration/test_tmux_integration.bats`:
- Around line 229-405: Add tests that cover session-name uniqueness,
detach/reattach flows, and concurrent sessions: 1) For setup_tmux_session, add a
test that invokes setup_tmux_session multiple times in rapid succession and
asserts tmux new-session -d -s ralph-[0-9]+ was called with unique session
suffixes by inspecting TMUX_CALL_LOG for duplicate names; 2) Add a test that
simulates detach/reattach by calling setup_tmux_session, then invoking the
mocked tmux detach-client/select-session sequence and asserting the script calls
the correct tmux attach or select commands; 3) Add a test that runs multiple
parallel setup_tmux_session invocations (background jobs) and asserts multiple
distinct ralph- sessions are created and that subsequent pane setup/send-keys
calls are routed to the correct session identifiers; use existing helpers like
assert_tmux_called_with and TMUX_CALL_LOG to validate calls.
- Around line 11-14: The test re-defines the real functions inline (see the
INLINE FUNCTION DEFINITIONS block) which duplicates production logic and weakens
test validity; instead remove those duplicated function bodies from
tests/integration/test_tmux_integration.bats and source or invoke the real
implementations from ralph_loop.sh (e.g., source ralph_loop.sh and call the
actual ralph_loop/start/stop/session-management functions), leaving only
environment and tmux mocking scaffolding in the test file so the tests exercise
the real code paths for session management and edge cases.
- Around line 370-375: The test "setup_tmux_session focuses left pane (pane 0)
after setup" uses a too-broad regex in assert_tmux_called_with that also matches
title-setting calls; update the assertion in that test to ensure select-pane is
called with the -t target for pane 0 and not the -T title flag (for example
require "-t <session>:0.0" explicitly or use a tighter regex like 'tmux
select-pane -t [^ ]+\.0($| )' or include a negative lookahead for '-T'); modify
the assert_tmux_called_with call in the setup_tmux_session test to use the
tighter pattern so only actual focus commands match.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e4f06964-b95e-4b1f-b5d8-30e91322f479
📒 Files selected for processing (1)
tests/integration/test_tmux_integration.bats
| # INLINE FUNCTION DEFINITIONS FOR TESTING | ||
| # These mirror the implementations in ralph_loop.sh. | ||
| # IMPORTANT: Keep in sync if ralph_loop.sh changes. | ||
| # ============================================================================== |
There was a problem hiding this comment.
Inline duplicated function bodies weaken integration-test validity.
These tests re-implement the functions under test instead of executing the real ralph_loop.sh implementations, so regressions in production code can go undetected. Please source/invoke the real functions and keep only environment/tmux mocks in this test file.
Based on learnings: "Test suite must achieve 100% test pass rate with comprehensive coverage of ... session management ... edge cases."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/integration/test_tmux_integration.bats` around lines 11 - 14, The test
re-defines the real functions inline (see the INLINE FUNCTION DEFINITIONS block)
which duplicates production logic and weakens test validity; instead remove
those duplicated function bodies from
tests/integration/test_tmux_integration.bats and source or invoke the real
implementations from ralph_loop.sh (e.g., source ralph_loop.sh and call the
actual ralph_loop/start/stop/session-management functions), leaving only
environment and tmux mocking scaffolding in the test file so the tests exercise
the real code paths for session management and edge cases.
PR Review: test(tmux) - 14 integration tests for tmux session managementGood addition - the file-based call tracking pattern ( What is working well
Issues to address1. Inline copy diverges from production code The inline log_status "INFO" " Left: Ralph loop"
log_status "INFO" " Right-top: Claude Code live output"
log_status "INFO" " Right-bottom: Status monitor"
log_status "INFO" ""The comment says "Keep in sync if ralph_loop.sh changes" - but it is already out of sync with main. These lines do not affect test assertions, but worth correcting before the pattern drifts further. 2. No test for base-index = 1 (the most common non-default) Test No. 3 only validates the default (0). Users with 3. Several flag-forwarding paths have no test coverage Tests 13 and 14 cover
Since this test file is the explicit home for flag-forwarding validation, comprehensive coverage here is preferable to discovering a regression when a flag silently stops being forwarded. Minor observations
SummaryTests No. 1-6 and 9-12 are solid. The two main items to address are the diverged inline copy (easy fix) and missing coverage for non-zero |
- Tighten select-pane assertion: anchor to end-of-line so title-setting calls (select-pane -T) do not produce false positives (CR finding 3) - Add test 15: session name follows ralph-EPOCH format (CR finding 2) - Add test 16: detach/reattach log instructions present (CR finding 2) - Add test 17: two invocations create two distinct new-session calls (CR finding 2) - Improve INLINE block comment to explain why sourcing ralph_loop.sh directly is not practical (CR finding 1) Total tests: 663 (was 660, +3)
Follow-up Review: test(tmux) - PR updated to address feedbackThe commit Still unresolved from previous review1. Inline copy is still missing production log lines The inline These do not affect any current test assertion, but the divergence will grow silently. The comment at the top says "Keep in sync if ralph_loop.sh changes" — it is already out of sync. 2. No test for base-index = 1 Only the default (0) is covered. Adding a test that overrides the mock to return 3. Several flag-forwarding paths still have no coverage Tests 13 and 14 cover
These are low-effort additions following the same pattern as tests 13 and 14. Minor itemsPR title/description mismatch: The title still says "add 14 integration tests" but the file now has 17. Worth updating the PR description. Test No. 17 uses sleep 1: The Test No. 3 assertion still broad: What was done well in this update
The three main items to resolve before merging are: the inline copy divergence, the missing base-index-1 test, and the incomplete flag-forwarding coverage. |
There was a problem hiding this comment.
♻️ Duplicate comments (2)
tests/integration/test_tmux_integration.bats (2)
441-459:⚠️ Potential issue | 🟠 MajorConcurrent session behavior is still not actually covered.
Line 441-Line 459runs two sequential calls withsleep 1; this avoids same-second collisions rather than testing rapid/concurrent creation. The linked objective calls out multiple concurrent sessions explicitly.Based on learnings: "Test suite must achieve 100% test pass rate with comprehensive coverage of: ... session management ... edge cases."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_tmux_integration.bats` around lines 441 - 459, The test currently calls setup_tmux_session twice sequentially with sleep 1; change it to exercise concurrent creation by launching two setup_tmux_session invocations in parallel (background/subshell) and wait for both to finish, then assert TMUX_CALL_LOG contains two "tmux new-session" entries and two distinct session names; update the test block (referencing the test name and the setup_tmux_session function and TMUX_CALL_LOG variable) to start both invocations concurrently, wait, then perform the existing grep/count and name-uniqueness checks.
11-20:⚠️ Potential issue | 🟠 MajorInline re-implementation still weakens integration-test trust.
At
Line 11-Line 20, this suite still validates copied function bodies instead of exercisingralph_loop.shbehavior directly, so production regressions can pass unnoticed when implementations drift.Based on learnings: "Test suite must achieve 100% test pass rate with comprehensive coverage of: ... session management ... edge cases."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_tmux_integration.bats` around lines 11 - 20, The tests inline function bodies from ralph_loop.sh which weakens integration coverage; instead extract the reusable functions into a new library (e.g., ralph_loop_lib.sh) and have both ralph_loop.sh source that lib and tests/integration/test_tmux_integration.bats source the same lib so the tests exercise the real implementations; ensure the refactor preserves exported variables (RALPH_DIR, LOG_DIR) by keeping top-level assignments in ralph_loop.sh (not the lib) or provide a sourcing guard (e.g., only define functions when not already defined) so sourcing the lib in tests does not execute production top-level assignments.
🧹 Nitpick comments (1)
tests/integration/test_tmux_integration.bats (1)
419-424: Test 15 duplicates Test 4’s assertion.
Line 419-Line 424overlaps withLine 284-Line 288(new-session+ralph-[0-9]+). Consider merging or repurposing this case to cover an untested behavior.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_tmux_integration.bats` around lines 419 - 424, The test "setup_tmux_session generates session name in ralph-EPOCH format" duplicates the earlier assertion that tmux was called with "new-session -d -s ralph-[0-9]+"; update the tests so they are not redundant by either removing this duplicate block or changing it to exercise a different behavior of setup_tmux_session (for example verify session name prefix only, verify environment variables passed, or assert the call includes a specific flag). Locate the test function name setup_tmux_session and the assertion assert_tmux_called_with "^tmux new-session -d -s ralph-[0-9]+" and replace or modify this second occurrence accordingly so each test covers a unique case.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@tests/integration/test_tmux_integration.bats`:
- Around line 441-459: The test currently calls setup_tmux_session twice
sequentially with sleep 1; change it to exercise concurrent creation by
launching two setup_tmux_session invocations in parallel (background/subshell)
and wait for both to finish, then assert TMUX_CALL_LOG contains two "tmux
new-session" entries and two distinct session names; update the test block
(referencing the test name and the setup_tmux_session function and TMUX_CALL_LOG
variable) to start both invocations concurrently, wait, then perform the
existing grep/count and name-uniqueness checks.
- Around line 11-20: The tests inline function bodies from ralph_loop.sh which
weakens integration coverage; instead extract the reusable functions into a new
library (e.g., ralph_loop_lib.sh) and have both ralph_loop.sh source that lib
and tests/integration/test_tmux_integration.bats source the same lib so the
tests exercise the real implementations; ensure the refactor preserves exported
variables (RALPH_DIR, LOG_DIR) by keeping top-level assignments in ralph_loop.sh
(not the lib) or provide a sourcing guard (e.g., only define functions when not
already defined) so sourcing the lib in tests does not execute production
top-level assignments.
---
Nitpick comments:
In `@tests/integration/test_tmux_integration.bats`:
- Around line 419-424: The test "setup_tmux_session generates session name in
ralph-EPOCH format" duplicates the earlier assertion that tmux was called with
"new-session -d -s ralph-[0-9]+"; update the tests so they are not redundant by
either removing this duplicate block or changing it to exercise a different
behavior of setup_tmux_session (for example verify session name prefix only,
verify environment variables passed, or assert the call includes a specific
flag). Locate the test function name setup_tmux_session and the assertion
assert_tmux_called_with "^tmux new-session -d -s ralph-[0-9]+" and replace or
modify this second occurrence accordingly so each test covers a unique case.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 36fbb0fe-a098-4102-952a-b018b7d57a2f
📒 Files selected for processing (1)
tests/integration/test_tmux_integration.bats
- Test 17: replace sequential sleep-1 calls with true concurrent background subshells; assert count=2 new-session entries in shared log (CR Major) - Test 15: replace duplicate format check with timestamp-currency check — verifies epoch is within 5s of test execution, not just matching [0-9]+ (CR Nitpick — was duplicate of test 4) Skipped: extracting tmux functions to ralph_loop_lib.sh — this requires modifying ralph_loop.sh and is out of scope for a test-only PR. Tracked as a future improvement.
There was a problem hiding this comment.
♻️ Duplicate comments (2)
tests/integration/test_tmux_integration.bats (2)
11-156:⚠️ Potential issue | 🟠 MajorUse production implementations instead of inline mirrors for integration validity.
At Line 11 onward, these tests duplicate the tmux functions instead of executing the real
ralph_loop.shcode path, so production regressions can slip through while tests still pass. Keep tmux/environment mocks, but source/invoke the real implementation (or a shared extracted library) for these assertions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_tmux_integration.bats` around lines 11 - 156, The test currently duplicates production functions (log_status, check_tmux_available, get_tmux_base_index, setup_tmux_session) which masks regressions; instead remove these inline mirrors and source or invoke the real implementations (or a shared extracted library) used by ralph_loop.sh so tests exercise production code paths. Concretely, replace the inline definitions with sourcing of a functions-only file (or refactor ralph_loop.sh to export these functions) and keep only lightweight tmux/environment mocks in the test; ensure setup_tmux_session, get_tmux_base_index and check_tmux_available calls in the test resolve to the production implementations rather than the inline copies.
436-460:⚠️ Potential issue | 🟠 MajorCoverage is still too weak for session uniqueness/reattach behavior.
Line 447-459 only checks that two
new-sessioncalls happened, not that the session names are distinct. Line 436-441 checks printed guidance text, but not actual attach/select call behavior. This can pass even if multi-session handling regresses.Suggested tightening
`@test` "two concurrent setup_tmux_session invocations each create a tmux new-session call" { @@ - local count - count=$(grep -c "^tmux new-session" "$TMUX_CALL_LOG") - [ "$count" -eq 2 ] + local count + count=$(grep -c "^tmux new-session" "$TMUX_CALL_LOG") + [ "$count" -eq 2 ] + + local s1 s2 + s1=$(grep "^tmux new-session" "$TMUX_CALL_LOG" | sed -n '1s/.*-s \(ralph-[0-9]\+\).*/\1/p') + s2=$(grep "^tmux new-session" "$TMUX_CALL_LOG" | sed -n '2s/.*-s \(ralph-[0-9]\+\).*/\1/p') + [ -n "$s1" ] && [ -n "$s2" ] + [ "$s1" != "$s2" ] }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_tmux_integration.bats` around lines 436 - 460, The tests only assert two "tmux new-session" entries and some printed guidance; tighten them to verify actual distinct session names and attach/select behavior: in the concurrent test (referencing setup_tmux_session and TMUX_CALL_LOG) parse the session names from the "tmux new-session" log lines (e.g. extract the -s <name> token) and assert there are two different names, and also assert the TMUX_CALL_LOG contains corresponding "tmux attach -t <name>" or "tmux select-window -t <name>" lines for those names; likewise, for the guidance test that calls setup_tmux_session, assert the command output includes the concrete session name or attach command (not just "Ctrl+B" and "tmux attach") so the test ensures printed reattach instructions reference the actual session created.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@tests/integration/test_tmux_integration.bats`:
- Around line 11-156: The test currently duplicates production functions
(log_status, check_tmux_available, get_tmux_base_index, setup_tmux_session)
which masks regressions; instead remove these inline mirrors and source or
invoke the real implementations (or a shared extracted library) used by
ralph_loop.sh so tests exercise production code paths. Concretely, replace the
inline definitions with sourcing of a functions-only file (or refactor
ralph_loop.sh to export these functions) and keep only lightweight
tmux/environment mocks in the test; ensure setup_tmux_session,
get_tmux_base_index and check_tmux_available calls in the test resolve to the
production implementations rather than the inline copies.
- Around line 436-460: The tests only assert two "tmux new-session" entries and
some printed guidance; tighten them to verify actual distinct session names and
attach/select behavior: in the concurrent test (referencing setup_tmux_session
and TMUX_CALL_LOG) parse the session names from the "tmux new-session" log lines
(e.g. extract the -s <name> token) and assert there are two different names, and
also assert the TMUX_CALL_LOG contains corresponding "tmux attach -t <name>" or
"tmux select-window -t <name>" lines for those names; likewise, for the guidance
test that calls setup_tmux_session, assert the command output includes the
concrete session name or attach command (not just "Ctrl+B" and "tmux attach") so
the test ensures printed reattach instructions reference the actual session
created.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7b171572-6c6d-4d22-9f2f-b813b48bb355
📒 Files selected for processing (1)
tests/integration/test_tmux_integration.bats
|
Round-3 Review: test(tmux) addressing round-2 CodeRabbit feedback The changes in f3535b0 are clean and correct responses to the CodeRabbit round-2 findings. Test 15: Timestamp-currency check (delta -le 5) is a meaningful improvement over the prior duplicate format check. Test 17: True concurrent background subshells with a shared log file is the right way to test session isolation. Still open from previous reviews INLINE COPY MISSING 4 LOG LINES This was raised in both the first and second reviews and remains unresolved. The inline setup_tmux_session() in the test file omits four log_status lines that exist in ralph_loop.sh (lines 383-394). Current inline copy goes directly from: Production code has four additional lines between those two: The inline comment says "Keep in sync if ralph_loop.sh changes" but these lines were never included. Test 16 only checks for Ctrl+B and tmux attach, so it passes against the incomplete copy and would not catch regressions in the pane-description output. Fix: Add the 4 missing lines to the inline function. Optionally extend Test 16 to assert on pane descriptions, e.g. [[ "$output" == "Right-top" ]] Minor: PR description still says 14 tests but the file now has 17. Worth updating before merge. |
Summary
Closes #14
tests/integration/test_tmux_integration.batswith 14 tests covering tmux session managementtest_backup_rollback.bats)run's subshell boundaryKey adaptations from the original plan
setup_tmux_session()creates: left (Ralph loop), right-top (Claude output viatail -f), right-bottom (ralph-monitor). Tests cover bothsplit-window -handsplit-window -v."Ralph: Loop | Output | Status"— not"Ralph: Loop | Monitor"as the issue plan stated.check_tmux_available()callsexit 1(notreturn 1) — tests run it viarun; test 2 skips gracefully when tmux is installed in the environment.setup_tmux_session()ends withexit 0— running viarunlets the exit only kill the subshell; the tmux call log file captures all assertions.get_tmux_base_index()test — this helper affects pane targeting and warrants explicit coverage.Test plan
bats tests/integration/test_tmux_integration.bats)Summary by CodeRabbit