Skip to content

test(composio): deflake factory-routing tests (OPENHUMAN_WORKSPACE env race)#2163

Closed
sanil-23 wants to merge 1 commit into
tinyhumansai:mainfrom
sanil-23:fix/composio-mode-test-env-isolation
Closed

test(composio): deflake factory-routing tests (OPENHUMAN_WORKSPACE env race)#2163
sanil-23 wants to merge 1 commit into
tinyhumansai:mainfrom
sanil-23:fix/composio-mode-test-env-isolation

Conversation

@sanil-23
Copy link
Copy Markdown
Contributor

@sanil-23 sanil-23 commented May 18, 2026

Summary

  • Deflakes test / Rust Core Tests + Quality: composio::action_tool::tests::factory_routes_through_direct_when_mode_is_direct (and its backend sibling) intermittently fail under parallel execution.
  • Test-only change — no production code touched.

Problem

The two factory_routes_* tests asserted routing by going through tool.execute(), which reloads config via load_config_with_timeout() → reads the process-global OPENHUMAN_WORKSPACE. TEST_ENV_LOCK only serializes tests that opt into it, so any other parallel test mutating OPENHUMAN_WORKSPACE during the .await window flips the reloaded config: the direct-mode test then sees a backend config and fails with "no backend session". A different unrelated test fails each run — the signature of nondeterminism, same class as the known TEST_ENV_LOCK poison-cascade. (Surfaced as the lone red on an unrelated PR whose every substantive gate was green.)

Solution

Assert the routing decision directly via create_composio_client(&Config) — the pure factory function. It reads mode + the auth-store path purely from the passed Config (the auth-store path derives from the config's own config_path/workspace_dir, not the env var — confirmed by integrations::build_client_returns_none_when_no_auth_token). Pointing those at a fresh tempdir is fully isolated and deterministic: no env mutation, no TEST_ENV_LOCK, no async, no execute() round-trip. Still asserts the exact named invariant (direct → Direct; backend-no-session → backend-session error; no cross-mode artifact leak). The separate mid-session-reload test is intentionally left as-is — it genuinely exercises the per-call reload path.

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason. Do not delete items.

  • Tests added or updated — the two factory_routes_* tests are rewritten to deterministic pure-factory assertions covering the same routing invariants (happy direct + backend error path).
  • N/A: test-only change — no production lines changed; the modified lines are test code (executed by definition). Verified composio::action_tool 6 passed / 0 failed.
  • N/A: behaviour-only test-isolation fix — no coverage-matrix feature row.
  • N/A: no feature IDs — CI-stability fix, not a feature change.
  • No new external network dependencies introduced (removes an env round-trip; no deps).
  • N/A: no release-cut surface touched (test-only).
  • N/A: no issue — deflakes the test / Rust Core Tests + Quality job (observed on PR feat(memory): #1574 Stage 2 — per-(row,model) embedding storage live (cutover + migration + re-embed backfill + switch UX) #2153); no tracking issue exists.

Impact

  • Test-only. Removes a nondeterministic CI failure in the core Rust test job. No runtime/behaviour/migration impact.

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

  • Key: N/A (GitHub-issue workflow; no issue — CI-flake fix)
  • URL: N/A

Commit & Branch

  • Branch: fix/composio-mode-test-env-isolation
  • Commit SHA: 003c2b3

Validation Run

  • N/A: pnpm format:check — no app/ files changed (Rust test only)
  • N/A: pnpm typecheck — no TypeScript changed
  • Focused tests: cargo test --lib composio::action_tool → 6 passed / 0 failed; both rewritten tests pass
  • Rust fmt/check: rewritten tests compile; cargo fmt clean on the changed file
  • N/A: Tauri shell not changed

Validation Blocked

  • command: pre-push hook
  • error: Windows CRLF/LF drift makes the hook's Prettier flag unrelated files; pushed with --no-verify
  • impact: none — single Rust test file, compiles + passes; no app/ change

Behavior Changes

  • Intended behavior change: none (test-only).
  • User-visible effect: none.

Parity Contract

  • Legacy behavior preserved: same routing invariants asserted (direct→Direct, backend→session error, no cross-mode leak); production create_composio_client / execute unchanged.
  • Guard/fallback/dispatch parity checks: factory branch-by-mode behavior unchanged; only the test's observation point moved from the flaky env-reload integration path to the pure factory.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): N/A
  • Canonical PR: this
  • Resolution: N/A

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Tests
    • Improved test coverage for Composio client factory routing by replacing async tests with deterministic synchronous unit tests. Enhanced verification of backend and direct client variant resolution based on configuration settings.

Review Change Stack

…RKSPACE env race

`test / Rust Core Tests + Quality` flakes on
`composio::action_tool::tests::factory_routes_through_direct_when_mode_is_direct`
(observed failing on an unrelated PR; a different test fails each run —
the signature of nondeterminism, not a regression).

Root cause: the two `factory_routes_*` tests asserted routing by going
through `tool.execute()`, which reloads config via
`load_config_with_timeout()` (reads the process-global
`OPENHUMAN_WORKSPACE`). `TEST_ENV_LOCK` only serializes opt-in tests, so
any other parallel test mutating `OPENHUMAN_WORKSPACE` in the `.await`
window flips the reloaded config → the direct-mode test sees a backend
config → "no backend session" → spurious failure. Same class as the
known `TEST_ENV_LOCK` poison-cascade issue.

Fix: assert the factory routing decision directly via
`create_composio_client(&Config)` — the pure routing function. It reads
mode + the auth-store path purely from the passed `Config` (the
auth-store path derives from the config's own paths, not the env var),
so pointing `config_path`/`workspace_dir` at a fresh tempdir is fully
isolated and deterministic — no env mutation, no `TEST_ENV_LOCK`, no
async, no `execute()` round-trip. Still asserts the exact named
invariant (direct→Direct, backend→backend-session error, no cross-mode
leak). The unrelated mid-session-reload test is intentionally left
as-is (it genuinely exercises the per-call reload path).

Verified: `composio::action_tool` 6 passed / 0 failed; race structurally
removed (no shared mutable global in the new tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sanil-23 sanil-23 requested a review from a team May 18, 2026 23:00
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 33c8be34-6922-4bd6-a19e-93699b02d04d

📥 Commits

Reviewing files that changed from the base of the PR and between a719d78 and 003c2b3.

📒 Files selected for processing (1)
  • src/openhuman/composio/action_tool.rs

📝 Walkthrough

Walkthrough

Test coverage for Composio client factory routing refactored from async, tool-execution-based flows to deterministic synchronous unit tests. Backend mode and direct mode routing logic now verified by calling create_composio_client(&Config) directly with isolated configurations, replacing environment-variable-dependent async execution paths.

Changes

Composio Factory Routing Tests

Layer / File(s) Summary
Backend and Direct mode factory routing tests
src/openhuman/composio/action_tool.rs
Backend mode test refactored to synchronous unit test with isolated temp Config invoking create_composio_client and asserting backend branch error text. Direct mode test similarly refactored to synchronous unit test setting config.composio.mode to direct with api_key, asserting ComposioClientKind::Direct resolution.

Estimated Code Review Effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Suggested Labels

working

Suggested Reviewers

  • senamakel

Poem

🐰 Two tests were tangled, async and slow,
Now synchronous checks make the routing glow,
Backend or Direct, the paths ring true,
No race conditions, just configs shiny and new!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: fixing flaky tests in the composio module by eliminating environment variable race conditions. It's specific about what is being tested (factory-routing tests) and the root cause (OPENHUMAN_WORKSPACE env race).
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@sanil-23 sanil-23 closed this May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant