Async Test Infrastructure & Editor Readiness Status + new refresh_unity tool #507

dsarno · 2026-01-03T20:03:13Z

What This Does

Makes Unity MCP more reliable when Unity is busy or running in the background.

Before this PR, tools could fail unpredictably when Unity was compiling scripts, importing assets, or minimized. Now the server checks if Unity is ready before running tools, waits when needed, and gives clear feedback when it can't proceed.

Based on the above, we also add run_tests_async, a new asynchronous (non-blocking) way to run tests—start them, get a job ID, and poll for progress instead of waiting for the entire suite to finish.

We also implement the much-requested refresh_unity tool so IDE-based agents can request a recompile/reload at any point without requiring Unity to be in focus.

New Features

Smarter Tool Execution (addresses #474)

Tools now check if Unity is ready before running (compiling? importing? tests already running?)
If Unity is busy, tools wait automatically or return a clear "try again in X ms" message
Detects when files changed outside Unity and refreshes automatically before proceeding

Non-Blocking Test Runs (addresses #503)

run_tests_async starts tests and returns immediately with a job ID (no more waiting for long suites)
get_test_job lets you poll for progress: how many tests done, which one is running, any failures so far
Progress tracking survives Unity recompiling mid-run
Detects when tests appear stuck and tells you why (e.g., "Unity is in the background")

Refresh Tool

New refresh_unity tool to explicitly tell Unity to reimport assets
Can wait until Unity is fully ready before returning
Handles Unity temporarily disconnecting during script recompilation

Background Throttling Fix

Unity slows way down when it's not the active window, which could stall tests
Now automatically sets Unity to "No Throttling" mode while tests run, then restores your preference after
Tests that trigger recompilation are marked as needing special handling (they work in CI, but may stall locally if Unity is minimized)

CI Changes

Tests that trigger script compilation now run in CI (previously skipped)
Uses the same Unity instance as other tests to avoid slow extra startup

What's New for LLMs

Tools list now includes:

run_tests_async — Start tests without blocking (preferred for long suites)
get_test_job — Check test progress and results
refresh_unity — Force Unity to reimport changed files

Tool descriptions now guide LLMs to prefer run_tests_async over the blocking run_tests.

No Breaking Changes

All existing tools work the same. This is purely additive.

Stats

45 files changed, ~2800 lines added
New README screenshot for v8.6 UI

Summary by Sourcery

Introduce editor readiness and external change tracking to gate tools, add async Unity test job tooling, and expose a refresh_unity workflow for safer background operation.

New Features:

Add run_tests_async and get_test_job tools to start and poll Unity test runs asynchronously with progress and partial failure details.
Expose a refresh_unity tool that triggers asset refresh and optional script compilation, with optional waiting for the editor to become ready.
Provide a v2 editor_state resource and Unity-side editor state cache to give LLMs a canonical readiness snapshot with advice and staleness metadata.

Bug Fixes:

Ensure synchronous run_tests returns a structured busy response when a test run is already in progress instead of a generic failure.
Prevent intermittent flakiness in ManageScriptableObject tests by isolating each run in its own temporary subfolder and reducing reimport churn.
Avoid leaving compilation or domain reload activity running between tests by forcing synchronous AssetDatabase refreshes and cleanup in domain reload resilience tests.

Enhancements:

Gate key tools (tests, asset and scene operations, gameobject management) behind a preflight readiness check that can wait for compilation to finish and refresh stale assets.
Track Unity test run status and job metadata across domain reloads, including per-test progress, failure summaries, and stuck-detection when the editor is backgrounded.
Temporarily disable Unity editor throttling during test runs to improve reliability when the editor is not focused, restoring user preferences afterward.
Standardize AssetDatabase.Refresh calls across tools to use synchronous import options so newly created or modified assets are immediately visible to the editor.
Improve manage_scriptable_object tests by splitting placement and patching concerns and tightening assertions around created asset locations.

CI:

Extend the Unity CI workflow to run explicit domain reload tests in a dedicated step before the main test suite, using the same Unity instance.

Documentation:

Update README tooling list and installation instructions to document run_tests_async, get_test_job, refresh_unity, and the v8.6.0 release and UI.
Add developer documentation explaining domain reload tests and how backgrounded Unity can stall compilation-driven tests, with recommended workarounds.

Summary by CodeRabbit

New Features
- Async test runs with job tracking, polling API, and a no-throttle test mode
- Editor-state v2 with server-side enrichment and external-change detection
- Unified refresh tool that can wait for editor readiness and clear external-dirty state
- Server-side preflight guard for tools
Bug Fixes
- More reliable asset handling via synchronous asset refreshes
- Improved domain-reload test resilience and per-test isolation
Documentation
- README and developer docs updated for async tools, server naming, and troubleshooting
Tests
- New integration tests covering editor-state v2, refresh recovery, external-change scanning, and test-job flows

_{✏️ Tip: You can customize this high-level summary in your review settings.}

… stall prevention - Add TestRunnerNoThrottle.cs: Sets editor to 'No Throttling' mode during test runs with SessionState persistence across domain reload - Add run_tests_async and get_test_job tools for non-blocking test execution - Add TestJobManager for async test job tracking with progress monitoring - Add ForceSynchronousImport to all AssetDatabase.Refresh() calls to prevent stalls - Mark DomainReloadResilienceTests as [Explicit] with documentation explaining the test infrastructure limitation (internal coroutine waits vs MCP socket polling) - MCP workflow is unaffected - socket messages provide external stimulus that keeps Unity responsive even when backgrounded

- Remove unused Newtonsoft.Json.Linq import from TestJobManager - Add throttling to SessionState persistence (once per second) to reduce overhead - Critical job state changes (start/finish) still persist immediately - Fix duplicate XML summary tag in DomainReloadResilienceTests

- Add run_tests_async and get_test_job to main README tools list - Document background stall limitation for domain reload tests in DEV readme

Run [Explicit] domain_reload tests in their own job using -testCategory

Combines into single job with two test steps to reuse cached Library

- Fix TOCTOU race in TestJobManager.StartJob (single lock scope for check-and-set) - Store TestRunnerApi reference with HideAndDontSave to prevent GC/serialization issues

- run_tests_async is now marked as preferred for long-running suites - run_tests description notes it blocks and suggests async alternative

sourcery-ai · 2026-01-03T20:03:21Z

Reviewer's Guide

Introduces a unified editor readiness infrastructure (editor_state v2 + server preflight) and async test job tracking, then wires core tools and CI/tests to use it so Unity MCP behaves reliably when the editor is compiling, importing, or backgrounded. Adds a new refresh_unity tool and async testing tools run_tests_async/get_test_job, plus a background throttling helper for test runs and related documentation updates.

Sequence diagram for async test run lifecycle with run_tests_async and get_test_job

sequenceDiagram
    actor Client
    participant Server as Server_tools
    participant Preflight as Server_preflight
    participant Unity as Unity_Editor
    participant TRService as TestRunnerService
    participant JobMgr as TestJobManager

    Client->>Server: run_tests_async(mode, filters)
    Server->>Unity: run_tests_async (MCP tool)
    Unity->>JobMgr: StartJob(mode, filterOptions)
    JobMgr->>TRService: RunTestsAsync(mode, filterOptions)
    TRService->>TRService: TestRunStatus.MarkStarted(mode)
    TRService->>Unity: UnityTestRunnerApi.Execute(filter)

    Note over Unity,TRService: Unity runs tests and fires callbacks

    Unity-->>TRService: RunStarted(testsToRun)
    TRService->>JobMgr: OnRunStarted(totalTests)

    loop For each test
        Unity-->>TRService: TestStarted(test)
        TRService->>JobMgr: OnTestStarted(testFullName)
        Unity-->>TRService: TestFinished(result)
        TRService->>JobMgr: OnLeafTestFinished(fullName,isFailure,message)
    end

    Unity-->>TRService: RunFinished(result)
    TRService->>JobMgr: OnRunFinished()
    TRService->>JobMgr: FinalizeCurrentJobFromRunFinished(TestRunResult)
    TRService->>TRService: TestRunStatus.MarkFinished()

    loop Client polls
        Client->>Server: get_test_job(job_id)
        Server->>Unity: get_test_job (MCP tool)
        Unity->>JobMgr: GetJob(job_id)
        JobMgr-->>Unity: TestJob
        Unity-->>Server: SuccessResponse(ToSerializable(job,...))
        Server-->>Client: job status + progress
    end

Sequence diagram for refresh_unity tool with disconnect recovery and external changes scanner

sequenceDiagram
    actor Client
    participant Server as Server_refresh_unity_py
    participant State as editor_state_v2_py
    participant Scanner as ExternalChangesScanner
    participant Unity as Unity_Editor
    participant ToolCS as RefreshUnity_cs

    Client->>Server: refresh_unity(mode,scope,compile,wait_for_ready)
    Server->>Unity: call refresh_unity (C# tool) via transport
    Unity->>ToolCS: HandleCommand(params)
    ToolCS->>Unity: AssetDatabase.Refresh / RequestScriptCompilation
    alt wait_for_ready true
        ToolCS->>Unity: WaitForUnityReadyAsync
        Unity-->>ToolCS: ready or timeout
    end
    ToolCS-->>Server: SuccessResponse or error

    alt response is error and retryable and wait_for_ready
        Server->>State: get_editor_state_v2
        State-->>Server: state_v2 with advice.ready_for_tools
        loop until ready or timeout
            Server->>State: get_editor_state_v2
            State-->>Server: updated state_v2
        end
    end

    Server->>State: get_editor_state_v2 (final readiness check)
    State-->>Server: state_v2
    Server->>Scanner: clear_dirty(instance_id)
    Server-->>Client: MCPResponse(success, message, data)

Class diagram for async test job tracking and status (TestJobManager, TestRunStatus, related types)

classDiagram
    class TestJobStatus {
        <<enum>>
        Running
        Succeeded
        Failed
    }

    class TestJobFailure {
        +string FullName
        +string Message
    }

    class TestJob {
        +string JobId
        +TestJobStatus Status
        +string Mode
        +long StartedUnixMs
        +long FinishedUnixMs
        +long LastUpdateUnixMs
        +int TotalTests
        +int CompletedTests
        +string CurrentTestFullName
        +long CurrentTestStartedUnixMs
        +string LastFinishedTestFullName
        +long LastFinishedUnixMs
        +List~TestJobFailure~ FailuresSoFar
        +string Error
        +TestRunResult Result
    }

    class TestJobManager {
        <<static>>
        -Dictionary~string,TestJob~ Jobs
        -string _currentJobId
        -long _lastPersistUnixMs
        +string CurrentJobId
        +bool HasRunningJob
        +string StartJob(TestMode mode, TestFilterOptions filterOptions)
        +void FinalizeCurrentJobFromRunFinished(TestRunResult resultPayload)
        +void OnRunStarted(int totalTests)
        +void OnTestStarted(string testFullName)
        +void OnLeafTestFinished(string testFullName, bool isFailure, string message)
        +void OnRunFinished()
        +TestJob GetJob(string jobId)
        +object ToSerializable(TestJob job, bool includeDetails, bool includeFailedTests)
    }

    class TestRunStatus {
        <<static>>
        -bool _isRunning
        -TestMode _mode
        -long _startedUnixMs
        -long _finishedUnixMs
        +bool IsRunning
        +TestMode Mode
        +long StartedUnixMs
        +long FinishedUnixMs
        +void MarkStarted(TestMode mode)
        +void MarkFinished()
    }

    class TestRunnerService {
        +Task~TestRunResult~ RunTestsAsync(TestMode mode, TestFilterOptions options)
        +void RunStarted(ITestAdaptor testsToRun)
        +void RunFinished(ITestResultAdaptor result)
        +void TestStarted(ITestAdaptor test)
        +void TestFinished(ITestResultAdaptor result)
        -static int CountLeafTests(ITestAdaptor node)
    }

    class RunTestsAsync {
        <<tool>>
        +Task~object~ HandleCommand(JObject @params)
    }

    class GetTestJob {
        <<tool>>
        +object HandleCommand(JObject @params)
    }

    TestJobManager "1" o-- "*" TestJob
    TestJob "*" o-- "*" TestJobFailure
    TestRunnerService ..> TestJobManager : uses
    TestRunnerService ..> TestRunStatus : updates
    RunTestsAsync ..> TestJobManager : StartJob
    GetTestJob ..> TestJobManager : GetJob,ToSerializable

Class diagram for editor readiness infrastructure (EditorStateCache, EditorStateV2, ExternalChangesScanner, preflight)

classDiagram
    class EditorStateCache {
        <<static>>
        -long _sequence
        -long _observedUnixMs
        -bool _lastIsCompiling
        -long _lastCompileStartedUnixMs
        -long _lastCompileFinishedUnixMs
        -bool _domainReloadPending
        -long _domainReloadBeforeUnixMs
        -long _domainReloadAfterUnixMs
        -double _lastUpdateTimeSinceStartup
        -JObject _cached
        +JObject GetSnapshot()
        -void OnUpdate()
        -void ForceUpdate(string reason)
        -JObject BuildSnapshot(string reason)
    }

    class EditorStateV2 {
        <<resource>>
        +object HandleCommand(JObject @params)
    }

    class ExternalChangesState {
        +string project_root
        +int last_scan_unix_ms
        +int last_seen_mtime_ns
        +bool dirty
        +int dirty_since_unix_ms
        +int external_changes_last_seen_unix_ms
        +int last_cleared_unix_ms
        +list~string~ extra_roots
        +int manifest_last_mtime_ns
    }

    class ExternalChangesScanner {
        -dict~string,ExternalChangesState~ _states
        -int _scan_interval_ms
        -int _max_entries
        +void set_project_root(string instance_id, string project_root)
        +void clear_dirty(string instance_id)
        +dict update_and_get(string instance_id)
        -ExternalChangesState _get_state(string instance_id)
        -int _scan_paths_max_mtime_ns(Iterable roots)
        -list~Path~ _resolve_manifest_extra_roots(Path project_root, ExternalChangesState st)
    }

    class editor_state_v2_py {
        <<resource>>
        +async get_editor_state_v2(Context ctx) MCPResponse
        -async _infer_single_instance_id(Context ctx) string
        -dict _build_v2_from_legacy(dict legacy)
        -dict _enrich_advice_and_staleness(dict state_v2)
    }

    class preflight_py {
        <<helper>>
        +async preflight(ctx, bool requires_no_tests, bool wait_for_no_compile, bool refresh_if_dirty, float max_wait_s) MCPResponse
    }

    class refresh_unity_py {
        <<tool>>
        +async refresh_unity(Context ctx, string mode, string scope, string compile, bool wait_for_ready) MCPResponse
    }

    class RefreshUnity_cs {
        <<tool>>
        +Task~object~ HandleCommand(JObject @params)
        -Task WaitForUnityReadyAsync(TimeSpan timeout)
    }

    EditorStateV2 ..> EditorStateCache : uses
    editor_state_v2_py ..> ExternalChangesScanner : uses
    preflight_py ..> editor_state_v2_py : calls
    preflight_py ..> refresh_unity_py : optional refresh_if_dirty
    refresh_unity_py ..> RefreshUnity_cs : calls Unity tool
    refresh_unity_py ..> editor_state_v2_py : polls until ready
    refresh_unity_py ..> ExternalChangesScanner : clear_dirty
    ExternalChangesScanner o-- ExternalChangesState

File-Level Changes

Change	Details	Files
Add Unity-side editor readiness snapshot and test run tracking to support async testing and robust preflight checks.	Introduce EditorStateCache and EditorStateV2 resource to expose a cached editor_state@2 snapshot with compilation, activity, tests, and asset state. Add TestRunStatus and TestJobManager to track current test runs, persist minimal job metadata across domain reloads, and expose serializable job payloads with progress, failures, and stuck detection. Extend TestRunnerService callbacks to mark test runs started/finished, compute total leaf count, and feed per-test progress/failure events into TestJobManager.	`MCPForUnity/Editor/Services/EditorStateCache.cs` `MCPForUnity/Editor/Resources/Editor/EditorStateV2.cs` `MCPForUnity/Editor/Services/TestRunStatus.cs` `MCPForUnity/Editor/Services/TestJobManager.cs` `MCPForUnity/Editor/Services/TestRunnerService.cs`
Add async test tools on Unity and server side (run_tests_async/get_test_job) and improve busy semantics for synchronous run_tests.	Implement RunTestsAsync and GetTestJob editor tools that start a Unity TestRunner run asynchronously and allow polling by job_id with optional filter options and detail flags. Add server-side test_jobs tools run_tests_async and get_test_job that forward parameters to Unity, leverage preflight, and normalize error handling. Update run_tests server tool to run preflight before dispatch, and to special-case Unity's 'test run already in progress' error into a structured busy/retry MCPResponse.	`MCPForUnity/Editor/Tools/RunTestsAsync.cs` `MCPForUnity/Editor/Tools/GetTestJob.cs` `Server/src/services/tools/test_jobs.py` `Server/src/services/tools/run_tests.py`
Introduce editor_state_v2 resource on the server plus an external filesystem change scanner and shared preflight helper, then integrate these into relevant tools.	Add editor_state_v2 server resource that prefers Unity get_editor_state_v2, falls back to legacy get_editor_state, enriches with advice/staleness, infers instance_id, and merges external change info. Implement ExternalChangesScanner to track per-instance external filesystem modifications (including file: package roots) and expose dirty/last-seen timestamps, with unit tests. Add a shared preflight helper that consults editor_state_v2 to optionally refresh if dirty, block while tests are running, and wait out compilation/domain reload with a bounded retry loop, then wire it into run_tests, manage_asset, manage_gameobject, and manage_scene.	`Server/src/services/resources/editor_state_v2.py` `Server/src/services/state/external_changes_scanner.py` `Server/src/services/tools/preflight.py` `Server/src/services/tools/manage_asset.py` `Server/src/services/tools/manage_gameobject.py` `Server/src/services/tools/manage_scene.py` `Server/tests/integration/test_external_changes_scanner.py` `Server/tests/integration/test_editor_state_v2_contract.py`
Add refresh_unity tool on both Unity and server sides, including robustness around compilation/disconnects and clearing external-dirty state.	Create RefreshUnity editor tool that can force/conditional AssetDatabase refresh, optionally request script compilation, and optionally wait for the editor to become ready, with structured error responses on timeouts. Expose refresh_unity as a server tool that forwards parameters, treats disconnect+retry hints as recoverable when waiting, optionally polls editor_state_v2 until ready, and clears external_changes_scanner dirty state for the instance. Add integration tests to assert refresh_unity registration and retry-recovery behavior when Unity disconnects mid-refresh.	`MCPForUnity/Editor/Tools/RefreshUnity.cs` `Server/src/services/tools/refresh_unity.py` `Server/tests/integration/test_refresh_unity_registration.py` `Server/tests/integration/test_refresh_unity_retry_recovery.py`
Mitigate background test throttling and clarify behavior of compilation-triggering domain reload tests, including CI ordering.	Add TestRunnerNoThrottle service that listens to TestRunnerApi callbacks, sets Editor Interaction Mode to 'No Throttling' around test runs, and restores previous settings using SessionState across domain reloads. Adjust DomainReloadResilienceTests to use ForceSynchronousImport AssetDatabase.Refresh calls and clean up compilation state in TearDown to avoid leaving background compilation running. Update unity-tests GitHub workflow to run domain_reload tests as a separate, first step with an explicit category filter so they run reliably in CI.	`MCPForUnity/Editor/Services/TestRunnerNoThrottle.cs` `TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/DomainReloadResilienceTests.cs` `.github/workflows/unity-tests.yml` `docs/README-DEV.md`
Tighten Editor/asset tools and tests to reduce flakiness by using synchronous refresh/import and avoiding heavy folder churn.	Switch various editor tools that create/update assets, shaders, scenes, scripts, prefabs, and screenshots to use AssetDatabase.Refresh with ForceSynchronousImport/ForceUpdate so Unity sees changes immediately. Refactor ManageScriptableObjectTests to use a per-test runRoot instead of deleting the entire TempRoot, and split nested-folder creation and patching into separate tests to reduce reimport churn. Guard manage_asset and manage_gameobject server tools with preflight to wait for compilation to finish and refresh when external changes are detected.	`MCPForUnity/Editor/Tools/ManageScene.cs` `MCPForUnity/Editor/Tools/ManageShader.cs` `MCPForUnity/Editor/Tools/ManageAsset.cs` `MCPForUnity/Editor/Tools/ManageGameObject.cs` `MCPForUnity/Editor/Tools/ManageScript.cs` `MCPForUnity/Editor/Tools/Prefabs/ManagePrefabs.cs` `TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/ManageScriptableObjectTests.cs` `Server/src/services/tools/manage_asset.py` `Server/src/services/tools/manage_gameobject.py`
Add tests to validate new behaviors and update documentation/README for the new tools and version.	Add run_tests busy semantics test, async test_jobs forwarding tests, and ensure refresh_unity and editor_state_v2 resources are registered. Update README and dev docs with new tool list (run_tests_async/get_test_job/refresh_unity), new screenshot and version tags, and updated UI labels for HTTP server controls.	`Server/tests/integration/test_run_tests_busy_semantics.py` `Server/tests/integration/test_test_jobs_async.py` `README.md` `docs/README-DEV.md`

Possibly linked issues

#: PR implements async test-running tools and infrastructure, effectively adding the requested Unity test execution capability.

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

coderabbitai · 2026-01-03T20:03:24Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds a v2 editor-state snapshot path with a cached EditorStateCache, server-side editor_state_v2 resource, external-changes scanner, preflight/tool guards, async test-job endpoints and client tools (run_tests_async/get_test_job/refresh_unity), no-throttle test-run support, TestJobManager/TestRunStatus, synchronous AssetDatabase refreshes across editor tools, and a CI domain-reload test step.

Changes

Cohort / File(s)	Change summary
CI workflow `.github/workflows/unity-tests.yml`	Added an explicit domain-reload test step (category filter) prior to standard tests and inlined Checkout/Cache usage.
Editor: state & cache `EditorStateV2` `\`MCPForUnity/Editor/Resources/Editor/EditorStateV2.cs``,` `MCPForUnity/Editor/Resources/Editor/EditorStateV2.cs.meta``	New Mcp resource handler `get_editor_state_v2` calling `EditorStateCache.GetSnapshot()` and returning Success/Error responses.
Editor: state cache implementation `\`MCPForUnity/Editor/Services/EditorStateCache.cs``,` `MCPForUnity/Editor/Services/EditorStateCache.cs.meta``	New EditorStateCache building schema_v2 snapshots, tracking compile/domain/test/asset activity, periodic refresh, and public `GetSnapshot()` API.
Editor: test-run tracking `\`MCPForUnity/Editor/Services/TestJobManager.cs``,` `.../TestJobManager.cs.meta``,` `MCPForUnity/Editor/Services/TestRunStatus.cs``,` `.../TestRunStatus.cs.meta``	New TestJob model, TestJobManager for lifecycle/persistence/serialization, and TestRunStatus for thread-safe run-state tracking.
Editor: test-run helpers & service `\`MCPForUnity/Editor/Services/TestRunnerNoThrottle.cs``,` `.../TestRunnerNoThrottle.cs.meta``,` `MCPForUnity/Editor/Services/TestRunnerService.cs``	New TestRunnerNoThrottle (forces no-throttle prefs during tests) and TestRunnerService now reports richer run/test events to TestJobManager.
Editor: new tools `\`MCPForUnity/Editor/Tools/RunTestsAsync.cs``,` `.../RunTestsAsync.cs.meta``,` `MCPForUnity/Editor/Tools/GetTestJob.cs``,` `.../GetTestJob.cs.meta``,` `MCPForUnity/Editor/Tools/RefreshUnity.cs``,` `.../RefreshUnity.cs.meta``	Added RunTestsAsync (starts async job), GetTestJob (query job), and RefreshUnity (refresh + optional compile + wait-for-ready) tool handlers and metadata.
Editor: tools — synchronous refresh changes `\`MCPForUnity/Editor/Tools/ManageAsset.cs``,` `.../ManageGameObject.cs``,` `.../ManageScene.cs``,` `.../ManageScript.cs``,` `.../ManageShader.cs``,` `.../Prefabs/ManagePrefabs.cs``	Replaced `AssetDatabase.Refresh()` with `AssetDatabase.Refresh(ImportAssetOptions.ForceSynchronousImport)` to force synchronous imports after asset ops.
Editor UI / connection `\`MCPForUnity/Editor/Windows/Components/Connection/McpConnectionSection.cs``	Dynamic HTTP server startup timeout (longer in Dev Mode) when auto-starting a session.
Server: editor_state_v2 `\`Server/src/services/resources/editor_state_v2.py``	New server resource `get_editor_state_v2`: prefers v2 snapshot, falls back to legacy, infers instance_id, enriches advice/staleness, and augments with external-change info.
Server: external changes scanner `\`Server/src/services/state/external_changes_scanner.py``	New ExternalChangesScanner singleton scanning project + manifest `file:` roots to track per-instance dirty/last-seen timestamps and expose clear/update APIs.
Server: preflight & tool guards `\`Server/src/services/tools/preflight.py``, updates in` `Server/src/services/tools/manage_asset.py``,` `.../manage_gameobject.py``,` `.../manage_scene.py``	New preflight(ctx, ...) that loads editor_state_v2 and optionally waits/guards against compile/tests/dirty state; tools call preflight and may early-return a busy response.
Server: refresh & test-job endpoints `\`Server/src/services/tools/refresh_unity.py``,` `Server/src/services/tools/test_jobs.py``,` `Server/src/services/tools/run_tests.py``	Added refresh_unity (wraps editor refresh + optional wait-for-ready + clears external dirty state), run_tests_async/get_test_job server wrappers, and run_tests now performs preflight and returns structured busy responses when tests already run.
Server: manage_asset preflight integration `\`Server/src/services/tools/manage_asset.py``	manage_asset now calls `preflight(wait_for_no_compile=True, refresh_if_dirty=True)` before parsing inputs.
Server tests added/updated `\`Server/tests/integration/*``	New tests for editor_state_v2 contract, external_changes_scanner behavior, refresh_unity retry recovery, run_tests busy semantics, async test-job tooling, and tool registration.
Unity project tests & polish `\`TestProjects/.../DomainReloadResilienceTests.cs``,` `.../ManageScriptableObjectTests.cs``	Replaced Refresh calls with synchronous import option; improved test isolation and cleanup for domain-reload resilience.
Docs & README `\`README.md``,` `docs/README-DEV.md``	README updates (client list, v8.6.0), tool docs rework (run_tests_async/get_test_job/run_tests), and dev note about domain-reload backgrounding.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client as Client/API call
participant Server as MCP Server (editor_state_v2)
participant Transport as Unity Transport
participant Scanner as ExternalChangesScanner
participant Unity as Unity Editor Instance

Note over Client,Server: Request editor state v2
Client->>Server: GET editor_state_v2(ctx)
alt Server requests v2 snapshot
Server->>Transport: send get_editor_state_v2
Transport->>Unity: deliver command
Unity-->>Transport: v2 snapshot (or non-v2)
Transport-->>Server: response
else fallback to legacy
Server->>Transport: send legacy get_editor_state
Transport->>Unity: deliver command
Unity-->>Transport: legacy snapshot
Transport-->>Server: response
Server->>Server: _build_v2_from_legacy(legacy)
end
Server->>Scanner: external_changes_scanner.update_and_get(instance_id)
Scanner-->>Server: external_changes payload
Server->>Server: _enrich_advice_and_staleness(state_v2)
Server-->>Client: MCPResponse(success, data: state_v2 + advice + external_changes)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Add testing and move menu items to resources #316 — Related test-runner infrastructure changes and async test-job/tool wiring.
Feature/run tests summary clean #501 — Related updates to test result handling, includeDetails/includeFailedTests behavior.
Unity MCP CI Test Improvements #452 — Related CI workflow edits affecting Unity test steps and domain-reload handling.

Suggested reviewers

Scriptwonder
msanatan

Poem

"Hello from a rabbit in a dev burrow,
I cached a snapshot, quick as an arrow,
Tests run async, no throttle, hooray! 🐇
Scanners sniff changes, the server will say:
'Fresh state, ready — hop on, let's play!'"

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 12.93% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title directly summarizes the three main features added: async test infrastructure, editor readiness status tracking, and a new refresh_unity tool. It is concise, specific, and clearly reflects the primary changes in the changeset.

✨ Finishing touches

📝 Generate docstrings

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a946be9 and e5774bb.

📒 Files selected for processing (1)

MCPForUnity/Editor/Services/TestJobManager.cs

🚧 Files skipped from review as they are similar to previous changes (1)

MCPForUnity/Editor/Services/TestJobManager.cs

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Sourcery review

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sourcery-ai

Hey - I've found 4 issues, and left some high level feedback:

In EditorStateCache.BuildSnapshot, activity.since_unix_ms is always set to the current observation time rather than the time the current phase began, which makes it hard for callers to know how long Unity has been compiling/running tests; consider tracking a per-phase start timestamp so this field reflects the duration of the current activity.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `EditorStateCache.BuildSnapshot`, `activity.since_unix_ms` is always set to the current observation time rather than the time the current phase began, which makes it hard for callers to know how long Unity has been compiling/running tests; consider tracking a per-phase start timestamp so this field reflects the duration of the current activity.

## Individual Comments

### Comment 1
<location> `Server/src/services/tools/manage_asset.py:53` </location>
<code_context>

+    # Best-effort guard: if Unity is compiling/reloading or known external changes are pending,
+    # wait/refresh to avoid stale reads and flaky timeouts.
+    await preflight(ctx, wait_for_no_compile=True, refresh_if_dirty=True)
+
     def _parse_properties_string(raw: str) -> tuple[dict[str, Any] | None, str | None]:
</code_context>

<issue_to_address>
**issue (bug_risk):** The preflight result is ignored, so busy/retry signals are effectively dropped.

`preflight` is meant to short‑circuit with an `MCPResponse` when the tool should not proceed (e.g., compiling, tests running). Here (and in similar sites like `manage_gameobject` / `manage_scene`) the return value is awaited but ignored, so tools run even when `preflight` signals `busy`. Please handle and return the response when it is non‑successful, e.g.:

```python
resp = await preflight(ctx, wait_for_no_compile=True, refresh_if_dirty=True)
if isinstance(resp, MCPResponse) and not resp.success:
    return resp
```

so callers receive the intended retry signal instead of executing while Unity is in an invalid state.
</issue_to_address>

### Comment 2
<location> `MCPForUnity/Editor/Services/TestJobManager.cs:304-279` </location>
<code_context>
+            return jobId;
+        }
+
+        public static void FinalizeCurrentJobFromRunFinished(TestRunResult resultPayload)
+        {
+            long now = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds();
+            lock (LockObj)
+            {
+                if (string.IsNullOrEmpty(_currentJobId) || !Jobs.TryGetValue(_currentJobId, out var job))
+                {
+                    return;
+                }
+
+                job.LastUpdateUnixMs = now;
+                job.FinishedUnixMs = now;
+                job.Status = TestJobStatus.Succeeded;
+                job.Error = null;
+                job.Result = resultPayload;
+                job.CurrentTestFullName = null;
+                _currentJobId = null;
+            }
+            PersistToSessionState(force: true);
+        }
+
</code_context>

<issue_to_address>
**issue (bug_risk):** Job status is always marked Succeeded in RunFinished, even when tests fail.

Here the job is always marked `TestJobStatus.Succeeded`, regardless of what `resultPayload` contains. That will mislead clients that read `job.Status` instead of inspecting `TestRunResult`. Consider deriving `job.Status` from `resultPayload` (e.g., mark as `Failed` when any tests fail/error) to keep this path consistent with the main completion logic.
</issue_to_address>

### Comment 3
<location> `Server/tests/integration/test_editor_state_v2_contract.py:8-17` </location>
<code_context>
+@pytest.mark.asyncio
</code_context>

<issue_to_address>
**suggestion (testing):** Add a test that exercises the legacy `get_editor_state` fallback path and verifies advice/staleness enrichment.

This currently only tests the v2 payload path from Unity. Please add a second test that stubs `send_with_unity_instance` to return the legacy `{success, data: { isCompiling, ... }}` shape without `schema_version`, and then assert that:

- The response `schema_version` is set to `unity-mcp/editor_state@2`.
- `advice.ready_for_tools` / `advice.blocking_reasons` are derived from `isCompiling` / `isUpdating` / `tests.is_running`.
- `staleness.age_ms` and `staleness.is_stale` are present.

This will exercise the fallback path and protect the mapping in `_build_v2_from_legacy` and `_enrich_advice_and_staleness`, which is important for older Unity plugins.

Suggested implementation:

```python
    resources = get_registered_resources()


@pytest.mark.asyncio
async def test_editor_state_legacy_fallback_enriches_advice_and_staleness(monkeypatch):
    """
    Ensure the legacy get_editor_state payload (no schema_version) is mapped into v2 shape,
    with advice/staleness enrichment.
    """
    import services.resources.editor_state_v2 as editor_state_v2

    # Legacy payload shape returned by Unity (no schema_version)
    legacy_response = {
        "success": True,
        "data": {
            "isCompiling": True,
            "isUpdating": False,
            "tests": {"is_running": True},
            # Include a timestamp or similar field if the resource uses it for staleness
            "timestamp_ms": 1_000,
        },
    }

    async def fake_send_with_unity_instance(*args, **kwargs):
        return legacy_response

    # Stub out the Unity call to return the legacy shape
    monkeypatch.setattr(
        editor_state_v2,
        "send_with_unity_instance",
        fake_send_with_unity_instance,
    )

    ctx = DummyContext()

    # Exercise the resource, which should detect the legacy shape and fall back
    # through _build_v2_from_legacy + _enrich_advice_and_staleness.
    result = await editor_state_v2.get_editor_state(ctx)

    # 1) v2 schema_version is injected
    assert result["schema_version"] == "unity-mcp/editor_state@2"

    # 2) Advice is enriched from isCompiling / isUpdating / tests.is_running
    advice = result.get("advice") or {}
    assert "ready_for_tools" in advice
    assert "blocking_reasons" in advice

    # When compiling or tests running, we should have at least one blocking reason
    assert advice["ready_for_tools"] is False
    assert isinstance(advice["blocking_reasons"], list)
    assert advice["blocking_reasons"], "Expected at least one blocking reason when editor is busy"

    # 3) Staleness is populated
    staleness = result.get("staleness") or {}
    assert "age_ms" in staleness
    assert "is_stale" in staleness

```

The above implementation assumes:

1. The resource module is `services.resources.editor_state_v2` and exposes an async entrypoint `get_editor_state(context, **kwargs)`. If the entrypoint name or signature differs (e.g. `editor_state`, `call`, or requires parameters like `instance_id`), update the `await editor_state_v2.get_editor_state(ctx)` call accordingly.
2. The Unity RPC function to stub is named `send_with_unity_instance` and is imported/used directly in `editor_state_v2`. If the function is referenced via another module (e.g. `unity_client.send_with_unity_instance`), adjust the `monkeypatch.setattr` target to match the actual attribute being used.
3. The legacy payload keys (`isCompiling`, `isUpdating`, `tests.is_running`, and `timestamp_ms`) match what `_build_v2_from_legacy` expects. If your actual legacy schema differs, update the `legacy_response["data"]` shape so that the fallback path and enrichment logic are exercised correctly.
4. If `DummyContext` requires constructor arguments or offers a helper to invoke resources (e.g. `await DummyContext().call("unity://editor_state")`), change the instantiation and call site to match the existing test patterns in this test module or neighboring tests.
</issue_to_address>

### Comment 4
<location> `Server/tests/integration/test_external_changes_scanner.py:6-15` </location>
<code_context>
+def test_external_changes_scanner_marks_dirty_and_clears(tmp_path, monkeypatch):
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding a small test that covers the scan-interval throttling behavior.

The existing tests cover dirty/clean transitions and `file:` resolution, but not the `scan_interval_ms` throttling in `update_and_get`.

A focused test could:
- Instantiate the scanner with a non-zero `scan_interval_ms` and set a project root.
- Trigger an initial scan, modify a file, then immediately call `update_and_get` again.
- Assert that the second call does not mark the project dirty until time is advanced (or the scanner is recreated with `scan_interval_ms=0`).

This would guard against regressions that remove or bypass throttling, which helps keep scans cheap on large projects.

Suggested implementation:

```python
import os
import time
from pathlib import Path


def test_external_changes_scanner_marks_dirty_and_clears(tmp_path, monkeypatch):
    # Ensure the scanner is active for this unit-style test (not gated by PYTEST_CURRENT_TEST).
    monkeypatch.delenv("PYTEST_CURRENT_TEST", raising=False)

    from services.state.external_changes_scanner import ExternalChangesScanner

    # Create a minimal Unity-like layout
    root = tmp_path / "Project"
    (root / "Assets").mkdir(parents=True)
    (root / "ProjectSettings").mkdir(parents=True)
    (root / "Packages").mkdir(parents=True)


def test_external_changes_scanner_respects_scan_interval_throttling(tmp_path, monkeypatch):
    # Ensure the scanner is active for this unit-style test (not gated by PYTEST_CURRENT_TEST).
    monkeypatch.delenv("PYTEST_CURRENT_TEST", raising=False)

    from services.state.external_changes_scanner import ExternalChangesScanner

    # Non-zero scan interval so that repeated scans are throttled.
    scan_interval_ms = 1000

    # Create a minimal Unity-like layout
    root = tmp_path / "Project"
    assets_dir = root / "Assets"
    assets_dir.mkdir(parents=True)
    (root / "ProjectSettings").mkdir(parents=True)
    (root / "Packages").mkdir(parents=True)

    # Create a file that the scanner would consider part of the project.
    watched_file = assets_dir / "foo.txt"
    watched_file.write_text("initial")

    # Instantiate the scanner with throttling enabled.
    scanner = ExternalChangesScanner(scan_interval_ms=scan_interval_ms)
    # Set the project root so the scanner knows what to watch.
    scanner.set_project_root(root)

    # Perform the initial scan; no external changes yet, so the project is clean.
    initial_state = scanner.update_and_get()
    assert not initial_state["dirty"]

    # Modify the file immediately after the initial scan.
    watched_file.write_text("modified")

    # Capture the current time and ensure the scanner sees that time for both calls
    # so that the second call is still within the scan_interval_ms window.
    now = time.time()
    monkeypatch.setattr(
        "services.state.external_changes_scanner.time.time",
        lambda: now,
    )

    # Because we are still within scan_interval_ms, this call should be throttled
    # and must NOT mark the project as dirty yet, even though the file changed.
    throttled_state = scanner.update_and_get()
    assert not throttled_state["dirty"]

    # Advance time beyond the scan interval and ensure the scanner sees the new time.
    advanced_now = now + (scan_interval_ms / 1000.0) + 0.001
    monkeypatch.setattr(
        "services.state.external_changes_scanner.time.time",
        lambda: advanced_now,
    )

    # Now that the scan interval has elapsed, the scanner should pick up the change
    # and mark the project as dirty.
    after_interval_state = scanner.update_and_get()
    assert after_interval_state["dirty"]

```

If the actual `ExternalChangesScanner` API differs, adjust the test to match:
1. If the constructor signature is different, pass `scan_interval_ms` using the correct parameter name or via configuration.
2. If there is no `set_project_root(root)` method, replace it with the appropriate way to configure the project root (e.g. passing `root` into the constructor or into `update_and_get(root)`).
3. If `update_and_get()` does not return a mapping with a `"dirty"` key, change the assertions to use the real shape (e.g. `state.dirty`, `state.is_dirty`, or a simple boolean return value).
4. If the scanner module does not use `time.time` via `services.state.external_changes_scanner.time.time`, update the `monkeypatch.setattr` targets to match how time is accessed inside `ExternalChangesScanner`.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

Server/src/services/tools/manage_asset.py

MCPForUnity/Editor/Services/TestJobManager.cs

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (15)

TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/ManageScriptableObjectTests.cs (1)
42-43: Optional: Consider per-run isolation for test materials.

The materials are created at TempRoot level rather than under _runRoot. While the unique GUIDs prevent conflicts, moving them under _runRoot would provide complete per-run isolation and simplify cleanup.
🔎 Suggested refactor
-            _matAPath = $"{TempRoot}/MatA_{Guid.NewGuid():N}.mat";
-            _matBPath = $"{TempRoot}/MatB_{Guid.NewGuid():N}.mat";
+            _matAPath = $"{_runRoot}/MatA.mat";
+            _matBPath = $"{_runRoot}/MatB.mat";
This eliminates the need for GUIDs in the material names since _runRoot already provides uniqueness.
Server/tests/integration/test_refresh_unity_registration.py (1)
4-12: LGTM! Simple and effective registration test.

The test correctly verifies that importing the refresh_unity module triggers tool registration.
Optional: Remove unnecessary `noqa` directive

The # noqa: F401 directive on line 9 can be removed since the F401 rule (unused import) is not enabled in your Ruff configuration.
-    import services.tools.refresh_unity  # noqa: F401
+    import services.tools.refresh_unity
.github/workflows/unity-tests.yml (1)
41-54: LGTM! Explicit domain reload test step added correctly.

The new step properly isolates domain reload tests using the -testCategory domain_reload filter. Running these tests first (before the main test suite) ensures they execute in a clean Unity instance, which is appropriate for domain reload testing.

Optional consideration: The domain reload test results aren't explicitly uploaded as artifacts (only the main tests step results are uploaded on line 71). If these tests can fail independently and you want their results preserved, consider adding:
- uses: actions/upload-artifact@v4
  if: always()
  with:
    name: Domain reload test results for ${{ matrix.testMode }}
    path: ${{ steps.domain-tests.outputs.artifactsPath }}
This is optional and depends on whether you need separate artifact preservation for domain reload tests.
Server/src/services/tools/run_tests.py (1)
106-119: Structured busy response improves client retry behavior.

The detection of "test run is already in progress" and conversion to a structured MCPResponse with hint="retry" and retry_after_ms=5000 is a good improvement over generic failures.

However, line 118 passes the Unity response directly to MCPResponse(**response). If the response dict contains unexpected keys, this could raise a validation error. Consider extracting only the expected fields.
🔎 Optional: Explicit field extraction for safety
-        return MCPResponse(**response)
+        return MCPResponse(
+            success=response.get("success", False),
+            error=response.get("error"),
+            message=response.get("message"),
+        )
MCPForUnity/Editor/Services/TestRunnerNoThrottle.cs (1)

76-104: Consider edge case: settings not restored after Unity crash.

If Unity crashes during a test run, EditorPrefs will retain the "No Throttling" settings (ApplicationIdleTime=0, InteractionMode=1) permanently since RestoreThrottling() never executes. This could surprise users who notice unusual battery drain or CPU usage after a crash.

A possible mitigation is to check and restore settings on editor startup if IsTestRunActive() is false but AreSettingsCaptured() is true (orphaned state).
Server/src/services/tools/preflight.py (2)
46-56: Consider logging exceptions for debugging.

The broad exception handling is appropriate for resilience (preflight shouldn't block tools if state is unavailable), but logging the exception at debug level would help diagnose issues in production:
     except Exception:
-        # If we cannot determine readiness, fall back to proceeding (tools already contain retry logic).
+        # If we cannot determine readiness, fall back to proceeding (tools already contain retry logic).
+        import logging
+        logging.debug("preflight: failed to load editor state", exc_info=True)
         return None
This applies similarly to the other except Exception blocks. Alternatively, if the codebase has a centralized logger, use that instead.

63-71: Best-effort refresh is reasonable, but consider logging failures.

The try-except-pass pattern for refresh_unity is documented as best-effort, which is appropriate. However, silently swallowing exceptions makes debugging difficult. Consider logging at debug level.
Server/src/services/tools/refresh_unity.py (2)
57-69: Consider moving the import outside the loop.

The from services.resources.editor_state_v2 import get_editor_state_v2 import on line 60 is inside the while loop. While Python caches imports, moving it before the loop (or at the top of the function) would be cleaner.
🔎 Proposed refactor
     if wait_for_ready:
         timeout_s = 60.0
         start = time.monotonic()
-        from services.resources.editor_state_v2 import get_editor_state_v2
+        # Import at function level or move to top
+        from services.resources.editor_state_v2 import get_editor_state_v2 as _get_state_v2

         while time.monotonic() - start < timeout_s:
-            state_resp = await get_editor_state_v2(ctx)
+            state_resp = await _get_state_v2(ctx)
71-79: Consider logging exceptions in best-effort cleanup blocks.

While the silent except Exception: pass is intentional for non-critical cleanup, logging at debug level would aid troubleshooting without affecting the control flow.
🔎 Proposed refactor
     try:
         from services.resources.editor_state_v2 import _infer_single_instance_id

         inst = unity_instance or await _infer_single_instance_id(ctx)
         if inst:
             external_changes_scanner.clear_dirty(inst)
-    except Exception:
-        pass
+    except Exception as exc:
+        # Best-effort cleanup; log but don't fail the refresh operation.
+        import logging
+        logging.getLogger(__name__).debug("Failed to clear external dirty flag: %s", exc)
MCPForUnity/Editor/Tools/RunTestsAsync.cs (1)

85-105: ParseStringArray duplicates logic from RunTests.cs.

This local function closely mirrors the ParseStringArray method in RunTests.cs (lines 136-168). Consider extracting to a shared utility to reduce duplication.

Note: The implementations have a subtle difference - this version uses .Values<string>() (line 99) while RunTests.cs manually filters by JTokenType.String. The .Values<string>() approach may return null for non-string tokens rather than skipping them.
Server/src/services/tools/test_jobs.py (2)
33-41: _coerce_string_list duplicates the same function in run_tests.py.

This local function is identical to _coerce_string_list in run_tests.py (lines 65-73 per the relevant snippets). Consider importing from a shared location.
🔎 Proposed refactor
 from services.tools.preflight import preflight
+from services.tools.run_tests import _coerce_string_list
 import transport.unity_transport as unity_transport
 from transport.legacy.unity_connection import async_send_command_with_retry

 ...

-    def _coerce_string_list(value) -> list[str] | None:
-        if value is None:
-            return None
-        if isinstance(value, str):
-            return [value] if value.strip() else None
-        if isinstance(value, list):
-            result = [str(v).strip() for v in value if v and str(v).strip()]
-            return result if result else None
-        return None
-
     params: dict[str, Any] = {"mode": mode}
     if (t := _coerce_string_list(test_names)):
64-66: Response handling pattern is duplicated.

Lines 64-66 and 90-92 share identical logic for converting failed dict responses to MCPResponse. This is a minor duplication but could be a shared helper if this pattern is used elsewhere.

Also applies to: 90-92
MCPForUnity/Editor/Tools/RefreshUnity.cs (1)
59-71: Consider making the scripts-only path more explicit.

The empty block at lines 61-65 with a comment is a bit unusual. Consider restructuring to make the intent clearer.
🔎 Proposed refactor
                 if (shouldRefresh)
                 {
-                    if (string.Equals(scope, "scripts", StringComparison.OrdinalIgnoreCase))
+                    bool isScriptsOnly = string.Equals(scope, "scripts", StringComparison.OrdinalIgnoreCase);
+                    if (!isScriptsOnly)
                     {
-                        // For scripts, requesting compilation is usually the meaningful action.
-                        // We avoid a heavyweight full refresh by default.
-                    }
-                    else
-                    {
+                        // Full asset refresh for non-scripts scopes
                         AssetDatabase.Refresh(ImportAssetOptions.ForceUpdate | ImportAssetOptions.ForceSynchronousImport);
                         refreshTriggered = true;
                     }
+                    // For scripts-only, we skip the heavyweight refresh and rely on compile request below
                 }
Server/src/services/state/external_changes_scanner.py (1)
213-216: Silent exception handling for manifest parsing.

Similar to other best-effort blocks in this PR, consider adding debug-level logging for troubleshooting.
🔎 Proposed refactor
         try:
             paths.extend(self._resolve_manifest_extra_roots(root, st))
-        except Exception:
-            pass
+        except Exception as exc:
+            import logging
+            logging.getLogger(__name__).debug("Failed to resolve manifest extra roots: %s", exc)
Server/src/services/resources/editor_state_v2.py (1)
24-66: The ctx parameter in _infer_single_instance_id is intentionally unused.

The static analysis flags ctx as unused (ARG001), but it's passed for interface consistency and potential future use. Consider adding a comment or using _ctx naming convention to signal intentional non-use.
🔎 Proposed refactor
-async def _infer_single_instance_id(ctx: Context) -> str | None:
+async def _infer_single_instance_id(_ctx: Context) -> str | None:
     """
     Best-effort: if exactly one Unity instance is connected, return its Name@hash id.
     This makes editor_state outputs self-describing even when no explicit active instance is set.
     """

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 191b730 and 766112b.

⛔ Files ignored due to path filters (2)

Server/uv.lock is excluded by !**/*.lock
docs/images/unity-mcp-ui-v8.6.png is excluded by !**/*.png

📒 Files selected for processing (43)

.github/workflows/unity-tests.yml
MCPForUnity/Editor/Resources/Editor/EditorStateV2.cs
MCPForUnity/Editor/Resources/Editor/EditorStateV2.cs.meta
MCPForUnity/Editor/Services/EditorStateCache.cs
MCPForUnity/Editor/Services/EditorStateCache.cs.meta
MCPForUnity/Editor/Services/TestJobManager.cs
MCPForUnity/Editor/Services/TestJobManager.cs.meta
MCPForUnity/Editor/Services/TestRunStatus.cs
MCPForUnity/Editor/Services/TestRunStatus.cs.meta
MCPForUnity/Editor/Services/TestRunnerNoThrottle.cs
MCPForUnity/Editor/Services/TestRunnerNoThrottle.cs.meta
MCPForUnity/Editor/Services/TestRunnerService.cs
MCPForUnity/Editor/Tools/GetTestJob.cs
MCPForUnity/Editor/Tools/GetTestJob.cs.meta
MCPForUnity/Editor/Tools/ManageAsset.cs
MCPForUnity/Editor/Tools/ManageGameObject.cs
MCPForUnity/Editor/Tools/ManageScene.cs
MCPForUnity/Editor/Tools/ManageScript.cs
MCPForUnity/Editor/Tools/ManageShader.cs
MCPForUnity/Editor/Tools/Prefabs/ManagePrefabs.cs
MCPForUnity/Editor/Tools/RefreshUnity.cs
MCPForUnity/Editor/Tools/RefreshUnity.cs.meta
MCPForUnity/Editor/Tools/RunTestsAsync.cs
MCPForUnity/Editor/Tools/RunTestsAsync.cs.meta
README.md
Server/src/services/resources/editor_state_v2.py
Server/src/services/state/external_changes_scanner.py
Server/src/services/tools/manage_asset.py
Server/src/services/tools/manage_gameobject.py
Server/src/services/tools/manage_scene.py
Server/src/services/tools/preflight.py
Server/src/services/tools/refresh_unity.py
Server/src/services/tools/run_tests.py
Server/src/services/tools/test_jobs.py
Server/tests/integration/test_editor_state_v2_contract.py
Server/tests/integration/test_external_changes_scanner.py
Server/tests/integration/test_refresh_unity_registration.py
Server/tests/integration/test_refresh_unity_retry_recovery.py
Server/tests/integration/test_run_tests_busy_semantics.py
Server/tests/integration/test_test_jobs_async.py
TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/DomainReloadResilienceTests.cs
TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/ManageScriptableObjectTests.cs
docs/README-DEV.md

🧰 Additional context used

🧠 Learnings (12)

📚 Learning: 2025-11-05T18:23:12.349Z

Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 368
File: MCPForUnity/UnityMcpServer~/src/resources/menu_items.py:15-15
Timestamp: 2025-11-05T18:23:12.349Z
Learning: In Unity MCP, the `name` parameter in the `mcp_for_unity_resource` decorator is the external API name exposed to MCP clients (LLMs, AI agents). The command string passed to `async_send_command_with_retry` or `async_send_with_unity_instance` (e.g., "get_menu_items") is the internal command identifier that must match the C# side. These are decoupled, allowing external API naming to evolve independently of internal command routing.

Applied to files:

Server/src/services/tools/manage_scene.py
README.md
Server/src/services/tools/refresh_unity.py
MCPForUnity/Editor/Tools/RefreshUnity.cs
Server/src/services/tools/manage_gameobject.py
Server/src/services/tools/manage_asset.py

📚 Learning: 2025-09-05T16:22:04.960Z

Learnt from: dsarno
Repo: CoplayDev/unity-mcp PR: 265
File: README.md:204-204
Timestamp: 2025-09-05T16:22:04.960Z
Learning: In the Unity MCP project, the ServerInstaller.cs creates a symlink from ~/Library/AppSupport to ~/Library/Application Support on macOS to mitigate argument parsing and quoting issues in some MCP clients. The README documentation should use the shortened AppSupport path, not the full "Application Support" path with spaces.

Applied to files:

docs/README-DEV.md
README.md

📚 Learning: 2025-09-05T16:22:04.960Z

Learnt from: dsarno
Repo: CoplayDev/unity-mcp PR: 265
File: README.md:204-204
Timestamp: 2025-09-05T16:22:04.960Z
Learning: In the Unity MCP project, the ServerInstaller.cs creates a symlink from ~/Library/AppSupport to ~/Library/Application Support on macOS to avoid spaces in paths that can cause arg parsing and quoting issues in some MCP clients. The README documentation should use the shortened AppSupport path, not the full "Application Support" path.

Applied to files:

docs/README-DEV.md
README.md

📚 Learning: 2025-12-29T15:23:17.871Z

Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 491
File: MCPForUnity/Editor/Windows/EditorPrefs/EditorPrefsWindow.cs:78-115
Timestamp: 2025-12-29T15:23:17.871Z
Learning: In the MCPForUnity project, the team prefers to rely on their testing process to catch UI initialization issues rather than adding defensive null checks for UI elements in editor windows.

Applied to files:

docs/README-DEV.md
README.md

📚 Learning: 2025-10-13T13:27:23.040Z

Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 316
File: TestProjects/UnityMCPTests/Assets/Tests/EditMode/Resources.meta:1-8
Timestamp: 2025-10-13T13:27:23.040Z
Learning: UnityMcpBridge is a legacy project kept for backwards compatibility; MCPForUnity is the only active Unity plugin project. GUID collisions between UnityMcpBridge and MCPForUnity are acceptable.

Applied to files:

docs/README-DEV.md
README.md
MCPForUnity/Editor/Tools/RefreshUnity.cs.meta

📚 Learning: 2025-09-04T01:01:11.927Z

Learnt from: dsarno
Repo: CoplayDev/unity-mcp PR: 260
File: UnityMcpBridge/UnityMcpServer~/src/server_version.txt:1-1
Timestamp: 2025-09-04T01:01:11.927Z
Learning: The UnityMcpBridge project is not maintaining changelogs yet, so don't suggest adding changelog entries for version bumps.

Applied to files:

README.md
MCPForUnity/Editor/Tools/RefreshUnity.cs.meta

📚 Learning: 2025-10-03T22:11:46.002Z

Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 301
File: docs/CUSTOM_TOOLS.md:54-62
Timestamp: 2025-10-03T22:11:46.002Z
Learning: In Unity MCP, the `description` parameter in the `mcp_for_unity_tool` decorator is technically optional but should always be included as a best practice. Without it, there's a higher chance that MCP clients will not parse the tool correctly. All Unity MCP tools should include the description in the decorator for compatibility.

Applied to files:

README.md
Server/src/services/tools/refresh_unity.py

📚 Learning: 2025-10-13T13:41:00.086Z

Learnt from: JohanHoltby
Repo: CoplayDev/unity-mcp PR: 309
File: MCPForUnity/Editor/Helpers/ServerInstaller.cs:478-508
Timestamp: 2025-10-13T13:41:00.086Z
Learning: In the MCPForUnityTools feature (MCPForUnity/Editor/Helpers/ServerInstaller.cs), the design intentionally forces users to have only one .py file per MCPForUnityTools folder to keep file tracking simple. Package-style tools (subdirectories with __init__.py) are not supported.

Applied to files:

README.md

📚 Learning: 2025-11-27T21:09:35.011Z

Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 401
File: MCPForUnity/Editor/Clients/Configurators/VSCodeConfigurator.cs:10-18
Timestamp: 2025-11-27T21:09:35.011Z
Learning: VS Code GitHub Copilot MCP configuration supports mcp.json files placed directly in the Code/User directory: %APPDATA%\Code\User\mcp.json on Windows, ~/Library/Application Support/Code/User/mcp.json on macOS, and ~/.config/Code/User/mcp.json on Linux. This is in addition to workspace-scoped .vscode/mcp.json files.

Applied to files:

README.md

📚 Learning: 2025-12-19T21:09:44.740Z

Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 468
File: Server/DOCKER_OVERVIEW.md:22-22
Timestamp: 2025-12-19T21:09:44.740Z
Learning: In the Unity MCP project, the Docker workflow (.github/workflows/publish-docker.yml) intentionally uses the DOCKER_USERNAME secret instead of hardcoding "msanatan" to allow users who fork the repository to publish to their own Docker Hub repositories, while the documentation (Server/DOCKER_OVERVIEW.md) correctly references the official image location "msanatan/mcp-for-unity-server" for end users.

Applied to files:

README.md

📚 Learning: 2025-12-29T15:23:11.613Z

Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 491
File: MCPForUnity/Editor/Windows/EditorPrefs/EditorPrefsWindow.cs:78-115
Timestamp: 2025-12-29T15:23:11.613Z
Learning: In MCPForUnity, prefer relying on the established testing process to catch UI initialization issues instead of adding defensive null checks for UI elements in editor windows. This means during reviews, verify that tests cover UI initialization paths and that code avoids repetitive null-check boilerplate in Editor Windows. If a UI element can be null, ensure there is a well-tested fallback or that its initialization is guaranteed by design, rather than sprinkling null checks throughout editor code.

Applied to files:

MCPForUnity/Editor/Tools/ManageScript.cs
MCPForUnity/Editor/Resources/Editor/EditorStateV2.cs
MCPForUnity/Editor/Tools/Prefabs/ManagePrefabs.cs
MCPForUnity/Editor/Tools/ManageScene.cs
MCPForUnity/Editor/Tools/ManageGameObject.cs
MCPForUnity/Editor/Services/TestRunnerService.cs
MCPForUnity/Editor/Tools/RunTestsAsync.cs
MCPForUnity/Editor/Services/TestRunnerNoThrottle.cs
MCPForUnity/Editor/Tools/RefreshUnity.cs
MCPForUnity/Editor/Services/TestRunStatus.cs
MCPForUnity/Editor/Tools/ManageAsset.cs
MCPForUnity/Editor/Tools/ManageShader.cs
MCPForUnity/Editor/Tools/GetTestJob.cs
MCPForUnity/Editor/Services/EditorStateCache.cs
MCPForUnity/Editor/Services/TestJobManager.cs

📚 Learning: 2025-09-03T16:00:55.839Z

Learnt from: dsarno
Repo: CoplayDev/unity-mcp PR: 0
File: :0-0
Timestamp: 2025-09-03T16:00:55.839Z
Learning: ComponentResolver in UnityMcpBridge/Editor/Tools/ManageGameObject.cs is a nested static class within ManageGameObject, not a sibling type. The `using static MCPForUnity.Editor.Tools.ManageGameObject;` import is required to access ComponentResolver methods directly without the outer class qualifier.

Applied to files:

MCPForUnity/Editor/Tools/RefreshUnity.cs

🧬 Code graph analysis (20)

Server/src/services/tools/manage_scene.py (1)

Server/src/services/tools/preflight.py (1)

preflight (27-105)

MCPForUnity/Editor/Resources/Editor/EditorStateV2.cs (1)

MCPForUnity/Editor/Services/EditorStateCache.cs (3)

JObject (91-217)

JObject (219-230)

EditorStateCache (35-63)

Server/tests/integration/test_test_jobs_async.py (2)

Server/tests/integration/test_helpers.py (1)

DummyContext (16-55)

Server/src/services/tools/test_jobs.py (2)

run_tests_async (17-66)

get_test_job (70-92)

Server/tests/integration/test_refresh_unity_registration.py (1)

Server/src/services/tools/refresh_unity.py (1)

refresh_unity (20-88)

TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/DomainReloadResilienceTests.cs (2)

TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/ManageScriptableObjectTests.cs (1)

TearDown (56-90)

TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/MCPToolParameterTests.cs (1)

TearDown (46-70)

Server/src/services/tools/test_jobs.py (4)

Server/src/services/tools/preflight.py (1)

preflight (27-105)

Server/src/transport/legacy/unity_connection.py (1)

async_send_command_with_retry (753-785)

Server/src/services/tools/run_tests.py (1)

_coerce_string_list (66-74)

Server/src/transport/unity_transport.py (1)

send_with_unity_instance (79-113)

Server/tests/integration/test_external_changes_scanner.py (1)

Server/src/services/state/external_changes_scanner.py (4)

ExternalChangesScanner (34-240)

set_project_root (51-54)

update_and_get (172-240)

clear_dirty (56-62)

Server/tests/integration/test_refresh_unity_retry_recovery.py (3)

Server/src/services/state/external_changes_scanner.py (1)

ExternalChangesState (21-31)

Server/tests/integration/test_helpers.py (3)

DummyContext (16-55)

set_state (49-51)

model_dump (10-13)

Server/src/services/tools/refresh_unity.py (1)

refresh_unity (20-88)

Server/src/services/tools/run_tests.py (1)

Server/src/services/tools/preflight.py (1)

preflight (27-105)

MCPForUnity/Editor/Services/TestRunnerService.cs (2)

MCPForUnity/Editor/Services/TestRunStatus.cs (3)

TestRunStatus (10-59)

MarkStarted (39-48)

MarkFinished (50-58)

MCPForUnity/Editor/Services/TestJobManager.cs (6)

TestJobManager (48-578)

TestJobManager (65-69)

OnRunStarted (325-346)

OnRunFinished (401-415)

OnTestStarted (348-368)

OnLeafTestFinished (370-399)

MCPForUnity/Editor/Tools/RunTestsAsync.cs (2)

MCPForUnity/Editor/Resources/Tests/GetTests.cs (2)

ModeParser (77-105)

TryParse (79-104)

MCPForUnity/Editor/Tools/RunTests.cs (1)

ParseStringArray (137-169)

MCPForUnity/Editor/Services/TestRunnerNoThrottle.cs (3)

MCPForUnity/Editor/Tools/ManageScript.cs (1)

System (2289-2355)

MCPForUnity/Editor/Helpers/McpLog.cs (3)

McpLog (7-52)

Info (37-41)

Warn (43-46)

MCPForUnity/Editor/Services/TestRunnerService.cs (4)

RunStarted (167-184)

RunFinished (186-199)

TestStarted (201-217)

TestFinished (219-262)

MCPForUnity/Editor/Tools/RefreshUnity.cs (1)

MCPForUnity/Editor/Services/TestRunStatus.cs (1)

TestRunStatus (10-59)

Server/src/services/tools/manage_gameobject.py (1)

Server/src/services/tools/preflight.py (1)

preflight (27-105)

MCPForUnity/Editor/Tools/GetTestJob.cs (4)

MCPForUnity/Editor/Tools/ManageScene.cs (2)

McpForUnityTool (18-734)

HandleCommand (90-191)

MCPForUnity/Editor/Tools/RunTestsAsync.cs (1)

McpForUnityTool (16-125)

MCPForUnity/Editor/Services/EditorStateCache.cs (2)

JObject (91-217)

JObject (219-230)

MCPForUnity/Editor/Services/TestJobManager.cs (3)

TestJobManager (48-578)

TestJobManager (65-69)

ToSerializable (429-467)

MCPForUnity/Editor/Services/EditorStateCache.cs (3)

MCPForUnity/Editor/Helpers/McpLog.cs (2)

McpLog (7-52)

Error (48-51)

MCPForUnity/Editor/Services/TestRunStatus.cs (1)

TestRunStatus (10-59)

MCPForUnity/Editor/Services/TestJobManager.cs (2)

TestJobManager (48-578)

TestJobManager (65-69)

MCPForUnity/Editor/Services/TestJobManager.cs (2)

MCPForUnity/Editor/Helpers/McpLog.cs (3)

Error (48-51)

McpLog (7-52)

Warn (43-46)

MCPForUnity/Editor/Services/TestRunnerService.cs (6)

TestRunResult (438-510)

TestRunResult (440-444)

TestRunResult (484-509)

ToSerializable (454-482)

ToSerializable (531-542)

ToSerializable (573-585)

TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/ManageScriptableObjectTests.cs (1)

MCPForUnity/Editor/Tools/ManageScriptableObject.cs (1)

HandleCommand (42-77)

Server/src/services/tools/manage_asset.py (1)

Server/src/services/tools/preflight.py (1)

preflight (27-105)

Server/src/services/resources/editor_state_v2.py (3)

Server/src/transport/legacy/unity_connection.py (1)

async_send_command_with_retry (753-785)

Server/src/services/state/external_changes_scanner.py (4)

_now_unix_ms (11-12)

_in_pytest (15-17)

set_project_root (51-54)

update_and_get (172-240)

Server/src/transport/unity_transport.py (1)

_current_transport (23-25)

🪛 LanguageTool

README.md

[style] ~19-~19: In American English, abbreviations like “etc.” require a period.
Context: ... (Claude, Cursor, Antigravity, VS Code, etc) to interact directly with your Unity E...

(ETC_PERIOD)

🪛 markdownlint-cli2 (0.18.1)

README.md

412-412: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

🪛 Ruff (0.14.10)

Server/tests/integration/test_test_jobs_async.py

12-12: Unused function argument: send_fn

(ARG001)

12-12: Unused function argument: unity_instance

(ARG001)

12-12: Unused function argument: kwargs

(ARG001)

39-39: Unused function argument: send_fn

(ARG001)

39-39: Unused function argument: unity_instance

(ARG001)

39-39: Unused function argument: kwargs

(ARG001)

Server/src/services/tools/refresh_unity.py

78-79: try-except-pass detected, consider logging the exception

(S110)

78-78: Do not catch blind exception: Exception

(BLE001)

Server/tests/integration/test_refresh_unity_registration.py

9-9: Unused noqa directive (non-enabled: F401)

Remove unused noqa directive

(RUF100)

Server/tests/integration/test_refresh_unity_retry_recovery.py

25-25: Unused function argument: send_fn

(ARG001)

25-25: Unused function argument: unity_instance

(ARG001)

25-25: Unused function argument: params

(ARG001)

25-25: Unused function argument: kwargs

(ARG001)

Server/tests/integration/test_editor_state_v2_contract.py

16-16: Unused noqa directive (non-enabled: F401)

Remove unused noqa directive

(RUF100)

26-26: Unused function argument: send_fn

(ARG001)

26-26: Unused function argument: unity_instance

(ARG001)

26-26: Unused function argument: params

(ARG001)

26-26: Unused function argument: kwargs

(ARG001)

Server/src/services/tools/preflight.py

50-50: Do not catch blind exception: Exception

(BLE001)

69-71: try-except-pass detected, consider logging the exception

(S110)

69-69: Do not catch blind exception: Exception

(BLE001)

100-100: Do not catch blind exception: Exception

(BLE001)

Server/src/services/state/external_changes_scanner.py

126-126: Do not catch blind exception: Exception

(BLE001)

215-216: try-except-pass detected, consider logging the exception

(S110)

215-215: Do not catch blind exception: Exception

(BLE001)

Server/tests/integration/test_run_tests_busy_semantics.py

14-14: Unused function argument: send_fn

(ARG001)

14-14: Unused function argument: unity_instance

(ARG001)

14-14: Unused function argument: params

(ARG001)

14-14: Unused function argument: kwargs

(ARG001)

Server/src/services/resources/editor_state_v2.py

24-24: Unused function argument: ctx

(ARG001)

34-34: Do not catch blind exception: Exception

(BLE001)

50-50: Do not catch blind exception: Exception

(BLE001)

64-64: Do not catch blind exception: Exception

(BLE001)

146-146: Do not catch blind exception: Exception

(BLE001)

263-265: try-except-pass detected, consider logging the exception

(S110)

263-263: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Sourcery review

🔇 Additional comments (69)

TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/ManageScriptableObjectTests.cs (5)

31-36: LGTM! Good test isolation strategy.

The per-run folder isolation using Run_<Guid> effectively prevents test interference while avoiding heavy reimport churn from repeatedly deleting/recreating TempRoot. The approach is well-documented.

73-75: LGTM!

The TearDown correctly deletes the per-run _runRoot folder instead of the entire TempRoot, aligning with the new isolation strategy.

93-121: LGTM! Good test separation.

Splitting the nested folder creation test from the patching test improves test clarity and maintainability. Each test now has a single, well-defined responsibility.

124-160: LGTM!

The patch testing is now isolated in its own test with a clear focus. The comment explaining the use of _runRoot instead of _nestedFolder is helpful, and the assertions thoroughly verify all patched properties.

170-170: LGTM!

The update to use _runRoot maintains consistency with the per-run isolation strategy.

TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/DomainReloadResilienceTests.cs (7)

12-23: Excellent documentation and test isolation strategy.

The expanded XML comments clearly explain why these tests are marked [Explicit] and the distinction between test runner behavior (can stall when backgrounded) versus the MCP workflow (unaffected due to socket stimulus) is well-articulated. The updated attribute message is more concise and less brittle than a version-specific rationale.

27-36: ForceSynchronousImport in Setup ensures deterministic test execution.

Using ForceSynchronousImport after directory creation ensures the AssetDatabase is fully synchronized before tests begin. While directory creation alone doesn't trigger compilation (so synchronous import may be conservative here), this approach prevents any potential race conditions and aligns with the PR's deterministic domain-reload handling strategy.

38-61: Critical cleanup ensures proper test isolation.

The addition of ForceSynchronousImport in TearDown (line 60) is essential for preventing lingering compilation state from affecting subsequent tests. This is particularly important for these domain-reload tests since they create and delete scripts that trigger compilation. The detailed comment explaining the rationale is helpful for future maintainers.

71-130: ForceSynchronousImport improves test determinism.

The change to ForceSynchronousImport (line 85) ensures the domain reload triggered by script creation completes before the subsequent console read operations. This makes the test's 100% success expectation (line 128-129) more reliable and aligns with the PR's goal of handling domain reloads deterministically.

160-198: Synchronous refresh ensures reliable console query after script creation.

The ForceSynchronousImport change (line 176) guarantees the domain reload completes before the console query executes. Note that the yield return null on line 181 may now be redundant since synchronous import already waits for completion, but keeping it is harmless and may help with other Unity async operations.

204-264: Synchronous imports should improve test reliability beyond 80% threshold.

The ForceSynchronousImport change (line 224) ensures each script's domain reload completes before creating the next script, which should significantly improve reliability. The 80% success threshold (line 261-262) is conservative and appropriate given the rapid-fire nature of the test, though the synchronous imports may push actual success rates closer to 100%.

22-22: [Explicit] tests are already properly configured in CI.

The upstream CI workflow (.github/workflows/unity-tests.yml) already includes an explicit step to run domain-reload tests first with -testCategory domain_reload, which correctly targets the [Explicit] tests marked with [Category("domain_reload")]. The concern is resolved.

MCPForUnity/Editor/Tools/Prefabs/ManagePrefabs.cs (1)

226-226: LGTM: Synchronous refresh ensures immediate directory recognition.

The addition of ImportAssetOptions.ForceSynchronousImport ensures Unity immediately recognizes the newly created directory before subsequent prefab operations. This aligns with the PR's broader goal of making asset operations deterministic and reliable for the async test infrastructure.

MCPForUnity/Editor/Tools/ManageScript.cs (1)

981-981: LGTM: Synchronous refresh after deletion ensures clean state.

The synchronous refresh after MoveAssetToTrash ensures Unity's asset database immediately reflects the deletion. This prevents race conditions where subsequent operations might reference the deleted script.

MCPForUnity/Editor/Tools/ManageShader.cs (1)

97-97: LGTM: Consistent synchronous refresh pattern for shader lifecycle operations.

The three refresh calls ensure Unity immediately recognizes:

Line 97: New shader directory

Line 177: Newly created shader asset (note: ImportAsset on line 176 already imports the asset; the refresh ensures full recognition)

Line 245: Updated shader asset

This pattern prevents downstream operations from encountering stale asset state.

Also applies to: 177-177, 245-245

MCPForUnity/Editor/Services/TestRunStatus.cs.meta (1)

1-11: Standard Unity meta file - no concerns.

This is standard Unity-generated metadata for the corresponding .cs file. The configuration is appropriate for a standard editor script.

MCPForUnity/Editor/Tools/ManageAsset.cs (1)

183-183: LGTM: Synchronous refresh for directory operations ensures immediate availability.

The synchronous refresh pattern is applied in two directory-creation contexts:

Line 183: During asset creation when target directory doesn't exist

Line 872: In the EnsureDirectoryExists helper (used by multiple operations including move/rename at line 591)

This ensures newly created directories are immediately recognized by Unity's asset database before subsequent operations attempt to use them.

Also applies to: 872-872

Server/src/services/tools/manage_gameobject.py (1)

11-11: LGTM! Preflight guard improves robustness.

The addition of the preflight check before action validation ensures Unity is ready (not compiling, external changes refreshed) before GameObject operations proceed. This placement is optimal as it guards the tool early while allowing fast-fail for invalid inputs after readiness is confirmed.

Also applies to: 96-97

MCPForUnity/Editor/Services/EditorStateCache.cs.meta (1)

1-11: Standard Unity meta file.

This is a properly formatted Unity asset metadata file with standard MonoImporter settings. No issues.

MCPForUnity/Editor/Services/TestRunnerNoThrottle.cs.meta (1)

1-11: Standard Unity meta file.

This is a properly formatted Unity asset metadata file with standard MonoImporter settings. No issues.

MCPForUnity/Editor/Tools/ManageScene.cs (3)

222-222: LGTM! Synchronous refresh ensures asset visibility.

Using ForceSynchronousImport after scene creation ensures Unity immediately recognizes the new scene file before returning success to the caller, preventing race conditions in subsequent operations.

365-365: LGTM! Synchronous refresh ensures asset visibility.

Using ForceSynchronousImport after saving a scene ensures Unity's asset database is immediately updated before the success response is returned, maintaining consistency for follow-up operations.

411-411: LGTM! Synchronous refresh ensures screenshot availability.

Using ForceSynchronousImport after capturing a screenshot ensures the asset is immediately available in Unity's asset database before returning the path to the caller.

Server/src/services/tools/manage_asset.py (1)

15-15: LGTM! Preflight guard with clear rationale.

The preflight check ensures Unity is ready before asset operations, with helpful comments explaining the guard prevents stale reads and timeouts. The placement before property parsing is optimal, and the parameters are appropriate for asset management operations.

Also applies to: 51-53

MCPForUnity/Editor/Tools/RefreshUnity.cs.meta (1)

1-11: LGTM – Standard Unity metadata file.

This .meta file accompanies the new RefreshUnity tool and contains standard MonoImporter metadata with a unique GUID. No issues detected.

MCPForUnity/Editor/Resources/Editor/EditorStateV2.cs.meta (1)

1-11: LGTM – Standard Unity metadata file.

This .meta file accompanies the new EditorStateV2 resource and follows Unity's standard metadata format. No issues detected.

MCPForUnity/Editor/Tools/ManageGameObject.cs (1)

586-597: LGTM – Synchronous refresh ensures directory availability.

The addition of ImportAssetOptions.ForceSynchronousImport on line 593 ensures the newly created directory is immediately recognized by Unity's asset database before attempting to save the prefab. This change is consistent with the broader PR goal of making asset refresh behavior deterministic and synchronous across editor tooling, reducing race conditions during asset operations.

MCPForUnity/Editor/Tools/GetTestJob.cs.meta (1)

1-12: LGTM – Standard Unity metadata file.

This .meta file accompanies the new GetTestJob tool (part of the async test infrastructure) and contains standard MonoImporter metadata. No issues detected.

MCPForUnity/Editor/Tools/RunTestsAsync.cs.meta (1)

1-12: LGTM – Standard Unity metadata file.

This .meta file accompanies the new RunTestsAsync tool (part of the async test infrastructure) and contains standard MonoImporter metadata. No issues detected.

Server/tests/integration/test_refresh_unity_retry_recovery.py (1)

10-44: LGTM! Well-structured integration test for retry/recovery flow.

The test correctly verifies that refresh_unity recovers from a disconnect with retry hint, polls for readiness, and clears the external dirty state. The mock setup properly simulates the disconnect scenario and readiness restoration.

Note: The Ruff warnings about unused function arguments in fake_send_with_unity_instance (line 25) are false positives—the mock signature must match the real function signature for monkeypatch.setattr to work correctly.

Server/src/services/tools/manage_scene.py (1)

9-9: LGTM! Preflight guard correctly integrated.

The preflight check is properly positioned after obtaining the Unity instance and before scene management operations. The parameters (wait_for_no_compile=True, refresh_if_dirty=True) are appropriate for ensuring Unity is ready and assets are refreshed before manipulating scenes.

Also applies to: 44-44

MCPForUnity/Editor/Services/TestJobManager.cs.meta (1)

1-12: LGTM! Standard Unity metadata file.

This is the required Unity asset metadata for TestJobManager.cs. The GUID and MonoImporter configuration are standard boilerplate.

docs/README-DEV.md (1)

338-349: Clear and helpful documentation for domain reload test caveats.

The documentation accurately describes the macOS background throttling limitation and provides practical workarounds. This aligns well with the PR's introduction of [Explicit] attributes for domain-reload tests and helps developers understand test reliability considerations.

Server/tests/integration/test_editor_state_v2_contract.py (2)

26-38: Mock stub arguments are acceptable for signature matching.

The unused arguments (send_fn, unity_instance, params, kwargs) are expected since this fake function must match the real send_with_unity_instance signature. The static analysis hints can be safely ignored here.

48-57: Server-side enrichment correctly adds advice and staleness fields.

The fake stub at lines 29-38 intentionally omits advice and staleness to verify that the server layer enriches the response. The get_editor_state_v2 function calls _enrich_advice_and_staleness() (line 267) before returning, which adds both fields: advice (containing ready_for_tools, blocking_reasons, etc.) and staleness (containing age_ms and is_stale). The test correctly verifies this server-side enrichment occurs.

MCPForUnity/Editor/Resources/Editor/EditorStateV2.cs (1)

14-25: Clean implementation following the established pattern.

The handler correctly delegates to EditorStateCache.GetSnapshot() and wraps exceptions in an ErrorResponse. The unused @params parameter is acceptable since this is a simple getter with no input parameters.

Server/src/services/tools/run_tests.py (1)

61-63: Good integration of preflight guard.

The preflight check with requires_no_tests=True, wait_for_no_compile=True, and refresh_if_dirty=True ensures the test runner doesn't start when tests are already running or during compilation. This prevents the "already in progress" race condition at the gate level.

MCPForUnity/Editor/Services/TestRunnerService.cs (4)

96-97: Good: Early status marking for accurate readiness snapshots.

Marking the test run as started before calling Execute() ensures that EditorStateCache.GetSnapshot() immediately reflects the busy state, preventing race conditions where clients might try to start another run.

170-183: Robust run start handling with best-effort progress tracking.

The try-catch pattern for counting leaf tests and notifying TestJobManager is appropriate since test adaptor behavior may vary. Falling back to null for the total count when counting fails is a reasonable degradation.

237-253: Failure detection logic covers standard NUnit outcomes.

The case-insensitive check for "failed" or "error" in the ResultState string should correctly identify test failures across NUnit's result states (e.g., "Failed", "Error", "Failed:Error").

266-293: Defensive leaf counting with graceful degradation.

The recursive CountLeafTests helper handles null nodes and enumeration failures gracefully. Returning 0 on exception (line 289) treats it as "unknown total" which is appropriate for progress reporting.

Server/tests/integration/test_run_tests_busy_semantics.py (1)

6-35: Well-structured integration test for busy response semantics.

The test correctly validates that when Unity reports a test run is already in progress, the tool returns a structured retry response rather than a generic failure. The mock accurately simulates Unity's error format, and the assertions properly verify the retry hint and backoff time.

The unused function arguments in the fake are expected for signature matching with send_with_unity_instance.

MCPForUnity/Editor/Tools/GetTestJob.cs (1)

14-50: Clean implementation following established tool patterns.

The handler correctly:

Supports both job_id and jobId parameter names for client flexibility

Uses defensive parsing for boolean flags with try-catch

Returns appropriate error responses for missing or unknown job IDs

Delegates serialization to TestJobManager.ToSerializable with detail flags

The pattern is consistent with RunTestsAsync.cs.

MCPForUnity/Editor/Services/TestRunStatus.cs (1)

1-60: LGTM! Clean thread-safe status tracker.

The implementation correctly uses a single lock object to guard all state access and mutations, ensuring atomicity. The use of DateTimeOffset.UtcNow.ToUnixTimeMilliseconds() for timestamps is appropriate for cross-system consistency.

Server/tests/integration/test_test_jobs_async.py (2)

6-30: LGTM! Well-structured integration test.

The test correctly verifies that run_tests_async forwards the command type and parameters to the Unity transport. The unused arguments (send_fn, unity_instance, kwargs) in the fake function are intentional—they match the real function signature but aren't needed for verification in this mock.

33-50: LGTM! Proper parameter forwarding verification.

The test correctly validates that get_test_job forwards the job_id parameter and returns the expected response structure.

MCPForUnity/Editor/Services/TestRunnerNoThrottle.cs (2)

39-61: Good initialization pattern with domain reload recovery.

The use of HideFlags.HideAndDontSave on the TestRunnerApi instance prevents serialization issues and GC collection. The domain reload recovery logic (checking IsTestRunActive() and reapplying no-throttling) correctly handles mid-test recompilation scenarios.

106-120: Reflection fallback is appropriately defensive.

Silently catching reflection failures for UpdateInteractionModeSettings is reasonable since the method may not exist in all Unity versions and the worst case is that settings take effect on the next editor tick anyway.

Server/tests/integration/test_external_changes_scanner.py (2)

6-44: LGTM! Comprehensive dirty-state lifecycle test.

The test correctly validates the scanner's dirty detection, timestamp tracking, and clearing behavior. Using monkeypatch.delenv("PYTEST_CURRENT_TEST") to bypass the scanner's test-mode short-circuit is appropriate for testing real scanner logic.

46-85: LGTM! Good coverage of external package dependency scanning.

The test validates that changes to external packages referenced via file: paths in manifest.json are correctly detected as dirty. The relative path resolution (../../ExternalPkg) matches real Unity project structures.

MCPForUnity/Editor/Services/EditorStateCache.cs (3)

65-81: Good throttling with compilation edge detection.

The update logic correctly allows through compilation state transitions even when throttled, ensuring timestamps remain accurate for compile start/finish events.

168-173: since_unix_ms always equals observation time—is this intentional?

Currently since_unix_ms is set to _observedUnixMs (line 171), which means it always reflects the current snapshot time rather than when the current activity phase actually began. If the intent is to track phase duration, you'd need to persist the timestamp when the phase transitioned.

If the current behavior is intentional (providing the observation time as context), consider renaming to clarify its meaning or updating the comment.

219-230: LGTM! Defensive snapshot access with deep cloning.

The GetSnapshot() method correctly acquires a lock, rebuilds if null, and returns a deep clone to prevent external mutation of the cached state.

README.md (4)

19-21: LGTM! Documentation updates align with PR changes.

The screenshot update to v8.6 and expanded client list accurately reflect the PR's scope.

54-56: Good documentation for the new async test workflow.

Clearly documenting run_tests_async as preferred with get_test_job for polling, while keeping run_tests available for simple cases, gives users appropriate guidance.

157-157: Version tag updated consistently.

The Git URL version tag is correctly updated to v8.6.0.

173-175: Transport and server terminology updated.

The updated terminology ("HTTP Local", "Start Server") is clearer and consistent with the new UI.

Server/src/services/tools/preflight.py (2)

11-24: LGTM! Clean helper functions.

_in_pytest() correctly detects the test environment to avoid blocking integration tests that mock transports. _busy() provides a consistent retry response structure.

79-101: Polling loop is well-bounded with appropriate backoff.

The compilation wait loop correctly:

Uses a monotonic deadline to prevent clock-skew issues

Sleeps 0.25s between polls to avoid busy-waiting

Returns a busy response with 500ms retry hint if the deadline is exceeded

The redundant import of get_editor_state_v2 inside the loop (line 94) is harmless since Python caches module imports.

MCPForUnity/Editor/Tools/RunTestsAsync.cs (1)

56-65: The includeDetails and includeFailedTests values are returned but not stored.

These flags are parsed and included in the response, but they're not passed to TestJobManager.StartJob or stored with the job. When get_test_job is called later, these preferences won't be available unless the caller re-specifies them.

This appears intentional (the caller re-specifies on each poll), but consider documenting this contract or storing them with the job for consistency.

MCPForUnity/Editor/Tools/RefreshUnity.cs (1)

129-171: Well-implemented async wait pattern.

The WaitForUnityReadyAsync method correctly uses TaskCompletionSource with EditorApplication.update to poll readiness state without blocking. The use of TaskCreationOptions.RunContinuationsAsynchronously is appropriate to avoid deadlocks.

Minor note: The empty catch at line 169 around QueuePlayerLoopUpdate() is acceptable as this API may not be available in all Unity versions.

Server/src/services/state/external_changes_scanner.py (2)

64-104: Solid implementation of bounded recursive mtime scanner.

The scanner correctly:

Limits entries to max_entries to bound runtime

Skips irrelevant directories (Library, Temp, etc.) early

Handles missing files gracefully with OSError catch

Falls back to st_mtime when st_mtime_ns is unavailable

The entry counting includes both directories and files, which provides consistent bounding behavior.

147-153: Edge case in file: URI parsing.

The parsing at lines 148-149 uses suffix.lstrip("/") after stripping ///, which would correctly handle file:///c:/path on Windows but might behave unexpectedly with file:////path (four slashes). This is likely an edge case that won't occur in practice.

Server/src/services/resources/editor_state_v2.py (2)

141-179: Well-structured readiness advice computation.

The _enrich_advice_and_staleness function cleanly computes blocking reasons and staleness metrics. The 2-second staleness threshold is conservative and appropriate for detecting throttled editors.

237-265: External changes integration adds server-side file monitoring.

The integration with external_changes_scanner is well-designed:

Caches project root per instance

Overwrites Unity placeholders with authoritative server-side data (per comment at lines 256-257)

Best-effort with graceful degradation

Consider: The get_project_info(ctx) call on line 243 may add latency if called on every state fetch. The scanner already caches project_root internally, so you might skip this call if already set.

MCPForUnity/Editor/Services/TestJobManager.cs (4)

269-302: Robust job start with TOCTOU protection.

The StartJob method correctly uses a single lock scope (lines 270-278) for the check-and-set of _currentJobId, preventing race conditions. The FinalizeJob helper ensures state mutation happens on the main thread via EditorApplication.delayCall.

The dual finalization path (RunFinished callback + Task continuation as safety net) at lines 294-299 is a good defensive pattern.

190-242: Good throttling strategy for SessionState persistence.

The 1-second throttle (MinPersistIntervalMs) prevents excessive writes during rapid test progress updates, while force: true ensures critical state changes (job start/finish) are persisted immediately.

Minor observation: The SessionState.SetString calls at lines 234-236 are outside the lock, which is correct - the snapshot is already captured, and the lock doesn't need to cover the write.

532-577: Defensive dual-finalization approach.

FinalizeFromTask correctly checks if the job is still running before finalizing (line 543), preventing double-finalization when FinalizeCurrentJobFromRunFinished is called first. This provides resilience against edge cases where the RunFinished callback isn't delivered.

500-514: Clear stuck-test detection logic.

The IsStuck method provides a simple 60-second threshold check for detecting when a test has been running too long (typically due to Unity being backgrounded). The guards at lines 502-510 ensure false negatives are avoided when state is incomplete.

…sults - manage_asset, manage_gameobject, manage_scene now check preflight return value and propagate busy/retry signals to clients (fixes Sourcery #1) - TestJobManager.FinalizeCurrentJobFromRunFinished now sets job status to Failed when resultPayload.Failed > 0, not always Succeeded (fixes Sourcery #2)

When 'Force fresh server install' is enabled, uvx uses --no-cache --refresh which rebuilds the package and takes significantly longer to start. - Increase timeout from 10s to 45s when dev mode is enabled - Add informative log message explaining the longer startup time - Show actual timeout value in warning message

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

MCPForUnity/Editor/Services/TestJobManager.cs (1)

304-325: Past issue resolved - status now correctly reflects test failures.

The finalization logic correctly derives job status from the test results: when resultPayload.Failed > 0, the job is marked as Failed, otherwise Succeeded. This addresses the concern raised in the previous review.

🧹 Nitpick comments (2)

MCPForUnity/Editor/Services/TestJobManager.cs (2)
431-469: Consider serializing results for Failed jobs too.

Currently, resultPayload is only populated when job.Status == TestJobStatus.Succeeded (line 439). This means when a test run completes but contains test failures, clients calling get_test_job receive only the first 25 failures via failures_so_far (line 463), not the complete test results.

Since job.Result is set regardless of pass/fail (line 320 in FinalizeCurrentJobFromRunFinished), and ToSerializable accepts includeDetails and includeFailedTests flags for fine-grained control, consider allowing Failed jobs to serialize their results as well. This would let clients retrieve full test details for failed runs, not just the capped failures.
🔎 Possible approach
             object resultPayload = null;
-            if (job.Status == TestJobStatus.Succeeded && job.Result != null)
+            if (job.Result != null && (job.Status == TestJobStatus.Succeeded || job.Status == TestJobStatus.Failed))
             {
                 resultPayload = job.Result.ToSerializable(job.Mode, includeDetails, includeFailedTests);
             }
Alternatively, if keeping payloads small for failed runs is intentional, consider documenting this design decision.
372-401: Minor: Redundant defensive guards.

Line 383 uses Math.Max(0, job.CompletedTests + 1) and line 389 uses job.FailuresSoFar ??= new List<TestJobFailure>(). Since CompletedTests is initialized to 0 and only incremented, and FailuresSoFar is always initialized in the constructor (line 264) and in OnRunStarted (line 344), these guards are redundant.

While defensive programming is generally good, these specific checks add no safety value. Consider removing them for cleaner code, or document if there's a scenario where they're needed (e.g., manual state manipulation in tests).

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 766112b and 72cb011.

📒 Files selected for processing (4)

MCPForUnity/Editor/Services/TestJobManager.cs
Server/src/services/tools/manage_asset.py
Server/src/services/tools/manage_gameobject.py
Server/src/services/tools/manage_scene.py

🚧 Files skipped from review as they are similar to previous changes (3)

Server/src/services/tools/manage_asset.py
Server/src/services/tools/manage_gameobject.py
Server/src/services/tools/manage_scene.py

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 491
File: MCPForUnity/Editor/Windows/EditorPrefs/EditorPrefsWindow.cs:78-115
Timestamp: 2025-12-29T15:23:17.871Z
Learning: In the MCPForUnity project, the team prefers to rely on their testing process to catch UI initialization issues rather than adding defensive null checks for UI elements in editor windows.

📚 Learning: 2025-12-29T15:23:11.613Z

Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 491
File: MCPForUnity/Editor/Windows/EditorPrefs/EditorPrefsWindow.cs:78-115
Timestamp: 2025-12-29T15:23:11.613Z
Learning: In MCPForUnity, prefer relying on the established testing process to catch UI initialization issues instead of adding defensive null checks for UI elements in editor windows. This means during reviews, verify that tests cover UI initialization paths and that code avoids repetitive null-check boilerplate in Editor Windows. If a UI element can be null, ensure there is a well-tested fallback or that its initialization is guaranteed by design, rather than sprinkling null checks throughout editor code.

Applied to files:

MCPForUnity/Editor/Services/TestJobManager.cs

🧬 Code graph analysis (1)

MCPForUnity/Editor/Services/TestJobManager.cs (3)

MCPForUnity/Editor/Helpers/McpLog.cs (3)

Error (48-51)

McpLog (7-52)

Warn (43-46)

MCPForUnity/Editor/Services/TestRunnerService.cs (6)

TestRunResult (438-510)

TestRunResult (440-444)

TestRunResult (484-509)

ToSerializable (454-482)

ToSerializable (531-542)

ToSerializable (573-585)

MCPForUnity/Editor/Tools/RunTestsAsync.cs (1)

TestFilterOptions (78-124)

🔇 Additional comments (4)

MCPForUnity/Editor/Services/TestJobManager.cs (4)

13-43: LGTM - Clean data structures for job tracking.

The enum and classes provide a solid foundation for tracking test job state with appropriate nullable fields for optional/time-sensitive data.

244-302: LGTM - Robust job creation with proper thread safety.

The StartJob method demonstrates good practices:

Single lock scope at lines 270-277 prevents TOCTOU races when checking and setting _currentJobId

Task continuation (lines 294-299) correctly marshals finalization to the main thread via EditorApplication.delayCall, respecting Unity API constraints

Immediate job ID return enables async client polling

471-516: LGTM - Thoughtful stuck detection and diagnostics.

The stuck detection logic (60-second threshold) combined with GetBlockedReason provides actionable diagnostics for clients. Checking InternalEditorUtility.isApplicationActive (line 484) directly addresses the real-world issue mentioned in the PR objectives where backgrounded Unity gets heavily throttled by the OS.

190-242: LGTM - Well-designed persistence with throttling.

The persistence logic demonstrates good engineering:

Throttling at line 195 reduces overhead during large test runs (though the first non-forced call will likely persist since _lastPersistUnixMs defaults to 0)

Capping failures at line 222 prevents SessionState bloat

Taking only the 10 most recent jobs (line 207) bounds memory usage

Intentionally excluding Result (per comment at line 172) avoids ballooning SessionState

coderabbitai · 2026-01-03T20:36:52Z

MCPForUnity/Editor/Services/TestJobManager.cs

+        private static void FinalizeFromTask(string jobId, Task<TestRunResult> task)
+        {
+            lock (LockObj)
+            {
+                if (!Jobs.TryGetValue(jobId, out var existing))
+                {
+                    if (_currentJobId == jobId) _currentJobId = null;
+                    return;
+                }
+
+                // If RunFinished already finalized the job, do nothing.
+                if (existing.Status != TestJobStatus.Running)
+                {
+                    if (_currentJobId == jobId) _currentJobId = null;
+                    return;
+                }
+
+                existing.LastUpdateUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds();
+                existing.FinishedUnixMs = existing.LastUpdateUnixMs;
+
+                if (task.IsFaulted)
+                {
+                    existing.Status = TestJobStatus.Failed;
+                    existing.Error = task.Exception?.GetBaseException()?.Message ?? "Unknown test job failure";
+                    existing.Result = null;
+                }
+                else if (task.IsCanceled)
+                {
+                    existing.Status = TestJobStatus.Failed;
+                    existing.Error = "Test job canceled";
+                    existing.Result = null;
+                }
+                else
+                {
+                    existing.Status = TestJobStatus.Succeeded;
+                    existing.Error = null;
+                    existing.Result = task.Result;
+                }
+
+                if (_currentJobId == jobId)
+                {
+                    _currentJobId = null;
+                }
+            }
+            PersistToSessionState(force: true);
+        }


⚠️ Potential issue | 🔴 Critical

Critical: Fallback finalization doesn't detect test failures.

The FinalizeFromTask method always marks the job as Succeeded when the task completes without exceptions (line 568), even if the test results contain failures. This is inconsistent with FinalizeCurrentJobFromRunFinished (lines 316-318), which correctly checks resultPayload.Failed > 0.

While this is a fallback path (per the comment at line 296), it must handle test failures correctly to avoid misleading clients if RunFinished is not delivered.

🔎 Proposed fix

else { - existing.Status = TestJobStatus.Succeeded; + var result = task.Result; + existing.Status = result != null && result.Failed > 0 + ? TestJobStatus.Failed + : TestJobStatus.Succeeded; existing.Error = null; - existing.Result = task.Result; + existing.Result = result; }

🤖 Prompt for AI Agents

In MCPForUnity/Editor/Services/TestJobManager.cs around lines 534 to 579, FinalizeFromTask currently marks a completed task as Succeeded unconditionally; change it to inspect the task.Result (if non-null) for test failures (e.g., result.Failed > 0) and set existing.Status = TestJobStatus.Failed, existing.Error = a descriptive message (or retain task.Result summary) and existing.Result = task.Result when failures are present; for successful results (failed == 0) keep Status = Succeeded and Error = null; also guard for null task.Result (treat as failure with an appropriate Error) and ensure LastUpdate/Finished and _currentJobId handling remain unchanged, then call PersistToSessionState(force: true) as before.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

MCPForUnity/Editor/Windows/Components/Connection/McpConnectionSection.cs (1)
630-643: LGTM! Well-implemented adaptive timeout with helpful user feedback.

The dynamic timeout logic correctly handles the dev mode scenario where package rebuilds take longer. The implementation is solid:

Safe fallback when reading EditorPrefs (consistent with codebase patterns)

Clear informational logging to set user expectations

Improved warning message that includes the actual timeout value
Optional: Consider extracting timeout constants

While the current implementation is clear and acceptable, you could extract the timeout values to named constants for improved maintainability:
+        private static readonly TimeSpan DefaultStartupTimeout = TimeSpan.FromSeconds(10);
+        private static readonly TimeSpan DevModeStartupTimeout = TimeSpan.FromSeconds(45);
+        
         private async Task TryAutoStartSessionAfterHttpServerAsync()
         {
             // ... existing code ...
             
             // Dev mode uses --no-cache --refresh which causes uvx to rebuild the package, taking significantly longer.
             bool devModeEnabled = false;
             try { devModeEnabled = EditorPrefs.GetBool(EditorPrefKeys.DevModeForceServerRefresh, false); } catch { }
-            var startupTimeout = devModeEnabled ? TimeSpan.FromSeconds(45) : TimeSpan.FromSeconds(10);
+            var startupTimeout = devModeEnabled ? DevModeStartupTimeout : DefaultStartupTimeout;
This would make it easier to adjust these values in the future, though it's perfectly fine as-is.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 72cb011 and a946be9.

📒 Files selected for processing (1)

MCPForUnity/Editor/Windows/Components/Connection/McpConnectionSection.cs

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 491
File: MCPForUnity/Editor/Windows/EditorPrefs/EditorPrefsWindow.cs:78-115
Timestamp: 2025-12-29T15:23:17.871Z
Learning: In the MCPForUnity project, the team prefers to rely on their testing process to catch UI initialization issues rather than adding defensive null checks for UI elements in editor windows.

📚 Learning: 2025-12-29T15:23:11.613Z

Learnt from: msanatan
Repo: CoplayDev/unity-mcp PR: 491
File: MCPForUnity/Editor/Windows/EditorPrefs/EditorPrefsWindow.cs:78-115
Timestamp: 2025-12-29T15:23:11.613Z
Learning: In MCPForUnity, prefer relying on the established testing process to catch UI initialization issues instead of adding defensive null checks for UI elements in editor windows. This means during reviews, verify that tests cover UI initialization paths and that code avoids repetitive null-check boilerplate in Editor Windows. If a UI element can be null, ensure there is a well-tested fallback or that its initialization is guaranteed by design, rather than sprinkling null checks throughout editor code.

Applied to files:

MCPForUnity/Editor/Windows/Components/Connection/McpConnectionSection.cs

🧬 Code graph analysis (1)

MCPForUnity/Editor/Windows/Components/Connection/McpConnectionSection.cs (2)

MCPForUnity/Editor/Constants/EditorPrefKeys.cs (1)

EditorPrefKeys (7-56)

MCPForUnity/Editor/Helpers/McpLog.cs (3)

McpLog (7-52)

Info (37-41)

Warn (43-46)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Sourcery review

Apply same logic as FinalizeCurrentJobFromRunFinished: check result.Failed > 0 to correctly mark jobs as Failed when tests fail, even in the fallback path when RunFinished callback is not delivered.

dsarno added 15 commits January 2, 2026 11:56

Add editor readiness v2, refresh tool, and preflight guards

95b6210

Detect external package changes and harden refresh retry

3d58862

docs: add async test tools to README, document domain reload limitation

3550f8f

- Add run_tests_async and get_test_job to main README tools list - Document background stall limitation for domain reload tests in DEV readme

ci: add separate job for domain reload tests

7d75a18

Run [Explicit] domain_reload tests in their own job using -testCategory

ci: run domain reload tests in same job as regular tests

5aa22d9

Combines into single job with two test steps to reuse cached Library

fix: address coderabbit review issues

6e7de9e

- Fix TOCTOU race in TestJobManager.StartJob (single lock scope for check-and-set) - Store TestRunnerApi reference with HideAndDontSave to prevent GC/serialization issues

docs: update tool descriptions to prefer run_tests_async

bb77db4

- run_tests_async is now marked as preferred for long-running suites - run_tests description notes it blocks and suggests async alternative

docs: update README screenshot to v8.6 UI

16b8d00

docs: add v8.6 UI screenshot

f554eec

docs: update v8.6 UI screenshot

a8f34cd

docs: update v8.6 UI screenshot

a3977f6

docs: update v8.6 UI screenshot

3b7a4f8

Update README for MCP version and instructions for v8.7

766112b

sourcery-ai bot reviewed Jan 3, 2026

View reviewed changes

Server/src/services/tools/manage_asset.py Outdated Show resolved Hide resolved

MCPForUnity/Editor/Services/TestJobManager.cs Show resolved Hide resolved

coderabbitai bot reviewed Jan 3, 2026

View reviewed changes

dsarno added 2 commits January 3, 2026 12:29

coderabbitai bot reviewed Jan 3, 2026

View reviewed changes

fix: derive job status from test results in FinalizeFromTask fallback

e5774bb

Apply same logic as FinalizeCurrentJobFromRunFinished: check result.Failed > 0 to correctly mark jobs as Failed when tests fail, even in the fallback path when RunFinished callback is not delivered.

dsarno merged commit 711768d into CoplayDev:main Jan 3, 2026
2 checks passed

dsarno deleted the feat/unity-readiness-v2 branch January 4, 2026 22:57

coderabbitai bot mentioned this pull request Jan 7, 2026

v9 pre-release pruning #528

Merged

This was referenced Jan 14, 2026

Fix PlayMode tests stalling when unfocused (python refresh utility), improve domain reload recovery and refresh tool #554

Merged

Feat:actiontrace Feat:HookSystem #588

Closed

feat:action trace feat:HookSystem #598

Closed

Async Test Infrastructure & Editor Readiness Status + new refresh_unity tool #507

Async Test Infrastructure & Editor Readiness Status + new refresh_unity tool #507

Uh oh!

Conversation

dsarno commented Jan 3, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What This Does

New Features

Smarter Tool Execution (addresses #474)

Non-Blocking Test Runs (addresses #503)

Refresh Tool

Background Throttling Fix

CI Changes

What's New for LLMs

No Breaking Changes

Stats

Summary by Sourcery

Summary by CodeRabbit

Uh oh!

sourcery-ai bot commented Jan 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for async test run lifecycle with run_tests_async and get_test_job

Sequence diagram for refresh_unity tool with disconnect recovery and external changes scanner

Class diagram for async test job tracking and status (TestJobManager, TestRunStatus, related types)

Class diagram for editor readiness infrastructure (EditorStateCache, EditorStateV2, ExternalChangesScanner, preflight)

File-Level Changes

Possibly linked issues

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

coderabbitai bot commented Jan 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dsarno commented Jan 3, 2026 •

edited by coderabbitai bot

Loading

sourcery-ai bot commented Jan 3, 2026 •

edited

Loading

coderabbitai bot commented Jan 3, 2026 •

edited

Loading