Skip to content

Lease expiry watchdog keeps running after terminal jobs #24

@nficano

Description

@nficano

The lease expiry watchdog is started for jobs with LeaseConstraints.ExpiresAt, but it is not cancelled when the agent completes before the lease expires. RunAsync emits the terminal result, marks the job terminal, enters finally, and then awaits the watchdog task. Because the job cancellation token was not cancelled by a successful completion, the watchdog can sleep until expires_at, emit a late tool_result lease-expired event after the terminal result, and keep the background job task alive for the full remaining lease duration.

The relevant locations are src/Arcp.Runtime/JobManager.cs:166, src/Arcp.Runtime/JobManager.cs:171, src/Arcp.Runtime/JobManager.cs:216, and src/Arcp.Runtime/JobManager.cs:274.

Fix prompt: Change JobManager.RunAsync so the lease watchdog is cancelled as soon as the job reaches any terminal state. Use a watchdog-specific CancellationTokenSource linked to the run token and cancel it in finally before awaiting the watchdog, or have the watchdog also stop when job.Status becomes terminal. Preserve the behavior where an actual lease expiry cancels the job and emits LEASE_EXPIRED. Add a test with a future ExpiresAt and an agent that returns immediately, then assert RunAsync completes promptly and no lease-expired event is emitted after job.result.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingseverity:highHigh severity

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions