Skip to content

Conversation

@LucaButBoring
Copy link
Contributor

@LucaButBoring LucaButBoring commented Oct 23, 2025

This PR implements the required changes for modelcontextprotocol/modelcontextprotocol#1686, which adds task augmentation to requests.

Motivation and Context

The current MCP specification supports tool calls that execute a request and eventually receive a response, and tool calls can be passed a progress token to integrate with MCP’s progress-tracking functionality, enabling host applications to receive status updates for a tool call via notifications. However, there is no way for a client to explicitly request the status of a tool call, resulting in states where it is possible for a tool call to have been dropped on the server, and it is unknown if a response or a notification may ever arrive. Similarly, there is no way for a client to explicitly retrieve the result of a tool call after it has completed — if the result was dropped, clients must call the tool again, which is undesirable for tools expected to take minutes or more. This is particularly relevant for MCP servers abstracting existing workflow-based APIs, such as AWS Step Functions, Workflows for Google Cloud, or APIs representing CI/CD pipelines, among other applications.

This proposal (and implementation) solves this by introducing the concept of Tasks, which are pending work objects that can augment any other request. Clients generate a task ID and augment their request with it — that task ID is both a reference to the request and an idempotency token. If the server accepts the task augmentation request, clients can then poll for the status and eventual result of the task with the tasks/get and tasks/result operations.

How Has This Been Tested?

Unit tests and updated sHTTP example.

Breaking Changes

None.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

Copy link

@halter73 halter73 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SEP-1686 currently states that it "introduces a mechanism for requestors (which can be either clients or servers, depending on the direction of communication) to augment their requests with tasks." Is this still the case?

I don't see any tests or examples demonstrating using a TaskStore with an McpClient (it's all McpServer or the Protocol base type), although I suppose it should work considering its shared code. It still might be nice to have an end-to-end test demonstrating that client-side support for stuff like elicitations works using the public APIs.

* Note: This is not suitable for production use as all data is lost on restart.
* For production, consider implementing TaskStore with a database or distributed cache.
*/
export class InMemoryTaskStore implements TaskStore {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we cannot create a default in-memory task store with settable max limits on the total number of tasks?

I like that this PR adds the TaskStore abstraction which opens the door for a distributed implementation and doesn't require the application developer to manually handle tasks/get, tasks/list, etc., but it feels weird to me to not have a production-ready in-memory solution provided by the SDK.

It should still be off by default to make sure people are explicit about any maxTrackedTask limits and the like, but it should be built in and not left to an example that is "not suitable for production." This will prove it's possible to implement a production-quality TaskStore in the easiest case where everything is tracked in-memory and lost on process restart.

  • Note: This is not suitable for production use as all data is lost on restart.
  • For production, consider implementing TaskStore with a database or distributed cache.

This is true, but I think the impact is overstated. The logic in protocol.ts that calls _taskStore.updateTaskStatus(taskMetadata.taskId, 'input_required'); and then sets the task back to 'working' when the a request completes also breaks if the server restarts. If the process exits before the client provides a response, the task will be permanently left in an 'input_required' state indefinitely without some manual intervention outside of the SDK.

In most cases supported by the SDK, when tasks cannot outlive the MCP server process, an in-memory TaskStore would be better than a distributed one because the client will be informed that the Task is no longer being tracked by the server after it restarts automatically.

Copy link
Contributor Author

@LucaButBoring LucaButBoring Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't disagree, but precedence was the main reason for leaving this as an example and leaving out any limits, really - it's the same case with InMemoryEventStore and InMemoryOAuthClientProvider. @ihrpr @felixweinberger any input here?

Copy link
Contributor Author

@LucaButBoring LucaButBoring Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the SDK is going to provide a production-grade implementation, it needs to have limits and some form of logging extension points, but if it is not going to provide a production-grade implementation, I don't want to misrepresent this example as one.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can never implement a production-ready service implementation, because that will always require some additional resources

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems right to me to have this as an interface - it seems hard to provide a generic "production-ready" approach here; I guess we could use sqlite? But I think the argument witht he InMemoryEventStore and InMemoryOAuthClientProvider are convincing. If we wanted to create production ready examples, we can do that as follow-ups.

However, it's important this is documented clearly, which it is in the README.

@LucaButBoring
Copy link
Contributor Author

LucaButBoring commented Oct 31, 2025

It still might be nice to have an end-to-end test demonstrating that client-side support for stuff like elicitations works using the public APIs.

Agreed, will add this.

edit: Done

@LucaButBoring
Copy link
Contributor Author

Updated @maxisbey

@felixweinberger
Copy link
Contributor

felixweinberger commented Nov 25, 2025

@maxisbey @LucaButBoring I tested TS <> Py. The following worked:

  1. TS Client -> TS Server
  2. Py Client -> Py Server
  3. TS Client -> Py Server

But this didn't work:

  1. Py Server -> TS Client

To make this work cleanly I had to make 2 small changes:

Then I get this:

CleanShot 2025-11-25 at 18 33 18

@LucaButBoring
Copy link
Contributor Author

TS SDK fix is in: 681cf0d

@felixweinberger
Copy link
Contributor

TS SDK fix is in: 681cf0d

Gonna test elicitation examples as well, assuming that works fine I think we should get this landed.

LucaButBoring and others added 5 commits November 26, 2025 10:18
Responses and errors were incorrectly going through the original stream path instead of being queued. Also, extra.sendRequest was not setting the input_required status. These issues have been fixed and tests have been added/updated for them.

Sequence diagram of the intended flow:

```mermaid
sequenceDiagram
    participant C as Client Protocol
    participant CT as Client Transport
    participant ST as Server Transport
    participant S as Server Protocol
    participant TQ as TaskMessageQueue
    participant TS as TaskStore
    participant H as Async Handler

    Note over C,H: Phase 1: Task Creation
    activate C
    C->>CT: tools/call { task: { ttl: 60000 } }
    activate CT
    CT->>ST: HTTP POST
    activate ST
    ST->>S: _onrequest()
    activate S
    S->>TS: createTask()
    activate TS
    TS-->>S: Task { taskId, status: 'working' }
    deactivate TS
    S--)H: Start async handler (non-blocking)
    activate H
    S-->>ST: CreateTaskResult { task }
    deactivate S
    ST-->>CT: HTTP Response
    deactivate ST
    CT-->>C: CreateTaskResult
    deactivate CT
    deactivate C

    Note over C,H: Phase 2: Server Queues Elicitation Request
    H->>S: extra.sendRequest(elicitation, { relatedTask })
    activate S
    S->>TQ: enqueue({ type: 'request', message: elicitation })
    activate TQ
    TQ-->>S: OK
    deactivate TQ
    S->>S: Store resolver in _requestResolvers
    Note over S: Promise waiting...
    deactivate S
    H->>TS: updateTaskStatus('input_required')
    activate TS
    TS-->>H: OK
    deactivate TS
    Note over H: Blocked awaiting elicitation response

    Note over C,H: Phase 3: Client Polls Status
    activate C
    C->>CT: tasks/get { taskId }
    activate CT
    CT->>ST: HTTP POST
    activate ST
    ST->>S: _onrequest(GetTask)
    activate S
    S->>TS: getTask(taskId)
    activate TS
    TS-->>S: Task { status: 'input_required' }
    deactivate TS
    S-->>ST: Task
    deactivate S
    ST-->>CT: HTTP Response
    deactivate ST
    CT-->>C: Task { status: 'input_required' }
    deactivate CT
    deactivate C

    Note over C,H: Phase 4: Client Fetches Queued Messages
    activate C
    C->>CT: tasks/result { taskId }
    activate CT
    CT->>ST: HTTP POST
    activate ST
    ST->>S: _onrequest(GetTaskPayload)
    activate S
    S->>TQ: dequeue(taskId)
    activate TQ
    TQ-->>S: { type: 'request', message: elicitation }
    deactivate TQ
    S->>ST: send(elicitation, { relatedRequestId })
    ST-->>CT: SSE Event: elicitation request
    Note over S: Handler blocks (task not terminal)

    Note over C,H: Phase 5: Client Handles & Responds
    CT->>C: _onrequest(elicitation)
    activate C
    Note over C: Extract relatedTaskId from _meta
    C->>C: Call ElicitRequestSchema handler
    C->>C: Check: relatedTaskId && _taskMessageQueue
    Note over C: _taskMessageQueue is undefined
    C->>CT: transport.send(response)
    CT->>ST: HTTP POST (elicitation response)
    deactivate C

    Note over C,H: Phase 6: Server Receives Response, Resolves Promise
    ST->>S: _onresponse(elicitation response)
    S->>S: Lookup resolver in _requestResolvers
    S->>S: resolver(response)
    Note over S: Promise resolves
    S-->>H: Elicitation result { action: 'accept', content }
    Note over H: Resumes execution

    Note over C,H: Phase 7: Task Completes
    H->>TS: storeTaskResult('completed', finalResult)
    activate TS
    TS-->>H: OK
    deactivate TS
    deactivate H
    Note over S: GetTaskPayload handler wakes up
    S->>TS: getTask(taskId)
    activate TS
    TS-->>S: Task { status: 'completed' }
    deactivate TS
    S->>TS: getTaskResult(taskId)
    activate TS
    TS-->>S: CallToolResult
    deactivate TS
    S-->>ST: Return final result
    deactivate S
    ST-->>CT: SSE Event: CallToolResult
    deactivate ST
    CT-->>C: CallToolResult { content: [...] }
    deactivate CT
    deactivate C
```
felixweinberger and others added 4 commits November 27, 2025 09:04
Phase 1-2 of tasks experimental isolation:
- Create src/experimental/tasks/ directory structure
- Move TaskStore, TaskMessageQueue, and related interfaces to experimental/tasks/interfaces.ts
- Add experimental/tasks/types.ts for re-exporting spec types
- Update shared/task.ts to re-export from experimental for backward compatibility
- Add barrel exports for experimental module

All tests pass (1399 tests).
Restore callTool() to its original implementation instead of delegating
to experimental.tasks.callToolStream(). This aligns with Python SDK's
approach where call_tool() is task-unaware and call_tool_as_task() is
the explicit experimental method.

Changes:
- Add guard for taskSupport: 'required' tools with clear error message
- Restore original output schema validation logic
- Add _cachedRequiredTaskTools to track required-only task tools
- Remove unused takeResult import

Tools with taskSupport: 'optional' work normally with callTool() since
the server returns CallToolResult. Only 'required' tools need the
experimental API.
@LucaButBoring
Copy link
Contributor Author

@felixweinberger Merged and synced with main

@felixweinberger felixweinberger dismissed maxisbey’s stale review November 27, 2025 20:10

Dismissing for merge

@felixweinberger felixweinberger merged commit b9538a2 into modelcontextprotocol:main Nov 27, 2025
6 checks passed
@He-Pin
Copy link

He-Pin commented Nov 27, 2025

nice to see this got merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement SEP-1686: Tasks

6 participants