-
Notifications
You must be signed in to change notification settings - Fork 1.5k
SEP-1686: Tasks #1041
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SEP-1686: Tasks #1041
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SEP-1686 currently states that it "introduces a mechanism for requestors (which can be either clients or servers, depending on the direction of communication) to augment their requests with tasks." Is this still the case?
I don't see any tests or examples demonstrating using a TaskStore with an McpClient (it's all McpServer or the Protocol base type), although I suppose it should work considering its shared code. It still might be nice to have an end-to-end test demonstrating that client-side support for stuff like elicitations works using the public APIs.
| * Note: This is not suitable for production use as all data is lost on restart. | ||
| * For production, consider implementing TaskStore with a database or distributed cache. | ||
| */ | ||
| export class InMemoryTaskStore implements TaskStore { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we cannot create a default in-memory task store with settable max limits on the total number of tasks?
I like that this PR adds the TaskStore abstraction which opens the door for a distributed implementation and doesn't require the application developer to manually handle tasks/get, tasks/list, etc., but it feels weird to me to not have a production-ready in-memory solution provided by the SDK.
It should still be off by default to make sure people are explicit about any maxTrackedTask limits and the like, but it should be built in and not left to an example that is "not suitable for production." This will prove it's possible to implement a production-quality TaskStore in the easiest case where everything is tracked in-memory and lost on process restart.
- Note: This is not suitable for production use as all data is lost on restart.
- For production, consider implementing TaskStore with a database or distributed cache.
This is true, but I think the impact is overstated. The logic in protocol.ts that calls _taskStore.updateTaskStatus(taskMetadata.taskId, 'input_required'); and then sets the task back to 'working' when the a request completes also breaks if the server restarts. If the process exits before the client provides a response, the task will be permanently left in an 'input_required' state indefinitely without some manual intervention outside of the SDK.
In most cases supported by the SDK, when tasks cannot outlive the MCP server process, an in-memory TaskStore would be better than a distributed one because the client will be informed that the Task is no longer being tracked by the server after it restarts automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't disagree, but precedence was the main reason for leaving this as an example and leaving out any limits, really - it's the same case with InMemoryEventStore and InMemoryOAuthClientProvider. @ihrpr @felixweinberger any input here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the SDK is going to provide a production-grade implementation, it needs to have limits and some form of logging extension points, but if it is not going to provide a production-grade implementation, I don't want to misrepresent this example as one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can never implement a production-ready service implementation, because that will always require some additional resources
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems right to me to have this as an interface - it seems hard to provide a generic "production-ready" approach here; I guess we could use sqlite? But I think the argument witht he InMemoryEventStore and InMemoryOAuthClientProvider are convincing. If we wanted to create production ready examples, we can do that as follow-ups.
However, it's important this is documented clearly, which it is in the README.
Agreed, will add this. edit: Done |
|
Updated @maxisbey |
|
@maxisbey @LucaButBoring I tested TS <> Py. The following worked:
But this didn't work:
To make this work cleanly I had to make 2 small changes:
Then I get this:
|
|
TS SDK fix is in: 681cf0d |
Gonna test elicitation examples as well, assuming that works fine I think we should get this landed. |
Responses and errors were incorrectly going through the original stream path instead of being queued. Also, extra.sendRequest was not setting the input_required status. These issues have been fixed and tests have been added/updated for them.
Sequence diagram of the intended flow:
```mermaid
sequenceDiagram
participant C as Client Protocol
participant CT as Client Transport
participant ST as Server Transport
participant S as Server Protocol
participant TQ as TaskMessageQueue
participant TS as TaskStore
participant H as Async Handler
Note over C,H: Phase 1: Task Creation
activate C
C->>CT: tools/call { task: { ttl: 60000 } }
activate CT
CT->>ST: HTTP POST
activate ST
ST->>S: _onrequest()
activate S
S->>TS: createTask()
activate TS
TS-->>S: Task { taskId, status: 'working' }
deactivate TS
S--)H: Start async handler (non-blocking)
activate H
S-->>ST: CreateTaskResult { task }
deactivate S
ST-->>CT: HTTP Response
deactivate ST
CT-->>C: CreateTaskResult
deactivate CT
deactivate C
Note over C,H: Phase 2: Server Queues Elicitation Request
H->>S: extra.sendRequest(elicitation, { relatedTask })
activate S
S->>TQ: enqueue({ type: 'request', message: elicitation })
activate TQ
TQ-->>S: OK
deactivate TQ
S->>S: Store resolver in _requestResolvers
Note over S: Promise waiting...
deactivate S
H->>TS: updateTaskStatus('input_required')
activate TS
TS-->>H: OK
deactivate TS
Note over H: Blocked awaiting elicitation response
Note over C,H: Phase 3: Client Polls Status
activate C
C->>CT: tasks/get { taskId }
activate CT
CT->>ST: HTTP POST
activate ST
ST->>S: _onrequest(GetTask)
activate S
S->>TS: getTask(taskId)
activate TS
TS-->>S: Task { status: 'input_required' }
deactivate TS
S-->>ST: Task
deactivate S
ST-->>CT: HTTP Response
deactivate ST
CT-->>C: Task { status: 'input_required' }
deactivate CT
deactivate C
Note over C,H: Phase 4: Client Fetches Queued Messages
activate C
C->>CT: tasks/result { taskId }
activate CT
CT->>ST: HTTP POST
activate ST
ST->>S: _onrequest(GetTaskPayload)
activate S
S->>TQ: dequeue(taskId)
activate TQ
TQ-->>S: { type: 'request', message: elicitation }
deactivate TQ
S->>ST: send(elicitation, { relatedRequestId })
ST-->>CT: SSE Event: elicitation request
Note over S: Handler blocks (task not terminal)
Note over C,H: Phase 5: Client Handles & Responds
CT->>C: _onrequest(elicitation)
activate C
Note over C: Extract relatedTaskId from _meta
C->>C: Call ElicitRequestSchema handler
C->>C: Check: relatedTaskId && _taskMessageQueue
Note over C: _taskMessageQueue is undefined
C->>CT: transport.send(response)
CT->>ST: HTTP POST (elicitation response)
deactivate C
Note over C,H: Phase 6: Server Receives Response, Resolves Promise
ST->>S: _onresponse(elicitation response)
S->>S: Lookup resolver in _requestResolvers
S->>S: resolver(response)
Note over S: Promise resolves
S-->>H: Elicitation result { action: 'accept', content }
Note over H: Resumes execution
Note over C,H: Phase 7: Task Completes
H->>TS: storeTaskResult('completed', finalResult)
activate TS
TS-->>H: OK
deactivate TS
deactivate H
Note over S: GetTaskPayload handler wakes up
S->>TS: getTask(taskId)
activate TS
TS-->>S: Task { status: 'completed' }
deactivate TS
S->>TS: getTaskResult(taskId)
activate TS
TS-->>S: CallToolResult
deactivate TS
S-->>ST: Return final result
deactivate S
ST-->>CT: SSE Event: CallToolResult
deactivate ST
CT-->>C: CallToolResult { content: [...] }
deactivate CT
deactivate C
```
1789d3c to
3e9f737
Compare
Phase 1-2 of tasks experimental isolation: - Create src/experimental/tasks/ directory structure - Move TaskStore, TaskMessageQueue, and related interfaces to experimental/tasks/interfaces.ts - Add experimental/tasks/types.ts for re-exporting spec types - Update shared/task.ts to re-export from experimental for backward compatibility - Add barrel exports for experimental module All tests pass (1399 tests).
Restore callTool() to its original implementation instead of delegating to experimental.tasks.callToolStream(). This aligns with Python SDK's approach where call_tool() is task-unaware and call_tool_as_task() is the explicit experimental method. Changes: - Add guard for taskSupport: 'required' tools with clear error message - Restore original output schema validation logic - Add _cachedRequiredTaskTools to track required-only task tools - Remove unused takeResult import Tools with taskSupport: 'optional' work normally with callTool() since the server returns CallToolResult. Only 'required' tools need the experimental API.
|
@felixweinberger Merged and synced with |
|
nice to see this got merged |

This PR implements the required changes for modelcontextprotocol/modelcontextprotocol#1686, which adds task augmentation to requests.
Motivation and Context
The current MCP specification supports tool calls that execute a request and eventually receive a response, and tool calls can be passed a progress token to integrate with MCP’s progress-tracking functionality, enabling host applications to receive status updates for a tool call via notifications. However, there is no way for a client to explicitly request the status of a tool call, resulting in states where it is possible for a tool call to have been dropped on the server, and it is unknown if a response or a notification may ever arrive. Similarly, there is no way for a client to explicitly retrieve the result of a tool call after it has completed — if the result was dropped, clients must call the tool again, which is undesirable for tools expected to take minutes or more. This is particularly relevant for MCP servers abstracting existing workflow-based APIs, such as AWS Step Functions, Workflows for Google Cloud, or APIs representing CI/CD pipelines, among other applications.
This proposal (and implementation) solves this by introducing the concept of Tasks, which are pending work objects that can augment any other request. Clients generate a task ID and augment their request with it — that task ID is both a reference to the request and an idempotency token. If the server accepts the task augmentation request, clients can then poll for the status and eventual result of the task with the
tasks/getandtasks/resultoperations.How Has This Been Tested?
Unit tests and updated sHTTP example.
Breaking Changes
None.
Types of changes
Checklist
Additional context