Skip to content

Feat/OpenAI backend worker#3

Merged
rushilbhat merged 3 commits into
mainfrom
feat/openai-backend-worker
Apr 27, 2026
Merged

Feat/OpenAI backend worker#3
rushilbhat merged 3 commits into
mainfrom
feat/openai-backend-worker

Conversation

@rushilbhat
Copy link
Copy Markdown

Summary

Add an OpenAI-compatible backend worker for Dynamo so SGLang and vLLM can handle chat processing through their own OpenAI-compatible APIs, reducing the lag between upstream engine changes and Dynamo support.

Problem

Dynamo’s native SGLang/vLLM paths are still tightly coupled to model-specific chat processing, which means upstream engine changes to chat templating, reasoning, tool calling, and related request/response behavior often take time to be reflected in Dynamo. That lag makes it harder to stay current with engine behavior and support new upstream capabilities quickly.

Solution

This PR adds an OpenAI-compatible backend worker for Dynamo that forwards chat/completions requests to a colocated engine’s OpenAI-compatible endpoint, so chat processing can be delegated to the engine itself.

To support that cleanly, this PR also:

  • adds a generic forwarding worker for OpenAI-compatible backends
  • preserves streaming behavior and coalesces streamed tool-call arguments into complete tool calls
  • normalizes compatibility gaps between Dynamo and the underlying engine, including:
    - chat_template_args -> chat_template_kwargs
    - vLLM streamed reasoning -> reasoning_content
  • splits launcher entrypoints by engine:
    - dynamo.openai_backend.sglang
    - dynamo.openai_backend.vllm
  • moves shared launcher logic into a common helper
  • updates the vLLM install flow to support explicit stable and nightly build paths

- Implement a Dynamo worker that forwards requests to a local OpenAI-compatible server
- Support streaming chat/completions responses
- Coalesce streamed tool call arguments
- Normalize chat :template fields before forwarding requests
@rushilbhat rushilbhat merged commit 636fe13 into main Apr 27, 2026
7 of 13 checks passed
return
task.cancel()
with contextlib.suppress(asyncio.CancelledError):
await task
future.cancel()
try:
await future
except asyncio.CancelledError:
)
try:
yield event_source
except BaseException as exc:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants