feat: add python binding for rust llm modules by biswapanda · Pull Request #13 · ai-dynamo/dynamo

biswapanda · 2025-03-04T19:06:26Z

What does the PR do?

Adds python binding and example for these llm modules:

model deployment card
preprocessor
backend

Checklist

Commit Type:

Check the conventional commit type
box here and add the label to the github PR.

Related PRs:

Where should the reviewer start?

Test plan:

CI Pipeline ID:

Caveats:

Background

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes GitHub issue: #xxx

github-actions · 2025-03-04T19:38:30Z

Test Results

2 files 2 suites 52s ⏱️
75 tests 75 ✅ 0 💤 0 ❌
97 runs 96 ✅ 1 💤 0 ❌

Results for commit 4f1861f.

Quick-win review fixes from PR #9131. Heavy-lift items (#9 prompt_token_ids env-gate, #11 update_weights atomicity, #13 per-choice completion_token_ids) tracked separately as follow-ups. handlers.py - Catch EngineDeadError before the generic except in all 8 RL handlers (pause/resume/liveness_probe/get_state/flush_cache/update_weights_from_path/ load_lora_adapter/unload_lora_adapter): match the existing shutdown pattern in this file so admin calls also surface engine death instead of leaving a broken worker alive. - get_state: fall back to a no-op collective_rpc when check_health is absent — same fallback liveness_probe already uses, otherwise older engines without check_health always look alive. - load_lora_adapter hot-swap path: a remove_lora() failure now returns a 400-style error response (was: silent log warn + continue, leaving add_lora to no-op against the still-registered ID); a reset_prefix_cache() failure after add_lora succeeds also returns error (was: log error and continue, leaving stale KV from the old adapter routable). - unload_lora_adapter: an unregister_model() failure after engine remove_lora succeeds now returns error (was: log warn and report success, leaving model=<lora_name> still routed to this worker even though _resolve_lora_request would now fall back to the base model). container/deps/vllm/install_vllm.sh - Pin prime-rl install to an immutable commit SHA (d49f3939e7dca29bceb9ed515cc1782497b67e81 ↔ tag v0.5.1.dev101) so a re-pointed tag upstream can't change what we ship. PRIME_RL_REF kept in build logs for human readability; PRIME_RL_COMMIT is the authoritative pin. - Replace `echo "\n=== ..."` with `printf '\n=== ...\n'` (shellcheck SC2028). lib/llm/src/http/service/openai.rs - Force `request.inner.logprobs = Some(true)` unconditionally in both RL token-id promotion blocks (was: only when None). RL extraction of completion_token_ids depends on logprobs being on at the engine; an explicit logprobs=false would otherwise silently drop them. - Bound `/v1/rl/ready` per-worker probes with a 5s timeout (override via DYN_RL_LIVENESS_TIMEOUT_MS). Was reusing the shared 600s http_client, so one wedged worker could block readiness for 10 minutes instead of failing fast as 503. - Tokenize Chat handler: call `request.validate()?` before `merged_chat_template_kwargs()` so the continue_final_message + add_generation_prompt mutual-exclusion constraint is enforced (validate() existed but was never invoked). lib/llm/src/protocols/openai/chat_completions.rs - Update stale doc comments on the legacy `tokens` and `return_token_ids` fields: they pointed callers at the now-404 `/v1/chat/completions/tokens` URI. Direct callers to the canonical top-level `prompt_token_ids` extension and `nvext.extra_fields` instead. cargo check -p dynamo-llm: clean (1 pre-existing benign warning). cargo test -p dynamo-llm --test test_common_ext: 15 passed.

biswapanda added 6 commits March 4, 2025 10:42

feat: add pybindings for mdc, backend and preprocessor

2dd42cb

fix: remove pipeline

8d4afeb

fix: add license text

89f121e

style: mypy and cargo fmt

769550c

style: rename dirs

7ab8913

feat: fixes

4f1861f

biswapanda requested review from grahamking, nnshah1, paulhendricks, ptarasiewiczNV and rmccorm4 as code owners March 4, 2025 19:06

biswapanda temporarily deployed to GITLAB March 4, 2025 19:06 — with GitHub Actions Inactive

biswapanda temporarily deployed to GITLAB March 4, 2025 19:10 — with GitHub Actions Inactive

grahamking approved these changes Mar 4, 2025

View reviewed changes

biswapanda self-assigned this Mar 4, 2025

fix: clippy err

0b62587

biswapanda requested a review from GuanLuo as a code owner March 4, 2025 22:06

biswapanda temporarily deployed to GITLAB March 4, 2025 22:06 — with GitHub Actions Inactive

biswapanda temporarily deployed to GITLAB March 4, 2025 22:15 — with GitHub Actions Inactive

style: cargo fmt

3b96755

biswapanda temporarily deployed to GITLAB March 4, 2025 22:17 — with GitHub Actions Inactive

biswapanda temporarily deployed to GITLAB March 4, 2025 22:25 — with GitHub Actions Inactive

Merge branch 'main' into bis/pybind-rusty-llm

d0e38c9

biswapanda temporarily deployed to GITLAB March 4, 2025 22:32 — with GitHub Actions Inactive

biswapanda temporarily deployed to GITLAB March 4, 2025 22:42 — with GitHub Actions Inactive

biswapanda merged commit 2da0921 into main Mar 4, 2025

biswapanda deleted the bis/pybind-rusty-llm branch March 4, 2025 22:52

kylehh pushed a commit to kylehh/dynamo that referenced this pull request Apr 11, 2025

feat: add python binding for rust llm modules (ai-dynamo#13)

a32cdad

tanmayv25 mentioned this pull request Apr 15, 2026

DEP: Backend Interface -- LLMEngine ABC and Worker #8251

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add python binding for rust llm modules#13

feat: add python binding for rust llm modules#13
biswapanda merged 9 commits into
mainfrom
bis/pybind-rusty-llm

biswapanda commented Mar 4, 2025 •

edited

Loading

Uh oh!

github-actions Bot commented Mar 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

biswapanda commented Mar 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does the PR do?

Checklist

Commit Type:

Related PRs:

Where should the reviewer start?

Test plan:

Caveats:

Background

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Uh oh!

github-actions Bot commented Mar 4, 2025

Test Results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

biswapanda commented Mar 4, 2025 •

edited

Loading