Skip to content

Fix sampler and grpo#179

Merged
Yunnglin merged 2 commits intomainfrom
fix_sampler_0422
Apr 22, 2026
Merged

Fix sampler and grpo#179
Yunnglin merged 2 commits intomainfrom
fix_sampler_0422

Conversation

@Yunnglin
Copy link
Copy Markdown
Collaborator

PR type

  • Bug Fix
  • New Feature
  • Document Updates
  • More Models or Datasets Support

PR information

Write the detail information belongs to this PR.

Experiment results

Paste your experiment result here(if needed).

Copilot AI review requested due to automatic review settings April 22, 2026 12:17
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a mechanism to bypass weight caching by using timestamped symlinks for sampler checkpoints and ensures the prefix cache is reset upon loading new weights. It also increases HTTP timeouts and simplifies the sampler initialization to exclusively use vLLMSampler. Feedback was provided regarding the removal of the sampler_type branching logic, which leaves a dead parameter, and suggestions were made to improve symlink robustness by using relative paths and higher-resolution timestamps.

Comment thread src/twinkle/server/sampler/app.py
Comment thread src/twinkle/server/utils/checkpoint_base.py
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes issues around sampler weight saving/loading and GRPO-related sampling behavior by adjusting checkpoint naming, cache invalidation, and request timeouts.

Changes:

  • Increase HTTP client request timeouts for Twinkle client requests.
  • Change sampler checkpoint saving to always write weights under sampler_weights/latest while returning a per-save timestamp path (via symlink) to avoid path-based caching.
  • Reset vLLM prefix cache when new sampler weights/adapters are used, and simplify sampler deployment initialization to always use vLLM.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
src/twinkle_client/http/http_utils.py Increases default request timeout for GET/POST/DELETE helpers.
src/twinkle/server/utils/checkpoint_base.py Adjusts sampler checkpoint naming/storage and adds timestamp symlink return path; improves sampler-weight cleanup for symlinks.
src/twinkle/server/sampler/twinkle_handlers.py Resets sampler prefix cache when an adapter URI is provided for Twinkle sampling requests.
src/twinkle/server/sampler/tinker_handlers.py Resets sampler prefix cache during Tinker sampling flow.
src/twinkle/server/sampler/app.py Removes sampler-type branching and always initializes vLLMSampler.
src/twinkle/server/model/twinkle_handlers.py Ensures sampler weight files are saved under latest/ to match checkpoint manager behavior.
src/twinkle/server/model/tinker_handlers.py Ensures sampler weight files are saved under latest/ for Tinker sampler checkpoints.
src/twinkle/sampler/vllm_sampler/vllm_sampler.py Adds logging when attempting to load LoRA from a path during sampling.
src/twinkle/sampler/vllm_sampler/vllm_engine.py Adds logging when returning a cached LoRA request.

Comment thread src/twinkle/server/sampler/twinkle_handlers.py
Comment thread src/twinkle/sampler/vllm_sampler/vllm_engine.py
Comment thread src/twinkle/sampler/vllm_sampler/vllm_sampler.py
Comment thread src/twinkle_client/http/http_utils.py
Comment thread src/twinkle_client/http/http_utils.py
Comment thread src/twinkle_client/http/http_utils.py
Comment thread src/twinkle/server/utils/checkpoint_base.py
Comment thread src/twinkle/server/sampler/app.py
Comment thread src/twinkle/server/sampler/tinker_handlers.py
@Yunnglin Yunnglin merged commit c321952 into main Apr 22, 2026
6 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants