Fix some potential bugs#168
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces input validation for sampling parameters, modifies the retry sampler to return instead of raising exceptions, and adds a try-finally block for resource cleanup in the vLLM engine. Feedback suggests that the change in the retry sampler could lead to deadlocks in distributed training and should be reverted. Additionally, the cleanup logic in the vLLM engine should be expanded to cover all resource allocations to prevent potential leaks.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces validation logic for sampling parameters and enhances resource management in the VLLM engine by wrapping weight streaming in a try-finally block. Feedback focuses on improving the robustness of these changes: specifically, handling NaN values in parameter validation, ensuring background tasks are explicitly cancelled during cleanup to prevent deadlocks, and expanding the try block to encompass all resource allocations to avoid potential leaks if initialization fails.
(cherry picked from commit 7cc7d45)
(cherry picked from commit 4a51f5a)
PR type
PR information
Write the detail information belongs to this PR.
Experiment results
Paste your experiment result here(if needed).