fix cookbook#152
Conversation
There was a problem hiding this comment.
Code Review
This pull request updates several cookbook examples with revised GPU configurations and model IDs, and refactors the loss calculation logic to support separate training and evaluation statuses. It also includes bug fixes for variable references in the vLLM engine and sequence length calculations. Reviewers identified an inconsistency where some metrics still hardcode training status during evaluation and suggested more robust error handling when dynamically loading model architectures from configurations.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces multimodal distillation support for OlympiadBench, updates model configurations across several examples, and enhances the core framework's handling of evaluation status, weight synchronization, and vLLM sampling. Key improvements include a new check for uniform batch distribution to prevent hangs and corrected sequence length indexing. Review feedback highlights several critical issues: a potential TypeError in weight synchronization when model_keys is null, a ValueError in template label concatenation caused by numpy array operations, a missing self parameter in a template method, and ineffective type checking for tensor data in the uniformity check.
PR type
PR information
Write the detail information belongs to this PR.
Experiment results
Paste your experiment result here(if needed).