Skip to content

Fix multi lora device#160

Merged
tastelikefeet merged 10 commits intomodelscope:mainfrom
tastelikefeet:fix/0416-1
Apr 16, 2026
Merged

Fix multi lora device#160
tastelikefeet merged 10 commits intomodelscope:mainfrom
tastelikefeet:fix/0416-1

Conversation

@tastelikefeet
Copy link
Copy Markdown
Collaborator

PR type

  • Bug Fix
  • New Feature
  • Document Updates
  • More Models or Datasets Support

PR information

Write the detail information belongs to this PR.

Experiment results

Paste your experiment result here(if needed).

@tastelikefeet tastelikefeet merged commit 8d7bd15 into modelscope:main Apr 16, 2026
1 of 3 checks passed
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the Twinkle framework to version 0.2.0, transitioning the default model to Qwen3.6-35B-A3B and significantly expanding the documentation and cookbook examples. Key technical changes include the implementation of a convenience encode method for single trajectories and the addition of several new components such as Gym, Hub, and various RL-focused loss functions. Feedback suggests clarifying the error message when encode is used with a splitting truncation strategy and verifying template compatibility for text-only models in example scripts.

I am having trouble creating individual review comments. Click here to see my feedback.

src/twinkle/template/base.py (564-566)

medium

The assertion correctly prevents the use of truncation_strategy='split' with the encode method, as splitting results in multiple InputFeature objects which cannot be returned by a method designed for a single trajectory. However, the error message suggests using batch_encode() instead. It would be helpful to clarify that batch_encode should be used when multiple output features are expected from a single input due to splitting.

cookbook/sample/sample.py (82)

medium

The script uses Qwen3_5Template with MODEL_ID='Qwen/Qwen3.5-4B'. According to the documentation in docs/source_en/Components/Template/Template.md, Qwen3_5Template is specifically for Qwen3.5 MLLMs (multimodal models). If Qwen/Qwen3.5-4B is a pure text model, using the base Template class might be more appropriate and less prone to unexpected behavior related to multimodal processing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants