modelscope · tastelikefeet · Apr 15, 2026 · Apr 15, 2026 · Apr 15, 2026 · Apr 15, 2026
diff --git a/README.md b/README.md
@@ -2,7 +2,7 @@
 
 <p align="center">
     <img src="assets/slogan.png" width="200"/>
-<p>
+</p>
 <p align="center">
 by <a href="https://modelscope.cn/home">ModelScope</a>
 <br>
@@ -30,16 +30,16 @@ with `torchrun`, or scaling training across Ray clusters,
 Twinkle✨ eliminates infrastructure friction by encapsulating
 training logic into standardized APIs. Beyond simple
 abstraction, Twinkle✨ serves as a robust backend and gateway to enable serverless Training-as-a-Service (TaaS).
-It offers interfaces that constitute a _superset_ of  [Tinker](https://thinkingmachines.ai/tinker/) APIs,
-thereby making it possible to access a Twinkle✨ training service via Tinker client or native Twinkle✨ client
+It offers interfaces that constitute a _superset_ of [Tinker](https://thinkingmachines.ai/tinker/) APIs,
+thereby making it possible to access a Twinkle✨ training service via Tinker client or the native Twinkle✨ client,
 which offers more functionalities.
 
 🧩 <b>Decoupled Architecture</b>: Standardized Interfaces, backward compatible with Tinker APIs.<br>
 🚀 <b>Multiple Runtime Modes</b>: torchrun / Ray / HTTP.<br>
 🔌 <b>Versatile Backends</b>: Transformers / Megatron.<br>
 👥 <b>Multi-Tenancy Training Service</b>: Train multiple LoRAs that share one base model deployment.<br>
 
-Note: Twinkle✨is built by the team behind [ms-swift](https://github.com/modelscope/ms-swift), and
+Note: Twinkle✨ is built by the team behind [ms-swift](https://github.com/modelscope/ms-swift), and
 we expect the two projects to evolve together. We expect some fundamental components in Twinkle✨will likely
 be reused in [ms-swift](https://github.com/modelscope/ms-swift).
 
@@ -89,31 +89,39 @@ sh INSTALL_MEGATRON.sh
 
 ## Tutorials
 
-| Training Type                     | Model Framework | Cookbook Path                                     |
-| --------------------------------- | --------------- | ------------------------------------------------- |
-| FSDP finetuning                   | transformers    | [Script](cookbook/transformers/fsdp2.py)             |
-| FSDP MoE finetuning               | transformers    | [Script](cookbook/transformers/fsdp2_moe.py)         |
-| ep FSDP MoE finetuning            | transformers    | [Script](cookbook/transformers/ep_fsdp_qwen3_moe.py) |
-| sp FSDP finetuning                | transformers    | [Script](cookbook/transformers/sp_fsdp_dense.py)     |
-| EP MoE finetuning                 | transformers    | [Script](cookbook/transformers/ep_fsdp_qwen3_moe.py) |
-| pp/tp/cp finetuning               | megatron        | [Script](cookbook/megatron/tp.py)                    |
-| pp/tp/cp MoE finetuning           | megatron        | [Script](cookbook/megatron/tp_moe.py)                |
-| tinker client finetuning          | megatron        | [Script](cookbook/client/tinker/megatron)            |
-| tinker client finetuning/sampling | transformers    | [Script](cookbook/client/tinker/transformer)         |
-| twinkle client finetuning         | megatron        | [Script](cookbook/client/twinkle/megatron)           |
-| twinkle client finetuning         | transformer     | [Script](cookbook/client/twinkle/transformer)        |
+| Training Type                        | Model Framework | Cookbook Path                                          |
+| ------------------------------------ | --------------- | ----------------------------------------------------- |
+| FSDP finetuning                      | transformers    | [Script](cookbook/transformers/fsdp2.py)               |
+| FSDP MoE finetuning                  | transformers    | [Script](cookbook/transformers/fsdp2_moe.py)           |
+| EP FSDP MoE finetuning               | transformers    | [Script](cookbook/transformers/ep_fsdp_qwen3_moe.py)  |
+| SP FSDP finetuning                   | transformers    | [Script](cookbook/transformers/sp_fsdp_dense.py)      |
+| pp/tp/cp finetuning                  | megatron        | [Script](cookbook/megatron/tp.py)                      |
+| pp/tp/cp MoE finetuning              | megatron        | [Script](cookbook/megatron/tp_moe.py)                  |
+| Multimodal FSDP finetuning           | transformers    | [Script](cookbook/mm/fsdp2.py)                         |
+| GRPO RL training                     | megatron        | [Script](cookbook/rl/grpo.py)                          |
+| GRPO Multimodal RL training          | megatron        | [Script](cookbook/rl/grpo_mm.py)                       |
+| GRPO Math RL training                | megatron        | [Script](cookbook/rl/short_math_grpo.py)               |
+| DPO full-parameter training          | transformers    | [Script](cookbook/rl/dpo_full.py)                      |
+| DPO LoRA training                    | transformers    | [Script](cookbook/rl/dpo_lora.py)                      |
+| DPO multi-LoRA training              | transformers    | [Script](cookbook/rl/dpo_multi_lora.py)                |
+| GKD on-policy distillation           | megatron        | [Script](cookbook/rl/gkd_on_policy.py)                 |
+| GKD off-policy distillation          | megatron        | [Script](cookbook/rl/gkd_off_policy.py)                |
+| Tinker client finetuning (self-host) | transformers    | [Script](cookbook/client/tinker/self_host)             |
+| Tinker client finetuning (ModelScope) | transformers   | [Script](cookbook/client/tinker/modelscope)            |
+| Twinkle client finetuning (self-host) | transformers   | [Script](cookbook/client/twinkle/self_host)            |
+| Twinkle client finetuning (ModelScope) | transformers  | [Script](cookbook/client/twinkle/modelscope)           |
+| Server startup scripts               | transformers/megatron | [Script](cookbook/client/server)                 |
 
 ## Changelog
-- 🎉2026-04-14 The ModelScope service has been deployed to [Qwen/Qwen3.5-27B](https://www.modelscope.cn/models/Qwen/Qwen3.5-27B) with a new release 0.2.0.
+- 🎉2026-04-14 The ModelScope service has been deployed to [Qwen/Qwen3.6-35B-A3B](https://www.modelscope.cn/models/Qwen/Qwen3.6-35B-A3B) with a new release 0.2.0.
 - 🎉2026-03-28 Support DPO training with both Transformers and Megatron backends. See [dpo_full.py](cookbook/rl/dpo_full.py) and [dpo_lora.py](cookbook/rl/dpo_lora.py).
 - 🎉2026-03-24 Twinkle Web site is now live at https://modelscope.github.io/twinkle-web/
-- 🎉2026-03-19 Support GKD training ，please refer to this [cookbook](cookbook/rl/gkd_on_policy.py).
+- 🎉2026-03-19 Support GKD training, please refer to this [cookbook](cookbook/rl/gkd_on_policy.py).
 - 🎉2026-02-13 Initial version of Twinkle✨ released, including SFT/PT/RL support for text models.
 
 ## Training as a Service on ModelScope
 
-We are rolling out training service built atop Twinkle✨ on ModelScope. It is currently in _Beta_. You may
-sign up for free access by joining the [Twinkle-Explorers](https://modelscope.cn/organization/twinkle-explorers) organization, and
+We are rolling out training service built atop Twinkle✨ on ModelScope. You may
 train via API endpoint  `base_url=https://www.modelscope.cn/twinkle`. For more details, please refer to
 our [documentation](docs/source_en/Usage%20Guide/Train-as-a-Service.md).
 
@@ -122,7 +130,7 @@ our [documentation](docs/source_en/Usage%20Guide/Train-as-a-Service.md).
 | Hardware Environment | Notes                                                            |
 | -------------------- | ---------------------------------------------------------------- |
 | Nvidia GPUs          | ✅ Support for BF16/Flash-Attn may be incomplete in earlier GPUs |
-| Ascend NPU           | ✅ Some operators may not supported                              |
+| Ascend NPU           | ✅ Some operators may not be supported                           |
 | PPU                  | ✅                                                               |
 | CPU                  | Supports partial components like dataset, dataloader             |
 
@@ -135,7 +143,7 @@ supported on Twinkle✨ framework.
 > For serverless training service accessed via `base_url=https://www.modelscope.cn/twinkle`, it
 > is currently provided via the Tinker-compatible APIs. We will be rolling out services that support
 > both Tinker APIs, as well as the full-fledged Twinkle✨ native APIs. The serverless endpoint is backed
-> by one training base at a time, and currently it is [Qwen3.5-27B](https://modelscope.cn/models/Qwen/Qwen3.5-27B).
+> by one training base at a time, and currently it is [Qwen3.6-35B-A3B](https://modelscope.cn/models/Qwen/Qwen3.6-35B-A3B).
 
 | Model Type          | Model ID on [ModelScope](https://modelscope.cn)                                                                 |               Model Size                | Requires             | Support Megatron |                                                HF Model ID                                                |
 |---------------------|-----------------------------------------------------------------------------------------------------------------|:---------------------------------------:|----------------------|:----------------:|:---------------------------------------------------------------------------------------------------------:|
@@ -184,7 +192,7 @@ twinkle.initialize(mode='ray', groups=device_group, global_device_mesh=device_me
 
 def train():
     # to load model from Hugging Face, use 'hf://...'
-    base_model = 'ms://Qwen/Qwen3.5-27B'
+    base_model = 'ms://Qwen/Qwen3.6-35B-A3B'
     # 1000 samples
     dataset = Dataset(dataset_meta=DatasetMeta('ms://swift/self-cognition', data_slice=range(1000)))
     # Set template to prepare encoding
@@ -240,7 +248,7 @@ from twinkle.dataset import Dataset, DatasetMeta
 from twinkle.preprocessor import SelfCognitionProcessor
 from twinkle.server.common import input_feature_to_datum
 
-base_model = 'ms://Qwen/Qwen3.5-27B'
+base_model = 'ms://Qwen/Qwen3.6-35B-A3B'
 base_url='your-base-url'
 api_key='your-api-key'
 

diff --git a/README_ZH.md b/README_ZH.md
@@ -1,8 +1,8 @@
-# Twinkle: Training workbench to make your model glow
+<h1 align="center">Twinkle: Training workbench to make your model glow</h1>
 
 <p align="center">
     <img src="assets/slogan.png" width="200"/>
-<p>
+</p>
 <p align="center">
 <a href="https://modelscope.cn/home">ModelScope</a>
 <br>
@@ -71,38 +71,49 @@ Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
 
 这个脚本会下载或利用conda，创建一个叫`twinkle-client`的虚拟环境，这个环境可以直接用于远端训练。
 
-如果你需要安装Megatron相关依赖，可以如下脚本：
+如果你需要安装Megatron相关依赖，可以使用如下脚本：
 
 ```shell
 sh INSTALL_MEGATRON.sh
 ```
 
 ## 教程
 
-| 训练类型                     | 模型框架 | Cookbook 路径                                     |
-| ---------------------------- | -------- | ------------------------------------------------- |
-| FSDP 微调                    | transformers    | [脚本](cookbook/transformers/fsdp2.py)             |
-| FSDP MoE 微调                | transformers    | [脚本](cookbook/transformers/fsdp2_moe.py)         |
-| EP MoE 微调                  | transformers    | [脚本](cookbook/transformers/ep_fsdp_qwen3_moe.py) |
-| pp/tp/cp 微调                | megatron        | [脚本](cookbook/megatron/tp.py)                    |
-| pp/tp/cp MoE 微调            | megatron        | [脚本](cookbook/megatron/tp_moe.py)                |
-| tinker 客户端微调            | megatron        | [脚本](cookbook/client/tinker/megatron)            |
-| tinker 客户端微调/采样       | transformers    | [脚本](cookbook/client/tinker/transformer)         |
-| twinkle 客户端微调           | megatron        | [脚本](cookbook/client/twinkle/megatron)           |
-| twinkle 客户端微调           | transformer     | [脚本](cookbook/client/twinkle/transformer)        |
+| 训练类型                            | 模型框架              | Cookbook 路径                                          |
+| ----------------------------------- | --------------------- | ----------------------------------------------------- |
+| FSDP 微调                           | transformers          | [脚本](cookbook/transformers/fsdp2.py)                 |
+| FSDP MoE 微调                       | transformers          | [脚本](cookbook/transformers/fsdp2_moe.py)             |
+| EP FSDP MoE 微调                    | transformers          | [脚本](cookbook/transformers/ep_fsdp_qwen3_moe.py)    |
+| SP FSDP 微调                        | transformers          | [脚本](cookbook/transformers/sp_fsdp_dense.py)        |
+| pp/tp/cp 微调                       | megatron              | [脚本](cookbook/megatron/tp.py)                        |
+| pp/tp/cp MoE 微调                   | megatron              | [脚本](cookbook/megatron/tp_moe.py)                    |
+| 多模态 FSDP 微调                    | transformers          | [脚本](cookbook/mm/fsdp2.py)                           |
+| GRPO 强化学习训练                    | megatron              | [脚本](cookbook/rl/grpo.py)                            |
+| GRPO 多模态强化学习训练             | megatron              | [脚本](cookbook/rl/grpo_mm.py)                         |
+| GRPO 数学强化学习训练               | megatron              | [脚本](cookbook/rl/short_math_grpo.py)                 |
+| DPO 全参数训练                      | transformers          | [脚本](cookbook/rl/dpo_full.py)                        |
+| DPO LoRA 训练                       | transformers          | [脚本](cookbook/rl/dpo_lora.py)                        |
+| DPO 多 LoRA 训练                    | transformers          | [脚本](cookbook/rl/dpo_multi_lora.py)                  |
+| GKD 在线蒸馏                        | megatron              | [脚本](cookbook/rl/gkd_on_policy.py)                   |
+| GKD 离线蒸馏                        | megatron              | [脚本](cookbook/rl/gkd_off_policy.py)                  |
+| Tinker 客户端微调（自部署）         | transformers          | [脚本](cookbook/client/tinker/self_host)               |
+| Tinker 客户端微调（ModelScope）      | transformers          | [脚本](cookbook/client/tinker/modelscope)              |
+| Twinkle 客户端微调（自部署）        | transformers          | [脚本](cookbook/client/twinkle/self_host)              |
+| Twinkle 客户端微调（ModelScope）     | transformers          | [脚本](cookbook/client/twinkle/modelscope)             |
+| 服务端启动脚本                      | transformers/megatron | [脚本](cookbook/client/server)                         |
 
 Twinkle✨支持相同的算法接口运行在单GPU、torchrun多机、Ray、Client等各场景下。其算法过程是外露的，非常便于修改和调试。完整的框架介绍请查看[快速开始](docs/source_zh/使用指引/快速开始.md)
 
 ## 更新日志
-🎉2026-04-14 ModelScope的训练服务部署为[Qwen/Qwen3.5-27B](https://www.modelscope.cn/models/Qwen/Qwen3.5-27B)，并发布了0.2.0版本.
+🎉2026-04-16 ModelScope的训练服务部署为[Qwen/Qwen3.6-35B-A3B](https://www.modelscope.cn/models/Qwen/Qwen3.6-35B-A3B)，并发布了0.2.0版本.
 🎉2026-03-28 支持 DPO 训练，同时支持 Transformers 和 Megatron 后端。参考 [dpo_full.py](cookbook/rl/dpo_full.py) 和 [dpo_lora.py](cookbook/rl/dpo_lora.py)。
 🎉2026-03-24 Twinkle 站点上线，访问地址 https://modelscope.github.io/twinkle-web/
-🎉2026-03-19 支持GKD蒸馏能力，参考[cookbook](cookbook/rl/gkd_on_policy.py)。
+🎉2026-03-19 支持 GKD 蒸馏能力，参考 [cookbook](cookbook/rl/gkd_on_policy.py)。
 🎉2026-02-13 Twinkle✨ 初始版本发布，支持文本模型的SFT/PT/RL训练。我们还通过兼容Tinker的API，在魔搭社区上提供了无服务器训练功能。
 
 ## ModelScope 的训练服务
 
-我们正在 ModelScope 上推出基于 Twinkle✨ 构建的训练服务。目前处于 _Beta_ 阶段。你可以通过加入 [Twinkle-Explorers](https://modelscope.cn/organization/twinkle-explorers) 组织来注册免费访问，并通过 API 端点 `base_url=https://www.modelscope.cn/twinkle` 进行训练。更多详情请参阅我们的[文档](docs/source_zh/使用指引/训练服务.md)。
+我们正在 ModelScope 上推出基于 Twinkle✨ 构建的训练服务。你可以通过 API 端点 `base_url=https://www.modelscope.cn/twinkle` 进行训练。更多详情请参阅我们的[文档](docs/source_zh/使用指引/训练服务.md)。
 
 ## 支持的硬件
 
@@ -118,7 +129,7 @@ Twinkle✨支持相同的算法接口运行在单GPU、torchrun多机、Ray、Cl
 随着新模型的发布，我们将添加对更多模型的支持。下表列出了 Twinkle✨ 框架当前支持的模型。
 
 >[!Note]
-> 通过 `base_url=https://www.modelscope.cn/twinkle` 访问的无服务器训练服务，目前是通过兼容Tinker的API提供的。我们将陆续推出同时支持Tinker API和完整Twinkle✨原生 API的服务。无服务器端点每次由一个训练基座支持，目前使用的是[Qwen3.5-27B](https://modelscope.cn/models/Qwen/Qwen3.5-27B)。
+> 通过 `base_url=https://www.modelscope.cn/twinkle` 访问的无服务器训练服务，目前是通过兼容Tinker的API提供的。我们将陆续推出同时支持Tinker API和完整Twinkle✨原生 API的服务。无服务器端点每次由一个训练基座支持，目前使用的是[Qwen3.6-35B-A3B](https://modelscope.cn/models/Qwen/Qwen3.6-35B-A3B)。
 
 | Model Type          | Model ID 举例                                                                                                     |               Model Size                | Requires             | Support Megatron |                                                HF Model ID                                                |
 |---------------------|-----------------------------------------------------------------------------------------------------------------|:---------------------------------------:|----------------------|:----------------:|:---------------------------------------------------------------------------------------------------------:|
@@ -166,7 +177,7 @@ twinkle.initialize(mode='ray', groups=device_group, global_device_mesh=device_me
 
 def train():
     # to load model from Hugging Face, use 'hf://...'
-    base_model = 'ms://Qwen/Qwen3.5-27B'
+    base_model = 'ms://Qwen/Qwen3.6-35B-A3B'
     # 1000 samples
     dataset = Dataset(dataset_meta=DatasetMeta('ms://swift/self-cognition', data_slice=range(1000)))
     # Set template to prepare encoding
@@ -222,7 +233,7 @@ from twinkle.dataset import Dataset, DatasetMeta
 from twinkle.preprocessor import SelfCognitionProcessor
 from twinkle.server.common import input_feature_to_datum
 
-base_model = 'ms://Qwen/Qwen3.5-27B'
+base_model = 'ms://Qwen/Qwen3.6-35B-A3B'
 base_url='your-base-url'
 api_key='your-api-key'