Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 33 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

<p align="center">
<img src="assets/slogan.png" width="200"/>
<p>
</p>
<p align="center">
by <a href="https://modelscope.cn/home">ModelScope</a>
<br>
Expand Down Expand Up @@ -30,16 +30,16 @@ with `torchrun`, or scaling training across Ray clusters,
Twinkle✨ eliminates infrastructure friction by encapsulating
training logic into standardized APIs. Beyond simple
abstraction, Twinkle✨ serves as a robust backend and gateway to enable serverless Training-as-a-Service (TaaS).
It offers interfaces that constitute a _superset_ of [Tinker](https://thinkingmachines.ai/tinker/) APIs,
thereby making it possible to access a Twinkle✨ training service via Tinker client or native Twinkle✨ client
It offers interfaces that constitute a _superset_ of [Tinker](https://thinkingmachines.ai/tinker/) APIs,
thereby making it possible to access a Twinkle✨ training service via Tinker client or the native Twinkle✨ client,
which offers more functionalities.

🧩 <b>Decoupled Architecture</b>: Standardized Interfaces, backward compatible with Tinker APIs.<br>
🚀 <b>Multiple Runtime Modes</b>: torchrun / Ray / HTTP.<br>
🔌 <b>Versatile Backends</b>: Transformers / Megatron.<br>
👥 <b>Multi-Tenancy Training Service</b>: Train multiple LoRAs that share one base model deployment.<br>

Note: Twinkle✨is built by the team behind [ms-swift](https://github.com/modelscope/ms-swift), and
Note: Twinkle✨ is built by the team behind [ms-swift](https://github.com/modelscope/ms-swift), and
we expect the two projects to evolve together. We expect some fundamental components in Twinkle✨will likely
be reused in [ms-swift](https://github.com/modelscope/ms-swift).

Expand Down Expand Up @@ -89,31 +89,39 @@ sh INSTALL_MEGATRON.sh

## Tutorials

| Training Type | Model Framework | Cookbook Path |
| --------------------------------- | --------------- | ------------------------------------------------- |
| FSDP finetuning | transformers | [Script](cookbook/transformers/fsdp2.py) |
| FSDP MoE finetuning | transformers | [Script](cookbook/transformers/fsdp2_moe.py) |
| ep FSDP MoE finetuning | transformers | [Script](cookbook/transformers/ep_fsdp_qwen3_moe.py) |
| sp FSDP finetuning | transformers | [Script](cookbook/transformers/sp_fsdp_dense.py) |
| EP MoE finetuning | transformers | [Script](cookbook/transformers/ep_fsdp_qwen3_moe.py) |
| pp/tp/cp finetuning | megatron | [Script](cookbook/megatron/tp.py) |
| pp/tp/cp MoE finetuning | megatron | [Script](cookbook/megatron/tp_moe.py) |
| tinker client finetuning | megatron | [Script](cookbook/client/tinker/megatron) |
| tinker client finetuning/sampling | transformers | [Script](cookbook/client/tinker/transformer) |
| twinkle client finetuning | megatron | [Script](cookbook/client/twinkle/megatron) |
| twinkle client finetuning | transformer | [Script](cookbook/client/twinkle/transformer) |
| Training Type | Model Framework | Cookbook Path |
| ------------------------------------ | --------------- | ----------------------------------------------------- |
| FSDP finetuning | transformers | [Script](cookbook/transformers/fsdp2.py) |
| FSDP MoE finetuning | transformers | [Script](cookbook/transformers/fsdp2_moe.py) |
| EP FSDP MoE finetuning | transformers | [Script](cookbook/transformers/ep_fsdp_qwen3_moe.py) |
| SP FSDP finetuning | transformers | [Script](cookbook/transformers/sp_fsdp_dense.py) |
| pp/tp/cp finetuning | megatron | [Script](cookbook/megatron/tp.py) |
| pp/tp/cp MoE finetuning | megatron | [Script](cookbook/megatron/tp_moe.py) |
| Multimodal FSDP finetuning | transformers | [Script](cookbook/mm/fsdp2.py) |
| GRPO RL training | megatron | [Script](cookbook/rl/grpo.py) |
| GRPO Multimodal RL training | megatron | [Script](cookbook/rl/grpo_mm.py) |
| GRPO Math RL training | megatron | [Script](cookbook/rl/short_math_grpo.py) |
| DPO full-parameter training | transformers | [Script](cookbook/rl/dpo_full.py) |
| DPO LoRA training | transformers | [Script](cookbook/rl/dpo_lora.py) |
| DPO multi-LoRA training | transformers | [Script](cookbook/rl/dpo_multi_lora.py) |
| GKD on-policy distillation | megatron | [Script](cookbook/rl/gkd_on_policy.py) |
| GKD off-policy distillation | megatron | [Script](cookbook/rl/gkd_off_policy.py) |
| Tinker client finetuning (self-host) | transformers | [Script](cookbook/client/tinker/self_host) |
| Tinker client finetuning (ModelScope) | transformers | [Script](cookbook/client/tinker/modelscope) |
| Twinkle client finetuning (self-host) | transformers | [Script](cookbook/client/twinkle/self_host) |
| Twinkle client finetuning (ModelScope) | transformers | [Script](cookbook/client/twinkle/modelscope) |
| Server startup scripts | transformers/megatron | [Script](cookbook/client/server) |

## Changelog
- 🎉2026-04-14 The ModelScope service has been deployed to [Qwen/Qwen3.5-27B](https://www.modelscope.cn/models/Qwen/Qwen3.5-27B) with a new release 0.2.0.
- 🎉2026-04-14 The ModelScope service has been deployed to [Qwen/Qwen3.6-35B-A3B](https://www.modelscope.cn/models/Qwen/Qwen3.6-35B-A3B) with a new release 0.2.0.
- 🎉2026-03-28 Support DPO training with both Transformers and Megatron backends. See [dpo_full.py](cookbook/rl/dpo_full.py) and [dpo_lora.py](cookbook/rl/dpo_lora.py).
- 🎉2026-03-24 Twinkle Web site is now live at https://modelscope.github.io/twinkle-web/
- 🎉2026-03-19 Support GKD trainingplease refer to this [cookbook](cookbook/rl/gkd_on_policy.py).
- 🎉2026-03-19 Support GKD training, please refer to this [cookbook](cookbook/rl/gkd_on_policy.py).
- 🎉2026-02-13 Initial version of Twinkle✨ released, including SFT/PT/RL support for text models.

## Training as a Service on ModelScope

We are rolling out training service built atop Twinkle✨ on ModelScope. It is currently in _Beta_. You may
sign up for free access by joining the [Twinkle-Explorers](https://modelscope.cn/organization/twinkle-explorers) organization, and
We are rolling out training service built atop Twinkle✨ on ModelScope. You may
train via API endpoint `base_url=https://www.modelscope.cn/twinkle`. For more details, please refer to
our [documentation](docs/source_en/Usage%20Guide/Train-as-a-Service.md).

Expand All @@ -122,7 +130,7 @@ our [documentation](docs/source_en/Usage%20Guide/Train-as-a-Service.md).
| Hardware Environment | Notes |
| -------------------- | ---------------------------------------------------------------- |
| Nvidia GPUs | ✅ Support for BF16/Flash-Attn may be incomplete in earlier GPUs |
| Ascend NPU | ✅ Some operators may not supported |
| Ascend NPU | ✅ Some operators may not be supported |
| PPU | ✅ |
| CPU | Supports partial components like dataset, dataloader |

Expand All @@ -135,7 +143,7 @@ supported on Twinkle✨ framework.
> For serverless training service accessed via `base_url=https://www.modelscope.cn/twinkle`, it
> is currently provided via the Tinker-compatible APIs. We will be rolling out services that support
> both Tinker APIs, as well as the full-fledged Twinkle✨ native APIs. The serverless endpoint is backed
> by one training base at a time, and currently it is [Qwen3.5-27B](https://modelscope.cn/models/Qwen/Qwen3.5-27B).
> by one training base at a time, and currently it is [Qwen3.6-35B-A3B](https://modelscope.cn/models/Qwen/Qwen3.6-35B-A3B).

| Model Type | Model ID on [ModelScope](https://modelscope.cn) | Model Size | Requires | Support Megatron | HF Model ID |
|---------------------|-----------------------------------------------------------------------------------------------------------------|:---------------------------------------:|----------------------|:----------------:|:---------------------------------------------------------------------------------------------------------:|
Expand Down Expand Up @@ -184,7 +192,7 @@ twinkle.initialize(mode='ray', groups=device_group, global_device_mesh=device_me

def train():
# to load model from Hugging Face, use 'hf://...'
base_model = 'ms://Qwen/Qwen3.5-27B'
base_model = 'ms://Qwen/Qwen3.6-35B-A3B'
# 1000 samples
dataset = Dataset(dataset_meta=DatasetMeta('ms://swift/self-cognition', data_slice=range(1000)))
# Set template to prepare encoding
Expand Down Expand Up @@ -240,7 +248,7 @@ from twinkle.dataset import Dataset, DatasetMeta
from twinkle.preprocessor import SelfCognitionProcessor
from twinkle.server.common import input_feature_to_datum

base_model = 'ms://Qwen/Qwen3.5-27B'
base_model = 'ms://Qwen/Qwen3.6-35B-A3B'
base_url='your-base-url'
api_key='your-api-key'

Expand Down
51 changes: 31 additions & 20 deletions README_ZH.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Twinkle: Training workbench to make your model glow
<h1 align="center">Twinkle: Training workbench to make your model glow</h1>

<p align="center">
<img src="assets/slogan.png" width="200"/>
<p>
</p>
<p align="center">
<a href="https://modelscope.cn/home">ModelScope</a>
<br>
Expand Down Expand Up @@ -71,38 +71,49 @@ Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

这个脚本会下载或利用conda,创建一个叫`twinkle-client`的虚拟环境,这个环境可以直接用于远端训练。

如果你需要安装Megatron相关依赖,可以如下脚本
如果你需要安装Megatron相关依赖,可以使用如下脚本

```shell
sh INSTALL_MEGATRON.sh
```

## 教程

| 训练类型 | 模型框架 | Cookbook 路径 |
| ---------------------------- | -------- | ------------------------------------------------- |
| FSDP 微调 | transformers | [脚本](cookbook/transformers/fsdp2.py) |
| FSDP MoE 微调 | transformers | [脚本](cookbook/transformers/fsdp2_moe.py) |
| EP MoE 微调 | transformers | [脚本](cookbook/transformers/ep_fsdp_qwen3_moe.py) |
| pp/tp/cp 微调 | megatron | [脚本](cookbook/megatron/tp.py) |
| pp/tp/cp MoE 微调 | megatron | [脚本](cookbook/megatron/tp_moe.py) |
| tinker 客户端微调 | megatron | [脚本](cookbook/client/tinker/megatron) |
| tinker 客户端微调/采样 | transformers | [脚本](cookbook/client/tinker/transformer) |
| twinkle 客户端微调 | megatron | [脚本](cookbook/client/twinkle/megatron) |
| twinkle 客户端微调 | transformer | [脚本](cookbook/client/twinkle/transformer) |
| 训练类型 | 模型框架 | Cookbook 路径 |
| ----------------------------------- | --------------------- | ----------------------------------------------------- |
| FSDP 微调 | transformers | [脚本](cookbook/transformers/fsdp2.py) |
| FSDP MoE 微调 | transformers | [脚本](cookbook/transformers/fsdp2_moe.py) |
| EP FSDP MoE 微调 | transformers | [脚本](cookbook/transformers/ep_fsdp_qwen3_moe.py) |
| SP FSDP 微调 | transformers | [脚本](cookbook/transformers/sp_fsdp_dense.py) |
| pp/tp/cp 微调 | megatron | [脚本](cookbook/megatron/tp.py) |
| pp/tp/cp MoE 微调 | megatron | [脚本](cookbook/megatron/tp_moe.py) |
| 多模态 FSDP 微调 | transformers | [脚本](cookbook/mm/fsdp2.py) |
| GRPO 强化学习训练 | megatron | [脚本](cookbook/rl/grpo.py) |
| GRPO 多模态强化学习训练 | megatron | [脚本](cookbook/rl/grpo_mm.py) |
| GRPO 数学强化学习训练 | megatron | [脚本](cookbook/rl/short_math_grpo.py) |
| DPO 全参数训练 | transformers | [脚本](cookbook/rl/dpo_full.py) |
| DPO LoRA 训练 | transformers | [脚本](cookbook/rl/dpo_lora.py) |
| DPO 多 LoRA 训练 | transformers | [脚本](cookbook/rl/dpo_multi_lora.py) |
| GKD 在线蒸馏 | megatron | [脚本](cookbook/rl/gkd_on_policy.py) |
| GKD 离线蒸馏 | megatron | [脚本](cookbook/rl/gkd_off_policy.py) |
| Tinker 客户端微调(自部署) | transformers | [脚本](cookbook/client/tinker/self_host) |
| Tinker 客户端微调(ModelScope) | transformers | [脚本](cookbook/client/tinker/modelscope) |
| Twinkle 客户端微调(自部署) | transformers | [脚本](cookbook/client/twinkle/self_host) |
| Twinkle 客户端微调(ModelScope) | transformers | [脚本](cookbook/client/twinkle/modelscope) |
| 服务端启动脚本 | transformers/megatron | [脚本](cookbook/client/server) |

Twinkle✨支持相同的算法接口运行在单GPU、torchrun多机、Ray、Client等各场景下。其算法过程是外露的,非常便于修改和调试。完整的框架介绍请查看[快速开始](docs/source_zh/使用指引/快速开始.md)

## 更新日志
🎉2026-04-14 ModelScope的训练服务部署为[Qwen/Qwen3.5-27B](https://www.modelscope.cn/models/Qwen/Qwen3.5-27B),并发布了0.2.0版本.
🎉2026-04-16 ModelScope的训练服务部署为[Qwen/Qwen3.6-35B-A3B](https://www.modelscope.cn/models/Qwen/Qwen3.6-35B-A3B),并发布了0.2.0版本.
🎉2026-03-28 支持 DPO 训练,同时支持 Transformers 和 Megatron 后端。参考 [dpo_full.py](cookbook/rl/dpo_full.py) 和 [dpo_lora.py](cookbook/rl/dpo_lora.py)。
🎉2026-03-24 Twinkle 站点上线,访问地址 https://modelscope.github.io/twinkle-web/
🎉2026-03-19 支持GKD蒸馏能力,参考[cookbook](cookbook/rl/gkd_on_policy.py)。
🎉2026-03-19 支持 GKD 蒸馏能力,参考 [cookbook](cookbook/rl/gkd_on_policy.py)。
🎉2026-02-13 Twinkle✨ 初始版本发布,支持文本模型的SFT/PT/RL训练。我们还通过兼容Tinker的API,在魔搭社区上提供了无服务器训练功能。

## ModelScope 的训练服务

我们正在 ModelScope 上推出基于 Twinkle✨ 构建的训练服务。目前处于 _Beta_ 阶段。你可以通过加入 [Twinkle-Explorers](https://modelscope.cn/organization/twinkle-explorers) 组织来注册免费访问,并通过 API 端点 `base_url=https://www.modelscope.cn/twinkle` 进行训练。更多详情请参阅我们的[文档](docs/source_zh/使用指引/训练服务.md)。
我们正在 ModelScope 上推出基于 Twinkle✨ 构建的训练服务。你可以通过 API 端点 `base_url=https://www.modelscope.cn/twinkle` 进行训练。更多详情请参阅我们的[文档](docs/source_zh/使用指引/训练服务.md)。

## 支持的硬件

Expand All @@ -118,7 +129,7 @@ Twinkle✨支持相同的算法接口运行在单GPU、torchrun多机、Ray、Cl
随着新模型的发布,我们将添加对更多模型的支持。下表列出了 Twinkle✨ 框架当前支持的模型。

>[!Note]
> 通过 `base_url=https://www.modelscope.cn/twinkle` 访问的无服务器训练服务,目前是通过兼容Tinker的API提供的。我们将陆续推出同时支持Tinker API和完整Twinkle✨原生 API的服务。无服务器端点每次由一个训练基座支持,目前使用的是[Qwen3.5-27B](https://modelscope.cn/models/Qwen/Qwen3.5-27B)。
> 通过 `base_url=https://www.modelscope.cn/twinkle` 访问的无服务器训练服务,目前是通过兼容Tinker的API提供的。我们将陆续推出同时支持Tinker API和完整Twinkle✨原生 API的服务。无服务器端点每次由一个训练基座支持,目前使用的是[Qwen3.6-35B-A3B](https://modelscope.cn/models/Qwen/Qwen3.6-35B-A3B)。

| Model Type | Model ID 举例 | Model Size | Requires | Support Megatron | HF Model ID |
|---------------------|-----------------------------------------------------------------------------------------------------------------|:---------------------------------------:|----------------------|:----------------:|:---------------------------------------------------------------------------------------------------------:|
Expand Down Expand Up @@ -166,7 +177,7 @@ twinkle.initialize(mode='ray', groups=device_group, global_device_mesh=device_me

def train():
# to load model from Hugging Face, use 'hf://...'
base_model = 'ms://Qwen/Qwen3.5-27B'
base_model = 'ms://Qwen/Qwen3.6-35B-A3B'
# 1000 samples
dataset = Dataset(dataset_meta=DatasetMeta('ms://swift/self-cognition', data_slice=range(1000)))
# Set template to prepare encoding
Expand Down Expand Up @@ -222,7 +233,7 @@ from twinkle.dataset import Dataset, DatasetMeta
from twinkle.preprocessor import SelfCognitionProcessor
from twinkle.server.common import input_feature_to_datum

base_model = 'ms://Qwen/Qwen3.5-27B'
base_model = 'ms://Qwen/Qwen3.6-35B-A3B'
base_url='your-base-url'
api_key='your-api-key'

Expand Down
Loading
Loading