From 3ea856c25ecb09b598c65128b1fee5feade57caf Mon Sep 17 00:00:00 2001 From: Shenghai Yuan <140951558+SHYuanBest@users.noreply.github.com> Date: Thu, 5 Mar 2026 21:12:10 +0800 Subject: [PATCH 1/4] Fix Helios paper link in documentation Updated the link to the Helios paper for accuracy. --- docs/source/en/api/pipelines/helios.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/en/api/pipelines/helios.md b/docs/source/en/api/pipelines/helios.md index 81559b24c071..54a08240001c 100644 --- a/docs/source/en/api/pipelines/helios.md +++ b/docs/source/en/api/pipelines/helios.md @@ -22,7 +22,7 @@ # Helios -[Helios: Real Real-Time Long Video Generation Model](https://huggingface.co/papers/) from Peking University & ByteDance & etc, by Shenghai Yuan, Yuanyang Yin, Zongjian Li, Xinwei Huang, Xiao Yang, Li Yuan. +[Helios: Real Real-Time Long Video Generation Model](https://huggingface.co/papers/2603.04379) from Peking University & ByteDance & etc, by Shenghai Yuan, Yuanyang Yin, Zongjian Li, Xinwei Huang, Xiao Yang, Li Yuan. * We introduce Helios, the first 14B video generation model that runs at 17 FPS on a single NVIDIA H100 GPU and supports minute-scale generation while matching a strong baseline in quality. We make breakthroughs along three key dimensions: (1) robustness to long-video drifting without commonly used anti-drift heuristics such as self-forcing, error banks, or keyframe sampling; (2) real-time generation without standard acceleration techniques such as KV-cache, causal masking, or sparse attention; and (3) training without parallelism or sharding frameworks, enabling image-diffusion-scale batch sizes while fitting up to four 14B models within 80 GB of GPU memory. Specifically, Helios is a 14B autoregressive diffusion model with a unified input representation that natively supports T2V, I2V, and V2V tasks. To mitigate drifting in long-video generation, we characterize its typical failure modes and propose simple yet effective training strategies that explicitly simulate drifting during training, while eliminating repetitive motion at its source. For efficiency, we heavily compress the historical and noisy context and reduce the number of sampling steps, yielding computational costs comparable to—or lower than—those of 1.3B video generative models. Moreover, we introduce infrastructure-level optimizations that accelerate both inference and training while reducing memory consumption. Extensive experiments demonstrate that Helios consistently outperforms prior methods on both short- and long-video generation. All the code and models are available at [this https URL](https://pku-yuangroup.github.io/Helios-Page). From e46d2e97f90b15cffacb3c1b6e72c9f13ff30ef3 Mon Sep 17 00:00:00 2001 From: Shenghai Yuan <140951558+SHYuanBest@users.noreply.github.com> Date: Thu, 5 Mar 2026 21:12:44 +0800 Subject: [PATCH 2/4] Fix reference link in HeliosTransformer3DModel documentation Updated the reference link for the Helios Transformer model paper. --- docs/source/en/api/models/helios_transformer3d.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/en/api/models/helios_transformer3d.md b/docs/source/en/api/models/helios_transformer3d.md index 5aa2826c32ec..302b91d6c829 100644 --- a/docs/source/en/api/models/helios_transformer3d.md +++ b/docs/source/en/api/models/helios_transformer3d.md @@ -11,7 +11,7 @@ specific language governing permissions and limitations under the License. --> # HeliosTransformer3DModel -A 14B Real-Time Autogressive Diffusion Transformer model (support T2V, I2V and V2V) for 3D video-like data from [Helios](https://github.com/PKU-YuanGroup/Helios) was introduced in [Helios: Real Real-Time Long Video Generation Model](https://huggingface.co/papers/) by Peking University & ByteDance & etc. +A 14B Real-Time Autogressive Diffusion Transformer model (support T2V, I2V and V2V) for 3D video-like data from [Helios](https://github.com/PKU-YuanGroup/Helios) was introduced in [Helios: Real Real-Time Long Video Generation Model](https://huggingface.co/papers/2603.04379) by Peking University & ByteDance & etc. The model can be loaded with the following code snippet. From 21ab05acf92e6e1f4386f67f4670942e299866f7 Mon Sep 17 00:00:00 2001 From: Shenghai Yuan <140951558+SHYuanBest@users.noreply.github.com> Date: Thu, 5 Mar 2026 21:13:37 +0800 Subject: [PATCH 3/4] Update Helios research paper link in documentation --- docs/source/en/using-diffusers/helios.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/en/using-diffusers/helios.md b/docs/source/en/using-diffusers/helios.md index 8106f1c568f8..ced7c6298f23 100644 --- a/docs/source/en/using-diffusers/helios.md +++ b/docs/source/en/using-diffusers/helios.md @@ -130,4 +130,4 @@ pipe.to("cuda") Learn more about Helios with the following resources. - Watch [video1](https://www.youtube.com/watch?v=vd_AgHtOUFQ) and [video2](https://www.youtube.com/watch?v=1GeIU2Dn7UY) for a demonstration of Helios's key features. -- The research paper, [Helios: Real Real-Time Long Video Generation Model](https://huggingface.co/papers/) for more details. +- The research paper, [Helios: Real Real-Time Long Video Generation Model](https://huggingface.co/papers/2603.04379) for more details. From 513025b817d314b7279d6dc6b4f3696beda69f2b Mon Sep 17 00:00:00 2001 From: Shenghai Yuan <140951558+SHYuanBest@users.noreply.github.com> Date: Thu, 5 Mar 2026 21:13:59 +0800 Subject: [PATCH 4/4] Update Helios research paper link in documentation --- docs/source/zh/using-diffusers/helios.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/zh/using-diffusers/helios.md b/docs/source/zh/using-diffusers/helios.md index 5c4faed2ca2a..5f7f067eb781 100644 --- a/docs/source/zh/using-diffusers/helios.md +++ b/docs/source/zh/using-diffusers/helios.md @@ -131,4 +131,4 @@ pipe.to("cuda") 通过以下资源了解有关 Helios 的更多信息: - [视频1](https://www.youtube.com/watch?v=vd_AgHtOUFQ)和[视频2](https://www.youtube.com/watch?v=1GeIU2Dn7UY)演示了 Helios 的主要功能; -- 有关更多详细信息,请参阅研究论文 [Helios: Real Real-Time Long Video Generation Model](https://huggingface.co/papers/)。 +- 有关更多详细信息,请参阅研究论文 [Helios: Real Real-Time Long Video Generation Model](https://huggingface.co/papers/2603.04379)。