SwanLab: add id and resume support for resuming runs (fixes #43698) by Rayyan-Oumlil · Pull Request #43739 · huggingface/transformers

Rayyan-Oumlil · 2026-02-04T15:11:22Z

Summary

When using Trainer with SwanLab and resuming training (trainer.train(resume_from_checkpoint=...)), the integration previously had no way to pass id and resume to swanlab.init(), so a new experiment was always created instead of continuing the existing one.

This PR adds support for resuming a previous SwanLab run via environment variables (same pattern as MLflow's MLFLOW_RUN_ID):

SWANLAB_RUN_ID (or SWANLAB_ID): The 21-character SwanLab run ID to resume. Users set this when resuming so the same experiment continues.
SWANLAB_RESUME: Resume mode — "allow" / True (resume if run exists, else create new), "must" (resume only), or "never" / False (always create new).

Changes

In SwanLabCallback.setup(), read SWANLAB_RUN_ID/SWANLAB_ID and SWANLAB_RESUME from the environment and pass them to swanlab.init().
Document the new env vars in the callback docstring.
Parse SWANLAB_RESUME for common values (true/false, allow/must/never) so both string and boolean-like env values work.

Usage

When resuming training, set the env vars before calling trainer.train(resume_from_checkpoint=...):

import os
os.environ["SWANLAB_RUN_ID"] = "14pk4qbyav4toobziszli"  # from previous run
os.environ["SWANLAB_RESUME"] = "allow"
trainer.train(resume_from_checkpoint="./checkpoint-100")

This keeps metrics and history in a single SwanLab run across restarts.

Copilot

Pull request overview

This PR adds support for resuming SwanLab runs when using Trainer with resume_from_checkpoint, fixing issue #43698. Previously, resuming training always created a new SwanLab experiment because there was no way to pass id and resume parameters to swanlab.init().

Changes:

Added support for SWANLAB_RUN_ID/SWANLAB_ID and SWANLAB_RESUME environment variables
Updated SwanLabCallback.setup() to read these env vars and pass them to swanlab.init()
Added documentation for the new environment variables in the callback's docstring

Copilot · 2026-02-04T15:21:32Z

+        - **SWANLAB_RUN_ID** (`str`, *optional*, defaults to `None`):
+            The SwanLab run ID (21-character string) to resume. When set together with `SWANLAB_RESUME`, enables
+            resuming a previous run so that `trainer.train(resume_from_checkpoint=...)` continues the same
+            experiment instead of creating a new one. The run ID can be found in the experiment's Environment tab
+            or in the URL on the SwanLab dashboard.


The documentation mentions "SWANLAB_ID" as an alternative to "SWANLAB_RUN_ID", but this is only documented in the implementation code (line 2317), not in the docstring. For consistency and clarity, the docstring should document both environment variable names.

Consider updating the docstring at line 2281 to mention both names, for example:
"SWANLAB_RUN_ID (or SWANLAB_ID) (str, optional, defaults to None):"

This would make it clear to users that either environment variable name can be used.

Copilot · 2026-02-04T15:21:33Z

+            if swanlab_resume is not None:
+                if swanlab_resume.lower() in ("true", "1"):
+                    init_args["resume"] = True
+                elif swanlab_resume.lower() in ("false", "0"):
+                    init_args["resume"] = False
+                elif swanlab_resume.lower() in ("allow", "must", "never"):
+                    init_args["resume"] = swanlab_resume.lower()
+                else:
+                    init_args["resume"] = swanlab_resume


The environment variable parsing for boolean values doesn't follow the established pattern in this codebase. Other integrations in this file use either ENV_VARS_TRUE_VALUES (e.g., MLflowCallback at lines 1300-1301) or utility functions like is_env_variable_true() for consistent parsing.

For consistency with the codebase, consider using the standard approach:

swanlab_resume_env = os.getenv("SWANLAB_RESUME", None) if swanlab_resume_env is not None: resume_upper = swanlab_resume_env.upper() if resume_upper in ENV_VARS_TRUE_VALUES or resume_upper == "ALLOW": init_args["resume"] = "allow" # or True, depending on SwanLab's API elif resume_upper in ("FALSE", "0", "NEVER"): init_args["resume"] = "never" # or False elif resume_upper == "MUST": init_args["resume"] = "must" else: init_args["resume"] = swanlab_resume_env

This approach:

Follows the established pattern (see src/transformers/integrations/integration_utils.py:1300-1301)

Handles case-insensitivity consistently

Avoids potential AttributeError if the value is not a string

Uses the same values that other integrations recognize ("TRUE", "1", "FALSE", "0")

MekkCyber

Thank you @Rayyan-Oumlil ! sorry this pr is superseded by #43719

SwanLab: add id and resume support for resuming runs (fixes #43698)

4980702

Copilot AI review requested due to automatic review settings February 4, 2026 15:11

Copilot started reviewing on behalf of Rayyan-Oumlil February 4, 2026 15:11 View session

Copilot AI reviewed Feb 4, 2026

View reviewed changes

MekkCyber reviewed Feb 9, 2026

View reviewed changes

Rayyan-Oumlil closed this by deleting the head repository Feb 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SwanLab: add id and resume support for resuming runs (fixes #43698)#43739

SwanLab: add id and resume support for resuming runs (fixes #43698)#43739
Rayyan-Oumlil wants to merge 1 commit intohuggingface:mainfrom
Rayyan-Oumlil:fix/43698-swanlab-id-resume

Rayyan-Oumlil commented Feb 4, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 4, 2026

Uh oh!

Copilot AI Feb 4, 2026

Uh oh!

MekkCyber left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Rayyan-Oumlil commented Feb 4, 2026

Summary

Changes

Usage

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

MekkCyber left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants