IDEALLab · SoheylM · Apr 22, 2026 · Apr 22, 2026 · Apr 23, 2026
diff --git a/README.md b/README.md
@@ -39,7 +39,7 @@ As much as we can, we follow the [CleanRL](https://github.com/vwxyzjn/cleanrl) p
 [pixel_cnn_pp_2d](engiopt/pixel_cnn_pp_2d) | Inverse Design | 2D | ✅ | PixelCNN++ Autoregressive Model
 
 ## Dashboards
-The integration with WandB allows us to access live dashboards of our runs (on the cluster or not). We also upload the trained models there. You can access some of our runs at https://wandb.ai/engibench/engiopt.
+The integration with WandB allows us to access live dashboards of our runs (on the cluster or not). New checkpoint packages are stored on the Hugging Face Hub by default, while WandB keeps experiment tracking, metadata, and links back to the canonical checkpoint location. Historical WandB model artifacts remain supported for backward compatibility. You can access some of our runs at https://wandb.ai/engibench/engiopt.
 <img src="imgs/wandb_dashboard.png" alt="WandB dashboards"/>
 
 
@@ -66,6 +66,11 @@ First, if you want to use weights and biases, you need to set the `WANDB_API_KEY
 wandb login
 ```
 
+If you want to save or load checkpoints from Hugging Face Hub, make sure your environment is authenticated there as well:
+```
+huggingface-cli login
+```
+
 ### Inverse design
 Usually, we provide two scripts per algorithm: one to train the model, and one to evaluate it.
 
@@ -77,6 +82,20 @@ python engiopt/cgan_cnn_2d/cgan_cnn_2d.py --problem-id "beams2d" --track --wandb
 
 This will run a CGAN 2D using CNN model on the beams2d problem. `--track` will track the run on wandb, `--wandb-entity None` will use the default wandb entity, `--save-model` will save the model, `--n-epochs 200` will run for 200 epochs, and `--seed 1` will set the random seed.
 
+By default, `--save-model` now stores a self-contained checkpoint package on the Hugging Face Hub. The default backend is:
+```
+--checkpoint-backend hf
+```
+You can still force legacy or hybrid behavior when needed:
+```
+--checkpoint-backend wandb
+--checkpoint-backend both
+--checkpoint-backend none
+```
+
+All HF-backed checkpoint packages contain the model files together with `run_config.json` and `metadata.json`, so evaluation does not depend on live WandB run config state.
+When W&B tracking is active, the HF package metadata also records the originating W&B run identity, and the W&B run summary records the HF repo, the seed-based convenience path, the exact uploaded HF revision, and an immutable run-specific HF package path.
+
 For reproducible debugging runs, you can additionally enable strict deterministic mode:
 ```
 python engiopt/cgan_cnn_2d/cgan_cnn_2d.py --problem-id "beams2d" --seed 1 --strict-determinism
@@ -95,7 +114,30 @@ Then you can restore a trained model and evaluate it:
 ```
 python engiopt/cgan_cnn_2d/evaluate_cgan_cnn_2d.py --problem-id "beams2d" --wandb-entity None --seed 1 --n-samples 10
 ```
-This will generate 10 designs from the trained model and run some [metrics](https://github.com/IDEALLab/EngiOpt/blob/main/engiopt/metrics.py) on them. This is what we used to generate the results in the paper. This by default will pull the model from wandb. It is possible to restore a model from a local file but is not currently supported.
+This will generate 10 designs from the trained model and run some [metrics](https://github.com/IDEALLab/EngiOpt/blob/main/engiopt/metrics.py) on them. This is what we used to generate the results in the paper.
+
+Evaluation now defaults to:
+```
+--model-source auto
+```
+In `auto` mode, EngiOpt tries to resolve checkpoints in this order:
+1. Hugging Face package for the model family, problem, and seed
+2. Legacy WandB model artifact
+3. Explicit local checkpoint package directory if you pass `--local-model-dir`
+
+For new HF-backed runs, EngiOpt maintains both:
+- a seed-based convenience path such as `beams2d/seed_1`
+- an immutable run-specific path such as `beams2d/seed_1/run_<wandb_run_id>`
+
+You can force legacy WandB loading for historical runs:
+```
+python engiopt/cgan_cnn_2d/evaluate_cgan_cnn_2d.py --problem-id "beams2d" --seed 1 --model-source wandb
+```
+
+You can also point evaluation at a local package directory:
+```
+python engiopt/cgan_cnn_2d/evaluate_cgan_cnn_2d.py --problem-id "beams2d" --seed 1 --model-source local --local-model-dir /path/to/package
+```
 
 ### Surrogate model
 
@@ -107,6 +149,13 @@ The current surrogate model comprises several steps:
 
 See this [notebook](https://github.com/IDEALLab/EngiOpt/blob/main/engiopt/surrogate_model/case_study_pe_notebook.ipynb) for an example.
 
+Surrogate-model optimization paths now use the same checkpoint abstraction. For example, the power-electronics optimizer can consume:
+* legacy WandB artifact refs
+* HF package refs such as `hf://IDEALLab/engiopt-mlp-tabular-only/power_electronics/DcGain/seed_42`
+* local checkpoint package directories
+
+For migration guidance on moving historical checkpoint subsets from WandB to the IDEALLab HF organization later, see [docs/checkpoint_migration_playbook.md](docs/checkpoint_migration_playbook.md).
+
 
 
 

diff --git a/docs/checkpoint_migration_playbook.md b/docs/checkpoint_migration_playbook.md
@@ -0,0 +1,96 @@
+# Checkpoint Migration Playbook
+
+This playbook describes how to migrate historical EngiOpt checkpoints from Weights & Biases (W&B) to the Hugging Face Hub (HF) without breaking backward compatibility.
+
+This is guidance only for phase 1 of the HF checkpoint backend rollout.
+
+What this phase does not do:
+- delete historical W&B artifacts
+- mutate historical W&B artifacts in place
+- assume every historical run should be migrated
+
+## Goals
+
+1. Stop creating new long-lived checkpoint pressure in W&B by making HF the default backend for new saved models.
+2. Keep historical W&B checkpoints loadable while HF-backed packages roll out.
+3. Create a safe path to reclaim W&B storage later, after validation.
+
+## Recommended Scope
+
+Start with the official or release subset of checkpoints first, not the full historical backlog.
+
+Good early candidates:
+- checkpoints referenced in papers, benchmarks, or public notebooks
+- checkpoints linked from README examples
+- checkpoints used by downstream evaluation scripts or case studies
+
+## Migration Package Format
+
+Each migrated HF checkpoint package should be self-contained and include:
+- the original model weight file or files
+- `run_config.json`
+- `metadata.json`
+
+`metadata.json` should record at least:
+- `problem_id`
+- `algo`
+- `seed`
+- original W&B project, run id, and artifact names
+- HF repo id and package path
+- primary file list
+
+## Recommended Procedure
+
+1. Inventory the target artifacts.
+   Create a manifest with artifact name, aliases, run id, problem, algorithm, seed, expected files, and whether the artifact is part of the official/release subset.
+
+2. Freeze the migration manifest before uploads.
+   Use the manifest as the source of truth so the migration is reproducible and reviewable.
+
+3. Download the original W&B artifacts.
+   For each target artifact, download the stored files and extract the associated W&B run config needed to rebuild the model outside W&B.
+
+4. Build the HF package.
+   Upload the weights together with `run_config.json` and `metadata.json` into the per-model-family HF repo under the deterministic package path:
+   `problem_id[/extra_parts]/seed_<seed>`
+
+5. Record the mapping.
+   For every migrated checkpoint, store a durable mapping from the W&B artifact alias to the HF repo id, revision, and package path.
+
+6. Validate restores before cleanup.
+   Test a representative sample end-to-end with EngiOpt’s evaluation or surrogate-model loading path and confirm the HF package reproduces the expected model restore behavior.
+
+7. Add pointers back into W&B metadata.
+   Once validated, update run metadata, summaries, or notes so the WandB run points to the canonical HF checkpoint location.
+
+8. Keep an overlap period.
+   Retain both HF and W&B copies long enough to verify that downstream users and scripts are not relying on the old blob storage path.
+
+9. Only then define deletion policy.
+   Deletion or retention changes should happen in a separate maintenance pass, using the manifest as the authoritative record.
+
+## Validation Checklist
+
+Before any cleanup is considered for a migrated checkpoint:
+
+1. The HF package contains all required weight files.
+2. `run_config.json` is present and sufficient to rebuild the model.
+3. `metadata.json` is present and points back to the original W&B lineage.
+4. A real EngiOpt load path has succeeded against the HF package.
+5. The corresponding W&B run contains the HF pointer or mapping information.
+
+## Cleanup Guidance for Later
+
+When the project is ready to reclaim W&B storage:
+
+1. Start with official checkpoints only.
+2. Delete in small batches, not all at once.
+3. Confirm the manifest entry is complete before each deletion.
+4. Re-run a small restore audit after each batch.
+5. Keep at least one validated overlap window where both copies coexist.
+
+## Notes
+
+- W&B remains part of the lineage story even after HF becomes the canonical checkpoint host.
+- HF should be treated as the long-lived storage backend for public or durable checkpoints.
+- Backward compatibility matters more than immediate cleanup.
diff --git a/engiopt/cgan_1d/cgan_1d.py b/engiopt/cgan_1d/cgan_1d.py
@@ -22,6 +22,8 @@
 import tyro
 import wandb
 
+from engiopt.checkpoint_store import CheckpointBackend
+from engiopt.checkpoint_store import save_checkpoint_package
 from engiopt.reproducibility import enable_strict_determinism
 from engiopt.reproducibility import make_dataloader_generator
 from engiopt.reproducibility import seed_training
@@ -64,6 +66,14 @@ class Args:
     """Wandb project name."""
     wandb_entity: str | None = None
     """Wandb entity name."""
+    checkpoint_backend: CheckpointBackend = "hf"
+    """Checkpoint backend for saved model weights."""
+    hf_entity: str = "IDEALLab"
+    """HF org/user where checkpoints are stored."""
+    hf_repo_prefix: str = "engiopt"
+    """HF repo prefix used for model-family repositories."""
+    hf_private: bool = False
+    """Whether newly created HF repos should be private."""
     seed: int = 1
     """Random seed."""
 
@@ -398,13 +408,21 @@ def sample_designs(n_designs: int) -> tuple[th.Tensor, th.Tensor]:
 
                     th.save(ckpt_gen, "generator.pth")
                     th.save(ckpt_disc, "discriminator.pth")
-                    if args.track:
-                        artifact_gen = wandb.Artifact(f"{args.problem_id}_{args.algo}_generator", type="model")
-                        artifact_gen.add_file("generator.pth")
-                        artifact_disc = wandb.Artifact(f"{args.problem_id}_{args.algo}_discriminator", type="model")
-                        artifact_disc.add_file("discriminator.pth")
-
-                        wandb.log_artifact(artifact_gen, aliases=[f"seed_{args.seed}"])
-                        wandb.log_artifact(artifact_disc, aliases=[f"seed_{args.seed}"])
+                    save_checkpoint_package(
+                        checkpoint_backend=args.checkpoint_backend,
+                        hf_entity=args.hf_entity,
+                        hf_repo_prefix=args.hf_repo_prefix,
+                        hf_private=args.hf_private,
+                        problem_id=args.problem_id,
+                        algo=args.algo,
+                        seed=args.seed,
+                        checkpoint_files={"generator.pth": "generator.pth", "discriminator.pth": "discriminator.pth"},
+                        run_config=vars(args),
+                        primary_files=["generator.pth"],
+                        wandb_artifacts={
+                            f"{args.problem_id}_{args.algo}_generator": "generator.pth",
+                            f"{args.problem_id}_{args.algo}_discriminator": "discriminator.pth",
+                        },
+                    )
 
     wandb.finish()
diff --git a/engiopt/cgan_1d/evaluate_cgan_1d.py b/engiopt/cgan_1d/evaluate_cgan_1d.py
@@ -15,8 +15,9 @@
 from engiopt import metrics
 from engiopt.cgan_1d.cgan_1d import Generator
 from engiopt.cgan_1d.cgan_1d import prepare_data
+from engiopt.checkpoint_store import ModelSource
+from engiopt.checkpoint_store import resolve_named_checkpoint
 from engiopt.dataset_sample_conditions import sample_conditions
-import wandb
 
 
 @dataclasses.dataclass
@@ -31,6 +32,14 @@ class Args:
     """Wandb project name."""
     wandb_entity: str | None = None
     """Wandb entity name."""
+    model_source: ModelSource = "auto"
+    """Where to load the checkpoint package from."""
+    hf_entity: str = "IDEALLab"
+    """HF org/user where checkpoints are stored."""
+    hf_repo_prefix: str = "engiopt"
+    """HF repo prefix used for model-family repositories."""
+    local_model_dir: str | None = None
+    """Optional local checkpoint package directory."""
     n_samples: int = 10
     """Number of generated samples per seed."""
     sigma: float = 10.0
@@ -74,30 +83,28 @@ class Args:
     )
 
     ### Set Up Generator ###
-    if args.wandb_entity is not None:
-        artifact_path = f"{args.wandb_entity}/{args.wandb_project}/{args.problem_id}_cgan_1d_generator:seed_{seed}"
-    else:
-        artifact_path = f"{args.wandb_project}/{args.problem_id}_cgan_1d_generator:seed_{seed}"
-
-    api = wandb.Api()
-    artifact = api.artifact(artifact_path, type="model")
-
-    class RunRetrievalError(ValueError):
-        def __init__(self):
-            super().__init__("Failed to retrieve the run")
-
-    run = artifact.logged_by()
-    if run is None or not hasattr(run, "config"):
-        raise RunRetrievalError
+    resolved = resolve_named_checkpoint(
+        model_source=args.model_source,
+        problem_id=args.problem_id,
+        algo="cgan_1d",
+        seed=seed,
+        hf_entity=args.hf_entity,
+        hf_repo_prefix=args.hf_repo_prefix,
+        required_files=["generator.pth"],
+        wandb_project=args.wandb_project,
+        wandb_entity=args.wandb_entity,
+        wandb_artifact_names={"generator.pth": f"{args.problem_id}_cgan_1d_generator"},
+        local_model_dir=args.local_model_dir,
+    )
+    run_config = resolved.run_config
 
-    artifact_dir = artifact.download()
-    ckpt_path = os.path.join(artifact_dir, "generator.pth")
+    ckpt_path = resolved.files["generator.pth"]
     ckpt = th.load(ckpt_path, map_location=device)
 
     _, conds_normalizer, design_normalizer = prepare_data(problem, device)
 
     model = Generator(
-        latent_dim=run.config["latent_dim"],
+        latent_dim=run_config["latent_dim"],
         n_conds=len(problem.conditions_keys),
         design_shape=design_shape,
         design_normalizer=design_normalizer,
@@ -107,7 +114,7 @@ def __init__(self):
     model.eval()
 
     # Sample noise and generate designs
-    z = th.randn((args.n_samples, run.config["latent_dim"]), device=device)
+    z = th.randn((args.n_samples, run_config["latent_dim"]), device=device)
     gen_designs = model(z, conditions_tensor)
     gen_designs_np = gen_designs.detach().cpu().numpy()
     print(gen_designs_np.shape)

diff --git a/engiopt/cgan_2d/cgan_2d.py b/engiopt/cgan_2d/cgan_2d.py
@@ -19,6 +19,8 @@
 import tyro
 import wandb
 
+from engiopt.checkpoint_store import CheckpointBackend
+from engiopt.checkpoint_store import save_checkpoint_package
 from engiopt.reproducibility import enable_strict_determinism
 from engiopt.reproducibility import make_dataloader_generator
 from engiopt.reproducibility import seed_training
@@ -40,6 +42,14 @@ class Args:
     """Wandb project name."""
     wandb_entity: str | None = None
     """Wandb entity name."""
+    checkpoint_backend: CheckpointBackend = "hf"
+    """Checkpoint backend for saved model weights."""
+    hf_entity: str = "IDEALLab"
+    """HF org/user where checkpoints are stored."""
+    hf_repo_prefix: str = "engiopt"
+    """HF repo prefix used for model-family repositories."""
+    hf_private: bool = False
+    """Whether newly created HF repos should be private."""
     seed: int = 1
     """Random seed."""
 
@@ -317,13 +327,21 @@ def sample_designs(n_designs: int) -> tuple[th.Tensor, th.Tensor]:
 
                     th.save(ckpt_gen, "generator.pth")
                     th.save(ckpt_disc, "discriminator.pth")
-                    if args.track:
-                        artifact_gen = wandb.Artifact(f"{args.problem_id}_{args.algo}_generator", type="model")
-                        artifact_gen.add_file("generator.pth")
-                        artifact_disc = wandb.Artifact(f"{args.problem_id}_{args.algo}_discriminator", type="model")
-                        artifact_disc.add_file("discriminator.pth")
-
-                        wandb.log_artifact(artifact_gen, aliases=[f"seed_{args.seed}"])
-                        wandb.log_artifact(artifact_disc, aliases=[f"seed_{args.seed}"])
+                    save_checkpoint_package(
+                        checkpoint_backend=args.checkpoint_backend,
+                        hf_entity=args.hf_entity,
+                        hf_repo_prefix=args.hf_repo_prefix,
+                        hf_private=args.hf_private,
+                        problem_id=args.problem_id,
+                        algo=args.algo,
+                        seed=args.seed,
+                        checkpoint_files={"generator.pth": "generator.pth", "discriminator.pth": "discriminator.pth"},
+                        run_config=vars(args),
+                        primary_files=["generator.pth"],
+                        wandb_artifacts={
+                            f"{args.problem_id}_{args.algo}_generator": "generator.pth",
+                            f"{args.problem_id}_{args.algo}_discriminator": "discriminator.pth",
+                        },
+                    )
 
     wandb.finish()