A standalone development repo for mlutils (PyTorch training utilities) plus a minimal reference app in project/.
mlutils in this repo is synced from ~/.julia/dev/FLARE-dev.py/mlutils and validated here as a standalone package/workflow.
mlutils/: training framework (trainer, callbacks, utils, schedules, models, EMA)project/: minimal runnable example app usingmlutilstests/: unit + integration tests (CPU and optional GPU smoke tests)
mlutils.py/
├── mlutils/
│ ├── trainer.py
│ ├── callbacks.py
│ ├── utils.py
│ ├── schedule.py
│ ├── models.py
│ └── ema.py
├── project/
│ ├── __main__.py
│ ├── callbacks.py
│ ├── utils.py
│ ├── datasets/
│ │ ├── dummy.py
│ │ └── utils.py
│ └── models/
│ └── transformer.py
├── tests/
├── pyproject.toml
└── scripts/
This repo uses Python 3.11+ and works well with uv.
uv venv
uv sync
source .venv/bin/activateIf .venv already exists:
source .venv/bin/activatepython -m project --helppython -m project \
--train true \
--dataset dummy \
--epochs 10 \
--batch_size 64 \
--exp_name demo_runpython -m project \
--restart true \
--exp_name demo_runpython -m project \
--evaluate true \
--exp_name demo_runExperiment outputs are written to:
out/project/<exp_name>/config.yamlout/project/<exp_name>/ckptXX/(periodic checkpoints)out/project/<exp_name>/eval/(evaluation artifacts)
Common artifacts:
model.ptstats.jsonmodel_stats.jsonlosses.png,grad_norm.png,learning_rate.png
Example output for --exp_name demo_run:
out/
└── project/
└── demo_run/
├── config.yaml
├── model_stats.json
├── losses.png
├── grad_norm.png
├── learning_rate.png
├── ckpt00/
│ ├── model.pt
│ ├── stats.json
│ ├── model_stats.json
│ ├── losses.png
│ ├── grad_norm.png
│ └── learning_rate.png
├── ckpt01/
│ └── ...
├── ckpt02/
│ └── ...
└── eval/
├── model.pt
├── stats.json
├── model_stats.json
├── losses.png
├── grad_norm.png
└── learning_rate.png
project/datasets/dummy.py creates a synthetic regression dataset if files are missing:
- Inputs: fixed 2D grid points, shape
[N, 1024, 2] - Targets:
sin(pi*x) * sin(pi*y), shape[N, 1024, 1]
load_dataset("dummy", ...) returns an 80/20 train/test split plus metadata (in_dim=2, out_dim=1, normalizers).
Run all tests:
pytest -qRun only integration tests:
pytest -q -m integrationNotes:
- GPU smoke tests auto-skip when CUDA is unavailable.
- CPU smoke/eval flow tests always run.
This repo includes a skill-oriented sync workflow at:
~/.codex/skills/flare-mlutils-sync/
Core scripts:
sync_mlutils_from_flare.sh: syncmlutils/from FLARE-devvalidate_mlutils_sync.sh: run test + smoke validation
MIT (see LICENSE).