Machine Learning Engineering Open Book
-
Updated
Dec 8, 2025 - Python
Machine Learning Engineering Open Book
dstack is an open-source control plane for running development, training, and inference jobs on GPUs—across hyperscalers, neoclouds, or on-prem.
A WDL, CWL and Python API supporting easy-to-use workflow engine. It is scalable, efficient and cross-platform (Linux/macOS).
Best practices & guides on how to write distributed pytorch training code
Lightweight fast function pipeline (DAG) creation in pure Python for scientific (HPC) workflows 🕸️🧪
TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and support for E2E production ML pipelines when you're ready.
Simplify HPC and Batch workloads on Azure
SEML: Slurm Experiment Management Library
A toolset for black-box hyperparameter optimisation.
Slurm-Mail is a drop in replacement for Slurm's e-mails to give users much more information about their jobs compared to the standard Slurm e-mails.
A Slurm dashboard for the terminal.
A toolkit featured artificial intelligence × ab initio for computational chemistry research.
🦠🧬🧑💻📇 Microbial genomes-to-report pipeline
Add a description, image, and links to the slurm topic page so that developers can more easily learn about it.
To associate your repository with the slurm topic, visit your repo's landing page and select "manage topics."