The single entry repo for learning MinT (Mind Lab Toolkit) — from first API call to advanced RL training.
Important: All experiments run against an already deployed MinT server. This repo does not start MinT backend services locally. You only need valid server endpoint + API key credentials.
| # | Demo | Track | Reward Source | Script |
|---|---|---|---|---|
| 1 | RL-1 Verifiable Math | RL | Deterministic verifier | demos/rl/adapters/verifiable_math.py |
| 2 | RL-2 Preference Chat | RL | Pairwise/judge preference | demos/rl/adapters/preference_chat.py |
| 3 | RL-3 Environment Tool Use | RL | Code execution feedback | demos/rl/adapters/environment_tooluse.py |
| # | Demo | Track | Description | Status |
|---|---|---|---|---|
| 4 | VLM-1 Vision QA | VLM | Image + question -> grounded answer | Planned (M2) |
| 5 | VLM-2 Vision Instruction | VLM | Image + task -> action/decision | Planned (M2) |
| 6 | Embodied-1 Simulator Agent | Embodied | Simplified env -> action sequences | Planned (M3) |
Requirements: Python >= 3.11, a MinT API key
pip install git+https://github.com/MindLab-Research/mindlab-toolkit.git python-dotenv matplotlib numpyCreate .env in the repo root:
MINT_API_KEY=sk-mint-your-api-key-here
Use the MinT endpoint that matches your region:
- Mainland China:
https://mint-cn.macaron.xin/ - Outside Mainland China:
https://mint.macaron.xin/
Run the quickstart (SFT then RL in one script):
python quickstart/quickstart.pyOr open the interactive notebook:
jupyter notebook quickstart/mint_quickstart.ipynbpython demos/rl/adapters/verifiable_math.py # RL-1: math with exact-match reward
python demos/rl/adapters/preference_chat.py # RL-2: chat with helpfulness proxy
python demos/rl/adapters/environment_tooluse.py # RL-3: code gen with execution rewardAll demos are configurable via environment variables. See demos/rl/README.md for details.
If you want a full checkpoint lifecycle:
python advanced/checkpoint.py save --name my-ckpt
python advanced/checkpoint.py download mint://<run-id>/weights/<ckpt-name> -o ./ckpts
python advanced/checkpoint.py upload ./ckpts/<archive>.tar.gz
python advanced/checkpoint.py resume ckpt_<id> --with-optimizer --steps 3See advanced/README.md for the full command matrix and guardrails (sampler_weights vs weights).
If you want a focused end-to-end check for session-level Seq-MIS wiring:
python advanced/validate_mis_rollout_correction.py --base-model Qwen/Qwen3-0.6BSee docs/mis_rollout_correction.md for prerequisites, env vars, expected output, and failure modes.
mint-quickstart/
.env.example # Template for API key configuration
quickstart/
quickstart.py # SFT -> RL in one script
mint_quickstart.ipynb # Interactive notebook version
demos/
rl/ # 3 RL demos (available)
rl_core.py # Shared GRPO training loop
adapters/
verifiable_math.py
preference_chat.py
environment_tooluse.py
vlm/ # 2 VLM demos (coming soon)
embodied/ # 1 embodied demo (coming soon)
advanced/ # Checkpoint workflows and MIS validation
docs/
roadmap.md # 6-demo roadmap with status tags
troubleshooting.md # Common issues and fixes
migration-from-minT-demo.md
experiments/ # Validation reports for quickstart flows
mint-skill/ # AI coding agent migration skill
If you have existing code using import tinker:
pip install tinkerTINKER_BASE_URL=<your-region-endpoint>
TINKER_API_KEY=<your-mint-api-key>
Use the MinT endpoint that matches your region:
- Mainland China:
https://mint-cn.macaron.xin/ - Outside Mainland China:
https://mint.macaron.xin/
All code works identically with import tinker instead of import mint.
- Roadmap — all 6 demos with availability status
- Troubleshooting — common issues and solutions
- Migration Guide — moving from old MinT-demo repo
- RL Demos — detailed docs for the 3 available RL demos
- Advanced — checkpoint workflows and MIS validation entry points
- MIS Rollout Correction — targeted Seq-MIS validation flow and troubleshooting
- Experiment Report — quickstart upload-download-resume validation template/results
- Migration Skill — AI agent skill for migrating from verl/TRL/OpenRLHF
- 中文 README — Chinese version of this document