diff --git a/README.md b/README.md
index 60c7448..c8ff85f 100644
--- a/README.md
+++ b/README.md
@@ -154,10 +154,13 @@ Serving, reward modeling, and training are fully decoupled. The agent continues
 pip install -e .                        # skills_only mode (lightweight)
 pip install -e ".[rl]"                  # + RL training support (torch, transformers, tinker)
 pip install -e ".[evolve]"              # + skill evolution via OpenAI-compatible LLM
+pip install mindlab-toolkit             # + optional MinT compatibility backend
 pip install -e ".[scheduler]"           # + Google Calendar integration for scheduler
 pip install -e ".[rl,evolve,scheduler]" # recommended for full RL + scheduler setup
 ```
 
+Use `mindlab-toolkit` only if you want `rl.backend=mint`. MinT docs: [overview](https://mint-doc.macaron.im/), [Tinker compatibility](https://mint-doc.macaron.im/using-the-api/tinker-compatibility), [model lineup](https://mint-doc.macaron.im/using-the-api/model-lineup), [GitHub](https://github.com/MindLab-Research/mindlab-toolkit).
+
 ### 2. Configure
 
 ```bash
@@ -166,6 +169,19 @@ metaclaw setup
 
 The interactive wizard will ask you to choose your LLM provider (Kimi, Qwen, MiniMax, or custom), enter your API key, and optionally enable RL training.
 
+MetaClaw's RL path keeps Tinker as the default reference backend. `rl.backend=auto` is the recommended default: it works with Tinker out of the box, and can also infer MinT from MinT-like credentials/base URLs when the MinT package is installed. If you want to point the same workflow at MinT, you can set:
+
+```bash
+metaclaw config rl.backend mint
+metaclaw config rl.api_key sk-mint-...
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # China mainland; use https://mint.macaron.xin/ otherwise
+metaclaw config rl.model Qwen/Qwen3-4B-Instruct-2507
+```
+
+Use `https://mint-cn.macaron.xin/` for mainland China or `https://mint.macaron.xin/` otherwise.
+
+Legacy aliases `rl.tinker_api_key` and `rl.tinker_base_url` are still accepted for backward compatibility.
+
 ### 3. Start
 
 ```bash
@@ -193,7 +209,9 @@ metaclaw config KEY VALUE       # Set a config value
 
 ```bash
 metaclaw config rl.enabled true           # Enable RL training
-metaclaw config rl.tinker_api_key sk-...  # Set Tinker key
+metaclaw config rl.backend auto           # auto | tinker | mint
+metaclaw config rl.api_key sk-...         # Set RL backend key
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # MinT mainland China endpoint; use https://mint.macaron.xin/ otherwise
 metaclaw config skills.auto_evolve false  # Disable auto skill summarization
 metaclaw config proxy.port 31000          # Change proxy port
 ```
@@ -226,8 +244,12 @@ skills:
 
 rl:
   enabled: false            # set to true to enable RL training
+  backend: auto             # "auto" | "tinker" | "mint"
   model: moonshotai/Kimi-K2.5
-  tinker_api_key: ""
+  api_key: ""
+  base_url: ""              # optional backend endpoint, e.g. https://mint-cn.macaron.xin/ (mainland China) or https://mint.macaron.xin/ for MinT
+  tinker_api_key: ""        # legacy alias for api_key
+  tinker_base_url: ""       # legacy alias for base_url
   prm_url: https://api.openai.com/v1
   prm_model: gpt-5.2
   prm_api_key: ""
@@ -277,22 +299,33 @@ cp -r memory_data/skills/* ~/.metaclaw/skills/
 
 ## 🔬 Advanced: RL Mode
 
-Enable RL training to continuously fine-tune the model from live conversations:
+Enable RL training to continuously fine-tune the model from live conversations. Tinker remains the default reference path, and MetaClaw can also target MinT as a Tinker-compatible alternative:
 
 ```bash
 metaclaw config rl.enabled true
-metaclaw config rl.tinker_api_key sk-...
+metaclaw config rl.backend auto
+metaclaw config rl.api_key sk-...
 metaclaw config rl.prm_url https://api.openai.com/v1
 metaclaw config rl.prm_api_key sk-...
 metaclaw start
 ```
 
+If you want to run the same workflow on MinT, switch the backend and set the MinT endpoint/model:
+
+```bash
+metaclaw config rl.backend mint
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # China mainland; use https://mint.macaron.xin/ otherwise
+metaclaw config rl.model Qwen/Qwen3-4B-Instruct-2507
+```
+
 In RL mode:
 - Each conversation turn is tokenized and submitted as a training sample
 - A judge LLM (PRM) scores responses asynchronously
-- Tinker cloud runs LoRA fine-tuning; updated weights are hot-swapped every `batch_size` samples
+- Tinker cloud handles LoRA fine-tuning by default; MetaClaw can also work with a compatible alternative such as MinT, and updated weights are hot-swapped every `batch_size` samples
 - A dedicated evolver LLM extracts new skills from failed episodes
 
+If you stay on Tinker, `rl.backend=auto` or `rl.backend=tinker` keeps the default path. If you use MinT, switch to `rl.backend=mint` and provide the MinT endpoint.
+
 **Programmatic rollout** (no OpenClaw TUI needed): set `openclaw_env_data_dir` to a directory of JSONL task files:
 
 ```json
@@ -356,14 +389,17 @@ Each `ConversationSample` is tagged with a `skill_generation` version. When skil
 
 ## 🙏 Acknowledgements
 
-MetaClaw builds on top of the following open-source projects:
+MetaClaw builds on top of the following open-source projects and collaborations:
 
 - [OpenClaw](https://openclaw.ai) – the core agent framework.
 - [SkillRL](https://github.com/aiming-lab/SkillRL) – our skill-augmented RL framework.
-- [Tinker](https://www.thinkingmachines.ai/tinker/) – used for online RL training.
+- [Tinker](https://www.thinkingmachines.ai/tinker/) – the primary reference backend for online RL training in MetaClaw.
+- [MinT](https://mint-doc.macaron.im/) – a Tinker-compatible alternative from [Mind Lab](https://macaron.im/mindlab), available via [`mindlab-toolkit`](https://github.com/MindLab-Research/mindlab-toolkit). Its current supported models are listed on the [official model lineup page](https://mint-doc.macaron.im/using-the-api/model-lineup).
 - [OpenClaw-RL](https://github.com/Gen-Verse/OpenClaw-RL) – inspiration for our RL design.
 - [awesome-openclaw-skills](https://github.com/VoltAgent/awesome-openclaw-skills) – provides the foundation for our skill bank.
 
+The MinT compatibility work in this repository is one result of a collaboration between the MetaClaw project team and [Mind Lab](https://macaron.im/mindlab). In that collaboration, Mind Lab has focused on infrastructure research and LoRA RL algorithm optimization in support of the broader MetaClaw effort.
+
 ---
 
 ## 📄 License
diff --git a/assets/README_DE.md b/assets/README_DE.md
index b68e8aa..ce8297f 100644
--- a/assets/README_DE.md
+++ b/assets/README_DE.md
@@ -156,10 +156,13 @@ Serving, Reward Modeling und Training sind vollständig entkoppelt. Der Agent an
 pip install -e .                        # skills_only-Modus (leichtgewichtig)
 pip install -e ".[rl]"                  # + RL-Trainingsunterstützung (torch, transformers, tinker)
 pip install -e ".[evolve]"              # + Skill-Evolution via OpenAI-kompatibler LLM
+pip install mindlab-toolkit             # + optionales MinT-Kompatibilitäts-Backend
 pip install -e ".[scheduler]"           # + Google Calendar Integration für Scheduler
 pip install -e ".[rl,evolve,scheduler]" # empfohlen: vollständiges RL + Scheduler-Setup
 ```
 
+`mindlab-toolkit` wird nur benötigt, wenn du `rl.backend=mint` verwenden willst. MinT-Links: [Überblick](https://mint-doc.macaron.im/), [Tinker-Kompatibilität](https://mint-doc.macaron.im/using-the-api/tinker-compatibility), [Modellliste](https://mint-doc.macaron.im/using-the-api/model-lineup), [GitHub](https://github.com/MindLab-Research/mindlab-toolkit).
+
 ### 2. Konfiguration
 
 ```bash
@@ -168,6 +171,19 @@ metaclaw setup
 
 Der interaktive Assistent führt dich durch die Auswahl des LLM-Anbieters (Kimi, Qwen, MiniMax oder benutzerdefiniert), API-Schlüssel und optionale RL-Aktivierung.
 
+Der RL-Pfad von MetaClaw behält Tinker als Standard-Referenzbackend. `rl.backend=auto` ist die empfohlene Voreinstellung und kann bei installiertem MinT-Kompatibilitätspaket auch MinT anhand Mint-typischer Credentials oder Base-URLs erkennen. Wenn du denselben Workflow auf MinT richten willst, kannst du Folgendes setzen:
+
+```bash
+metaclaw config rl.backend mint
+metaclaw config rl.api_key sk-mint-...
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # Festlandchina; sonst https://mint.macaron.xin/
+metaclaw config rl.model Qwen/Qwen3-4B-Instruct-2507
+```
+
+Nutze `https://mint-cn.macaron.xin/` für Festlandchina oder `https://mint.macaron.xin/` andernfalls.
+
+Die Legacy-Aliase `rl.tinker_api_key` und `rl.tinker_base_url` werden aus Kompatibilitätsgründen weiterhin akzeptiert.
+
 ### 3. Start
 
 ```bash
@@ -195,7 +211,9 @@ metaclaw config KEY VALUE       # Konfigurationswert setzen
 
 ```bash
 metaclaw config rl.enabled true           # RL-Training aktivieren
-metaclaw config rl.tinker_api_key sk-...  # Tinker-Schlüssel setzen
+metaclaw config rl.backend auto           # auto | tinker | mint
+metaclaw config rl.api_key sk-...         # Schlüssel für das RL-Backend setzen
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # MinT-Endpunkt für Festlandchina; sonst https://mint.macaron.xin/
 metaclaw config skills.auto_evolve false  # Automatische Skill-Zusammenfassung deaktivieren
 metaclaw config proxy.port 31000          # Proxy-Port ändern
 ```
@@ -228,8 +246,12 @@ skills:
 
 rl:
   enabled: false            # auf true setzen, um RL-Training zu aktivieren
+  backend: auto             # "auto" | "tinker" | "mint"
   model: moonshotai/Kimi-K2.5
-  tinker_api_key: ""
+  api_key: ""
+  base_url: ""              # optionaler Backend-Endpunkt, z. B. https://mint-cn.macaron.xin/ (Festlandchina) oder https://mint.macaron.xin/ für MinT
+  tinker_api_key: ""        # Legacy-Alias für api_key
+  tinker_base_url: ""       # Legacy-Alias für base_url
   prm_url: https://api.openai.com/v1
   prm_model: gpt-5.2
   prm_api_key: ""
@@ -279,22 +301,33 @@ cp -r memory_data/skills/* ~/.metaclaw/skills/
 
 ## 🔬 Erweitert: RL-Modus
 
-RL-Training aktivieren, um das Modell kontinuierlich aus Live-Gesprächen feinabzustimmen:
+RL-Training aktivieren, um das Modell kontinuierlich aus Live-Gesprächen feinabzustimmen. Tinker bleibt der Standard-Referenzpfad, und MetaClaw kann denselben Workflow auch auf MinT als Tinker-kompatible Alternative richten:
 
 ```bash
 metaclaw config rl.enabled true
-metaclaw config rl.tinker_api_key sk-...
+metaclaw config rl.backend auto
+metaclaw config rl.api_key sk-...
 metaclaw config rl.prm_url https://api.openai.com/v1
 metaclaw config rl.prm_api_key sk-...
 metaclaw start
 ```
 
+Wenn du denselben Workflow auf MinT ausführen willst, ergänze Backend, Endpunkt und Modell:
+
+```bash
+metaclaw config rl.backend mint
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # Festlandchina; sonst https://mint.macaron.xin/
+metaclaw config rl.model Qwen/Qwen3-4B-Instruct-2507
+```
+
 Im RL-Modus:
 - Jeder Gesprächszug wird tokenisiert und als Trainingsbeispiel eingereicht
 - Ein Richter-LLM (PRM) bewertet Antworten asynchron
-- Tinker Cloud führt LoRA-Fine-Tuning durch; aktualisierte Gewichte werden alle `batch_size` Samples hot-geswappt
+- Standardmäßig übernimmt Tinker Cloud das LoRA-Fine-Tuning; MetaClaw kann aber auch mit einer kompatiblen Alternative wie MinT arbeiten, und aktualisierte Gewichte werden alle `batch_size` Samples hot-geswappt
 - Ein dediziertes Evolver-LLM extrahiert neue Skills aus fehlgeschlagenen Episoden
 
+Wenn du bei Tinker Cloud bleiben willst, nutze `rl.backend=auto` oder `rl.backend=tinker`. Für MinT setze `rl.backend=mint` und konfiguriere den entsprechenden Endpunkt.
+
 **Programmatisches Rollout** (keine OpenClaw TUI nötig): `openclaw_env_data_dir` auf ein Verzeichnis mit JSONL-Aufgabendateien setzen:
 
 ```json
@@ -358,14 +391,17 @@ Jedes `ConversationSample` wird mit einer `skill_generation`-Version getaggt. We
 
 ## 🙏 Danksagungen
 
-MetaClaw baut auf folgenden Open-Source-Projekten auf:
+MetaClaw baut auf folgenden Open-Source-Projekten und Kollaborationen auf:
 
 - [OpenClaw](https://openclaw.ai) — das zentrale Agent-Framework.
 - [SkillRL](https://github.com/aiming-lab/SkillRL) — unser skill-erweitertes RL-Framework.
-- [Tinker](https://www.thinkingmachines.ai/tinker/) — für Online-RL-Training verwendet.
+- [Tinker](https://www.thinkingmachines.ai/tinker/) — das primäre Referenz-Backend für Online-RL-Training in MetaClaw.
+- [MinT](https://mint-doc.macaron.im/) — eine Tinker-kompatible Alternative von [Mind Lab](https://macaron.im/mindlab), verfügbar über [`mindlab-toolkit`](https://github.com/MindLab-Research/mindlab-toolkit); unterstützte Modelle stehen auf der [offiziellen Modellliste](https://mint-doc.macaron.im/using-the-api/model-lineup).
 - [OpenClaw-RL](https://github.com/Gen-Verse/OpenClaw-RL) — Inspiration für unser RL-Design.
 - [awesome-openclaw-skills](https://github.com/VoltAgent/awesome-openclaw-skills) — stellt die Grundlage für unsere Skill-Bank bereit.
 
+Die MinT-Kompatibilitätsarbeit in diesem Repository ist ein Ergebnis der Zusammenarbeit zwischen dem MetaClaw-Projektteam und [Mind Lab](https://macaron.im/mindlab). Mind Lab konzentriert sich in dieser Zusammenarbeit vor allem auf Infrastrukturforschung und LoRA-RL-Algorithmusoptimierung.
+
 ---
 
 ## 📄 Lizenz
diff --git a/assets/README_ES.md b/assets/README_ES.md
index 72c78cb..ebac05a 100644
--- a/assets/README_ES.md
+++ b/assets/README_ES.md
@@ -156,10 +156,13 @@ El serving, el modelado de recompensas y el entrenamiento están completamente d
 pip install -e .                        # modo skills_only (ligero)
 pip install -e ".[rl]"                  # + soporte de entrenamiento RL (torch, transformers, tinker)
 pip install -e ".[evolve]"              # + evolución de skills via LLM compatible con OpenAI
+pip install mindlab-toolkit             # + backend de compatibilidad MinT opcional
 pip install -e ".[scheduler]"           # + integración Google Calendar para planificador
 pip install -e ".[rl,evolve,scheduler]" # recomendado: configuración completa RL + planificador
 ```
 
+Usa `mindlab-toolkit` solo si quieres `rl.backend=mint`. Enlaces de MinT: [overview](https://mint-doc.macaron.im/), [compatibilidad con Tinker](https://mint-doc.macaron.im/using-the-api/tinker-compatibility), [lista de modelos](https://mint-doc.macaron.im/using-the-api/model-lineup), [GitHub](https://github.com/MindLab-Research/mindlab-toolkit).
+
 ### 2. Configuración
 
 ```bash
@@ -168,6 +171,19 @@ metaclaw setup
 
 El asistente interactivo te pedirá que elijas tu proveedor de LLM (Kimi, Qwen, MiniMax o personalizado), tu clave API y si deseas activar el entrenamiento RL.
 
+La ruta RL de MetaClaw mantiene a Tinker como backend de referencia por defecto. `rl.backend=auto` es el valor recomendado y también puede detectar MinT a partir de credenciales o base URLs estilo Mint cuando el paquete de compatibilidad de MinT está instalado. Si quieres apuntar el mismo flujo a MinT, puedes configurar:
+
+```bash
+metaclaw config rl.backend mint
+metaclaw config rl.api_key sk-mint-...
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # China continental; fuera de China usa https://mint.macaron.xin/
+metaclaw config rl.model Qwen/Qwen3-4B-Instruct-2507
+```
+
+Usa `https://mint-cn.macaron.xin/` para China continental o `https://mint.macaron.xin/` en otros casos.
+
+Los alias heredados `rl.tinker_api_key` y `rl.tinker_base_url` siguen siendo válidos por compatibilidad.
+
 ### 3. Inicio
 
 ```bash
@@ -195,7 +211,9 @@ metaclaw config KEY VALUE       # Establecer un valor de configuración
 
 ```bash
 metaclaw config rl.enabled true           # Activar entrenamiento RL
-metaclaw config rl.tinker_api_key sk-...  # Establecer clave Tinker
+metaclaw config rl.backend auto           # auto | tinker | mint
+metaclaw config rl.api_key sk-...         # Establecer clave del backend RL
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # Endpoint de MinT para China continental; fuera de China usa https://mint.macaron.xin/
 metaclaw config skills.auto_evolve false  # Desactivar resumen automático de skills
 metaclaw config proxy.port 31000          # Cambiar puerto del proxy
 ```
@@ -228,8 +246,12 @@ skills:
 
 rl:
   enabled: false            # poner a true para activar entrenamiento RL
+  backend: auto             # "auto" | "tinker" | "mint"
   model: moonshotai/Kimi-K2.5
-  tinker_api_key: ""
+  api_key: ""
+  base_url: ""              # endpoint opcional del backend, p. ej. https://mint-cn.macaron.xin/ (China continental) o https://mint.macaron.xin/ para MinT
+  tinker_api_key: ""        # alias heredado de api_key
+  tinker_base_url: ""       # alias heredado de base_url
   prm_url: https://api.openai.com/v1
   prm_model: gpt-5.2
   prm_api_key: ""
@@ -279,22 +301,33 @@ cp -r memory_data/skills/* ~/.metaclaw/skills/
 
 ## 🔬 Avanzado: Modo RL
 
-Activa el entrenamiento RL para afinar continuamente el modelo a partir de conversaciones en vivo:
+Activa el entrenamiento RL para afinar continuamente el modelo a partir de conversaciones en vivo. Tinker sigue siendo la ruta de referencia por defecto, y MetaClaw también puede apuntar a MinT como alternativa compatible con Tinker:
 
 ```bash
 metaclaw config rl.enabled true
-metaclaw config rl.tinker_api_key sk-...
+metaclaw config rl.backend auto
+metaclaw config rl.api_key sk-...
 metaclaw config rl.prm_url https://api.openai.com/v1
 metaclaw config rl.prm_api_key sk-...
 metaclaw start
 ```
 
+Si quieres ejecutar el mismo flujo sobre MinT, añade backend, endpoint y modelo:
+
+```bash
+metaclaw config rl.backend mint
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # China continental; fuera de China usa https://mint.macaron.xin/
+metaclaw config rl.model Qwen/Qwen3-4B-Instruct-2507
+```
+
 En modo RL:
 - Cada turno de conversación se tokeniza y se envía como muestra de entrenamiento
 - Un LLM juez (PRM) puntúa las respuestas de forma asíncrona
-- Tinker Cloud ejecuta el fine-tuning LoRA; los pesos actualizados se hot-swap cada `batch_size` muestras
+- Por defecto, Tinker Cloud ejecuta el fine-tuning LoRA; MetaClaw también puede funcionar con una alternativa compatible como MinT, y los pesos actualizados se hot-swap cada `batch_size` muestras
 - Un LLM evolucionador dedicado extrae nuevos skills de los episodios fallidos
 
+Si te quedas con Tinker Cloud, usa `rl.backend=auto` o `rl.backend=tinker`. Si usas MinT, cambia a `rl.backend=mint` y configura el endpoint correspondiente.
+
 **Rollout programático** (sin TUI de OpenClaw): establece `openclaw_env_data_dir` en un directorio de archivos JSONL de tareas:
 
 ```json
@@ -358,14 +391,17 @@ Cada `ConversationSample` se etiqueta con una versión `skill_generation`. Cuand
 
 ## 🙏 Agradecimientos
 
-MetaClaw se construye sobre los siguientes proyectos de código abierto:
+MetaClaw se construye sobre los siguientes proyectos de código abierto y colaboraciones:
 
 - [OpenClaw](https://openclaw.ai) — el framework central de agentes.
 - [SkillRL](https://github.com/aiming-lab/SkillRL) — nuestro framework RL aumentado con skills.
-- [Tinker](https://www.thinkingmachines.ai/tinker/) — usado para entrenamiento RL en línea.
+- [Tinker](https://www.thinkingmachines.ai/tinker/) — el backend de referencia principal para el entrenamiento RL en línea de MetaClaw.
+- [MinT](https://mint-doc.macaron.im/) — una alternativa compatible con Tinker de [Mind Lab](https://macaron.im/mindlab), disponible vía [`mindlab-toolkit`](https://github.com/MindLab-Research/mindlab-toolkit); los modelos soportados están en la [página oficial](https://mint-doc.macaron.im/using-the-api/model-lineup).
 - [OpenClaw-RL](https://github.com/Gen-Verse/OpenClaw-RL) — inspiración para nuestro diseño RL.
 - [awesome-openclaw-skills](https://github.com/VoltAgent/awesome-openclaw-skills) — proporciona la base de nuestro banco de skills.
 
+El trabajo de compatibilidad con MinT en este repositorio es uno de los resultados de la colaboración entre el equipo del proyecto MetaClaw y [Mind Lab](https://macaron.im/mindlab). En esa colaboración, Mind Lab se ha centrado principalmente en la investigación de infraestructura y en la optimización de algoritmos LoRA RL.
+
 ---
 
 ## 📄 Licencia
diff --git a/assets/README_FR.md b/assets/README_FR.md
index c99a060..13ef9f4 100644
--- a/assets/README_FR.md
+++ b/assets/README_FR.md
@@ -156,10 +156,13 @@ Le serving, la modélisation des récompenses et l'entraînement sont entièreme
 pip install -e .                        # mode skills_only (léger)
 pip install -e ".[rl]"                  # + support d'entraînement RL (torch, transformers, tinker)
 pip install -e ".[evolve]"              # + évolution des skills via LLM compatible OpenAI
+pip install mindlab-toolkit             # + backend de compatibilité MinT optionnel
 pip install -e ".[scheduler]"           # + intégration Google Calendar pour le planificateur
 pip install -e ".[rl,evolve,scheduler]" # recommandé : configuration complète RL + planificateur
 ```
 
+`mindlab-toolkit` n'est nécessaire que si vous voulez utiliser `rl.backend=mint`. Liens MinT : [aperçu](https://mint-doc.macaron.im/), [compatibilité Tinker](https://mint-doc.macaron.im/using-the-api/tinker-compatibility), [liste des modèles](https://mint-doc.macaron.im/using-the-api/model-lineup), [GitHub](https://github.com/MindLab-Research/mindlab-toolkit).
+
 ### 2. Configuration
 
 ```bash
@@ -168,6 +171,19 @@ metaclaw setup
 
 L'assistant interactif vous demande de choisir votre fournisseur LLM (Kimi, Qwen, MiniMax, ou personnalisé), votre clé API, et d'activer optionnellement l'entraînement RL.
 
+La pile RL de MetaClaw garde Tinker comme backend de référence par défaut. `rl.backend=auto` est la valeur recommandée et peut aussi détecter MinT à partir d'identifiants ou de base URLs de style Mint quand le paquet de compatibilité MinT est installé. Si vous voulez pointer le même workflow vers MinT, vous pouvez configurer :
+
+```bash
+metaclaw config rl.backend mint
+metaclaw config rl.api_key sk-mint-...
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # Chine continentale ; sinon https://mint.macaron.xin/
+metaclaw config rl.model Qwen/Qwen3-4B-Instruct-2507
+```
+
+Utilisez `https://mint-cn.macaron.xin/` pour la Chine continentale ou `https://mint.macaron.xin/` sinon.
+
+Les alias hérités `rl.tinker_api_key` et `rl.tinker_base_url` restent acceptés pour compatibilité.
+
 ### 3. Démarrage
 
 ```bash
@@ -195,7 +211,9 @@ metaclaw config KEY VALUE       # Définir une valeur de configuration
 
 ```bash
 metaclaw config rl.enabled true           # Activer l'entraînement RL
-metaclaw config rl.tinker_api_key sk-...  # Définir la clé Tinker
+metaclaw config rl.backend auto           # auto | tinker | mint
+metaclaw config rl.api_key sk-...         # Définir la clé du backend RL
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # Endpoint MinT pour la Chine continentale ; sinon https://mint.macaron.xin/
 metaclaw config skills.auto_evolve false  # Désactiver le résumé automatique des skills
 metaclaw config proxy.port 31000          # Changer le port du proxy
 ```
@@ -228,8 +246,12 @@ skills:
 
 rl:
   enabled: false            # mettre à true pour activer l'entraînement RL
+  backend: auto             # "auto" | "tinker" | "mint"
   model: moonshotai/Kimi-K2.5
-  tinker_api_key: ""
+  api_key: ""
+  base_url: ""              # endpoint backend optionnel, par ex. https://mint-cn.macaron.xin/ (Chine continentale) ou https://mint.macaron.xin/ pour MinT
+  tinker_api_key: ""        # alias hérité de api_key
+  tinker_base_url: ""       # alias hérité de base_url
   prm_url: https://api.openai.com/v1
   prm_model: gpt-5.2
   prm_api_key: ""
@@ -279,22 +301,33 @@ cp -r memory_data/skills/* ~/.metaclaw/skills/
 
 ## 🔬 Avancé : Mode RL
 
-Activez l'entraînement RL pour affiner continuellement le modèle à partir des conversations en direct :
+Activez l'entraînement RL pour affiner continuellement le modèle à partir des conversations en direct. Tinker reste le chemin de référence par défaut, et MetaClaw peut aussi viser MinT comme alternative compatible Tinker :
 
 ```bash
 metaclaw config rl.enabled true
-metaclaw config rl.tinker_api_key sk-...
+metaclaw config rl.backend auto
+metaclaw config rl.api_key sk-...
 metaclaw config rl.prm_url https://api.openai.com/v1
 metaclaw config rl.prm_api_key sk-...
 metaclaw start
 ```
 
+Si vous voulez exécuter le même workflow sur MinT, ajoutez le backend, l'endpoint et le modèle :
+
+```bash
+metaclaw config rl.backend mint
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # Chine continentale ; sinon https://mint.macaron.xin/
+metaclaw config rl.model Qwen/Qwen3-4B-Instruct-2507
+```
+
 En mode RL :
 - Chaque tour de conversation est tokenisé et soumis comme échantillon d'entraînement
 - Un LLM juge (PRM) évalue les réponses de manière asynchrone
-- Tinker exécute le fine-tuning LoRA dans le cloud ; les poids mis à jour sont hot-swappés toutes les `batch_size` samples
+- Par défaut, Tinker Cloud exécute le fine-tuning LoRA ; MetaClaw peut aussi fonctionner avec une alternative compatible comme MinT, et les poids mis à jour sont hot-swappés toutes les `batch_size` samples
 - Un LLM évolueur dédié extrait de nouveaux skills des épisodes échoués
 
+Si vous restez sur Tinker Cloud, gardez `rl.backend=auto` ou utilisez `rl.backend=tinker`. Si vous utilisez MinT, passez à `rl.backend=mint` et indiquez l'endpoint correspondant.
+
 **Rollout programmatique** (sans TUI OpenClaw) : définissez `openclaw_env_data_dir` sur un répertoire de fichiers JSONL de tâches :
 
 ```json
@@ -358,14 +391,17 @@ Chaque `ConversationSample` est étiqueté avec une version `skill_generation`.
 
 ## 🙏 Remerciements
 
-MetaClaw est construit sur les projets open-source suivants :
+MetaClaw est construit sur les projets open-source et collaborations suivants :
 
 - [OpenClaw](https://openclaw.ai) — le framework d'agent central.
 - [SkillRL](https://github.com/aiming-lab/SkillRL) — notre framework RL augmenté de skills.
-- [Tinker](https://www.thinkingmachines.ai/tinker/) — utilisé pour l'entraînement RL en ligne.
+- [Tinker](https://www.thinkingmachines.ai/tinker/) — le backend de référence principal pour l'entraînement RL en ligne dans MetaClaw.
+- [MinT](https://mint-doc.macaron.im/) — une alternative compatible Tinker proposée par [Mind Lab](https://macaron.im/mindlab), disponible via [`mindlab-toolkit`](https://github.com/MindLab-Research/mindlab-toolkit) ; les modèles pris en charge sont listés sur la [page officielle](https://mint-doc.macaron.im/using-the-api/model-lineup).
 - [OpenClaw-RL](https://github.com/Gen-Verse/OpenClaw-RL) — inspiration pour notre conception RL.
 - [awesome-openclaw-skills](https://github.com/VoltAgent/awesome-openclaw-skills) — fournit la base de notre banque de skills.
 
+Le travail de compatibilité MinT dans ce dépôt est l'un des résultats d'une collaboration entre l'équipe du projet MetaClaw et [Mind Lab](https://macaron.im/mindlab). Dans cette collaboration, Mind Lab s'est concentré sur la recherche infra et l'optimisation des algorithmes LoRA RL.
+
 ---
 
 ## 📄 Licence
diff --git a/assets/README_JA.md b/assets/README_JA.md
index c8fe387..a8404f7 100644
--- a/assets/README_JA.md
+++ b/assets/README_JA.md
@@ -156,10 +156,13 @@ OPD モードでは、学生モデルが通常通り回答を生成し、教師
 pip install -e .                        # skills_only モード（軽量）
 pip install -e ".[rl]"                  # + RL トレーニングサポート（torch、transformers、tinker）
 pip install -e ".[evolve]"              # + OpenAI 互換 LLM によるスキル進化
+pip install mindlab-toolkit             # + オプションの MinT 互換バックエンド
 pip install -e ".[scheduler]"           # + Google Calendar スケジューラ統合
 pip install -e ".[rl,evolve,scheduler]" # 推奨：完全 RL + スケジューラセットアップ
 ```
 
+`rl.backend=mint` を使う場合にだけ `mindlab-toolkit` が必要です。MinT の資料: [概要](https://mint-doc.macaron.im/)、[Tinker 互換性](https://mint-doc.macaron.im/using-the-api/tinker-compatibility)、[モデル一覧](https://mint-doc.macaron.im/using-the-api/model-lineup)、[GitHub](https://github.com/MindLab-Research/mindlab-toolkit)。
+
 ### 2. 設定
 
 ```bash
@@ -168,6 +171,19 @@ metaclaw setup
 
 対話型ウィザードで LLM プロバイダー（Kimi、Qwen、MiniMax、またはカスタム）、API キー、RL の有効化を設定します。
 
+MetaClaw の RL パスは引き続き Tinker を基準バックエンドとしています。推奨デフォルトは `rl.backend=auto` で、MinT 互換パッケージが入っていれば Mint 形式の credentials や base URL から MinT も自動判定できます。同じ学習フローを MinT に切り替えたい場合は、次のように設定します。
+
+```bash
+metaclaw config rl.backend mint
+metaclaw config rl.api_key sk-mint-...
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # 中国本土。その他の地域では https://mint.macaron.xin/
+metaclaw config rl.model Qwen/Qwen3-4B-Instruct-2507
+```
+
+中国本土では `https://mint-cn.macaron.xin/`、それ以外では `https://mint.macaron.xin/` を使用できます。
+
+互換性のため、旧来の `rl.tinker_api_key` と `rl.tinker_base_url` も引き続き利用できます。
+
 ### 3. 起動
 
 ```bash
@@ -195,7 +211,9 @@ metaclaw config KEY VALUE       # 設定値を変更
 
 ```bash
 metaclaw config rl.enabled true           # RL トレーニングを有効化
-metaclaw config rl.tinker_api_key sk-...  # Tinker キーを設定
+metaclaw config rl.backend auto           # auto | tinker | mint
+metaclaw config rl.api_key sk-...         # RL バックエンドのキーを設定
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # MinT の中国本土 endpoint。その他の地域では https://mint.macaron.xin/
 metaclaw config skills.auto_evolve false  # スキル自動集約を無効化
 metaclaw config proxy.port 31000          # プロキシポートを変更
 ```
@@ -228,8 +246,12 @@ skills:
 
 rl:
   enabled: false            # true にすると RL トレーニングを有効化
+  backend: auto             # "auto" | "tinker" | "mint"
   model: moonshotai/Kimi-K2.5
-  tinker_api_key: ""
+  api_key: ""
+  base_url: ""              # 任意の backend endpoint。例: MinT 用 https://mint-cn.macaron.xin/（中国本土）または https://mint.macaron.xin/
+  tinker_api_key: ""        # api_key の互換エイリアス
+  tinker_base_url: ""       # base_url の互換エイリアス
   prm_url: https://api.openai.com/v1
   prm_model: gpt-5.2
   prm_api_key: ""
@@ -279,22 +301,33 @@ cp -r memory_data/skills/* ~/.metaclaw/skills/
 
 ## 🔬 上級者向け：RL モード
 
-RL トレーニングを有効にして、ライブ会話からモデルを継続的にファインチューニング：
+RL トレーニングを有効にして、ライブ会話からモデルを継続的にファインチューニングします。Tinker がデフォルトの基準パスで、MetaClaw は Tinker 互換の代替として MinT にも切り替えられます：
 
 ```bash
 metaclaw config rl.enabled true
-metaclaw config rl.tinker_api_key sk-...
+metaclaw config rl.backend auto
+metaclaw config rl.api_key sk-...
 metaclaw config rl.prm_url https://api.openai.com/v1
 metaclaw config rl.prm_api_key sk-...
 metaclaw start
 ```
 
+同じフローを MinT で動かしたい場合は、backend・endpoint・model を追加で設定します：
+
+```bash
+metaclaw config rl.backend mint
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # 中国本土。その他の地域では https://mint.macaron.xin/
+metaclaw config rl.model Qwen/Qwen3-4B-Instruct-2507
+```
+
 RL モードでは：
 - 各会話ターンがトークン化されてトレーニングサンプルとして提出
 - 審判 LLM（PRM）が非同期で回答にスコアを付与
-- Tinker クラウドが LoRA ファインチューニングを実行。`batch_size` サンプルごとにウェイトをホットスワップ
+- デフォルトでは Tinker クラウドが LoRA ファインチューニングを実行し、MetaClaw は MinT のような互換代替バックエンドにも接続可能です。`batch_size` サンプルごとにウェイトをホットスワップ
 - 専用エボルバー LLM が失敗したエピソードから新しいスキルを抽出
 
+Tinker クラウドを使う場合は `rl.backend=auto` のままか `rl.backend=tinker` に設定します。MinT を使う場合は `rl.backend=mint` にして endpoint を指定してください。
+
 **プログラム的なロールアウト**（OpenClaw TUI 不要）：`openclaw_env_data_dir` を JSONL タスクファイルのディレクトリに設定：
 
 ```json
@@ -358,14 +391,17 @@ metaclaw config scheduler.calendar.credentials_path ~/.metaclaw/client_secrets.j
 
 ## 🙏 謝辞
 
-MetaClaw は以下のオープンソースプロジェクトの上に構築されています：
+MetaClaw は以下のオープンソースプロジェクトと協業成果の上に構築されています：
 
 - [OpenClaw](https://openclaw.ai) — コアエージェントフレームワーク。
 - [SkillRL](https://github.com/aiming-lab/SkillRL) — スキル強化 RL フレームワーク。
-- [Tinker](https://www.thinkingmachines.ai/tinker/) — オンライン RL トレーニングに使用。
+- [Tinker](https://www.thinkingmachines.ai/tinker/) — MetaClaw におけるオンライン RL トレーニングの主要な基準バックエンド。
+- [MinT](https://mint-doc.macaron.im/) — [Mind Lab](https://macaron.im/mindlab) による Tinker 互換の代替バックエンド。[`mindlab-toolkit`](https://github.com/MindLab-Research/mindlab-toolkit) 経由で利用でき、対応モデルは[公式モデル一覧](https://mint-doc.macaron.im/using-the-api/model-lineup)で確認できます。
 - [OpenClaw-RL](https://github.com/Gen-Verse/OpenClaw-RL) — RL 設計のインスピレーション。
 - [awesome-openclaw-skills](https://github.com/VoltAgent/awesome-openclaw-skills) — スキルバンクの基盤を提供。
 
+このリポジトリにおける MinT 互換対応は、MetaClaw プロジェクトチームと [Mind Lab](https://macaron.im/mindlab) の協業成果の一部です。この協業において、Mind Lab は主にインフラ研究と LoRA RL アルゴリズム最適化に注力しています。
+
 ---
 
 ## 📄 ライセンス
diff --git a/assets/README_KO.md b/assets/README_KO.md
index 98d5a5a..e475f94 100644
--- a/assets/README_KO.md
+++ b/assets/README_KO.md
@@ -156,10 +156,13 @@ OPD 모드에서는 학생 모델이 평소와 같이 응답을 생성하고, 
 pip install -e .                        # skills_only 모드 (경량)
 pip install -e ".[rl]"                  # + RL 학습 지원 (torch, transformers, tinker)
 pip install -e ".[evolve]"              # + OpenAI 호환 LLM을 통한 스킬 진화
+pip install mindlab-toolkit             # + 선택적 MinT 호환 백엔드
 pip install -e ".[scheduler]"           # + Google Calendar 스케줄러 통합
 pip install -e ".[rl,evolve,scheduler]" # 권장: 전체 RL + 스케줄러 설정
 ```
 
+`rl.backend=mint`를 사용할 때만 `mindlab-toolkit`가 필요합니다. MinT 자료: [개요](https://mint-doc.macaron.im/), [Tinker 호환성](https://mint-doc.macaron.im/using-the-api/tinker-compatibility), [모델 목록](https://mint-doc.macaron.im/using-the-api/model-lineup), [GitHub](https://github.com/MindLab-Research/mindlab-toolkit).
+
 ### 2. 설정
 
 ```bash
@@ -168,6 +171,19 @@ metaclaw setup
 
 대화형 마법사에서 LLM 공급자(Kimi, Qwen, MiniMax, 또는 커스텀), API 키, RL 활성화 여부를 설정합니다.
 
+MetaClaw의 RL 경로는 기본적으로 Tinker를 기준 백엔드로 유지합니다. 권장 기본값은 `rl.backend=auto`이며, MinT 호환 패키지가 설치되어 있으면 Mint 스타일 credentials나 base URL로 MinT도 자동 추론할 수 있습니다. 같은 워크플로를 MinT로 전환하려면 다음과 같이 설정합니다.
+
+```bash
+metaclaw config rl.backend mint
+metaclaw config rl.api_key sk-mint-...
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # 중국 본토. 그 외 지역은 https://mint.macaron.xin/
+metaclaw config rl.model Qwen/Qwen3-4B-Instruct-2507
+```
+
+중국 본토에서는 `https://mint-cn.macaron.xin/`, 그 외 지역에서는 `https://mint.macaron.xin/`를 사용할 수 있습니다.
+
+하위 호환성을 위해 기존 `rl.tinker_api_key`와 `rl.tinker_base_url`도 계속 지원합니다.
+
 ### 3. 시작
 
 ```bash
@@ -195,7 +211,9 @@ metaclaw config KEY VALUE       # 설정값 변경
 
 ```bash
 metaclaw config rl.enabled true           # RL 학습 활성화
-metaclaw config rl.tinker_api_key sk-...  # Tinker 키 설정
+metaclaw config rl.backend auto           # auto | tinker | mint
+metaclaw config rl.api_key sk-...         # RL 백엔드 키 설정
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # MinT 중국 본토 endpoint. 그 외 지역은 https://mint.macaron.xin/
 metaclaw config skills.auto_evolve false  # 스킬 자동 요약 비활성화
 metaclaw config proxy.port 31000          # 프록시 포트 변경
 ```
@@ -228,8 +246,12 @@ skills:
 
 rl:
   enabled: false            # true로 설정하면 RL 학습 활성화
+  backend: auto             # "auto" | "tinker" | "mint"
   model: moonshotai/Kimi-K2.5
-  tinker_api_key: ""
+  api_key: ""
+  base_url: ""              # 선택적 backend endpoint. 예: MinT용 https://mint-cn.macaron.xin/ (중국 본토) 또는 https://mint.macaron.xin/
+  tinker_api_key: ""        # api_key의 호환 별칭
+  tinker_base_url: ""       # base_url의 호환 별칭
   prm_url: https://api.openai.com/v1
   prm_model: gpt-5.2
   prm_api_key: ""
@@ -279,22 +301,33 @@ cp -r memory_data/skills/* ~/.metaclaw/skills/
 
 ## 🔬 고급: RL 모드
 
-RL 학습을 활성화하여 실시간 대화로부터 모델을 지속적으로 파인튜닝:
+RL 학습을 활성화해 실시간 대화로부터 모델을 지속적으로 파인튜닝합니다. Tinker가 기본 기준 경로이며, MetaClaw는 Tinker 호환 대안인 MinT에도 연결할 수 있습니다:
 
 ```bash
 metaclaw config rl.enabled true
-metaclaw config rl.tinker_api_key sk-...
+metaclaw config rl.backend auto
+metaclaw config rl.api_key sk-...
 metaclaw config rl.prm_url https://api.openai.com/v1
 metaclaw config rl.prm_api_key sk-...
 metaclaw start
 ```
 
+같은 워크플로를 MinT에서 실행하려면 backend, endpoint, model을 추가로 설정합니다:
+
+```bash
+metaclaw config rl.backend mint
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # 중국 본토. 그 외 지역은 https://mint.macaron.xin/
+metaclaw config rl.model Qwen/Qwen3-4B-Instruct-2507
+```
+
 RL 모드에서:
 - 각 대화 턴이 토크나이즈되어 학습 샘플로 제출됨
 - 심판 LLM(PRM)이 비동기로 응답 채점
-- Tinker 클라우드가 LoRA 파인튜닝 실행. `batch_size` 샘플마다 가중치 핫스왑
+- 기본적으로 Tinker 클라우드가 LoRA 파인튜닝을 수행하며, MetaClaw는 MinT 같은 호환 대안 백엔드에도 연결할 수 있습니다. `batch_size` 샘플마다 가중치 핫스왑
 - 전용 에볼버 LLM이 실패한 에피소드에서 새 스킬 추출
 
+Tinker Cloud를 계속 쓰려면 `rl.backend=auto`를 유지하거나 `rl.backend=tinker`로 설정하면 됩니다. MinT를 쓰려면 `rl.backend=mint`로 바꾸고 endpoint를 지정하세요.
+
 **프로그래매틱 롤아웃** (OpenClaw TUI 불필요): `openclaw_env_data_dir`를 JSONL 태스크 파일 디렉토리로 설정:
 
 ```json
@@ -358,14 +391,17 @@ metaclaw config scheduler.calendar.credentials_path ~/.metaclaw/client_secrets.j
 
 ## 🙏 감사의 말
 
-MetaClaw는 다음 오픈소스 프로젝트를 기반으로 구축되었습니다:
+MetaClaw는 다음 오픈소스 프로젝트와 협업 성과를 바탕으로 구축되었습니다:
 
 - [OpenClaw](https://openclaw.ai) — 핵심 에이전트 프레임워크.
 - [SkillRL](https://github.com/aiming-lab/SkillRL) — 스킬 강화 RL 프레임워크.
-- [Tinker](https://www.thinkingmachines.ai/tinker/) — 온라인 RL 학습에 사용.
+- [Tinker](https://www.thinkingmachines.ai/tinker/) — MetaClaw 온라인 RL 학습의 주요 기준 백엔드.
+- [MinT](https://mint-doc.macaron.im/) — [Mind Lab](https://macaron.im/mindlab)의 Tinker 호환 대안 백엔드. [`mindlab-toolkit`](https://github.com/MindLab-Research/mindlab-toolkit)으로 사용할 수 있으며, 지원 모델은 [공식 모델 목록](https://mint-doc.macaron.im/using-the-api/model-lineup)에서 확인할 수 있습니다.
 - [OpenClaw-RL](https://github.com/Gen-Verse/OpenClaw-RL) — RL 설계의 영감.
 - [awesome-openclaw-skills](https://github.com/VoltAgent/awesome-openclaw-skills) — 스킬 뱅크의 기반 제공.
 
+이 저장소의 MinT 호환 작업은 MetaClaw 프로젝트 팀과 [Mind Lab](https://macaron.im/mindlab)의 협업 결과 중 하나입니다. 이 협업에서 Mind Lab은 주로 인프라 연구와 LoRA RL 알고리즘 최적화에 집중했습니다.
+
 ---
 
 ## 📄 라이선스
diff --git a/assets/README_ZH.md b/assets/README_ZH.md
index e8eb721..75d2c15 100644
--- a/assets/README_ZH.md
+++ b/assets/README_ZH.md
@@ -156,10 +156,13 @@ MetaClaw 同时支持：
 pip install -e .                        # skills_only 模式（轻量）
 pip install -e ".[rl]"                  # + RL 训练支持（torch、transformers、tinker）
 pip install -e ".[evolve]"              # + 通过 OpenAI 兼容 LLM 进行 Skill 进化
+pip install mindlab-toolkit             # + 可选的 MinT 兼容后端
 pip install -e ".[scheduler]"           # + Google Calendar 调度器集成
 pip install -e ".[rl,evolve,scheduler]" # 推荐：完整 RL + 调度器配置
 ```
 
+只有在你要使用 `rl.backend=mint` 时才需要安装 `mindlab-toolkit`。MinT 相关资料见：[总览](https://mint-doc.macaron.im/)、[Tinker 兼容说明](https://mint-doc.macaron.im/using-the-api/tinker-compatibility)、[模型列表](https://mint-doc.macaron.im/using-the-api/model-lineup)、[GitHub](https://github.com/MindLab-Research/mindlab-toolkit)。
+
 ### 2. 配置
 
 ```bash
@@ -168,6 +171,19 @@ metaclaw setup
 
 交互式向导会引导你选择 LLM 提供商（Kimi、Qwen、MiniMax 或自定义），填写 API Key，并可选开启 RL 训练。
 
+MetaClaw 的 RL 路径默认仍以 Tinker 为参考后端。推荐默认值是 `rl.backend=auto`；当环境里安装了 MinT 兼容包时，它也可以根据 Mint 风格的凭证或 base URL 自动识别 MinT。如果你想把同一套流程切到 MinT，可以这样配置：
+
+```bash
+metaclaw config rl.backend mint
+metaclaw config rl.api_key sk-mint-...
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # 中国大陆；其他地区可用 https://mint.macaron.xin/
+metaclaw config rl.model Qwen/Qwen3-4B-Instruct-2507
+```
+
+中国大陆可使用 `https://mint-cn.macaron.xin/`，其他地区可使用 `https://mint.macaron.xin/`。
+
+兼容旧配置的 `rl.tinker_api_key` 和 `rl.tinker_base_url` 仍然可以继续使用。
+
 ### 3. 启动
 
 ```bash
@@ -195,7 +211,9 @@ metaclaw config KEY VALUE       # 设置配置项
 
 ```bash
 metaclaw config rl.enabled true           # 开启 RL 训练
-metaclaw config rl.tinker_api_key sk-...  # 设置 Tinker Key
+metaclaw config rl.backend auto           # auto | tinker | mint
+metaclaw config rl.api_key sk-...         # 设置 RL 后端 Key
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # MinT 中国大陆 endpoint；其他地区可用 https://mint.macaron.xin/
 metaclaw config skills.auto_evolve false  # 关闭 Skill 自动总结
 metaclaw config proxy.port 31000          # 修改代理端口
 ```
@@ -228,8 +246,12 @@ skills:
 
 rl:
   enabled: false            # 设为 true 开启 RL 训练
+  backend: auto             # "auto" | "tinker" | "mint"
   model: moonshotai/Kimi-K2.5
-  tinker_api_key: ""
+  api_key: ""
+  base_url: ""              # 可选后端 endpoint，例如用于 MinT 的 https://mint-cn.macaron.xin/（中国大陆）或 https://mint.macaron.xin/
+  tinker_api_key: ""        # api_key 的兼容别名
+  tinker_base_url: ""       # base_url 的兼容别名
   prm_url: https://api.openai.com/v1
   prm_model: gpt-5.2
   prm_api_key: ""
@@ -279,22 +301,33 @@ cp -r memory_data/skills/* ~/.metaclaw/skills/
 
 ## 🔬 进阶：RL 模式
 
-开启 RL 训练，从实时对话中持续微调模型：
+开启 RL 训练，从实时对话中持续微调模型。Tinker 仍是默认参考路径，MetaClaw 也支持把同一套流程切到作为 Tinker 兼容替代方案的 MinT：
 
 ```bash
 metaclaw config rl.enabled true
-metaclaw config rl.tinker_api_key sk-...
+metaclaw config rl.backend auto
+metaclaw config rl.api_key sk-...
 metaclaw config rl.prm_url https://api.openai.com/v1
 metaclaw config rl.prm_api_key sk-...
 metaclaw start
 ```
 
+如果你要切到 MinT，再额外配置 backend、endpoint 和 model：
+
+```bash
+metaclaw config rl.backend mint
+metaclaw config rl.base_url https://mint-cn.macaron.xin/  # 中国大陆；其他地区可用 https://mint.macaron.xin/
+metaclaw config rl.model Qwen/Qwen3-4B-Instruct-2507
+```
+
 RL 模式下：
 - 每轮对话被 tokenize 并作为训练样本提交
 - 裁判 LLM（PRM）异步为回复打分
-- Tinker 云端执行 LoRA 微调，每累积 `batch_size` 个样本热更新权重
+- 默认由 Tinker 云端执行 LoRA 微调；MetaClaw 也可以接到 MinT 这类兼容替代后端上，并在每累积 `batch_size` 个样本后热更新权重
 - 专属进化器 LLM 从失败的 episode 中提取新 Skill
 
+如果你继续使用 Tinker 云端，保留 `rl.backend=auto` 或显式设成 `rl.backend=tinker` 即可；如果你使用 MinT，则把 `rl.backend` 改成 `mint` 并配置对应 endpoint。
+
 **程序化 rollout**（无需 OpenClaw TUI）：将 `openclaw_env_data_dir` 设为包含 JSONL 任务文件的目录：
 
 ```json
@@ -358,14 +391,17 @@ metaclaw config scheduler.calendar.credentials_path ~/.metaclaw/client_secrets.j
 
 ## 🙏 致谢
 
-MetaClaw 基于以下开源项目构建：
+MetaClaw 基于以下开源项目与协作成果构建：
 
 - [OpenClaw](https://openclaw.ai) —— 核心 Agent 框架。
 - [SkillRL](https://github.com/aiming-lab/SkillRL) —— 我们的 Skill 增强 RL 框架。
-- [Tinker](https://www.thinkingmachines.ai/tinker/) —— 用于在线 RL 训练。
+- [Tinker](https://www.thinkingmachines.ai/tinker/) —— MetaClaw 在线 RL 训练的主要参考后端。
+- [MinT](https://mint-doc.macaron.im/) —— 来自 [Mind Lab](https://macaron.im/mindlab) 的 Tinker 兼容替代方案，可通过 [`mindlab-toolkit`](https://github.com/MindLab-Research/mindlab-toolkit) 使用；当前支持的模型见[官方模型列表](https://mint-doc.macaron.im/using-the-api/model-lineup)。
 - [OpenClaw-RL](https://github.com/Gen-Verse/OpenClaw-RL) —— 我们 RL 设计的灵感来源。
 - [awesome-openclaw-skills](https://github.com/VoltAgent/awesome-openclaw-skills) —— 为我们的 Skill 库提供基础。
 
+仓库中的 MinT 兼容工作，是 MetaClaw 项目组与 [Mind Lab](https://macaron.im/mindlab) 协作成果的一部分。Mind Lab 在这项协作中主要聚焦于基础设施研究以及 LoRA RL 算法优化。
+
 ---
 
 ## 📄 许可证