Skip to content

AI45Lab/Safactory

Repository files navigation

🧪 Safactory

A universal AI agent sandbox for evaluation, training data construction, and RL training
across ten open-source environments spanning Android, OS, Minecraft, Embodied agents, QA, data processing, scientific discovery, and multimodal reasoning.

Quick StartEnvironmentsRL TrainingCustom EnvConfigurationDataReport

Python 3.9+ License Modes LLM Backends


✨ Why Safactory?

Safactory provides a unified pipeline so you can go from model evaluation to RL training without changing your codebase:

Goal What Safactory does
Evaluate agents Run any LLM against realistic simulated environments and collect reward metrics
Build training data Every interaction is automatically logged to SQLite — ready to be used as SFT / RL data
RL training Feed rollout data directly into Slime-based GRPO training via the built-in Buffer Server

Key strengths:

  • 🌍 Multi-domain environments — Android, OS, Minecraft, RoboTrustBench, Embodied ALFRED and more
  • High concurrency — Environment pool management with async workers for fast parallel rollouts
  • 🔌 LLM-agnostic — Works with any OpenAI-compatible endpoint (vLLM, SGLang, OpenAI API)
  • 🏗️ Two deployment modeslocal (single machine) or remote (Ray-based cluster)
  • 🧩 Extensible — Add a new environment in < 50 lines by implementing a simple BaseEnv interface

🚀 Quick Start

Installation

git clone https://github.com/AI45Lab/Safactory.git
cd Safactory
pip install -r requirements.txt

1 — Evaluate a Model

python launcher.py \
  --env-config env/osgym/os_config.yaml \   # Select the evaluation environment (OS / Android / Minecraft, etc.)
  --llm-base-url http://YOUR_LLM_HOST/v1 \  # Model service address
  --llm-api-key YOUR_API_KEY \              # API Key
  --llm-model YOUR_MODEL \                  # Model name
  --pool-size 1                             # Number of concurrent environment instances

This command will automatically complete environment loading, task scheduling, and evaluation execution.

Configuration

  • CLI parameters: Control model access and concurrent execution(e.g., --llm-*, --pool-size

  • YAML configuration: Defines specific environments and tasks (e.g., dataset, environment parameters)

2 — Collect Training Data

Every run automatically records step-level interactions (messages, response, reward, environment state) to test_envs.db. Records are available immediately after the run completes.

See docs/data-manager.md for the database schema and example queries.

3 — RL Training (Optional)

With a rollout runner active, start the Slime training loop in a second terminal:

# Terminal 1 — Slime training process (requires Slime installation)
cd rl && ./run_slime_generator_vl.sh

# Terminal 2 — Buffer Server (launches the Safactory runner and collects rollouts)
cd rl && ./run_buffer_server.sh

Terminals 1 and 2 can run on different machines as long as they can communicate.

Full setup guide: docs/rl-training.md

4 — Experience Extraction & Injection(Optional)

Safactory supports optional experience extraction and injection. You can distill reusable lessons from historical trajectories into a local experience library, then inject relevant experience into the agent prompt at the start of a new episode.

For a detailed usage guide, see docs/experience-extraction-injection.md.


📚 Documentation

Guide Description
Supported Environments Setup, Prerequisites, Docker images, and Configuration
RL Training Slime integration, Buffer Server setup, and RL parameters
Custom Environment Step-by-step guide to adding a new environment
Configuration Full CLI reference and config.yaml schema
Data Manager Database schema and SQLite query examples
Report Project report PDF

🤝 Contributing

Contributions for new environments, bug fixes, and documentation improvements are welcome.

  1. Fork the repository
  2. Implement your environment under env/your_env_name/
  3. Add a config YAML and a brief README.md in the same directory
  4. Open a Pull Request

For questions and bug reports, please use the issue tracker.

About

Safactory: A Scalable Agent Factory for Trustworthy Autonomous Intelligence

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors