NexRL

中文｜ English

NexRL

NexRL is a production-ready, distributed LLM post-training framework defined by its ultra-loosely-coupled philosophy. Its service-oriented architecture provides maximum flexibility and extensibility while maintaining clean abstractions and ease of use.

Key Features

Training-as-a-Service & Rollout-as-a-Service: Unified API architecture that seamlessly supports different training and inference frameworks through service abstraction. Switch between training backends (FSDP, Megatron, etc.) and inference engines (SGLang, vLLM, TGI, etc.) without modifying your code.
Decoupled Modular Architecture: Clean separation of concerns with well-defined interfaces and extensible components. Each module operates independently, enabling easy customization and maintenance.
Zero-Code Agent-Training Support: Agents can seamlessly integrate with RL training without any RL-specific code modifications.
Intelligent Resource Management: Configurable placement and co-location of services for optimal performance in distributed environments
Comprehensive Monitoring: Built-in activity tracking and health checking system for production deployments
Robust Error Handling: Centralized error reporting and recovery mechanisms for production reliability

Architecture

NexRL follows a modular architecture where components communicate through explicit interfaces and APIs:

Core Components:

DataLoader: Provides training data (supports custom datasets)
RolloutWorker: Executes environment interactions (your agent goes here!)
TrajectoryPool: Manages trajectory collection and batching
AlgorithmProcessor: Computes advantages and prepares training batches
TrainWorker: Coordinates model training through service APIs

Services:

Inference Service: Adopts the standard OpenAI API as the unified interaction interface with inference engines. This API-centric design ensures that the upper-layer modules can interact with various inference engines (such as SGLang, vLLM, etc.) in a consistent manner, eliminating the need for code modifications when switching between different inference engines.
Train Service: Utilizes standardized forward() and forward_backward() APIs to communicate with different training backends (including FSDP, Megatron, etc.). To achieve compatibility with diverse backends, we implement lightweight adapters tailored for each backend. These adapters translate the standardized API calls into backend-specific operations, enabling seamless switching of training backends without altering the core training logic.
Agent Service: Provides a streamlined integration path for agents to participate in RL training. Agents can directly push generated trajectories into the TrajectoryPool through this service, eliminating the need for developers to rewrite or modify agent code to adapt to RL training requirements.

Getting Started

Prerequisites

Python 3.12+
CUDA 12.8+ (for GPU support)
Ray 2.48+ (for distributed mode)
kubectl installed and configured
Access to a Kubernetes cluster
Volcano Scheduler installed in the cluster
High-performance network file system, e.g., GPFS

Check pyproject for the full dependency list.

Quick Start

Install the NexRL repository to enable the CLI:

git clone git@github.com:nex-agi/NexRL.git
cd NexRL
pip install -e .

Next, ask your cluster maintainer to perform the one-time admin-setup:

nexrl admin-setup

This step saves cluster-shared configurations in Kubernetes and launches cluster-level services such as train-router and rollout-router.

Once the setup is complete, you can prepare an rl_train.yaml configuration file in a folder and launch your job:

nexrl launch --job-path /path/to/your/configuration/folder/

For detailed configuration examples, please refer to examples/single_turn_math.

Documentation

Developer Guide: Comprehensive documentation on architecture, APIs, and advanced usage
Configuration Reference: Full configuration options with detailed comments
Test Suite: Testing guide and examples

More on the Way

This release represents a foundational version of NexRL, designed to demonstrate our loosely-coupled and service-oriented architecture. We are actively working on preparing the code for open source and will release more of our work soon, including:

More model & agent support
Additional trainging and inference backend ntegrations
High-performance weight synchronization
Advanced agent training support
Post-training algorithm exploration
More usability tools
...

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Acknowlegement

NexRL aims for ultimate scalability and usability, fully embracing the open-source ecosystem to minimize code adaptation costs and improve experimental efficiency. NexRL is built upon several excellent open-source frameworks, including vLLM, SGLang, FSDP, Megatron, and VeRL (the adapter for the FSDP backend adopts the implementation from VeRL). Additionally, the zero-agent code development design of the Agent Service is inspired by Agent Lightning.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
cli		cli
docker		docker
docs		docs
examples/single_turn_math		examples/single_turn_math
nexrl		nexrl
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NexRL

Key Features

Architecture

Getting Started

Prerequisites

Quick Start

Documentation

More on the Way

License

Acknowlegement

About

Uh oh!

Releases

Packages

Languages

License

MachineLearningSystem/NexRL

Folders and files

Latest commit

History

Repository files navigation

NexRL

Key Features

Architecture

Getting Started

Prerequisites

Quick Start

Documentation

More on the Way

License

Acknowlegement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages