GitHub - rlite-project/RLite: A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithms with minimal intrusion.

🚀 Quick start • 🌰 Examples • 🍲 Recipes • 📚 Docs

A lightweight RL framework with PyTorch-like interfaces.

Features

FSDP2 and FSDP support for training.
vLLM support for inference.
ray support for resource management.
Easy to learn and use. Most interfaces are kept the same as Torch, with parallel engine working seamlessly behind the scenes.
Recipes that reproduce SOTA results with a single self-contained python script.

Installation

pip install pyrlite

Advanced installation options

We recommend using conda to manage our computation environment.

Create a conda environment:

conda create -n rlite python==3.12
conda activate rlite

Install common dependencies

# install vllm
pip install vllm accelerate

# flash attention 2 (make sure you have 64 CPU cores)
MAX_JOBS=64 pip install flash-attn --no-build-isolation

# Install fashinfer for faster inference
pip install flashinfer-python==0.2.2.post1 -i https://flashinfer.ai/whl/cu124/torch2.6

Install rlite

git clone https://github.com/rlite-project/RLite.git
cd RLite; pip install -e .

Recipes

We use recipes as examples for reproducing SOTA RL methods. Featured recipes

Programming Model

In RLite, users mainly work with Engines, which is a handler that takes the input from the main process, organizes the tasks and sends to the workers. The engine may have multiple Executors, each holding a full set of model weights. Both Engines and Executors reside in the main process. The Workers are the units that actually perform computational tasks, with each Worker corresponding to a GPU. Conversely, a single GPU can be associated with multiple Workers, which can use the GPU in a time-multiplexed manner.

Key Interfaces

RLite provide minimal interfaces that are

easy to learn: most interfaces resembles the behavior of PyTorch.
super flexible: interfaces are independent and can be used seperately. This allows inference without training, e.g. evaluation tasks, or training without inference, e.g. SFT and DPO.
super powerful: the interfaces combined together allows reproduction of SOTA RL results.
highly extensible: the interfaces allows extensions for fancy features such as other train/inference backends, streaming generations for multi-turn use cases, asynchronized workers for overlapping time-consuming operations.

Inference

Train

Offload/Reload/Discard Weights

Synchronize Weights

Join Our Discussion on WeChat

Contributing

Developer's guide.

Write code that you would like to read again.

We use pre-commit and git cz to sanitize the commits. You can run pre-commit before git cz to avoid repeatedly input the commit messages.

pip install pre-commit
# Install pre-commit hooks
pre-commit install
pre-commit install --hook-type commit-msg
# Install this emoji-style tool
sudo npm install -g git-cz --no-audit --verbose --registry=https://registry.npmmirror.com

# Install rlite
pip install -e ".[dev]"

Code Style

Single line code length is 99 characters, comments and documents are 79 characters.
Write unit tests for atomic capabilities to ensure that pytest does not throw an error.

Run pre-commit to automatically lint the code:

pre-commit run --all-files

Run Unit Tests:

# Only run tests
pytest

# Run tests and output test code coverage report
pytest --cov=rlite

Debug with VSCode

Preparation

Install Command Variable extension of VSCode. Add a launch.json under .vscode/, with the following content:

{
  "version": "0.2.0",
  "configurations": [
    {
      "name": "Python: Attach Dynamic",
      "type": "debugpy",
      "request": "attach",
      "connect": {
        "host": "${input:host}",
        "port": "${input:port}"
      },
      "pathMappings": [
        {
          "localRoot": "${workspaceFolder}",
          "remoteRoot": "."
        }
      ]
    }
  ],
  "inputs": [
    {
      "id": "host",
      "type": "command",
      "command": "extension.commandvariable.transform",
      "args": {
        "text": "${promptStringRemember:hostPort}",
        "find": ":.*",
        "replace": "",
        "promptStringRemember": {
          "hostPort": {
            "key": "hostPort",
            "description": "Input host:port",
          }
        }
      }
    },
    {
      "id": "port",
      "type": "command",
      "command": "extension.commandvariable.transform",
      "args": {
        "text": "${remember:hostPort}",
        "find": ".*:",
        "replace": ""
      }
    }
  ]
}

Debug by attaching to remote workers

Insert breakpoint() to the code you want to debug.
Run the code until you see a message containing the IP:Port you can attach to.
Copy the IP:Port, press F5, and paste the IP:Port to the prompted box.

You should ba able to attach your VSCode to the breakpoint.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
assets		assets
docs		docs
examples		examples
rlite		rlite
.flake8		.flake8
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
README.md		README.md
changelog.config.js		changelog.config.js
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Features

Installation

Recipes

Programming Model

Key Interfaces

Inference

Train

Offload/Reload/Discard Weights

Synchronize Weights

Join Our Discussion on WeChat

Contributing

Code Style

Run Unit Tests:

Debug with VSCode

Preparation

Debug by attaching to remote workers

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Features

Installation

Recipes

Programming Model

Key Interfaces

Inference

Train

Offload/Reload/Discard Weights

Synchronize Weights

Join Our Discussion on WeChat

Contributing

Code Style

Run Unit Tests:

Debug with VSCode

Preparation

Debug by attaching to remote workers

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages