Skip to content

rlite-project/RLite

Repository files navigation

RLite

🚀 Quick start🌰 Examples🍲 Recipes📚 Docs

A lightweight RL framework with PyTorch-like interfaces.

Features

  • FSDP2 and FSDP support for training.
  • vLLM support for inference.
  • ray support for resource management.
  • Easy to learn and use. Most interfaces are kept the same as Torch, with parallel engine working seamlessly behind the scenes.
  • Recipes that reproduce SOTA results with a single self-contained python script.

Installation

pip install pyrlite
Advanced installation options

We recommend using conda to manage our computation environment.

  1. Create a conda environment:
conda create -n rlite python==3.12
conda activate rlite
  1. Install common dependencies
# install vllm
pip install vllm accelerate

# flash attention 2 (make sure you have 64 CPU cores)
MAX_JOBS=64 pip install flash-attn --no-build-isolation

# Install fashinfer for faster inference
pip install flashinfer-python==0.2.2.post1 -i https://flashinfer.ai/whl/cu124/torch2.6
  1. Install rlite
git clone https://github.com/rlite-project/RLite.git
cd RLite; pip install -e .

Recipes

We use recipes as examples for reproducing SOTA RL methods. Featured recipes

Programming Model

Programming Model

In RLite, users mainly work with Engines, which is a handler that takes the input from the main process, organizes the tasks and sends to the workers. The engine may have multiple Executors, each holding a full set of model weights. Both Engines and Executors reside in the main process. The Workers are the units that actually perform computational tasks, with each Worker corresponding to a GPU. Conversely, a single GPU can be associated with multiple Workers, which can use the GPU in a time-multiplexed manner.

Key Interfaces

RLite provide minimal interfaces that are

  • easy to learn: most interfaces resembles the behavior of PyTorch.
  • super flexible: interfaces are independent and can be used seperately. This allows inference without training, e.g. evaluation tasks, or training without inference, e.g. SFT and DPO.
  • super powerful: the interfaces combined together allows reproduction of SOTA RL results.
  • highly extensible: the interfaces allows extensions for fancy features such as other train/inference backends, streaming generations for multi-turn use cases, asynchronized workers for overlapping time-consuming operations.

Inference

Inference Example

Train

Train Example

Offload/Reload/Discard Weights

Device Example

Synchronize Weights

Weight Sync

Join Our Discussion on WeChat

WeChatGroup

Contributing

Developer's guide.

Write code that you would like to read again.

We use pre-commit and git cz to sanitize the commits. You can run pre-commit before git cz to avoid repeatedly input the commit messages.

pip install pre-commit
# Install pre-commit hooks
pre-commit install
pre-commit install --hook-type commit-msg
# Install this emoji-style tool
sudo npm install -g git-cz --no-audit --verbose --registry=https://registry.npmmirror.com

# Install rlite
pip install -e ".[dev]"
Code Style
  • Single line code length is 99 characters, comments and documents are 79 characters.
  • Write unit tests for atomic capabilities to ensure that pytest does not throw an error.

Run pre-commit to automatically lint the code:

pre-commit run --all-files
Run Unit Tests:
# Only run tests
pytest

# Run tests and output test code coverage report
pytest --cov=rlite

Debug with VSCode

Preparation

Install Command Variable extension of VSCode. Add a launch.json under .vscode/, with the following content:

{
  "version": "0.2.0",
  "configurations": [
    {
      "name": "Python: Attach Dynamic",
      "type": "debugpy",
      "request": "attach",
      "connect": {
        "host": "${input:host}",
        "port": "${input:port}"
      },
      "pathMappings": [
        {
          "localRoot": "${workspaceFolder}",
          "remoteRoot": "."
        }
      ]
    }
  ],
  "inputs": [
    {
      "id": "host",
      "type": "command",
      "command": "extension.commandvariable.transform",
      "args": {
        "text": "${promptStringRemember:hostPort}",
        "find": ":.*",
        "replace": "",
        "promptStringRemember": {
          "hostPort": {
            "key": "hostPort",
            "description": "Input host:port",
          }
        }
      }
    },
    {
      "id": "port",
      "type": "command",
      "command": "extension.commandvariable.transform",
      "args": {
        "text": "${remember:hostPort}",
        "find": ".*:",
        "replace": ""
      }
    }
  ]
}
Debug by attaching to remote workers
  1. Insert breakpoint() to the code you want to debug.
  2. Run the code until you see a message containing the IP:Port you can attach to.
  3. Copy the IP:Port, press F5, and paste the IP:Port to the prompted box.

You should ba able to attach your VSCode to the breakpoint.

About

A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithms with minimal intrusion.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors