Skip to content

XYaoooo/reconstruction_long

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VERL Self-Reconstruction (GRPO) Training

This repo provides scripts to install dependencies and run self-construction GRPO long-context training for Llama and Qwen models.

Installation

Create a clean conda environment (Python 3.12), install required backends, then install this repo in editable mode:

conda create -n verl python==3.12
conda activate verl

# install vLLM / SGLang / mcore deps (Megatron disabled)
USE_MEGATRON=0 bash scripts/install_vllm_sglang_mcore.sh

# install this repo (editable)
pip install --no-deps -e .

Training

Training entry scripts:

  • self_construction_grpo_long_llama.sh
  • self_construction_grpo_long_qwen.sh

Tip: These scripts usually contain the full set of hyperparameters (model path, dataset, output dir, batch sizes, etc.).
Edit them directly to match your hardware and experiment settings.

Models

Pretrained / instruction-tuned models used in this project:

Dataset

Self-construction / reconstruction dataset:

Repository structure (expected)

A typical layout for this repo:

.
├─ scripts/
│  └─ install_vllm_sglang_mcore.sh
├─ self_construction_grpo_long_llama.sh
├─ self_construction_grpo_long_qwen.sh
└─ (python package / training code)

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published