VERL Self-Reconstruction (GRPO) Training

This repo provides scripts to install dependencies and run self-construction GRPO long-context training for Llama and Qwen models.

Installation

Create a clean conda environment (Python 3.12), install required backends, then install this repo in editable mode:

conda create -n verl python==3.12
conda activate verl

# install vLLM / SGLang / mcore deps (Megatron disabled)
USE_MEGATRON=0 bash scripts/install_vllm_sglang_mcore.sh

# install this repo (editable)
pip install --no-deps -e .

Training

Training entry scripts:

self_construction_grpo_long_llama.sh
self_construction_grpo_long_qwen.sh

Tip: These scripts usually contain the full set of hyperparameters (model path, dataset, output dir, batch sizes, etc.).
Edit them directly to match your hardware and experiment settings.

Models

Pretrained / instruction-tuned models used in this project:

Llama-3.1-8B-Instruct-14k: https://huggingface.co/YaoYX/Llama-3.1-8B-Instruct-14k
Qwen2.5-7B-Instruct-1M-14k: https://huggingface.co/YaoYX/Qwen2.5-7B-Instruct-1M-14k
Qwen2.5-7B-Instruct-1M-30k: https://huggingface.co/YaoYX/Qwen2.5-7B-Instruct-1M-30k

Dataset

Self-construction / reconstruction dataset:

reconstruction_14k: https://huggingface.co/datasets/YaoYX/reconstruction_14k
**validation: https://huggingface.co/datasets/YaoYX/reconstruction_validation

Repository structure (expected)

A typical layout for this repo:

.
├─ scripts/
│  └─ install_vllm_sglang_mcore.sh
├─ self_construction_grpo_long_llama.sh
├─ self_construction_grpo_long_qwen.sh
└─ (python package / training code)

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
docker		docker
docs		docs
examples		examples
long-eval		long-eval
recipe		recipe
scripts		scripts
tests		tests
verl.egg-info		verl.egg-info
verl		verl
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Notice.txt		Notice.txt
README.md		README.md
file_trans.py		file_trans.py
fsdp2_re.log		fsdp2_re.log
fsdp_to_hf.py		fsdp_to_hf.py
pyproject.toml		pyproject.toml
requirements-cuda.txt		requirements-cuda.txt
requirements-npu.txt		requirements-npu.txt
requirements.txt		requirements.txt
requirements_sglang.txt		requirements_sglang.txt
requirements_transferqueue.txt		requirements_transferqueue.txt
self_construction_grpo_long_llama.sh		self_construction_grpo_long_llama.sh
self_construction_grpo_long_qwen.sh		self_construction_grpo_long_qwen.sh
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VERL Self-Reconstruction (GRPO) Training

Installation

Training

Models

Dataset

Repository structure (expected)

About

Uh oh!

Releases

Packages

Languages

License

XYaoooo/reconstruction_long

Folders and files

Latest commit

History

Repository files navigation

VERL Self-Reconstruction (GRPO) Training

Installation

Training

Models

Dataset

Repository structure (expected)

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages