process-cxr

Follow

🎯

Focusing

SeptRan process-cxr

🎯

Focusing

Follow

Chen Xinran. Undergraduate at SJTU; Graduate student at UCAS / ISCAS. Research focus on post-training, Baidu AMU Team.

15 followers · 43 following

https://scholar.google.com/citations?user=q2ElSKsAAAAJ&hl=zh-CN

Pinned Loading

Self-KDRL Self-KDRL Public

Forked from lasgroup/SDPO

Reinforcement Learning via Self-Distillation (SDPO)

Python
rllm rllm Public

Forked from rllm-org/rllm

Democratizing Reinforcement Learning for LLMs

Python
verl-recipe-opkd verl-recipe-opkd Public

Forked from verl-project/verl-recipe

A set of examples based on verl for end-to-end RL training recipes.

Python
verl-upstream verl-upstream Public

Forked from verl-project/verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python
distillm distillm Public

Forked from jongwooko/distillm

Interesting Distillation of LLM

Python
vllm-inference vllm-inference Public

Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python