🎯
Focusing
Chen Xinran. Undergraduate at SJTU; Graduate student at UCAS / ISCAS. Research focus on post-training, Baidu AMU Team.
Pinned Loading
-
Self-KDRL
Self-KDRL PublicForked from lasgroup/SDPO
Reinforcement Learning via Self-Distillation (SDPO)
Python
-
-
verl-recipe-opkd
verl-recipe-opkd PublicForked from verl-project/verl-recipe
A set of examples based on verl for end-to-end RL training recipes.
Python
-
verl-upstream
verl-upstream PublicForked from verl-project/verl
verl: Volcano Engine Reinforcement Learning for LLMs
Python
-
-
vllm-inference
vllm-inference PublicForked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.