generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix import latency [2/N]: Implement native _is_package_available
#5129
opened Feb 19, 2026 by
albertvillanova
Loading…
Fix import latency [1/N]: Extract _LazyModule to dedicated module
#5128
opened Feb 19, 2026 by
albertvillanova
Loading…
Decouple rollout dispatch from vLLM backend in GRPO _generate_single_turn
#5122
opened Feb 18, 2026 by
albertvillanova
Loading…
feat(experimental): Divergence Proximal Policy Optimization
#5117
opened Feb 17, 2026 by
LeonEricsson
Loading…
5 tasks
Add prefix-preserving training chat template for GPT-OSS
#5109
opened Feb 17, 2026 by
qgallouedec
Loading…
feature: Configurable num logprobs in vLLM generation
#5107
opened Feb 16, 2026 by
LeonEricsson
Loading…
2 of 6 tasks
Add support for DGPO (ICLR 2026) to GRPO
#5102
opened Feb 15, 2026 by
YanqiDai
Loading…
5 tasks done
Cast multimodal forward_kwargs to compute dtype for bf16/fp16 training
#5073
opened Feb 11, 2026 by
akshan-main
Loading…
4 of 5 tasks
Fix GRPO VLM prompt handling for string prompts
#5064
opened Feb 10, 2026 by
akshan-main
Loading…
5 tasks done
fix: add gradient checkpointing to PolicyAndValueWrapper
#4955
opened Feb 3, 2026 by
lvhungdev
Loading…
3 of 5 tasks
[Experimental] Add SDFT trainer, config, docs, and tests
#4941
opened Jan 31, 2026 by
Shekswess
Loading…
4 of 5 tasks
Update RewardFunc type to use RewardCallable protocol
#4938
opened Jan 31, 2026 by
amit9oct
Loading…
2 of 5 tasks
documentation for modifying chat templates for assistant-only loss
#4937
opened Jan 30, 2026 by
jiosephlee
Loading…
Add Wordle example with Qwen3 thinking activated
#4936
opened Jan 30, 2026 by
sergiopaniego
•
Draft
5 tasks
Previous Next
ProTip!
Exclude everything labeled
bug with -label:bug.