huggingface / trl Public

generated from fastai/nbdev_template

Notifications You must be signed in to change notification settings
Fork 2.5k
Star 17.4k

Code
Issues 541
Pull requests 103
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: huggingface/trl

Labels 36 Milestones 0

New pull request New

103 Open 2,661 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Fix SFTTrainer support for single-image data

#5132 opened Feb 19, 2026 by qgallouedec

Loading…

Fix SFTTrainer crash when train_dataset=None

#5131 opened Feb 19, 2026 by albertvillanova

Loading…

Fix import latency [2/N]: Implement native _is_package_available

#5129 opened Feb 19, 2026 by albertvillanova

Loading…

Fix import latency [1/N]: Extract _LazyModule to dedicated module

#5128 opened Feb 19, 2026 by albertvillanova

Loading…

MGPO feature addition

#5126 opened Feb 19, 2026 by damoonsh

Loading…

2 of 5 tasks

wip(neuron): add neuron integration for SFT

#5125 opened Feb 18, 2026 by michaelbenayoun • Draft

Decouple rollout dispatch from vLLM backend in GRPO _generate_single_turn

#5122 opened Feb 18, 2026 by albertvillanova

Loading…

feat(experimental): Divergence Proximal Policy Optimization

#5117 opened Feb 17, 2026 by LeonEricsson

Loading…

5 tasks

Add prefix-preserving training chat template for GPT-OSS

#5109 opened Feb 17, 2026 by qgallouedec

Loading…

feature: Configurable num logprobs in vLLM generation

#5107 opened Feb 16, 2026 by LeonEricsson

Loading…

2 of 6 tasks

Add support for DGPO (ICLR 2026) to GRPO

#5102 opened Feb 15, 2026 by YanqiDai

Loading…

5 tasks done

Add environment_factory to GRPOTrainer

#5093 opened Feb 13, 2026 by qgallouedec

Loading…

Cast multimodal forward_kwargs to compute dtype for bf16/fp16 training

#5073 opened Feb 11, 2026 by akshan-main

Loading…

4 of 5 tasks

Add support for DPPO [WIP]

#5065 opened Feb 10, 2026 by catherinelee274 • Draft

5 tasks

Fix GRPO VLM prompt handling for string prompts

#5064 opened Feb 10, 2026 by akshan-main

Loading…

5 tasks done

Add CFPO objective to GRPO trainer

#5027 opened Feb 9, 2026 by asparius

Loading…

Add support for MaxRL

#5026 opened Feb 9, 2026 by catherinelee274

Loading…

4 of 5 tasks

Feature/ HICRA implementation

#4997 opened Feb 6, 2026 by w601sxs

Loading…

2 of 5 tasks

Add OpenEnv's Rubrics support

#4994 opened Feb 6, 2026 by sergiopaniego • Draft

5 tasks

fix: add gradient checkpointing to PolicyAndValueWrapper

#4955 opened Feb 3, 2026 by lvhungdev

Loading…

3 of 5 tasks

OpenEnv clients async support update

#4949 opened Feb 2, 2026 by sergiopaniego

Loading…

5 tasks

[Experimental] Add SDFT trainer, config, docs, and tests

#4941 opened Jan 31, 2026 by Shekswess

Loading…

4 of 5 tasks

Update RewardFunc type to use RewardCallable protocol

#4938 opened Jan 31, 2026 by amit9oct

Loading…

2 of 5 tasks

documentation for modifying chat templates for assistant-only loss

#4937 opened Jan 30, 2026 by jiosephlee

Loading…

Add Wordle example with Qwen3 thinking activated

#4936 opened Jan 30, 2026 by sergiopaniego • Draft

5 tasks

Previous 1 2 3 4 5 Next

Previous Next

ProTip! Exclude everything labeled bug with -label:bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!