Skip to content

ReplayBuffer for async rollouts #600

@kwanUm

Description

@kwanUm

Is your feature request related to a problem? Please describe.
Kudos for getting fully async rollout generation working! 🚀
Now that actors sample trajectories in parallel, the learner still waits for an entire batch before it can start back-prop. This leaves GPUs idle and stretches overall wall-clock training.

Describe the solution you’d like
Add a simple replay buffer between trajectory collection and training so the two stages run concurrently.
Actors call push(traj) as soon as a rollout finishes. there's even a ray Queue that can be used or this.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions