ReplayBuffer for async rollouts

Is your feature request related to a problem? Please describe.
Kudos for getting fully async rollout generation working! 🚀
Now that actors sample trajectories in parallel, the learner still waits for an entire batch before it can start back-prop. This leaves GPUs idle and stretches overall wall-clock training.

Describe the solution you’d like
Add a simple replay buffer between trajectory collection and training so the two stages run concurrently.
Actors call push(traj) as soon as a rollout finishes. there's even a ray Queue that can be used or this.






Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ReplayBuffer for async rollouts #600

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ReplayBuffer for async rollouts #600

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions