Skip to content

[BUG]: code in ppo.py always read same datas from prompt_dataloaderΒ #3746

@lizongzhi

Description

@lizongzhi

πŸ› Describe the bug

1684209059995
β€œ prompts = next(iter(self.prompt_dataloader))” always read same data from prompt_dataloader. Is it correct?

And what is meaning of "num_episodes" and "max_timesteps"? How to set these two param?

thank you

Environment

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions