Skip to content

Confusion about using RNN in such MASAC #4

@Lianghj427

Description

@Lianghj427

It's a great work to develop a multi-agent version of SAC. But I'm confused about using RNN in such MASAC. More specifically, if we employ a GRUCell in the Actor, how can we sample a new action during training? The hidden state used in the execution may not match the policy in the training, especially when using the off-policy paradigm.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions