Hi,
This question is regarding HW1, the dagger part.
I am just curious, how does the querying the expert policy with loaded_gaussian_policy works? Could someone point me to some resources? It is not queried directly from the expert labeled actions, but goes through this small net, right? How it was trained? Is this some kind of the way to store the expert actions or this net is the "expert"?
Thank you!