Including policy gradient methods
Issue #25
resolved
The project description suggests that RLPy is mainly about value function based algorithms. However, I think it'd be nice to add Will Dabney's implementation of some of the popular policy gradient methods.
https://github.com/amarack/python-rl/blob/master/pyrl/agents/policy_gradient.py
Comments (4)
-
-
reporter I think that all of Will's code should be included !
Having an implementation of REINFORCE would also be a useful baseline.
-
Thanks Pierre. I sent an email to Will about this.
-
- changed status to resolved
- Log in to comment
We totally agree with you. This is definitely a near-future goal for RLPy. Which specific method you suggest to address first?
Btw: There is an implementation of Natural Actor Critic in RLPy, but unfortunately it is tested very little so far (c.f. the simple example in
examples/gridworld/nac.py
)