Including policy gradient methods

Issue #25 resolved
Pierre-Luc Bacon created an issue

The project description suggests that RLPy is mainly about value function based algorithms. However, I think it'd be nice to add Will Dabney's implementation of some of the popular policy gradient methods.

https://github.com/amarack/python-rl/blob/master/pyrl/agents/policy_gradient.py

Comments (4)

  1. cdann@cdann.de

    We totally agree with you. This is definitely a near-future goal for RLPy. Which specific method you suggest to address first?

    Btw: There is an implementation of Natural Actor Critic in RLPy, but unfortunately it is tested very little so far (c.f. the simple example in examples/gridworld/nac.py)

  2. Pierre-Luc Bacon reporter

    I think that all of Will's code should be included !

    Having an implementation of REINFORCE would also be a useful baseline.

  3. Log in to comment