Value Function Prediction Code

Issue #13 on hold
cdann@cdann.de created an issue

current policy evaluation code seems to be broken. replace old code with an ValuePredictionExperiment (maybe also LinearValuePredictionExperiment) as a proper way of clean Policy Evaluation

Comments (3)

  1. Will Dabney

    Could someone familiar with the original Policy Evaluation code add a comment to this issue (or the other Policy Evaluation issue) which outlines the desired behavior of the class?

  2. cdann@cdann.de reporter

    I didn't wrote the code, but from what I understood it is pretty experimental and hacky. I belief we did not announce policy evaluation as a feature so I would mark it as unfinished work and do not care about it at the moment.

    I am probably going to do some policy evaluation stuff again soon. Then I will code up a proper setup for Policy Evaluation with a (Linear)ValueEstimationExperiment. I will maybe transfer parts of my code at https://bitbucket.org/chrodan/tdlearn into RLPy.

  3. Log in to comment