Value Function Prediction Code
Issue #13
on hold
current policy evaluation code seems to be broken. replace old code with an ValuePredictionExperiment (maybe also LinearValuePredictionExperiment) as a proper way of clean Policy Evaluation
Comments (3)
-
-
reporter I didn't wrote the code, but from what I understood it is pretty experimental and hacky. I belief we did not announce policy evaluation as a feature so I would mark it as unfinished work and do not care about it at the moment.
I am probably going to do some policy evaluation stuff again soon. Then I will code up a proper setup for Policy Evaluation with a
(Linear)ValueEstimationExperiment
. I will maybe transfer parts of my code at https://bitbucket.org/chrodan/tdlearn into RLPy. -
reporter - changed status to on hold
- Log in to comment
Could someone familiar with the original Policy Evaluation code add a comment to this issue (or the other Policy Evaluation issue) which outlines the desired behavior of the class?