Unify Pendulum and Cartpole base classes

Issue #14 resolved
cdann@cdann.de created an issue

Pendulum* and Cartpole* classes should be unified to use a single Cartpole base class.

Comments (6)

  1. cdann@cdann.de reporter

    What about the following class structure:

    • CartPoleBase: contains

      • all the dynamics (step, all constants etc)
      • a function plot_policy(pi) and plot_valfun(pi) which take 2-dimensional inputs
      • a plot_cart(state, action) function which plots the state and takes a 4-dim. state

      We use the following subclasses

      1. InfTrackCartPole:
        • state property which omits 2 dimensions
        • showLearning that computes V and pi and calls the helper functions to plot
        • redefine GROUND_VERTS to not show boundaries
        • showDomain: call plot_cart with omited state dimensions set to 0 (the view is always centered around the cart on the infinite track)
      2. FiniteTrackCartPole
        • showLearning computes V and pi for a centered cart with 0 velocity, calls helper functions + adds a text to the plot which states which slice of V and pi is shown
        • showDomain: calls helper functions to plot

    The actual classes then just inherit from InfTrackCartPole or FiniteTrackCartPole.

    What do you think? This way we avoid lots of code duplication and it should be clearer what the Pendulum domain actually is.

  2. Robert Klein

    That's possible. The original reason we split into separate domains though was that different sources actually used different constants for cartpole vs. pendulum.

    EG "Cartpole" domain from Lagoudakis + Parr (2003) calls out 50N for force on cart http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.118.8770&rep=rep1&type=pdf while the "pendulum" domain from Sutton + Barto (1998) calls out 1 N*m torques http://code.google.com/p/rl-library/wiki/CartpoleJava.

    These 'constants' are arbitrarily different for the different tasks (swingup vs inverted balance) as well.

    This has been an ongoing headache with these domains.

    For now how about we go with your scheme and override domain constants for Inf vs. Finite using the 2 standards above, ignoring the arbitrary differences for the different tasks?

  3. cdann@cdann.de reporter

    I think this is a good idea, Bobby. We can overwrite constants when necessary in the second or third inheritance level. We can even overwrite the integration method (see Cartpole.int_type) without touching the dynamics implementation itself.

    Btw: we should refer to Rich Sutton's original code at http://webdocs.cs.ualberta.ca/~sutton/book/code/pole.c instead of the rl-library implementation.

  4. Log in to comment