pa_ovisitor : auto-generation of visitor, folder and mapper from type definitions

CamlP4 type_conv module to auto-generate visitor, folder, mapper classes from type definitions.

This introduces 3 type_conv with extensions: ovisit, ofold and omap.

  • ovisit provides visitor pattern, iterates over nodes. You can use it for generic folding, adding a mutable state from the inheritance of generated class.
  • ofold provides generic fold. The methods take the accumulator argument to be pure.
  • omap provides generic map. It has also an accumulator for folding.

These generated classes define methods with the type names which provide the "spine" algorithm for visitor, folding and mapping. Normally you must inherit the class and extend some (or all) of the methods for your purpose. See examples at pa/test directory.

Visitor auto-generation

Let's start a small example to see how it works:

type t =
  | Foo of int list
  | Bar of int * int
with ovisit

A type definition is postfixed by with ovisit.

Preprocess this code with CamlP4 and pa_ovisitor by camlp4o pa_type_conv.cma pa_ovisitor.cma -printer Camlp4OCamlPrinter (The source code must be saved as, and the include directories (-I ...) for pa_type_conv.cma and pa_ovisitor.cma are omitted.). Then you should see an auto-created class definition ovisit_t with the following definition (you do not need to follow the entire code here):

class virtual ovisit_t =
  object (self)
    method virtual list : 'a1. ('a1 -> unit) -> 'a1 list -> unit
    method virtual int : int -> unit
    method t : t -> unit =
      fun __value ->
        match __value with
        | Foo __x1 -> (self#list self#int __x1; ())
        | Bar (__x1, __x2) -> (self#int __x1; self#int __x2; ())

This ovisit_t is a skelton class for the visitor pattern of the data type. It has methods of the same names as the data types appear in the type definition: list, int and t. They are visitors for those types. The visitor method t, for example, has type t -> unit: it takes a value of type t and visit its internal components.

Extending skelton classes

In method t, the components of t are visited by the methods int and list which should visit int and list data types. Note that methods int and list are virtual and no definitions are given. It is the user's responsibility to give the implementation of these virtual visitor methods. There are two ways of this: auto-generate underlying visitor methods by attaching with ovisit to the definitions of the underlying types, or hand writing them. Since int and list are primitive types without type deinitions, we implement them by ourselves:

class c = object
  inherit ovisit_t

  (* the state *)
  val mutable st = 0
  method st = st

  (* visitor *)
  method int n = st <- st + n
  method list = List.iter

Here, the visitor methods for int and list are defined, inheriting the skelton class ovisit_t. Note that the class also has a state st. The int visitor accumulates the value to the states, and the list visitor iterates the visitor function for the parameter type over the list elements. (See the type of the method list for details.)

As a result the class c is a class to accumulates all the integers of type t:

let () =
  let o1 = new c in
  o1#t (Foo [1;2;3;4;5]);
  assert (o1#st = 15);
  let o2 = new c in
  o2#t (Bar (1,2));
  assert (o2#st = 3)

Composing visitors

The classes created by with ovisit only concretely define visitor methods which are defined mutually in the type definition. The visitor methods for the other data types which are defined outside of the definition are kept virtual. In order to implement the complete visitor classes, you can hand-write those virtual methods like above, or compose visitor classes by inheritance which is explained in this section.

Now let's define a new data type u using t inside, and generate its visitor:

type u = { t : t; children : u list } with ovisit

The visitor skelton class ovisit_u has a virtual method t for the visitor of t. We can use the visitor class c (or c_pure) we have created above. The composition is done by inheritance:

class c_u = object
  inherit c
  inherit ovisit_u

That's it. Now the method u of c_u accumulates all the integer occurrences in a value of type u:

let () =
  let o1 = new c_u in
  o1#u { t = Foo [1;2;3;4;5]; children = [] };
  assert (o1#st = 15);
  let o2 = new c_u in
  o2#u { t = Bar (1,2);
         children = [ { t = Foo [1;2;3;4;5]; children = [] };
                      { t = Bar (3,4); children = [] } ]
  assert (o2#st = 25)


with ofold auto-generates folder: a functional version of visitor. It is like the relationship between List.iter and List.fold: a folder method takes an accumulator argument and traverses data modifying the accumulator value and returns it as the result:

type t =
  | Foo of int list
  | Bar of int * int
with ofold

class c = object
  inherit [int] ofold_t (* requires the accumulator type *)

  method int st n = st + n
  method list = List.fold_left

let () =
  let o = new c in (* folder is pure, so we only need one o *)
  assert ( o#t 0 (Foo [1;2;3;4;5]) = 15 );
  assert ( o#t 0 (Bar (1,2)) = 3 )


with omap auto-generates mapper skelton class. It is also an extension of ofold:

type t =
  | Foo of int list
  | Bar of int * int
with omap

class c = object
  inherit [int] omap_t (* requires the accumulator type *)

  method int st n = st + n, n + 1
  method list f st xs =
    List.fold_left (fun (st,xs) x ->
      let st, x = f st x in st, x :: xs) (st, []) (List.rev xs)

let () =
  let o = new c in (* folder is pure, so we only need one o *)
  assert ( o#t 0 (Foo [1;2;3;4;5]) = (15, Foo [2;3;4;5;6]) );
  assert ( o#t 0 (Bar (1,2)) = (3, Bar (2,3)) )

Evaluation order

The evaluation order of visitor/folder/mapper is depth first and left-to-right, but you should avoid relying on it when you write those methods by hand.

NO_VISIT( typename, typename, ... )

Sometimes it is desirable just to skip visiting some specific data types. Skipping visiting can be achieved by overriding the corresponding method with NOP method:

class o = object (self:'self)
    inherit ovisit_x

    method int _ = () (* Just ignore it *)

There is another way. Declaring data type names which should not be visited by NO_VISIT(...) toplevel expression. The data types in NO_VISIT(...) are excluded from the next with ovisit/ofold/omap auto-generation and it creates classes without the corresponding methods. Those data types are simply skipped in visiting process:

NO_VISIT(bool, list)

type t = ... bool ... list ... with ovisit

The class ovisit_t does not have method bool and list. Booleans and any list element in data type t is not visited by ovisit_t.

NO_VISIT(...) also applies to with ofold and with omap. For ofold no visit data types are just skiped as ovisit no visits. For omap, the values of no visit data types are kept as they are.

The effect of NO_VISIT(...) is only available at the next pa_ovisitor annotation. At each with ovisit/ofold/omap, the set of no visit data types are reset to the empty.