1. camlspotter
  2. mutated_ocaml


camlspotter  committed 74058d5

more details

  • Participants
  • Parent commits 9a3b33f
  • Branches typeloc

Comments (0)

Files changed (1)

File README.typeloc

View file
 typeloc branch: An attempt to improve unification error report.
-This typeloc branch tries to print additional information at type unification
-errors, to indicate which parts of the source code introduce the type 
-constructors caused the errors.
+This typeloc branch tries to print additional information at type unification errors, 
+to show which type constructors actually contradict, with their source code locations of introduction.
-This is inspired from the same idea implemented for Haskell type checker 
-by Lennart Augustsson.
+The idea of this implementation is given by the implementation of Haskell type checker by Lennart Augustsson, demonstrated at his talk at CUFP 2011.
+It should be also noted that the same idea was implemented in Opa langage. 
+See http://blog.opalang.org/2012/07/programming-tools-ux-better-type-error.html .
+What does it do?
+The modified OCaml compiler prints out additional lines at type error reports.
+For example, think about a simple language modeled by the following type t 
+and a function names : t -> string list, which should extract tokens used in it::
+    type t = 
+      | Int of int
+      | Null
+      | List of t list
+      | Tuple of t * t
+      | String of string
+    let rec names st = function
+      | Int n -> n :: st 
+      | Null -> st
+      | List xs -> List.fold_left names st xs
+      | Tuple (x,y) -> List.fold_left names st [x; y]
+      | String s -> s :: st
+Actually this code is rejected by the compiler::
+    $ ocamlc -c test.ml
+    File "test.ml", line 13, characters 16-17:
+    Error: This expression has type string but an expression was expected of type
+             int
+The error is reported at the last line of the code, the use of the variable s. 
+It saids s is expected to have type int. But it is sure to be string and 
+the programmer is writing a function which returns string... Nothing wrong around the line. Strange!
+And why did the compiler confuse with string and int? Where did this int come from?
+This example is so simple that it is easy to see that the problem is the first case which stores an integer to the list,
+but generally speaking, in more complex function definitions we often face more complex instance of this sort of cryptic type errors,
+whose fixes are actually far away from the reported error locations. 
+This is since the ML type inference is actually a simple (but also powerful) constraint solver, using type unification.
+Each syntax construct may introduce some constraints of types to the system, and once the infernece system finds
+the accumulated constraints have no solution, it reports a type error with the source location which added the last constraint. 
+But it does not always mean that this position is the place to be fixed. The programmer often feels::
+     The type system tells that this expression of this position has this type and that type at the same time, but they conflict. Wow.
+     But, first of all, I do not even understand why the system thinks this expression should have those types.
+Can we improve this? Pin-pointing where to be fixed is of course impossible. 
+The type inference is just a damn mathematical algorithm and cannot understand human intention, 
+like "the function has the name "names" so it should return string list rather than int list".
+But still there are things we can do . The typeloc branch provides additional information::
+    The differing types are:
+      string : introduced at File "test.ml", line 6, characters 14-20
+      int : introduced at File "test.ml", line 2, characters 11-14
+      #  GeeeZ! It is not a good example!!! The location should be at the use of variants not at the defs of them!!
+This tells you that it found a conflict between the two type constructors string and int 
+with the locations of syntax constructs which introduced them.
+Each syntactic construct introduces very small type constraint and therefore easy to understand:
+for example an expression 42 enforces its type to be an int.
+Constructor (::) asks its second argument and return type to be some list. The programmer's feeling should now change::
+     I understand this part of the code introduces this type constraint.
+     I understand that part of the code introduces that type constraint.
+     They are clearly incompatible.
+     Then, who has introduced unification of them?
+Of course this does not pin-point the place to be fixed, but you have more clues which are easier to understand.
 Current status 
 `Proof of concept' level implementation.
+Technical details
+In the typeloc branch, each type expression node (type_expr) has location information of where it is introduced.
+If two type constructors with valid introduction locations are unified, one of them are chosen randomly for the unified type.
+This extension introduces some complexity to the type system, and may impact on its efficiency:
+the same type can be introduced many times by different source code locations.
+For example, all the int types are all pointer equivalent in the original OCaml type checker, but 
+it is no longer true in typeloc, with reflecting the fact that int can be introduced by difference part of source code.